Dify

Overview

Dify is an open-source development platform for building, managing, and deploying applications powered by large language models (LLMs). It offers an intuitive interface that integrates key features such as AI workflows, Retrieval-Augmented Generation (RAG) pipelines, agent-based capabilities, model management, and observability tools — all designed to help users move quickly from prototyping to production

Key Features

AI Workflow Management: Streamlines the end-to-end development process for LLM apps.
RAG Pipelines and Agent Capabilities: Supports Retrieval-Augmented Generation and agent functionality to handle complex data queries and automate tasks.
Model Management: Enables efficient model management for various LLM deployments.
Observability: Provides tools to monitor and improve app performance in real time.

Integrations

On the Apolo platform, Dify might be integrate enhanced with:

PostgreSQL with PGVector application for scalable, low-latency semantic data retrieval
Apolo Buckets to store binary objects (datasets) that users upload to the platform
vLLM application (streaming & non‑streaming completions). This integration should be performed manually
Text embeddings / vLLM serving embeddings model for document/context processing

This design allows users to upload documents, build production-grade ingestion pipelines and create chat applications on top of them. All of this done with zero data leakage — all assets are securely stored within users Apolo tenant.

Installing

In this guide, we presume you've already deployed vLLM Inference, Text embeddings and PostgreSQL applications, since we are going to show the integration process with those apps. The overview of application installation process via web console could be found here.

Below are the detailed instructions for installing Dify application using Apolo Console. For instructions on how to install it using Apolo CLI, visit the dedicated page.

Select "Dify" application at Apolo web console

Access the Apolo web console and go to the "Apps" section. We presume you are already authorized in web console and a participant of organization and project.

Configure application

HTTP Ingress, authentication and authorization: allows you to enable or disable public domain creation for application and Apolo-powered HTTP authorization for the application API and web UI public domain.

We enable ingress, but disable authentication and authorization since Dify has it's own AAA mechanisms.

Dify API

This component under the hood coordinates communication among other application components and serves user's requests. It does not perform heavy compute by itself.

Replicas count: controls number of underlying replicas for the API. If you expect the load to be significant, deploy more than one instances.

We select a single replica here.

Resource preset: Here you select an appropriate preset that specifies CPU, memory. Since API does not do any compute, the preset with 1 vCPU and 2 GiB of RAM should be sufficient to serve our needs.

For our use-case we use cpu-medium with exactly those numbers.

Dify worker

This component under the hood runs async data analysis tasks, processes datasets and stores the results in the configured storages, etc. It does not serve LLMs itself, however the amount of workers and their resources impact speed with which your datasets gets processed.

Replicas count: controls number of underlying workers to deploy. This also depends on the amount of data you as a user is going to upload for processing

We select a single replica here.

Resource preset: Here you select an appropriate preset that specifies CPU, memory. This application under the hood communicates with external services such as vLLM and text embeddings API to process datasets. It also does not perform heavy compute by itself, so the preset with 1 vCPU and 2 GiB of RAM should be sufficient to serve your needs.

For our use-case we use cpu-medium with exactly those numbers.

Dify proxy

This component stands between API + Web services and the end user. Under the hood it's a simple Nginx reverse proxy. The smallest preset and a single replica should be fine here.

Dify web

This is the web UI application server without heavy compute footprint. The preset with ~ 1vCPU and 2 GiB of RAM is required.

Dify Redis

This is stand-alone cache service used while processing datasets and while serving some user requests. Currently, it's embedded into the Dify app, while in future it will be extracted to the dedicated application. Select similar to API resource preset here.

PostgreSQL integration

Dify uses PostgreSQL to store documents, their embeddings and metadata for later retrieval in chat. It also serves as a metrics storage and users metadata storage for Dify platform. In the opened window select PostgreSQL credentials from previously installed in Apolo PostgreSQL application.

When integrating with PostgreSQL app, make sure you are not using postgres user, this will not work due to security reasons.

You can select the same user credentials for both cases: first Postgres User Credentials is used to store metadata and users information, while the second one is used to store embeddings using PGVector extension.

You can also configure access to the PostgreSQL instance managed by you, among requirements: user should be able to connect, create tables in own or public schema for the specified database. PGVector add-on should also be pre-installed in the database.

Installation

When all of the required parameters are provided, click "install" to start the installation process. You will be redirected to the application details page, where app status, inputs and outputs are displayed. Wait till the status is healty.

Once the app is installed:

Click the Open App button at the top of the app details page to launch the Dify web UI in a new browser tab.
Find the init password in the Outputs section (scroll down on the app details page) - you'll need this to create the first root user.

Integrating vLLM and Text embedding

As for now, Dify does not allow programmatic configuration of LLM / embedding endpoints, therefore, users are obliged to add those configurations.

Follow the instructions on each tab in order to integrate vLLM and text embedding applications into the Dify.

After adding vLLM and Text embedding providers to Dify, the system will be ready to create and run your applications

Usage

Creating sample RAG application

Creating the app
In the Dify studio (landing page of Dify platform), click "Create from template" and select Knowledge Retreival + Chatbot template.
Name your app and hit "Create". This will create a basic flow for your application:

Setup knowledge
Select knowledge retrieval block (1), click on "+" sign on the Knowledge context menu (2) and navigate to the Knowledge creation page (3).
Creating a knowledge
The Knowledge in the Dify platform is a collection of correlated data. More information could be fond in Dify documentation. We will not explain the knowledge creation in details, instead we create a Knowledge based on Kubernetes book, by uploading the book via web UI, selecting the default embedding model to process the data. This processing might take some time.
When done, you will see the processed dataset item and the approximate number of words in it:
Dify processed knowledge
Setup LLM
In the LLM block, select the model from previously configured vLLM integration, while for the context, select the output of knowledge retrieval. Do not forget to select the previously prepared knowledge in the knowledge block. You can also parameterize the LLM prompt and bunch of other features.
The setup should look like this now:
Dify LLM setup
Publishing and testing
At this step, your first RAG application is ready to be consumed by the end users. Click "publish" button. There you can find API description for this particular app that you can use to embed into your product, as well as directly share the application link.
Dify application publishing options
Click "Run app" button to open a dedicated chat window and now you can ask some questions about the provided context.
Using Dify RAG application
Monitoring applications
Application usage logs, metrics and much more can be found on the corresponding sections within the application studio.
Dify application usage overview

As for the API usage, each particular application created within the Dify platform comes with embedded API documentation, that can be viewed by navigating to "Access API reference" from Dify apps publishing options.

References

PreviousMLflow NextWeaviate

Last updated 1 month ago

Was this helpful?