Dify
Overview
Dify is an open-source development platform for building, managing, and deploying applications powered by large language models (LLMs). It offers an intuitive interface that integrates key features such as AI workflows, Retrieval-Augmented Generation (RAG) pipelines, agent-based capabilities, model management, and observability tools — all designed to help users move quickly from prototyping to production
Key Features
AI Workflow Management: Streamlines the end-to-end development process for LLM apps.
RAG Pipelines and Agent Capabilities: Supports Retrieval-Augmented Generation and agent functionality to handle complex data queries and automate tasks.
Model Management: Enables efficient model management for various LLM deployments.
Observability: Provides tools to monitor and improve app performance in real time.
Integrations
On the Apolo platform, Dify might be integrate enhanced with:
PostgreSQL with PGVector application for scalable, low-latency semantic data retrieval
Apolo Buckets to store binary objects (datasets) that users upload to the platform
vLLM application (streaming & non‑streaming completions). This integration should be performed manually
Text embeddings / vLLM serving embeddings model for document/context processing
This design allows users to upload documents, build production-grade ingestion pipelines and create chat applications on top of them. All of this done with zero data leakage — all assets are securely stored within users Apolo tenant.
Installing
In this guide, we presume you've already deployed vLLM Inference, Text embeddings and PostgreSQL applications, since we are going to show the integration process with those apps. The overview of application installation process via web console could be found here.
Below are the detailed instructions for installing Dify application using Apolo Console. For instructions on how to install it using Apolo CLI, visit the dedicated page.
Configure application
HTTP Ingress, authentication and authorization: allows you to enable or disable public domain creation for application and Apolo-powered HTTP authorization for the application API and web UI public domain.
We enable ingress, but disable authentication and authorization since Dify has it's own AAA mechanisms.
Dify API
This component under the hood coordinates communication among other application components and serves user's requests. It does not perform heavy compute by itself.
Replicas count: controls number of underlying replicas for the API. If you expect the load to be significant, deploy more than one instances.
We select a single replica here.
Resource preset: Here you select an appropriate preset that specifies CPU, memory. Since API does not do any compute, the preset with 1 vCPU and 2 GiB of RAM should be sufficient to serve our needs.
For our use-case we use cpu-medium
with exactly those numbers.
Dify worker
This component under the hood runs async data analysis tasks, processes datasets and stores the results in the configured storages, etc. It does not serve LLMs itself, however the amount of workers and their resources impact speed with which your datasets gets processed.
Replicas count: controls number of underlying workers to deploy. This also depends on the amount of data you as a user is going to upload for processing
We select a single replica here.
Resource preset: Here you select an appropriate preset that specifies CPU, memory. This application under the hood communicates with external services such as vLLM and text embeddings API to process datasets. It also does not perform heavy compute by itself, so the preset with 1 vCPU and 2 GiB of RAM should be sufficient to serve your needs.
For our use-case we use cpu-medium
with exactly those numbers.
Dify proxy
This component stands between API + Web services and the end user. Under the hood it's a simple Nginx reverse proxy. The smallest preset and a single replica should be fine here.
Dify web
This is the web UI application server without heavy compute footprint. The preset with ~ 1vCPU and 2 GiB of RAM is required.
Dify Redis
This is stand-alone cache service used while processing datasets and while serving some user requests. Currently, it's embedded into the Dify app, while in future it will be extracted to the dedicated application. Select similar to API resource preset here.
PostgreSQL integration
Dify uses PostgreSQL to store documents, their embeddings and metadata for later retrieval in chat. It also serves as a metrics storage and users metadata storage for Dify platform. In the opened window select PostgreSQL credentials from previously installed in Apolo PostgreSQL application.
When integrating with PostgreSQL app, make sure you are not using postgres
user, this will not work due to security reasons.
You can select the same user credentials for both cases: first Postgres User Credentials is used to store metadata and users information, while the second one is used to store embeddings using PGVector extension.
You can also configure access to the PostgreSQL instance managed by you, among requirements: user should be able to connect, create tables in own or public schema for the specified database. PGVector add-on should also be pre-installed in the database.
Installation
When all of the required parameters are provided, click "install" to start the installation process. You will be redirected to the application details page, where app status, inputs and outputs are displayed. Wait till the status is healty and outputs appear.
Afterwards, use external web app URL do access the Dify platform web UI and init password to create the first, root user within the platform.

Integrating vLLM and Text embedding
As for now, Dify does not allow programmatic configuration of LLM / embedding endpoints, therefore, users are obliged to add those configurations.
Follow the instructions on each tab in order to integrate vLLM and text embedding applications into the Dify.
After adding vLLM and Text embedding providers to Dify, the system will be ready to create and run your applications
Usage
Creating sample RAG application
Creating the app
In the Dify studio (landing page of Dify platform), click "Create from template" and select Knowledge Retreival + Chatbot template.
Name your app and hit "Create". This will create a basic flow for your application:

Setup knowledge
Select knowledge retrieval block (1), click on "+" sign on the Knowledge context menu (2) and navigate to the Knowledge creation page (3).
Creating a knowledge The Knowledge in the Dify platform is a collection of correlated data. More information could be fond in Dify documentation. We will not explain the knowledge creation in details, instead we create a Knowledge based on Kubernetes book, by uploading the book via web UI, selecting the default embedding model to process the data. This processing might take some time.
When done, you will see the processed dataset item and the approximate number of words in it:
Dify processed knowledge Setup LLM
In the LLM block, select the model from previously configured vLLM integration, while for the context, select the output of knowledge retrieval. Do not forget to select the previously prepared knowledge in the knowledge block. You can also parameterize the LLM prompt and bunch of other features.
The setup should look like this now:
Dify LLM setup Publishing and testing
At this step, your first RAG application is ready to be consumed by the end users. Click "publish" button. There you can find API description for this particular app that you can use to embed into your product, as well as directly share the application link.
Dify application publishing options Click "Run app" button to open a dedicated chat window and now you can ask some questions about the provided context.
Using Dify RAG application Monitoring applications
Application usage logs, metrics and much more can be found on the corresponding sections within the application studio.
Dify application usage overview
As for the API usage, each particular application created within the Dify platform comes with embedded API documentation, that can be viewed by navigating to "Access API reference" from Dify apps publishing options.
References
Last updated
Was this helpful?