How to Self-Host RAGFlow for Deep-Document RAG

When your corporate knowledge lives in gnarly PDFs, scanned contracts and dense spreadsheets, retrieval quality is everything — and that is RAGFlow’s whole point. This recipe stands it up on a VM you own, with Microsoft Entra login, Office/PDF/Markdown ingestion, and multiple per-team and personal knowledge bases.

Prep time A day for a tuned pilot

One-time cost €0 software — RAGFlow is Apache-2.0 open source

Going cost One beefier VM (~€60–150/mo) + your LLM tokens

Ingredients

A Linux VM — Ubuntu 24.04. RAGFlow is hungrier than a chat UI: budget 4+ vCPU and 16 GB RAM minimum, 32 GB recommended (it runs Elasticsearch, MySQL, MinIO and Redis alongside the app), plus 100 GB+ disk.
Docker ≥ 24 and the Docker Compose v2 plugin.
The kernel setting vm.max_map_count ≥ 262144 (Elasticsearch refuses to start otherwise).
A DNS name and TLS (Caddy gives you one-line HTTPS).
Microsoft Entra ID admin rights to register an app for OIDC login.
An LLM + embedding backend — local Ollama, or any OpenAI-compatible / Azure OpenAI endpoint. RAGFlow’s full image also ships embedding models so you can start with nothing external.

Why RAGFlow over a chat-first tool? Its DeepDoc engine does layout-aware parsing — it understands tables, columns, headers and figures instead of flattening a PDF into a wall of text — and it offers template-based chunking per document type. If retrieval accuracy on complex documents is your top priority, this is the strongest of the self-hostable options.

1 Prepare the VM

Install Docker, then raise the memory-map limit Elasticsearch needs and make it survive reboots.

curl -fsSL https://get.docker.com | sh
sudo usermod -aG docker "$USER"   # log out/in afterwards

# Elasticsearch requirement — apply now and persist
sudo sysctl -w vm.max_map_count=262144
echo 'vm.max_map_count=262144' | sudo tee /etc/sysctl.d/99-ragflow.conf

Open only ports 80 and 443 on the firewall. Every backend service stays on the internal Docker network.

2 Get RAGFlow and choose an image

RAGFlow ships its own Compose stack. Clone the repo and drop into the docker folder.

git clone https://github.com/infiniflow/ragflow.git
cd ragflow/docker

Open the .env file and pick your image. Two flavours:

Recommended to start

Full image

Tag: :v0.21.1 (no -slim)
Size: ~9 GB
Includes: Built-in embedding models
Use when: You want it to just work offline

Lean

Slim image

Tag: :v0.21.1-slim
Size: ~2 GB
Includes: No embedding models
Use when: You supply embeddings via API/Ollama

Check the current tag. RAGFlow moves fast — confirm the latest release on GitHub and set RAGFLOW_IMAGE in .env accordingly. The default doc engine is Elasticsearch; you can switch to the lighter infinity via DOC_ENGINE if RAM is tight.

3 Register an app in Microsoft Entra

In Azure portal → Microsoft Entra ID → App registrations → New registration:

Name it RAGFlow; single-tenant is fine.
Redirect URI — platform Web: https://rag.example.com/v1/user/oauth/callback/microsoft (the trailing microsoft is the channel key you will use in config).
Copy the client ID and tenant ID; create a client secret and copy its value.
Add Microsoft Graph delegated permissions openid, email, profile, User.Read.

Your OIDC issuer for a single tenant is: https://login.microsoftonline.com/<tenant-id>/v2.0.

4 Wire up OIDC login

RAGFlow reads an oauth block from its service config. Edit ragflow/docker/service_conf.yaml.template (it is rendered into the running config on startup) and add a Microsoft channel:

# service_conf.yaml.template
oauth:
  microsoft:
    type: oidc
    display_name: "Microsoft"
    client_id: "<application-client-id>"
    client_secret: "<client-secret-value>"
    issuer: "https://login.microsoftonline.com/<tenant-id>/v2.0"
    scope: "openid email profile"
    redirect_uri: "https://rag.example.com/v1/user/oauth/callback/microsoft"

Config keys shift between releases. RAGFlow’s OAuth/OIDC support is young and the exact field names occasionally change. If a key is rejected on boot, diff your block against the service_conf.yaml.template shipped in your checked-out version — that file is the source of truth.

5 Put it behind TLS and start

RAGFlow’s own nginx listens on port 80 inside the stack. Front it with Caddy for automatic HTTPS. Add a caddy service to docker-compose.yml (or run Caddy separately) pointing at the RAGFlow web container:

# Caddyfile
rag.example.com {
    reverse_proxy ragflow-server:80
}

Bring the whole stack up. First boot pulls several gigabytes and initialises Elasticsearch, so give it a few minutes.

docker compose -f docker-compose.yml up -d

# follow startup — wait for the RAGFlow banner
docker compose logs -f ragflow-server

When you see the ASCII banner and * Running on all addresses, browse to https://rag.example.com. You should see a Sign in with Microsoft button alongside the form.

6 Register the models

A chat model — e.g. an Ollama endpoint (http://ollama:11434) or your Azure OpenAI / OpenAI key.
An embedding model — the built-in one (full image), an Ollama embedding model, or an API embedding model.

Then set them as the system default models so every new knowledge base inherits them.

7 Create knowledge bases

In RAGFlow a “database” is a Knowledge base (dataset). This is where the multi-RAG requirement and the deep parsing both pay off.

Click Create knowledge base, name it e.g. Contracts 2026.
Pick a chunking method / template that matches the content — General, Paper, Manual, Laws, Q&A, Table, Presentation, and more. The right template is the single biggest lever on answer quality.
Upload .pdf, .docx, .xlsx, .pptx, .md and friends, then click Parse. DeepDoc shows you the detected layout and the resulting chunks — you can inspect and correct them before they are embedded.

Corporate vs personal — the multi-tenant model

RAGFlow is multi-tenant. Each user owns their knowledge bases, and you share the corporate ones with a team:

Create a team and invite the relevant staff.
Share curated knowledge bases (e.g. HR Handbook, Contracts 2026) with that team — members query but the owner controls content.
Individual users freely create their own private knowledge bases for experiments, sitting right next to the shared corporate ones. That is exactly the “users try their own small RAGs” pattern.

8 Build a chat assistant

Knowledge bases are the data; a Chat assistant is how people use them. Under Chat → Create an assistant:

Attach one or more knowledge bases (a single assistant can span several, e.g. HR + IT policies).
Set the system prompt and enable citations so every answer links back to the source chunk — with DeepDoc that citation even highlights the region of the original PDF.
Tune similarity threshold and top N if answers are too narrow or too noisy.

Power users can instead use the visual Agent canvas to chain retrieval across knowledge bases — but a plain chat assistant covers the common company case.

Troubleshooting

Elasticsearch container exits immediately

Almost always vm.max_map_count is too low or the VM is out of RAM. Re-check sysctl vm.max_map_count (must be ≥ 262144) and give the box more memory — ES alone wants a couple of gigabytes. Switching DOC_ENGINE=infinity lowers the footprint.

Microsoft button missing or callback fails

Confirm the oauth block actually loaded (docker compose logs ragflow-server on startup).
The Entra redirect URI must match .../v1/user/oauth/callback/microsoft character for character, including the channel key.
Make sure RAGFlow knows its public URL so it builds the right redirect — access it via the HTTPS domain, not the raw IP.

Parsing is slow or stuck

DeepDoc’s layout analysis is CPU-heavy. Large scanned PDFs can take minutes each; watch the task executor logs. For big bulk imports, give the VM more cores or scale the ragflow task workers.

Answers ignore an uploaded file

A file only becomes searchable after it shows parsed/success and has been embedded. If it is stuck at pending, the embedding model is likely unset or unreachable — re-check Model providers and the system default embedding model.

What you end up with

A self-hosted RAG at https://rag.example.com that actually understands your messy documents — tables stay tables, contracts keep their clauses — with staff signing in via Microsoft, querying team-shared corporate knowledge bases, and spinning up their own private ones beside them. Every answer is grounded and cited, and nothing leaves the VM you control.

References

infiniflow/ragflow on GitHub — source, Compose stack and release tags
RAGFlow documentation — deployment, configuration and OAuth/OIDC
RAGFlow — Configuration reference (.env, service_conf.yaml, doc engine)
RAGFlow — User & tenant management (teams and sharing)
Microsoft — Register an application in Entra ID