How to Self-Host RAGFlow for Deep-Document RAG

When your corporate knowledge lives in gnarly PDFs, scanned contracts and dense spreadsheets, retrieval quality is everything — and that is RAGFlow’s whole point. This recipe stands it up on a VM you own, with Microsoft Entra login, Office/PDF/Markdown ingestion, and multiple per-team and personal knowledge bases.

Prep time A day for a tuned pilot
One-time cost €0 software — RAGFlow is Apache-2.0 open source
Going cost One beefier VM (~€60–150/mo) + your LLM tokens

Ingredients

Why RAGFlow over a chat-first tool? Its DeepDoc engine does layout-aware parsing — it understands tables, columns, headers and figures instead of flattening a PDF into a wall of text — and it offers template-based chunking per document type. If retrieval accuracy on complex documents is your top priority, this is the strongest of the self-hostable options.

1 Prepare the VM

Install Docker, then raise the memory-map limit Elasticsearch needs and make it survive reboots.

curl -fsSL https://get.docker.com | sh
sudo usermod -aG docker "$USER"   # log out/in afterwards

# Elasticsearch requirement — apply now and persist
sudo sysctl -w vm.max_map_count=262144
echo 'vm.max_map_count=262144' | sudo tee /etc/sysctl.d/99-ragflow.conf

Open only ports 80 and 443 on the firewall. Every backend service stays on the internal Docker network.

2 Get RAGFlow and choose an image

RAGFlow ships its own Compose stack. Clone the repo and drop into the docker folder.

git clone https://github.com/infiniflow/ragflow.git
cd ragflow/docker

Open the .env file and pick your image. Two flavours:

Recommended to start

Full image

Tag
:v0.21.1 (no -slim)
Size
~9 GB
Includes
Built-in embedding models
Use when
You want it to just work offline
Lean

Slim image

Tag
:v0.21.1-slim
Size
~2 GB
Includes
No embedding models
Use when
You supply embeddings via API/Ollama
Check the current tag. RAGFlow moves fast — confirm the latest release on GitHub and set RAGFLOW_IMAGE in .env accordingly. The default doc engine is Elasticsearch; you can switch to the lighter infinity via DOC_ENGINE if RAM is tight.

3 Register an app in Microsoft Entra

In Azure portal → Microsoft Entra ID → App registrations → New registration:

  1. Name it RAGFlow; single-tenant is fine.
  2. Redirect URI — platform Web: https://rag.example.com/v1/user/oauth/callback/microsoft (the trailing microsoft is the channel key you will use in config).
  3. Copy the client ID and tenant ID; create a client secret and copy its value.
  4. Add Microsoft Graph delegated permissions openid, email, profile, User.Read.

Your OIDC issuer for a single tenant is: https://login.microsoftonline.com/<tenant-id>/v2.0.

4 Wire up OIDC login

RAGFlow reads an oauth block from its service config. Edit ragflow/docker/service_conf.yaml.template (it is rendered into the running config on startup) and add a Microsoft channel:

# service_conf.yaml.template
oauth:
  microsoft:
    type: oidc
    display_name: "Microsoft"
    client_id: "<application-client-id>"
    client_secret: "<client-secret-value>"
    issuer: "https://login.microsoftonline.com/<tenant-id>/v2.0"
    scope: "openid email profile"
    redirect_uri: "https://rag.example.com/v1/user/oauth/callback/microsoft"
Config keys shift between releases. RAGFlow’s OAuth/OIDC support is young and the exact field names occasionally change. If a key is rejected on boot, diff your block against the service_conf.yaml.template shipped in your checked-out version — that file is the source of truth.

5 Put it behind TLS and start

RAGFlow’s own nginx listens on port 80 inside the stack. Front it with Caddy for automatic HTTPS. Add a caddy service to docker-compose.yml (or run Caddy separately) pointing at the RAGFlow web container:

# Caddyfile
rag.example.com {
    reverse_proxy ragflow-server:80
}

Bring the whole stack up. First boot pulls several gigabytes and initialises Elasticsearch, so give it a few minutes.

docker compose -f docker-compose.yml up -d

# follow startup — wait for the RAGFlow banner
docker compose logs -f ragflow-server

When you see the ASCII banner and * Running on all addresses, browse to https://rag.example.com. You should see a Sign in with Microsoft button alongside the form.

6 Register the models

Log in (the first user can be made the owner), then open Avatar → Model providers. Add:

Then set them as the system default models so every new knowledge base inherits them.

7 Create knowledge bases

In RAGFlow a “database” is a Knowledge base (dataset). This is where the multi-RAG requirement and the deep parsing both pay off.

  1. Click Create knowledge base, name it e.g. Contracts 2026.
  2. Pick a chunking method / template that matches the content — General, Paper, Manual, Laws, Q&A, Table, Presentation, and more. The right template is the single biggest lever on answer quality.
  3. Upload .pdf, .docx, .xlsx, .pptx, .md and friends, then click Parse. DeepDoc shows you the detected layout and the resulting chunks — you can inspect and correct them before they are embedded.

Corporate vs personal — the multi-tenant model

RAGFlow is multi-tenant. Each user owns their knowledge bases, and you share the corporate ones with a team:

8 Build a chat assistant

Knowledge bases are the data; a Chat assistant is how people use them. Under Chat → Create an assistant:

Power users can instead use the visual Agent canvas to chain retrieval across knowledge bases — but a plain chat assistant covers the common company case.

Troubleshooting

Elasticsearch container exits immediately

Almost always vm.max_map_count is too low or the VM is out of RAM. Re-check sysctl vm.max_map_count (must be ≥ 262144) and give the box more memory — ES alone wants a couple of gigabytes. Switching DOC_ENGINE=infinity lowers the footprint.

Microsoft button missing or callback fails

Parsing is slow or stuck

DeepDoc’s layout analysis is CPU-heavy. Large scanned PDFs can take minutes each; watch the task executor logs. For big bulk imports, give the VM more cores or scale the ragflow task workers.

Answers ignore an uploaded file

A file only becomes searchable after it shows parsed/success and has been embedded. If it is stuck at pending, the embedding model is likely unset or unreachable — re-check Model providers and the system default embedding model.

What you end up with

A self-hosted RAG at https://rag.example.com that actually understands your messy documents — tables stay tables, contracts keep their clauses — with staff signing in via Microsoft, querying team-shared corporate knowledge bases, and spinning up their own private ones beside them. Every answer is grounded and cited, and nothing leaves the VM you control.

References