Skip to content

ADR 03: Vector Database Selection

Statusaccepted
Author@freinold
Voters@l0renor, @lenabMUC, @freinold
Drafted2025-11-07
Accepted2025-11-10

Context and Problem Statement

Vector databases are specialized systems designed to efficiently store, index, and search high-dimensional vector representations of data, such as text embeddings. They are essential for semantic search and retrieval-augmented generation (RAG) scenarios, where finding similar documents or context based on vector similarity is required.

For our use case, the vector database must meet the following criteria:

  • Lightweight and easy to operate
  • Support storing documents or metadata alongside vectors
  • Enable flexible and efficient filtering of stored data
  • Integrate smoothly with frameworks like LangChain

Considered Options

  • Qdrant
  • pgVector (Postgres extension)

Evaluation

Qdrant

  • Slim footprint: Qdrant is a standalone, lightweight service with minimal dependencies.
  • Good document support: Supports storing payloads (documents/metadata) alongside vectors.
  • Easy filtering: Advanced filtering capabilities on payloads.
  • LangChain integration: Official integration and good community support.

pgVector

  • Slim footprint: Requires running a full Postgres instance; heavier than Qdrant.
  • Good document support: Can store documents in tables, but less optimized for vector+payload use cases.
  • Easy filtering: Leverages SQL for filtering, but may require more complex queries.
  • LangChain integration: Supported, but less feature-rich than Qdrant integration.

Overview

CriterionQdrantpgVector
Slim footprint++-
Document support+++
Easy filtering+++
LangChain integration+++

Decision Made

We choose Qdrant as our vector database. It offers a slim footprint, excellent document and filtering support, and integrates seamlessly with LangChain. While pgVector is a solid option for Postgres-centric stacks, Qdrant is better suited for our requirements.