Skip to content
LinkedInX

Choosing a Vector Database

About 10 minutes

Target audience: People designing RAG infrastructure, and those undecided on which vector DB to choose
Prerequisites: Familiarity with What Is RAG? and the basics of Embeddings & Vector Representations

A vector database is a database designed to store text or images as numerical vectors and to quickly search for “semantically similar vectors.” In RAG, documents are vectorized by an embedding model, stored in a vector DB, and retrieved when a user submits a query.

As explained in Embeddings, documents are converted to hundreds or thousands of dimensions. With 10,000 documents, there are 10,000 vectors. Computing the distance between a query vector and every one of those vectors on each search is impractical.

Vector DBs use ANN (Approximate Nearest Neighbor) search to find vectors that are “close enough” — not strictly the nearest, but sufficiently accurate — in milliseconds. This remains practical even for millions of vectors. Algorithms like HNSW (Hierarchical Navigable Small World) are widely used.

graph LR
    A["Documents (text)"] --> B["Embedding model"]
    B --> C["Vectors"]
    C --> D["Vector DB (ANN index)"]
    E["User query"] --> F["Embedding model"]
    F --> G["Query vector"]
    G --> D
    D --> H["Top K similar results"]
    H --> I["Pass to LLM"]

Vector DBs can be broadly categorized into four types based on how they are deployed.

The service provider manages the infrastructure. Setup is simple, and scaling and backups are automated.

Examples: Pinecone

Open-source vector DBs that can be self-hosted. Managed cloud options are also available.

Examples: Weaviate, Qdrant

Runs inside the application process or locally, with no separate server required. Suitable for prototypes and small-scale development.

Examples: Chroma, Faiss

Adds vector search capability to an existing relational database.

Examples: pgvector (PostgreSQL extension)

NameTypeHostingHybrid SearchMetadata FilteringScalingFree Tier
ChromaEmbeddedLocal / self-hostLimitedYesSmall–mediumFully free (OSS)
PineconeManagedCloud (AWS/GCP)YesYesLarge scaleYes (Starter)
WeaviateOSS / ManagedSelf / CloudBuilt-inYesLarge scaleYes (cloud)
QdrantOSS / ManagedSelf / CloudYesYesLarge scaleYes (cloud)
pgvectorSQL extensionPostgreSQL serverManual implementationPostgreSQL SQLMedium–largeFree (with PostgreSQL)
FaissLibraryIn-process (library)NoNoOffline / researchFully free (OSS)

Chroma is a simple vector DB that stores data in a local file or in memory. Its documentation describes it as AI data infrastructure that includes embeddings, metadata storage, vector search, and full-text search.[1]

# pip install chromadb
import chromadb

# Persistent local storage
client = chromadb.PersistentClient(path="./chroma_db")
collection = client.get_or_create_collection("documents")

# Add documents
collection.add(
    documents=["RAG stands for Retrieval-Augmented Generation", "Embeddings convert text to vectors"],
    metadatas=[{"source": "intro.md"}, {"source": "embeddings.md"}],
    ids=["doc1", "doc2"]
)

# Search
results = collection.query(
    query_texts=["Explain how RAG works"],
    n_results=2
)
print(results["documents"])

Best for: Prototype development, local experiments, thousands to tens of thousands of vectors

Pinecone is a managed vector database that provides semantic search, full-text search, hybrid search, metadata filtering, and reranking capabilities.[2]

# pip install pinecone
from pinecone import Pinecone, ServerlessSpec

pc = Pinecone(api_key="YOUR_API_KEY")  # Use PINECONE_API_KEY environment variable

# Create index (first time only)
pc.create_index(
    name="rag-documents",
    dimension=1536,  # Dimensions for text-embedding-3-small
    metric="cosine",
    spec=ServerlessSpec(cloud="aws", region="us-east-1")
)

index = pc.Index("rag-documents")

# Add vectors
index.upsert(vectors=[
    {"id": "doc1", "values": embedding_vector, "metadata": {"source": "intro.md"}},
])

# Search
results = index.query(vector=query_vector, top_k=5, include_metadata=True)

Best for: Production environments, team development, reducing infrastructure management overhead

Weaviate is an open-source vector DB that can handle BM25 + vector hybrid search.[3]

# pip install weaviate-client
import weaviate

client = weaviate.connect_to_local()  # Local Docker setup

collection = client.collections.get("Documents")

# Hybrid search (BM25 + vector)
results = collection.query.hybrid(
    query="How to read a file in Python",
    alpha=0.5,   # 0 = BM25 only, 1 = vector only
    limit=5
)

for obj in results.objects:
    print(obj.properties["content"][:100])

Best for: Hybrid search requirements, large-scale self-hosted operations, Docker-based deployments

Qdrant is an open-source vector DB implemented in Rust. It provides search features that combine filtering with vector search.[4]

# pip install qdrant-client
from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams, PointStruct

client = QdrantClient(":memory:")  # In-memory mode (development)
# client = QdrantClient(url="http://localhost:6333")  # Local server

client.create_collection(
    collection_name="documents",
    vectors_config=VectorParams(size=1536, distance=Distance.COSINE),
)

client.upsert(
    collection_name="documents",
    points=[
        PointStruct(
            id=1,
            vector=embedding_vector,
            payload={"source": "intro.md", "content": "RAG is..."}
        )
    ]
)

results = client.search(
    collection_name="documents",
    query_vector=query_vector,
    limit=5
)

Best for: High throughput requirements, preference for Rust-based reliability, self-hosted operations

pgvector — Best When Already Using PostgreSQL

Section titled “pgvector — Best When Already Using PostgreSQL”

pgvector is a PostgreSQL extension that adds vector search to an existing PostgreSQL database. It supports exact and approximate nearest-neighbor search plus distance functions such as L2 distance, inner product, and cosine distance.[5]

-- Enable the pgvector extension
CREATE EXTENSION IF NOT EXISTS vector;

-- Create a table with a vector column
CREATE TABLE documents (
    id SERIAL PRIMARY KEY,
    content TEXT,
    source VARCHAR(255),
    embedding vector(1536)  -- Dimensions for text-embedding-3-small
);

-- Create an HNSW index
CREATE INDEX ON documents
USING hnsw (embedding vector_cosine_ops);
# pip install psycopg2-binary pgvector
from pgvector.psycopg2 import register_vector
import psycopg2

conn = psycopg2.connect("postgresql://user:password@localhost/dbname")
register_vector(conn)

cursor = conn.cursor()

# Vector search (top 5)
cursor.execute(
    "SELECT content, source, 1 - (embedding <=> %s::vector) AS similarity "
    "FROM documents ORDER BY embedding <=> %s::vector LIMIT 5",
    (query_vector, query_vector)
)
results = cursor.fetchall()

Best for: Leveraging existing PostgreSQL infrastructure, complex SQL-based filtering, centralizing data management

Faiss (Facebook AI Similarity Search), developed by Meta, is a library for similarity search and clustering of dense vectors.[6]

# pip install faiss-cpu
import faiss
import numpy as np

dimension = 1536
index = faiss.IndexFlatIP(dimension)  # Inner product based search

# Normalize before adding (for cosine similarity)
vectors = np.array([embedding_vector]).astype("float32")
faiss.normalize_L2(vectors)
index.add(vectors)

# Search
query = np.array([query_vector]).astype("float32")
faiss.normalize_L2(query)
distances, indices = index.search(query, k=5)

Best for: Offline environments, research and experimentation, cases with no metadata filtering requirements

Chroma is the recommended starting point. No additional server setup is required, and integration with LangChain and LlamaIndex is straightforward.

pgvector is the most natural choice. It avoids introducing a new database and reuses existing backup, access control, and SQL query infrastructure.

Production Without Infrastructure Management

Section titled “Production Without Infrastructure Management”

Pinecone’s managed service is a viable option. Setup requires only an API key, and scaling is automatic.

Weaviate and Qdrant both offer hybrid search as a built-in feature. A combination of Elasticsearch with pgvector is also a strong option.

Qdrant (fast, Rust-based) and Weaviate (feature-rich) are both strong candidates. Both support deployment via Docker/Kubernetes.

When documents are updated or deleted, the corresponding vectors must be updated or deleted as well. Most vector DBs support upsert (update or insert) and delete.

# Delete in Chroma
collection.delete(ids=["doc1"])

# Update in Pinecone (upsert = update or insert)
index.upsert(vectors=[
    {"id": "doc1", "values": new_embedding_vector, "metadata": {"source": "intro_v2.md"}}
])

Changing the embedding model or chunking strategy requires re-creating all vectors. In production, keep the original document text stored separately so it can be re-vectorized when needed.

When migrating to a different vector DB, export the original document text and metadata, then re-index in the new DB. The vectors themselves typically do not need to be exported — they are regenerated from the source text.

  • A vector DB enables fast search for semantically similar vectors
  • For prototypes, Chroma; for existing PostgreSQL, pgvector; for managed production, Pinecone are practical starting points
  • For hybrid search requirements, Weaviate or Qdrant are strong choices
  • Plan for document update, deletion, and re-indexing operations from the start

Q: Can I use a regular database (MySQL, PostgreSQL) instead?

A: Regular databases are not optimized for high-dimensional vector nearest-neighbor search. As the number of documents grows, search speed degrades beyond practical limits. For PostgreSQL specifically, the pgvector extension enables fast vector search while staying within the PostgreSQL ecosystem.

Q: Can I migrate to a different vector DB later?

A: Yes. When switching vector DBs, having the original document text and metadata stored separately allows re-vectorization and re-indexing in the new DB. If using a framework like LangChain or LlamaIndex, swapping the vector DB component is relatively straightforward.

Q: Are there free vector DBs?

A: Chroma (OSS) and Faiss (OSS) are fully free. Pinecone, Weaviate, and Qdrant’s cloud offerings all have free tiers. pgvector is available at no additional cost if PostgreSQL is already in use. For minimal cost, Chroma + self-hosted or pgvector are the main options.

Q: Is data stored in vector DBs encrypted?

A: Managed services like Pinecone provide encryption in transit (TLS) and at rest. For self-hosted deployments, encryption must be configured at the infrastructure level. When handling confidential documents, select the hosting location according to the applicable security policy.

  1. Chroma — Official Documentation
  2. Pinecone — Official Documentation
  3. Weaviate — Official Documentation
  4. Qdrant — Official Documentation
  5. pgvector — GitHub
  6. Faiss — GitHub