Database Design Patterns

About 10 minutes

Engineers learning data design for AI systems, those who want to understand the different database types and when to use each

Understanding the overall structure from Cloud Architecture Overview will help

A database is a system for persistently storing, retrieving, and updating data. “Which database to choose” directly affects system performance, reliability, and development cost. AI systems handle data with very different characteristics — user information, conversation history, document embedding vectors, and session caches — making a multi-database design that matches each purpose standard practice.

Database Types

Databases used in modern system development fall into six major categories.

Relational DB (SQL)

A relational database (RDB) manages data in tabular rows and columns and uses SQL (Structured Query Language) for data operations. Multiple tables are connected via “foreign keys” and data can be combined with JOIN operations.

Key characteristics:

ACID properties: guarantees Atomicity, Consistency, Isolation, and Durability
Schema (table definition) is required; data types and constraints are strict
Complex join queries are a strength

Examples: PostgreSQL (the leading open-source option), MySQL, Cloud SQL, Aurora

Document DB (NoSQL)

A document database stores data in flexible structures (documents) like JSON. Without a fixed schema, it handles data with varying fields more easily.

{
  "_id": "user_123",
  "name": "Taro Tanaka",
  "preferences": {
    "language": "en",
    "theme": "dark"
  },
  "tags": ["premium", "beta-tester"]
}

Examples: MongoDB, Firebase Firestore

Key-Value DB

A key-value database stores data in simple key (name) and value (data) pairs. The structure is simple, but because it operates in memory it is extremely fast.

Examples: Redis (most widely used), Memcached

Vector Database

A vector database stores “embedding vectors” (arrays of floating-point numbers) produced when AI models numerically represent text, images, or audio, and enables semantic similarity search. It is essential for implementing RAG (Retrieval-Augmented Generation).

See RAG and Vector Databases for details.

Examples: Pinecone, Weaviate, Chroma, pgvector (PostgreSQL extension)

Time-Series DB

A time-series database is optimized for storing and aggregating timestamped data (metrics, logs, sensor data).

Examples: InfluxDB, TimescaleDB (PostgreSQL extension)

Graph DB

A graph database represents data as entities (nodes) and relationships (edges). It is suited for social graphs, knowledge graphs, and recommendation engines.

Examples: Neo4j

Comparison: When to Use Each DB Type

DB Type	Strengths	Weaknesses	Typical Use Case	Example DB
Relational	ACID guarantees, complex JOINs	High schema change cost	User info, orders, conversation history	PostgreSQL
Document	Flexible schema, easy horizontal scaling	Poor at JOINs	User profiles, product catalogs	MongoDB
Key-Value	Ultra-fast (microseconds)	Complex queries not possible	Sessions, caching, rate limit counters	Redis
Vector	Semantic similarity search	Inefficient for regular CRUD	Document search (RAG), similar product search	Pinecone, pgvector
Time-Series	Fast time-series aggregation	Poor at general queries	Metrics, log aggregation	InfluxDB
Graph	Cross-relational search	High learning curve	Social graphs, recommendations	Neo4j

Typical Data Architecture for AI Systems

Production AI services combine multiple databases.

graph TD
    User["User"] --> App["Application Server\n（FastAPI / Node.js）"]

    App --> PG["PostgreSQL\nUser info\nConversation history\nMetadata"]
    App --> Redis["Redis\nSession cache\nRate limit counters\nLLM response cache"]
    App --> VDB["Vector DB\nPinecone / pgvector\nDocument embeddings"]
    App --> Storage["Object Storage\nS3 / GCS\nRaw documents\nAttachments"]

    App --> LLMAPI["LLM API\nAnthropic / OpenAI"]

    PG --> PGReplica["PostgreSQL\nRead Replica (read-only)"]
    Redis --> RedisSentinel["Redis Sentinel\n(high availability)"]

Data Model Example for AI Services

PostgreSQL (main DB):

-- Users table
CREATE TABLE users (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    email       TEXT UNIQUE NOT NULL,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Conversations table
CREATE TABLE conversations (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id     UUID NOT NULL REFERENCES users(id),
    title       TEXT,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Messages table
CREATE TABLE messages (
    id               UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    conversation_id  UUID NOT NULL REFERENCES conversations(id),
    role             TEXT NOT NULL CHECK (role IN ('user', 'assistant')),
    content          TEXT NOT NULL,
    input_tokens     INTEGER,
    output_tokens    INTEGER,
    created_at       TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

Redis (cache and sessions):

# Session info (TTL: 24 hours)
SET session:abc123 '{"user_id":"uuid-xxx","plan":"pro"}' EX 86400

# Rate limit counter (TTL: 60 seconds)
INCR ratelimit:user:uuid-xxx:2026050312
EXPIRE ratelimit:user:uuid-xxx:2026050312 60

# LLM response cache (reuse for identical prompts)
SET cache:llm:sha256:... '{"content":"...","tokens":450}' EX 3600

Connection Pooling

Connection pooling is a technique where multiple database connections are established in advance and reused across requests, rather than creating a new connection for each request.

PostgreSQL supports up to 100 connections by default, but as more application server instances run, connections can quickly hit that limit. A connection pooler like PgBouncer aggregates the many connections from application servers into a much smaller number of connections to PostgreSQL.

10 app servers × 10 connections each = 100 connections
      ↓ PgBouncer
PostgreSQL (only 20 connections)

Index Basics

An index is a data structure that enables fast lookup of specific column values. Like the index at the back of a book, it finds matching data without scanning every row (a full table scan).

-- If conversations are frequently searched by user_id:
CREATE INDEX idx_conversations_user_id ON conversations(user_id);

-- For user lookup by email (UNIQUE constraint automatically creates an index):
CREATE UNIQUE INDEX idx_users_email ON users(email);

Indexes speed up reads but add overhead to INSERT/UPDATE operations. Create indexes only on columns that are frequently used in search conditions.

Schema Migration

Schema migration is the practice of safely changing table definitions in a production database in sync with application code changes.

Principles for production migrations:

Maintain forward compatibility: Add new columns with DEFAULT or NULL rather than NOT NULL without a default (or use a default value that doesn’t conflict with existing data)
Be careful with ALTER on large tables: These can cause locks; run during off-peak hours
Prepare rollback procedures: Have a reverse migration ready before running so you can revert immediately if something goes wrong

Common tools: Alembic (Python), Flyway, Liquibase

Backup and Recovery

Metric	Meaning	Typical Target
RTO (Recovery Time Objective)	Maximum acceptable time from failure to system restoration	Production service: within 1 hour
RPO (Recovery Point Objective)	Maximum acceptable data loss period at restoration	Critical data: within 1 hour

Managed DB services (AWS RDS, Cloud SQL, Supabase) provide automated backups and point-in-time recovery (the ability to restore to any arbitrary point in time) as standard features.

Summary

PostgreSQL is highly versatile and serves as the main DB for most AI services
Redis is an essential ultra-fast key-value DB for sessions, caching, and rate limit counters
Vector DBs are necessary for RAG semantic search (pgvector is a lightweight starting point for small scale)
The standard composition for production AI services is PostgreSQL + Redis + vector DB + object storage
Connection pooling, indexes, and migration management are the foundations of production operations

Frequently Asked Questions

Q: Can I use PostgreSQL for everything?

A: In most cases, yes. With pgvector you can handle vector search, and with the TimescaleDB extension you can handle time-series data as well. Session management in PostgreSQL without Redis is also possible, but high-frequency reads and writes (like rate limit counters) perform better with Redis, which also saves on connection count.

Q: When should I add a separate cache?

A: The signal to add a cache is when “the same query is issued frequently,” “DB latency has become a bottleneck,” or “identical prompts are being sent to the LLM API repeatedly.” Adding Redis from the start increases monitoring scope, so adding it when the need arises is also a valid approach.[1][2][3][4][5]

Q: Should I choose MongoDB or PostgreSQL?

A: For most AI service use cases, PostgreSQL is recommended. Its JSONB column type handles flexible schemas too, so PostgreSQL can manage data with variable schemas. MongoDB is an option when the team already has expertise in it, or when the data is predominantly highly irregular in structure.

Q: How do I scale as data grows?

A: First add a read replica to distribute read load. Next, cache frequently accessed data in Redis. For further growth, consider partitioning (splitting the table by time period, user ID, etc.) or sharding. Most AI services are well-served at the replica + cache stage.

See the references for the external specifications and background sources used on this page.[1][2][3][4][5]

References

Observability & Monitoring

API Design & Gateway