Introduction to Databases

Almost every web application stores data — user accounts, posts, orders, settings. Databases are the systems purpose-built to store, retrieve, and manage that data reliably. Understanding what a database is and the two major categories (relational and NoSQL) is foundational for any engineer.

Target reader: Beginners who have not worked with databases before. Estimated time: 15 min read Prerequisites: Servers and Clients (recommended)

What a Database Is

A database is an organized collection of data, along with software (a database management system, or DBMS) for storing, retrieving, and managing it efficiently.

The key properties that distinguish a database from a plain file:

Property	Database	Plain file
Speed	Indexed lookups take milliseconds even with millions of rows	Reading a large file requires scanning from the start
Concurrent access	Multiple users can read/write simultaneously without corruption	Files are not designed for concurrent writes
Reliability	Transactions ensure partial writes do not corrupt data	A crash mid-write can leave a file in a broken state
Query power	Structured queries filter and join data efficiently	Text search is limited and slow

Why Not Just Use Files?

For very small projects, storing data in a text or JSON file is sometimes acceptable. The problems emerge quickly:

Search: Finding a user by email in a 100,000-line file means reading every line.
Updates: Changing one record means rewriting the entire file, which risks corruption on crash.
Concurrency: Two requests writing to the same file at the same time can corrupt each other’s data.
Relationships: Expressing “this order belongs to this user” across files requires manual management.

Databases solve all of these problems. For any application beyond a small prototype, a database is the correct choice.

What a Relational Database (RDB) Is

A relational database (RDB) organizes data into tables — similar to spreadsheets — where:

A table stores one type of entity (users, products, orders).
A row is one record (one user, one product).
A column is one attribute (name, email, price).
Foreign keys link rows across tables (an order row references the user row that placed it).

Relational databases are queried using SQL (Structured Query Language), a declarative language designed specifically for this purpose.

Example: Product Table for an E-Commerce Site

Table: products

| id  | name             | price  | category    | in_stock |
|-----|------------------|--------|-------------|----------|
| 1   | Wireless Mouse   | 29.99  | Electronics | true     |
| 2   | Notebook (A5)    | 4.99   | Stationery  | true     |
| 3   | USB-C Hub        | 49.99  | Electronics | false    |

A SQL query to find all in-stock electronics under $50:

SELECT name, price
FROM products
WHERE category = 'Electronics'
  AND in_stock = true
  AND price < 50.00;

Result:

| name           | price |
|----------------|-------|
| Wireless Mouse | 29.99 |

The table structure also links to other tables:

Table: orders

| id  | user_id | product_id | quantity | ordered_at          |
|-----|---------|------------|----------|---------------------|
| 101 | 55      | 1          | 2        | 2026-03-10 14:23:00 |
| 102 | 82      | 2          | 5        | 2026-03-11 09:15:00 |

user_id and product_id are foreign keys — they reference rows in the users and products tables, linking the data together without duplication.

What NoSQL Is

NoSQL (Not Only SQL) is an umbrella term for databases that store data in formats other than tables. The most common format is a document — a JSON-like object that can have nested fields.

// A product document in a NoSQL database
{
  "id": "abc123",
  "name": "Wireless Mouse",
  "price": 29.99,
  "category": "Electronics",
  "tags": ["wireless", "USB", "ergonomic"],
  "specs": {
    "dpi": 1600,
    "buttons": 6,
    "battery_life_hours": 12
  }
}

Notice that specs is a nested object and tags is an array — structures that do not map cleanly to a single table row.

Other NoSQL types include key-value stores (like Redis), column-family stores (like Cassandra), and graph databases (like Neo4j).

When to Use RDB vs NoSQL

Neither type is universally better. The choice depends on the data and the use case.

Criterion	Relational DB	NoSQL (document)
Data structure	Well-defined, consistent schema	Flexible, varies per record
Relationships	Complex joins between many tables	Few cross-document relationships
Consistency	Strong (ACID transactions)	Often eventual consistency
Scaling	Vertical scaling is traditional	Horizontal scaling is easier
Best for	Financial records, e-commerce, user accounts	Content, catalogs, user activity, real-time data

Rule of thumb: Start with a relational database. Move to NoSQL only if there is a specific reason — usually extreme scale, highly variable document structure, or schema-less flexibility as a deliberate design choice.

Popular Databases

Database	Type	Notes
PostgreSQL	Relational	Open-source, feature-rich, excellent for most production apps
MySQL / MariaDB	Relational	Widely used; powers WordPress and many legacy apps
SQLite	Relational	Serverless, file-based; great for local development and small apps
MongoDB	Document (NoSQL)	Most popular NoSQL database; flexible JSON-like documents
Redis	Key-value (NoSQL)	In-memory, extremely fast; used for caching and sessions
Supabase	Relational (hosted)	PostgreSQL as a service with a developer-friendly dashboard

Where Databases Fit in Web Architecture

A database sits behind the application server, never directly exposed to the browser:

Browser (client)
     |
     |  HTTP request
     v
Web / Application Server
     |
     |  SQL query  /  ORM call
     v
Database Server
     |
     |  Query result
     v
Application Server  (formats response)
     |
     |  HTTP response (HTML or JSON)
     v
Browser (client)

The browser never talks directly to the database. The application server acts as the intermediary, applying authentication, authorization, and business logic before touching data.

Summary

A database stores and retrieves data reliably and efficiently — better than files for almost all production uses.
Relational databases organize data into tables with rows and columns, linked by foreign keys, and queried with SQL.
NoSQL databases use flexible formats (often JSON-like documents) suited to variable data structures.
Start with a relational database; switch to NoSQL only when there is a concrete reason.
Databases sit behind the application server and are never directly accessible from the browser.

FAQ

Q: Do I need to learn SQL?

A: Yes, at some point. Even when using an ORM (Object-Relational Mapper) that abstracts SQL, understanding what queries are generated and how indexes work is essential for debugging performance issues. SQL is a practical, durable skill.

Q: What is an ORM?

A: An ORM (Object-Relational Mapper) is a library that lets a developer interact with a database using the programming language of their choice rather than writing SQL directly. Prisma, Drizzle, SQLAlchemy, and ActiveRecord are popular examples. ORMs are convenient but do not replace the need to understand SQL.

Q: How does a database ensure that data is not corrupted if the server crashes?

A: Relational databases use transactions with ACID properties (Atomicity, Consistency, Isolation, Durability). An atomic transaction either completes entirely or is rolled back — there is no partial state. Write-ahead logging (WAL) ensures that committed transactions survive crashes.

Next: Introduction to Cloud Computing

Link to this page (Japanese): データベース入門