Skip to content
LinkedInX

JSON Processing in Python

About 5 minutes

Target audience: Those who understand Python basics and want to learn how to process API responses

JSON (JavaScript Object Notation) is a lightweight text-based format for representing data. JSON is used everywhere in engineering: API responses, configuration files, logs, and more. Python has a built-in json module for working with JSON, and pydantic enables type-safe data processing with automatic validation.

JSON represents data in the following format:

{
  "name": "Alice",
  "age": 30,
  "is_active": true,
  "tags": ["python", "ai", "engineer"],
  "address": {
    "city": "Tokyo",
    "zip": "100-0001"
  },
  "notes": null
}

Type correspondence between JSON and Python:

JSON TypePython TypeExample
objectdict{"key": "value"}
arraylist[1, 2, 3]
stringstr"hello"
numberint / float42, 3.14
true/falseTrue / FalseTrue
nullNoneNone

Python’s standard json module converts between strings and Python objects.

json.loads(): JSON string to Python object

Section titled “json.loads(): JSON string to Python object”
import json

# A JSON string (e.g., a response received from an API)
json_string = '{"name": "Alice", "age": 30, "is_active": true}'

# Convert to a Python dictionary
data = json.loads(json_string)

print(type(data))         # <class 'dict'>
print(data["name"])       # Alice
print(data["age"])        # 30
print(data["is_active"])  # True (Python bool)

json.dumps(): Python object to JSON string

Section titled “json.dumps(): Python object to JSON string”
import json

data = {
    "model": "claude-opus-4-5",
    "max_tokens": 1024,
    "messages": [
        {"role": "user", "content": "Hello"}
    ]
}

# Convert Python dictionary to JSON string
json_string = json.dumps(data)
print(json_string)
# {"model": "claude-opus-4-5", "max_tokens": 1024, "messages": [{"role": "user", "content": "Hello"}]}

# With indentation for readability
json_string_readable = json.dumps(data, indent=2)
print(json_string_readable)
# {
#   "model": "claude-opus-4-5",
#   "max_tokens": 1024,
#   "messages": [
#     {
#       "role": "user",
#       "content": "Hello"
#     }
#   ]
# }
import json

# Write to a JSON file
data = {"version": "1.0", "settings": {"theme": "dark", "language": "en"}}

with open("config.json", "w", encoding="utf-8") as f:
    json.dump(data, f, indent=2)

# Read from a JSON file
with open("config.json", "r", encoding="utf-8") as f:
    loaded = json.load(f)

print(loaded["settings"]["theme"])  # dark

API responses often have nested structures:

import json

# Example modeled on a GitHub API response
response_json = """
{
  "id": 12345,
  "name": "my-project",
  "owner": {
    "login": "alice",
    "type": "User"
  },
  "topics": ["ai", "python", "api"],
  "license": {
    "name": "MIT License",
    "spdx_id": "MIT"
  },
  "private": false,
  "stargazers_count": 128,
  "homepage": null
}
"""

repo = json.loads(response_json)

# Accessing nested data
print(repo["owner"]["login"])         # alice
print(repo["topics"][0])              # ai
print(repo["license"]["name"])        # MIT License

# Use get() to safely access keys that may not exist
homepage = repo.get("homepage")               # None (no KeyError)
description = repo.get("description", "N/A")  # Specify a default value
print(homepage)     # None
print(description)  # N/A
import json

# A JSON list of users
users_json = """
[
  {"id": 1, "name": "Alice", "role": "admin"},
  {"id": 2, "name": "Bob",   "role": "user"},
  {"id": 3, "name": "Carol", "role": "user"}
]
"""

users = json.loads(users_json)

# Extract all names
names = [user["name"] for user in users]
print(names)  # ['Alice', 'Bob', 'Carol']

# Filter by role
admins = [user for user in users if user["role"] == "admin"]
print(admins)  # [{'id': 1, 'name': 'Alice', 'role': 'admin'}]
import json

# Forgetting json.loads() leaves it as a string
raw = '{"name": "Alice"}'
print(raw["name"])  # TypeError! Cannot index a string this way

data = json.loads(raw)
print(data["name"])  # OK: Alice
import json

# loads() → converts from a string (the "s" stands for string)
data = json.loads('{"key": "value"}')

# load() → converts from a file object
with open("data.json") as f:
    data = json.load(f)
import json

data = {"city": "Tōkyō", "greeting": "Bonjour"}

# By default, non-ASCII characters are Unicode-escaped
print(json.dumps(data))
# {"city": "Tōkyō", "greeting": "Bonjour"}

# Use ensure_ascii=False to preserve non-ASCII characters
print(json.dumps(data, ensure_ascii=False))
# {"city": "Tōkyō", "greeting": "Bonjour"}

pydantic uses Python type hints to automatically validate and convert data. It is especially useful when working with API responses.

pip install pydantic
from pydantic import BaseModel
from typing import Optional

# Define the data model
class Owner(BaseModel):
    login: str
    type: str

class Repository(BaseModel):
    id: int
    name: str
    owner: Owner
    topics: list[str] = []               # Default value
    private: bool = False
    stargazers_count: int = 0
    homepage: Optional[str] = None       # Allow None
    description: Optional[str] = None

# Create a model instance from a dictionary
data = {
    "id": 12345,
    "name": "my-project",
    "owner": {"login": "alice", "type": "User"},
    "topics": ["ai", "python"],
    "stargazers_count": 128,
    "homepage": None,
    "private": False
}

repo = Repository(**data)

# Access fields with type safety
print(repo.name)              # my-project
print(repo.owner.login)       # alice (nested model auto-converted)
print(repo.topics)            # ['ai', 'python']
print(repo.homepage)          # None
from pydantic import BaseModel, ValidationError

class User(BaseModel):
    id: int
    name: str
    age: int

# Pass invalid data
try:
    user = User(id="not-an-int", name="Alice", age=30)
except ValidationError as e:
    print(e)
    # 1 validation error for User
    # id
    #   Input should be a valid integer, unable to parse string as an integer
    #   [type=int_parsing, input_value='not-an-int', ...]

When a type mismatch occurs, a clear, descriptive error message is produced.

Practical Example: Parsing an API Response with pydantic

Section titled “Practical Example: Parsing an API Response with pydantic”
import requests
from pydantic import BaseModel
from typing import Optional

class GitHubUser(BaseModel):
    login: str
    id: int
    name: Optional[str] = None
    company: Optional[str] = None
    public_repos: int = 0
    followers: int = 0

def get_github_user(username: str) -> GitHubUser | None:
    """Fetch GitHub user info and return it as a pydantic model"""
    try:
        response = requests.get(
            f"https://api.github.com/users/{username}",
            headers={"Accept": "application/vnd.github.v3+json"},
            timeout=10
        )
        response.raise_for_status()

        # Convert the response dictionary to a pydantic model
        # Fields not defined in the model are silently ignored
        return GitHubUser(**response.json())

    except requests.exceptions.RequestException as e:
        print(f"API call error: {e}")
        return None

# Run it
user = get_github_user("torvalds")
if user:
    print(f"Name: {user.name}")
    print(f"Public repos: {user.public_repos}")
    print(f"Followers: {user.followers:,}")

To convert a model back to a dictionary or JSON string:

user_dict = user.model_dump()
user_json = user.model_dump_json()
  • json.loads() converts a JSON string to a dictionary; json.dumps() converts a dictionary to a JSON string
  • Use json.load() / json.dump() for file operations
  • Use ensure_ascii=False when working with non-ASCII characters
  • pydantic’s BaseModel provides type-safe data processing and validation
  • Convert an API response dictionary to a pydantic model with Model(**response.json())

As a next step, learn how to combine pydantic with Anthropic API responses in Python AI SDK in Practice.

Q: How should I choose between the json module and pydantic?

A: The json module is sufficient for simple configuration file reading and writing. pydantic is valuable for processing API responses and when type checking or validation is required. FastAPI uses pydantic internally, making the integration seamless.

Q: How do I handle dates in JSON (e.g., “2026-05-13T09:00:00Z”)?

A: json.loads() does not automatically convert ISO 8601 date strings to datetime objects. With pydantic, specifying a datetime field type triggers automatic conversion. With the standard library, use datetime.fromisoformat().

Q: How do I efficiently process large JSON files?

A: For files larger than a few hundred MB, rather than loading the entire file into memory with json.load(), use the ijson library for streaming-based processing.

Q: What changed between pydantic v1 and v2?

A: pydantic v2 (released in 2023) changed the API. dict() became model_dump() and json() became model_dump_json(). New projects should use v2.

See the references for the external specifications and background sources used on this page.[1][2][3]

  1. json module (Python official docs)
  2. pydantic official documentation
  3. JSON specification (RFC 8259)