Deployment & CI/CD

About 10 minutes

Engineers learning to containerize applications and build CI/CD, those who want to set up automated deployment

Understanding the overall structure from Cloud Architecture Overview will help

Deployment is the process of making a developed application run in a production environment. Containers solve the “works on my machine” problem. CI/CD (Continuous Integration / Continuous Deployment) is a pipeline that automatically tests, builds, and deploys code changes. With CI/CD in place, manual deployment errors are eliminated and releases can happen safely many times a day.

Why Invest in the Deployment Process

Manual deployment has these problems:

Behavioral inconsistencies caused by environment differences (OS, library versions) between development and production
Human error from runbooks or knowledge siloed in individuals
Service downtime on every deployment
No way to track which version is running in production

Containers and CI/CD solve all of these together.

Containers (Docker)

A container is an isolated execution environment that bundles an application with all its dependencies (libraries, runtime, configuration files). Containers guarantee “runs the same way anywhere.”

Docker is the most widely used tool for creating and running containers.

VM vs Container

Comparison	Virtual Machine (VM)	Container
Startup time	Minutes	Seconds
Size	Several GB (includes OS)	Tens to hundreds of MB
Resource efficiency	Low (runs entire OS)	High (shares host OS kernel)
Isolation level	Strong (full OS isolation)	Medium (process-level isolation)
Portability	Low (VM images are heavy)	High (container images run anywhere)

Dockerfile Basics

A Dockerfile is a text file that describes the steps to build a container image.

# Dockerfile example for a Python FastAPI app
FROM python:3.12-slim          # Base image (lightweight Python 3.12)

WORKDIR /app                   # Working directory inside the container

COPY requirements.txt .        # Copy the dependency file
RUN pip install --no-cache-dir -r requirements.txt  # Install libraries

COPY . .                       # Copy the app source code

EXPOSE 8000                    # Port the container exposes

CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

# Build the image
docker build -t my-ai-app:latest .

# Start the container (for local verification)
docker run -p 8000:8000 --env-file .env my-ai-app:latest

Container Registries

A container registry is a service for storing and distributing Docker images. Just as GitHub manages code, a registry manages container images.

Registry	Characteristics
Docker Hub	The most well-known. Rich in public images
Amazon ECR	Easy integration with AWS. Used alongside ECS/EKS
GitHub Container Registry	Simple integration with GitHub Actions
Google Artifact Registry	Integrated with GCP

The CI/CD Pipeline

CI (Continuous Integration) and CD (Continuous Deployment)

CI (Continuous Integration): Automatically runs linting, tests, and builds every time code is pushed
CD (Continuous Deployment): Automatically deploys code that passes tests to staging and production

CI/CD Pipeline with GitHub Actions

graph LR
    Push["Developer pushes\n（feature → main）"] --> CI["GitHub Actions\nCI Job"]

    CI --> Lint["① Lint check\n（flake8 / eslint）"]
    Lint --> Test["② Run tests\n（pytest / jest）"]
    Test --> Build["③ Build Docker\nimage"]
    Build --> Push2["④ Push image\nto registry"]

    Push2 --> DeployStaging["⑤ Deploy to\nstaging environment"]
    DeployStaging --> SmokeTest["⑥ Smoke test\n（basic operation check）"]

    SmokeTest -->|Pass| DeployProd["⑦ Deploy to\nproduction"]
    SmokeTest -->|Fail| Rollback["Automatic rollback"]

    DeployProd --> Notify["⑧ Slack notification\n（deployment complete）"]

GitHub Actions Workflow Example

# .github/workflows/deploy.yml
name: CI/CD Pipeline

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.12"

      - name: Install dependencies
        run: pip install -r requirements.txt

      - name: Run linter
        run: flake8 src/

      - name: Run tests
        run: pytest tests/ -v
        env:
          DATABASE_URL: ${{ secrets.TEST_DATABASE_URL }}

  build-and-deploy:
    needs: test                  # Runs only if the test job succeeds
    runs-on: ubuntu-latest
    if: github.ref == 'refs/heads/main'  # Only on push to main

    steps:
      - uses: actions/checkout@v4

      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: ap-northeast-1

      - name: Login to Amazon ECR
        id: login-ecr
        uses: aws-actions/amazon-ecr-login@v2

      - name: Build and push Docker image
        env:
          ECR_REGISTRY: ${{ steps.login-ecr.outputs.registry }}
          IMAGE_TAG: ${{ github.sha }}
        run: |
          docker build -t $ECR_REGISTRY/my-ai-app:$IMAGE_TAG .
          docker push $ECR_REGISTRY/my-ai-app:$IMAGE_TAG

      - name: Deploy to ECS
        run: |
          aws ecs update-service \
            --cluster production \
            --service ai-chat-service \
            --force-new-deployment

Deployment Environments

Production AI services typically have three environments.

development (dev environment)
  ↓ Pull request created
staging (staging environment)
  ↓ After merge to main, behavioral verification
production (prod environment)

Environment	Purpose	Infrastructure Scale
development	Developer’s local machine or personal dev server	Minimal
staging	Behavioral verification in a production-like configuration	Reduced version of production
production	The environment real users use	Full scale

Deployment Strategies

Three strategies minimize service downtime during deployment.

Strategy	Mechanism	Downtime	Rollback Speed	Complexity
Rolling Update	Replace old instances one by one with the new version	None	Medium	Low
Blue-Green	Keep the old environment (Blue) running, prepare new environment (Green), then switch all traffic at once	None	Fast (immediate rollback)	Medium
Canary Release	Route a fraction of traffic (e.g., 5%) to the new version, then gradually increase if no issues	None	Fast	High

Blue-Green Deployment Visualization

Before deployment:
  Traffic 100% → [Blue Environment: v1.2]

During deployment:
  Launch and test [Green Environment: v1.3]
  Traffic 100% → [Blue Environment: v1.2]

After switch:
  Traffic 100% → [Green Environment: v1.3]
  [Blue Environment: v1.2] retained for a period (for rollback)

Container Orchestration: Kubernetes

Kubernetes (K8s) is an orchestration system that automatically manages and scales large numbers of containers. It automates operations like “automatically restart a container that goes down” and “automatically scale out when load increases.”

Key Kubernetes concepts:

Concept	Role
Pod	The smallest deployable unit; one or more containers grouped together
Deployment	Defines the desired state of Pods (e.g., replica count)
Service	A load balancer that provides network access to Pods
Ingress	Routes external HTTPS traffic to Services
ConfigMap / Secret	Manages configuration values and secrets

Kubernetes is powerful but complex. For small-to-medium AI services, I recommend starting with a managed service (Cloud Run, Fargate, Railway) before adopting Kubernetes.

Choosing a Cloud Deployment Target

Service	Characteristics	Best For
AWS ECS / Fargate	Serverless container execution. Simpler than Kubernetes	Production on AWS
Google Cloud Run	Automatically scales in response to HTTP requests	Stateless APIs on GCP
Railway	Deploy in minutes if you have a Dockerfile	Personal projects, startups
Render	Easy deployment integrated with GitHub	Personal projects, startups
Heroku	The easiest PaaS option	Small scale / prototypes
Kubernetes (EKS/GKE)	Large scale, complex requirements	Large-scale production

Environment Variables and Secret Management

Never hardcode API keys or DB passwords directly in source code.

# Local development: write to .env file and add to .gitignore
ANTHROPIC_API_KEY=sk-ant-xxxxx
DATABASE_URL=postgresql://user:pass@localhost:5432/mydb
REDIS_URL=redis://localhost:6379

# .gitignore
.env

In production, use a dedicated secret management service instead of placing a .env file on the server.

Service	Description
AWS Secrets Manager	Manages secrets including rotation
AWS SSM Parameter Store	Simple key-value format secrets
Google Secret Manager	GCP secret management
GitHub Actions Secrets	Secrets used only within CI pipelines
Vault (HashiCorp)	Cloud-agnostic OSS secret management

Summary

Containers (Docker) guarantee “runs the same everywhere,” solving the “works on my machine” problem
Package the app and its dependencies with a Dockerfile; distribute via a container registry
Build CI/CD with GitHub Actions to automate test → build → deploy
Rolling, blue-green, and canary are three strategies for zero-downtime deployment
Kubernetes is a powerful container orchestration tool, but Cloud Run / Fargate / Railway is the practical starting point
Never put secrets in source code; always use a secret management service

Frequently Asked Questions

Q: Do I need Kubernetes?

A: In most cases, no — not at the start. Managed services like AWS ECS/Fargate, Google Cloud Run, and Railway can deliver automatic scaling and automatic restarts. Kubernetes becomes necessary when custom scaling logic, complex networking configurations, or large-scale production requirements arise.

Q: How do I roll back a broken deployment?

A: With blue-green deployment, I just switch all traffic back to the old environment immediately. With a rolling update, I redeploy by specifying the previous container image tag. Having a “roll back to previous SHA” job in the GitHub Actions workflow makes this straightforward, or I can roll back to a previous revision through the ECS or Cloud Run console.

Q: Is a staging environment strictly necessary?

A: For services with real users, it is strongly recommended. Deploying directly to production without staging eliminates any opportunity to verify behavior, and the blast radius of an incident is much larger. To keep costs down, staging can use smaller instance sizes and a minimal configuration.

Q: Are there CI/CD tools other than GitHub Actions?

A: GitLab CI/CD (for GitLab users), CircleCI (a long-established cloud CI), and Jenkins (OSS with fine-grained on-premises control) are available alternatives. For GitHub users, GitHub Actions offers the simplest integration, with a free tier for public repositories.

See the references for the external specifications and background sources used on this page.[1][2][3][4][5]

References

AI Transformation

Observability & Monitoring