Docker and Containers - From Fundamentals to Production

21 Nov 2025 27 min read [ docker containers devops sre linux fundamentals ]

Docker and Containers - From Fundamentals to Production

A practical guide covering container internals, Docker architecture, networking, storage, and production-grade practices. This isn’t a beginner tutorial - we’ll go deep into how things actually work.

Chapter 1: Container Fundamentals

What Are Containers, Really?

Containers are isolated processes running on a shared kernel. They’re not lightweight VMs - they’re a clever use of Linux kernel features to create isolated environments.

Containers vs Virtual Machines

┌─────────────────────────────────────────────────────────────────────┐
│                    VIRTUAL MACHINES                                  │
├─────────────────────────────────────────────────────────────────────┤
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐                 │
│  │    App A    │  │    App B    │  │    App C    │                 │
│  ├─────────────┤  ├─────────────┤  ├─────────────┤                 │
│  │  Bins/Libs  │  │  Bins/Libs  │  │  Bins/Libs  │                 │
│  ├─────────────┤  ├─────────────┤  ├─────────────┤                 │
│  │  Guest OS   │  │  Guest OS   │  │  Guest OS   │  ← Full OS each │
│  └─────────────┘  └─────────────┘  └─────────────┘                 │
│  ┌─────────────────────────────────────────────────┐               │
│  │              HYPERVISOR (VMware, KVM)           │               │
│  └─────────────────────────────────────────────────┘               │
│  ┌─────────────────────────────────────────────────┐               │
│  │                   HOST OS                        │               │
│  └─────────────────────────────────────────────────┘               │
│  ┌─────────────────────────────────────────────────┐               │
│  │                  HARDWARE                        │               │
│  └─────────────────────────────────────────────────┘               │
└─────────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────────┐
│                       CONTAINERS                                     │
├─────────────────────────────────────────────────────────────────────┤
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐                 │
│  │    App A    │  │    App B    │  │    App C    │                 │
│  ├─────────────┤  ├─────────────┤  ├─────────────┤                 │
│  │  Bins/Libs  │  │  Bins/Libs  │  │  Bins/Libs  │                 │
│  └─────────────┘  └─────────────┘  └─────────────┘                 │
│  ┌─────────────────────────────────────────────────┐               │
│  │           CONTAINER RUNTIME (Docker)            │               │
│  └─────────────────────────────────────────────────┘               │
│  ┌─────────────────────────────────────────────────┐               │
│  │              HOST OS (Shared Kernel)            │  ← One kernel │
│  └─────────────────────────────────────────────────┘               │
│  ┌─────────────────────────────────────────────────┐               │
│  │                  HARDWARE                        │               │
│  └─────────────────────────────────────────────────┘               │
└─────────────────────────────────────────────────────────────────────┘

Aspect	Virtual Machines	Containers
Isolation Level	Hardware-level (hypervisor)	Process-level (kernel)
Boot Time	Minutes	Milliseconds to seconds
Size	GBs (full OS)	MBs (just app + deps)
Resource Overhead	High (each VM runs full OS)	Low (shared kernel)
Density	~10-20 VMs per host	~100s of containers per host
Security Isolation	Stronger (separate kernels)	Weaker (shared kernel)
Use Case	Multi-tenancy, different OS	Microservices, CI/CD

The Linux Kernel Features Behind Containers

Containers leverage three key kernel features:

1. Namespaces - Isolation of system resources

# View namespaces for a process
ls -la /proc/$$/ns/

# Types of namespaces:
# - pid    : Process IDs (container sees its own PID 1)
# - net    : Network interfaces, routing tables
# - mnt    : Mount points (filesystem)
# - uts    : Hostname and domain name
# - ipc    : Inter-process communication
# - user   : User and group IDs
# - cgroup : Control group root directory

2. Control Groups (cgroups) - Resource limits

# View cgroup limits for a container
cat /sys/fs/cgroup/memory/docker/<container-id>/memory.limit_in_bytes
cat /sys/fs/cgroup/cpu/docker/<container-id>/cpu.shares

# cgroups control:
# - Memory limits
# - CPU shares/quotas
# - Block I/O
# - Network bandwidth (with tc)

3. Union Filesystems - Layered storage

# Layers are stacked - each instruction in Dockerfile creates a layer
# Read-only layers + thin writable layer on top = container filesystem

Docker Architecture

Understanding Docker’s components helps when debugging issues.

┌─────────────────────────────────────────────────────────────────────┐
│                         DOCKER CLIENT                                │
│                     (docker CLI, Docker API)                         │
│                              │                                       │
│                              │ REST API                              │
│                              ▼                                       │
│  ┌───────────────────────────────────────────────────────────────┐  │
│  │                      DOCKER DAEMON (dockerd)                   │  │
│  │                                                                 │  │
│  │  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────┐   │  │
│  │  │   Images    │  │ Containers  │  │      Networks       │   │  │
│  │  └─────────────┘  └─────────────┘  └─────────────────────┘   │  │
│  │  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────┐   │  │
│  │  │   Volumes   │  │   Plugins   │  │   Build Cache       │   │  │
│  │  └─────────────┘  └─────────────┘  └─────────────────────┘   │  │
│  │                              │                                 │  │
│  └──────────────────────────────┼─────────────────────────────────┘  │
│                                 │                                    │
│                                 ▼                                    │
│  ┌───────────────────────────────────────────────────────────────┐  │
│  │                     CONTAINERD (container runtime)             │  │
│  │                              │                                 │  │
│  └──────────────────────────────┼─────────────────────────────────┘  │
│                                 │                                    │
│                                 ▼                                    │
│  ┌───────────────────────────────────────────────────────────────┐  │
│  │                         RUNC (OCI runtime)                     │  │
│  │                    Creates actual containers                   │  │
│  └───────────────────────────────────────────────────────────────┘  │
│                                                                      │
│  ┌───────────────────────────────────────────────────────────────┐  │
│  │                      LINUX KERNEL                              │  │
│  │              (namespaces, cgroups, union fs)                   │  │
│  └───────────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────────┘

Component Breakdown

Component	Purpose
Docker CLI	User interface - sends commands to daemon
Docker Daemon	Background service managing images, containers, networks, volumes
containerd	Industry-standard container runtime (manages container lifecycle)
runc	Low-level runtime that actually creates containers using kernel features
Registry	Stores and distributes images (Docker Hub, private registries)

# Check Docker system info
docker info

# Check component versions
docker version

# Check daemon status
systemctl status docker

Images and Layers

Images are read-only templates made of stacked layers. Each layer represents a Dockerfile instruction.

How Layers Work

┌─────────────────────────────────────────────┐
│         CONTAINER (Running Instance)         │
├─────────────────────────────────────────────┤
│    Thin Read-Write Layer (Container Layer)   │  ← Changes go here
├─────────────────────────────────────────────┤
│                                              │
│              IMAGE LAYERS                    │
│           (Read-Only, Shared)                │
│                                              │
│  ┌────────────────────────────────────────┐ │
│  │ Layer 4: COPY app.py /app              │ │  ← Your code
│  ├────────────────────────────────────────┤ │
│  │ Layer 3: RUN pip install flask         │ │  ← Dependencies
│  ├────────────────────────────────────────┤ │
│  │ Layer 2: RUN apt-get update && install │ │  ← System packages
│  ├────────────────────────────────────────┤ │
│  │ Layer 1: Base Image (python:3.11-slim) │ │  ← Base OS + runtime
│  └────────────────────────────────────────┘ │
│                                              │
└─────────────────────────────────────────────┘

Inspect Image Layers

# View image history (each layer)
docker history python:3.11-slim

# Detailed layer info
docker inspect python:3.11-slim --format '' | jq

# See actual layer sizes
docker history --no-trunc python:3.11-slim

Why Layers Matter

Caching - Unchanged layers are cached, speeding up builds
Sharing - Multiple containers can share base layers (saves disk space)
Distribution - Only changed layers need to be pushed/pulled

# Force rebuild without cache
docker build --no-cache -t myapp .

# Build with specific cache settings
docker build --build-arg CACHEBUST=$(date +%s) -t myapp .

Container Lifecycle

                    docker create
                         │
                         ▼
┌──────────┐       ┌──────────┐        docker start        ┌──────────┐
│  Image   │ ────► │ Created  │ ─────────────────────────► │ Running  │
└──────────┘       └──────────┘                            └────┬─────┘
                                                                │
                        ┌───────────────────────────────────────┤
                        │                                       │
                        │ docker stop                           │ docker pause
                        ▼                                       ▼
                  ┌──────────┐                            ┌──────────┐
                  │  Exited  │                            │  Paused  │
                  └────┬─────┘                            └──────────┘
                       │
                       │ docker rm
                       ▼
                  ┌──────────┐
                  │ Removed  │
                  └──────────┘

Essential Container Commands

# Create and start (most common)
docker run -d --name myapp nginx

# Just create (don't start)
docker create --name myapp nginx

# Start existing container
docker start myapp

# Stop gracefully (SIGTERM, then SIGKILL after timeout)
docker stop myapp

# Stop immediately (SIGKILL)
docker kill myapp

# Pause (freeze processes with SIGSTOP)
docker pause myapp
docker unpause myapp

# Remove container
docker rm myapp

# Remove running container
docker rm -f myapp

# Remove all stopped containers
docker container prune

Container Inspection

# List running containers
docker ps

# List all containers (including stopped)
docker ps -a

# Detailed container info
docker inspect myapp

# Get specific info
docker inspect myapp --format '{{.State.Status}}'
docker inspect myapp --format '{{.NetworkSettings.IPAddress}}'
docker inspect myapp --format '{{json .Mounts}}' | jq

# Resource usage
docker stats myapp

# Processes inside container
docker top myapp

Docker Networking Deep Dive

Docker networking is critical to understand - it’s where most issues occur.

Network Drivers

┌─────────────────────────────────────────────────────────────────────┐
│                      DOCKER NETWORK DRIVERS                          │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│  ┌─────────────────────────────────────────────────────────────┐   │
│  │                        BRIDGE (default)                      │   │
│  │  • Private internal network on the host                      │   │
│  │  • Containers can communicate via IP or container name       │   │
│  │  • Need port mapping (-p) for external access               │   │
│  │  • Best for: Single-host container communication             │   │
│  └─────────────────────────────────────────────────────────────┘   │
│                                                                      │
│  ┌─────────────────────────────────────────────────────────────┐   │
│  │                           HOST                               │   │
│  │  • Container uses host's network stack directly              │   │
│  │  • No network isolation                                      │   │
│  │  • No port mapping needed (container binds to host ports)   │   │
│  │  • Best for: Performance-critical apps, network tools       │   │
│  └─────────────────────────────────────────────────────────────┘   │
│                                                                      │
│  ┌─────────────────────────────────────────────────────────────┐   │
│  │                          NONE                                │   │
│  │  • No networking at all                                      │   │
│  │  • Container is completely isolated                          │   │
│  │  • Best for: Batch jobs, security-sensitive workloads       │   │
│  └─────────────────────────────────────────────────────────────┘   │
│                                                                      │
│  ┌─────────────────────────────────────────────────────────────┐   │
│  │                        OVERLAY                               │   │
│  │  • Multi-host networking (Docker Swarm, Kubernetes)         │   │
│  │  • Containers on different hosts can communicate            │   │
│  │  • Uses VXLAN encapsulation                                  │   │
│  │  • Best for: Distributed applications, orchestration        │   │
│  └─────────────────────────────────────────────────────────────┘   │
│                                                                      │
│  ┌─────────────────────────────────────────────────────────────┐   │
│  │                        MACVLAN                               │   │
│  │  • Container gets its own MAC address                        │   │
│  │  • Appears as physical device on network                     │   │
│  │  • Direct L2 connectivity                                    │   │
│  │  • Best for: Legacy apps that need direct network access    │   │
│  └─────────────────────────────────────────────────────────────┘   │
│                                                                      │
└─────────────────────────────────────────────────────────────────────┘

Bridge Network (Default)

┌─────────────────────────────────────────────────────────────────────┐
│                           HOST MACHINE                               │
│                                                                      │
│    ┌──────────────────────────────────────────────────────────┐    │
│    │                    docker0 bridge                         │    │
│    │                    (172.17.0.1)                           │    │
│    │                         │                                  │    │
│    │         ┌───────────────┼───────────────┐                 │    │
│    │         │               │               │                 │    │
│    │    ┌────┴────┐    ┌────┴────┐    ┌────┴────┐            │    │
│    │    │  veth   │    │  veth   │    │  veth   │            │    │
│    │    └────┬────┘    └────┬────┘    └────┬────┘            │    │
│    └─────────┼──────────────┼──────────────┼───────────────────┘    │
│              │              │              │                         │
│         ┌────┴────┐    ┌────┴────┐    ┌────┴────┐                  │
│         │Container│    │Container│    │Container│                  │
│         │   A     │    │   B     │    │   C     │                  │
│         │.17.0.2  │    │.17.0.3  │    │.17.0.4  │                  │
│         └─────────┘    └─────────┘    └─────────┘                  │
│                                                                      │
│    eth0 (Host NIC) ─────────── Internet                             │
│                         NAT (iptables)                               │
└─────────────────────────────────────────────────────────────────────┘

Network Commands

# List networks
docker network ls

# Create custom bridge network
docker network create --driver bridge my-network

# Create network with specific subnet
docker network create \
  --driver bridge \
  --subnet 192.168.100.0/24 \
  --gateway 192.168.100.1 \
  my-custom-network

# Connect container to network
docker network connect my-network mycontainer

# Disconnect container from network
docker network disconnect my-network mycontainer

# Inspect network
docker network inspect my-network

# Remove network
docker network rm my-network

# Remove unused networks
docker network prune

Port Mapping

# Map host port 8080 to container port 80
docker run -p 8080:80 nginx

# Map to specific interface
docker run -p 127.0.0.1:8080:80 nginx

# Map random host port
docker run -p 80 nginx
docker port <container>  # See assigned port

# Map UDP port
docker run -p 53:53/udp dns-server

# Map multiple ports
docker run -p 80:80 -p 443:443 nginx

Container DNS and Service Discovery

# On custom networks, containers can reach each other by name
docker network create app-network

docker run -d --name db --network app-network postgres
docker run -d --name api --network app-network myapi

# From 'api' container, can reach postgres at hostname 'db'
docker exec api ping db  # Works!

# On default bridge network, must use IP addresses
# Container names don't resolve on default bridge

Network Debugging

# Check container's network settings
docker inspect mycontainer --format '{{json .NetworkSettings}}' | jq

# Get container IP
docker inspect mycontainer --format '{{.NetworkSettings.IPAddress}}'

# Check what ports are exposed
docker inspect mycontainer --format '{{json .NetworkSettings.Ports}}' | jq

# Test connectivity from inside container
docker exec mycontainer ping google.com
docker exec mycontainer curl -v http://other-container:8080

# Check host iptables rules (port forwarding)
sudo iptables -t nat -L -n -v

# Check if port is listening
docker exec mycontainer netstat -tlnp
docker exec mycontainer ss -tlnp

Network Comparison Table

Driver	Isolation	Multi-Host	Port Mapping	Use Case
bridge	Yes	No	Required	Default, single-host
host	No	No	Not needed	Performance, network tools
none	Complete	No	N/A	Security, offline tasks
overlay	Yes	Yes	Optional	Swarm/K8s clusters
macvlan	Yes	No	Not needed	Direct LAN access

Docker Volumes and Storage

Containers are ephemeral - when they’re removed, their data is gone. Volumes solve this.

Storage Types

┌─────────────────────────────────────────────────────────────────────┐
│                      DOCKER STORAGE OPTIONS                          │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│  ┌─────────────────────────────────────────────────────────────┐   │
│  │                    VOLUMES (Recommended)                     │   │
│  │  • Managed by Docker (/var/lib/docker/volumes/)             │   │
│  │  • Best for persistent data                                  │   │
│  │  • Can be shared between containers                          │   │
│  │  • Works on Linux, macOS, Windows                           │   │
│  │  • Supports volume drivers (NFS, cloud storage)             │   │
│  │                                                              │   │
│  │  docker run -v myvolume:/data nginx                         │   │
│  └─────────────────────────────────────────────────────────────┘   │
│                                                                      │
│  ┌─────────────────────────────────────────────────────────────┐   │
│  │                      BIND MOUNTS                             │   │
│  │  • Maps host directory to container path                     │   │
│  │  • Good for development (code sync)                          │   │
│  │  • Host path must exist                                      │   │
│  │  • Performance varies by host OS                             │   │
│  │                                                              │   │
│  │  docker run -v /host/path:/container/path nginx             │   │
│  └─────────────────────────────────────────────────────────────┘   │
│                                                                      │
│  ┌─────────────────────────────────────────────────────────────┐   │
│  │                       TMPFS MOUNTS                           │   │
│  │  • Stored in host memory only                                │   │
│  │  • Never written to host filesystem                          │   │
│  │  • Fast, but data lost on container stop                     │   │
│  │  • Good for sensitive data, caches                           │   │
│  │                                                              │   │
│  │  docker run --tmpfs /app/cache nginx                        │   │
│  └─────────────────────────────────────────────────────────────┘   │
│                                                                      │
└─────────────────────────────────────────────────────────────────────┘

Volume Commands

# Create a volume
docker volume create mydata

# List volumes
docker volume ls

# Inspect volume
docker volume inspect mydata

# Use volume in container
docker run -v mydata:/app/data myapp

# Remove volume
docker volume rm mydata

# Remove unused volumes
docker volume prune

# Remove ALL unused volumes (careful!)
docker volume prune -a

Bind Mounts vs Volumes

# VOLUME - Docker manages the location
docker run -v myvolume:/data nginx
# Data stored at /var/lib/docker/volumes/myvolume/_data

# BIND MOUNT - You specify exact host path
docker run -v /home/user/data:/data nginx
# Data stored at /home/user/data

# Modern syntax (--mount) - more explicit
docker run --mount type=volume,source=myvolume,target=/data nginx
docker run --mount type=bind,source=/home/user/data,target=/data nginx

Volume Use Cases

# Database persistence
docker run -d \
  --name postgres \
  -v pgdata:/var/lib/postgresql/data \
  -e POSTGRES_PASSWORD=secret \
  postgres

# Share data between containers
docker run -d --name writer -v shared:/data alpine sh -c "while true; do date >> /data/log.txt; sleep 1; done"
docker run -d --name reader -v shared:/data:ro alpine tail -f /data/log.txt

# Development with live reload (bind mount)
docker run -d \
  -v $(pwd)/src:/app/src \
  -p 3000:3000 \
  node-dev

# Read-only mount (security)
docker run -v myconfig:/etc/app/config:ro myapp

Storage Debugging

# See what's using disk space
docker system df
docker system df -v  # Verbose

# Find where volume data is stored
docker volume inspect myvolume --format '{{.Mountpoint}}'

# Check mount points inside container
docker exec mycontainer df -h
docker exec mycontainer mount | grep /data

Docker Compose Fundamentals

Compose defines multi-container applications in a single YAML file. Essential for local development.

Basic Structure

# docker-compose.yml
version: '3.8'

services:
  web:
    build: ./web
    ports:
      - "8080:80"
    depends_on:
      - api
      - db
    environment:
      - API_URL=http://api:3000
    networks:
      - frontend
      - backend

  api:
    build: ./api
    ports:
      - "3000:3000"
    depends_on:
      - db
    environment:
      - DATABASE_URL=postgres://user:pass@db:5432/app
    volumes:
      - ./api/src:/app/src  # Dev: live reload
    networks:
      - backend

  db:
    image: postgres:15-alpine
    volumes:
      - pgdata:/var/lib/postgresql/data
    environment:
      - POSTGRES_USER=user
      - POSTGRES_PASSWORD=pass
      - POSTGRES_DB=app
    networks:
      - backend

volumes:
  pgdata:

networks:
  frontend:
  backend:

Compose Commands

# Start all services
docker compose up

# Start in background
docker compose up -d

# Build and start
docker compose up --build

# Stop services
docker compose down

# Stop and remove volumes
docker compose down -v

# View logs
docker compose logs
docker compose logs -f api  # Follow specific service

# Scale a service
docker compose up -d --scale api=3

# Execute command in running service
docker compose exec api sh

# Run one-off command
docker compose run --rm api npm test

# View running services
docker compose ps

# Rebuild specific service
docker compose build api

Compose Networking

# Services on same network can reach each other by service name
services:
  api:
    networks:
      - backend

  db:
    networks:
      - backend

# api can reach db at hostname 'db'
# No need to expose db port to host

Environment Variables

services:
  api:
    # Direct values
    environment:
      - NODE_ENV=production
      - DEBUG=false

    # From .env file
    env_file:
      - .env
      - .env.local

    # From host environment
    environment:
      - API_KEY  # Takes value from host's $API_KEY

Health Checks in Compose

services:
  api:
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s

  db:
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U user -d app"]
      interval: 10s
      timeout: 5s
      retries: 5

Dependency Management

services:
  api:
    depends_on:
      db:
        condition: service_healthy  # Wait for health check
      redis:
        condition: service_started  # Just wait for start

Chapter 2: Production-Grade Docker

Multi-Stage Builds

Multi-stage builds create minimal production images by separating build-time dependencies from runtime.

The Problem with Single-Stage Builds

# BAD: Single stage - image is huge!
FROM node:18

WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build

# This image contains:
# - Node.js runtime
# - npm and all dev dependencies (node_modules)
# - Source code
# - Build tools
# Result: 1+ GB image

Multi-Stage Solution

# GOOD: Multi-stage build

# Stage 1: Build
FROM node:18-alpine AS builder

WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build

# Stage 2: Production
FROM node:18-alpine AS production

WORKDIR /app

# Only copy what we need
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./

# Create non-root user
RUN addgroup -g 1001 -S appgroup && \
    adduser -u 1001 -S appuser -G appgroup

USER appuser

EXPOSE 3000
CMD ["node", "dist/index.js"]

# Result: ~150MB image (vs 1GB+)

Multi-Stage for Compiled Languages

# Go application - even smaller final image

# Stage 1: Build
FROM golang:1.21-alpine AS builder

WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download

COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o main .

# Stage 2: Minimal runtime
FROM alpine:3.18 AS production

# Add CA certificates for HTTPS
RUN apk --no-cache add ca-certificates

WORKDIR /root/
COPY --from=builder /app/main .

# Run as non-root
RUN adduser -D -g '' appuser
USER appuser

EXPOSE 8080
CMD ["./main"]

# Result: ~15MB image!

Python Multi-Stage Build

# Python with virtual environment

# Stage 1: Build dependencies
FROM python:3.11-slim AS builder

WORKDIR /app

# Install build dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
    gcc \
    libpq-dev \
    && rm -rf /var/lib/apt/lists/*

# Create virtual environment
RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"

# Install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Stage 2: Production
FROM python:3.11-slim AS production

# Install only runtime dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
    libpq5 \
    && rm -rf /var/lib/apt/lists/*

# Copy virtual environment from builder
COPY --from=builder /opt/venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"

WORKDIR /app
COPY . .

# Create non-root user
RUN useradd -m -u 1000 appuser && chown -R appuser:appuser /app
USER appuser

EXPOSE 8000
CMD ["gunicorn", "--bind", "0.0.0.0:8000", "app:app"]

Image Optimization

Base Image Selection

Base Image	Size	Use Case
`ubuntu:22.04`	~77MB	When you need apt and full tooling
`debian:bookworm-slim`	~74MB	Smaller Debian, most packages available
`alpine:3.18`	~7MB	Minimal, uses musl libc (some compatibility issues)
`distroless/base`	~20MB	No shell, minimal attack surface
`scratch`	0MB	For statically compiled binaries only

# Alpine-based images are smallest
FROM python:3.11-alpine  # ~50MB vs ~150MB for slim

# But watch for compatibility issues with musl libc
# Some Python packages need compilation fixes

Layer Optimization

# BAD: Many layers, poor caching
RUN apt-get update
RUN apt-get install -y curl
RUN apt-get install -y git
RUN apt-get clean

# GOOD: Single layer, cleanup in same layer
RUN apt-get update && \
    apt-get install -y --no-install-recommends \
        curl \
        git \
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/*

.dockerignore

# .dockerignore - exclude from build context

# Version control
.git
.gitignore

# Dependencies (will be installed in container)
node_modules
vendor
__pycache__
*.pyc
venv/
.venv/

# Build outputs
dist
build
*.egg-info

# IDE and editor files
.idea
.vscode
*.swp
*.swo

# Test and docs
tests
test
*.md
docs
coverage
.coverage

# CI/CD
.github
.gitlab-ci.yml
Jenkinsfile

# Docker files (not needed in image)
Dockerfile*
docker-compose*
.docker

# Environment files (security!)
.env
.env.*
*.pem
*.key

# Logs
*.log
logs

Caching Best Practices

# Order matters! Put things that change least at top

# 1. Base image (changes rarely)
FROM python:3.11-slim

# 2. System dependencies (change occasionally)
RUN apt-get update && apt-get install -y --no-install-recommends \
    libpq-dev \
    && rm -rf /var/lib/apt/lists/*

# 3. Application dependencies (change sometimes)
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# 4. Application code (changes frequently)
COPY . .

# This way, code changes don't invalidate dependency cache

Security Hardening

1. Run as Non-Root User

# Create user and group
RUN groupadd -r appgroup && useradd -r -g appgroup appuser

# Or on Alpine
RUN addgroup -S appgroup && adduser -S appuser -G appgroup

# Set ownership
COPY --chown=appuser:appgroup . /app

# Switch to non-root user
USER appuser

# Alternative: Use numeric UID (more portable)
USER 1000:1000

2. Use Read-Only Filesystem

# Run container with read-only root filesystem
docker run --read-only myapp

# If app needs to write, use tmpfs for specific directories
docker run --read-only \
  --tmpfs /tmp \
  --tmpfs /app/cache \
  myapp

3. Drop Capabilities

# Drop all capabilities, add only what's needed
docker run --cap-drop=ALL --cap-add=NET_BIND_SERVICE myapp

# Common capabilities to drop:
# - CAP_NET_RAW (prevent packet spoofing)
# - CAP_SYS_ADMIN (prevent container escapes)
# - CAP_SETUID/CAP_SETGID (prevent privilege escalation)

4. No New Privileges

# Prevent privilege escalation via setuid binaries
docker run --security-opt=no-new-privileges myapp

5. Scan Images for Vulnerabilities

# Docker Scout (built into Docker Desktop)
docker scout quickview myimage:latest
docker scout cves myimage:latest

# Trivy (open source)
trivy image myimage:latest

# Snyk
snyk container test myimage:latest

# Grype
grype myimage:latest

6. Use Specific Image Tags

# BAD: Using 'latest' - unpredictable
FROM python:latest

# BETTER: Use specific version
FROM python:3.11-slim

# BEST: Use digest for reproducibility
FROM python:3.11-slim@sha256:abc123...

7. Don’t Store Secrets in Images

# BAD: Secrets in image layers
ENV API_KEY=secret123
COPY credentials.json /app/

# GOOD: Use runtime environment variables
# or Docker secrets/Kubernetes secrets

# Build-time secrets (Docker BuildKit)
# syntax=docker/dockerfile:1.4
RUN --mount=type=secret,id=api_key \
    API_KEY=$(cat /run/secrets/api_key) ./configure

8. Minimal Attack Surface

# Remove unnecessary tools
RUN apt-get remove --purge -y \
    curl \
    wget \
    && apt-get autoremove -y

# Use distroless for minimal surface
FROM gcr.io/distroless/base-debian11
# No shell, no package manager - just your app

Security Checklist

# Security best practices checklist:
✓ Non-root user (USER instruction)
✓ Specific base image tags (not :latest)
✓ Multi-stage builds (minimize attack surface)
✓ No secrets in images (use env vars or secrets management)
✓ Image vulnerability scanning in CI/CD
✓ Read-only filesystem where possible
✓ Drop unnecessary capabilities
✓ Resource limits (--memory, --cpus)
✓ No privileged mode (--privileged=false)
✓ Network segmentation (custom networks)

Health Checks

Health checks tell Docker whether your container is actually working, not just running.

Dockerfile Health Check

# HTTP health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
  CMD curl -f http://localhost:8080/health || exit 1

# For containers without curl
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
  CMD wget --no-verbose --tries=1 --spider http://localhost:8080/health || exit 1

# Using Python
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
  CMD python -c "import urllib.request; urllib.request.urlopen('http://localhost:8080/health')" || exit 1

# Database health check
HEALTHCHECK --interval=10s --timeout=5s --start-period=30s --retries=5 \
  CMD pg_isready -U postgres || exit 1

# Redis health check
HEALTHCHECK --interval=10s --timeout=3s --start-period=5s --retries=3 \
  CMD redis-cli ping || exit 1

Health Check Parameters

Parameter	Description	Default
`--interval`	Time between checks	30s
`--timeout`	Max time for check to complete	30s
`--start-period`	Grace period before checks count	0s
`--retries`	Failures needed to mark unhealthy	3

Check Health Status

# View health status
docker ps
# CONTAINER ID   IMAGE   STATUS                    NAMES
# abc123         myapp   Up 5 min (healthy)        web

# Detailed health info
docker inspect myapp --format '' | jq

# Health check logs
docker inspect myapp --format '' | jq

Resource Limits

Without limits, a single container can consume all host resources.

Memory Limits

# Hard memory limit (OOM killed if exceeded)
docker run --memory=512m myapp

# Memory + swap limit
docker run --memory=512m --memory-swap=1g myapp

# Soft limit (reservation)
docker run --memory=512m --memory-reservation=256m myapp

# Disable OOM killer (container pauses instead of dying)
docker run --memory=512m --oom-kill-disable myapp

CPU Limits

# Limit to 1.5 CPUs
docker run --cpus=1.5 myapp

# CPU shares (relative weight, default 1024)
docker run --cpu-shares=512 myapp  # Half priority

# Pin to specific CPUs
docker run --cpuset-cpus="0,1" myapp  # Use CPU 0 and 1

# CPU quota (microseconds per 100ms period)
docker run --cpu-quota=50000 myapp  # 50% of one CPU

Docker Compose Resources

services:
  api:
    deploy:
      resources:
        limits:
          cpus: '0.5'
          memory: 512M
        reservations:
          cpus: '0.25'
          memory: 256M

Monitoring Resource Usage

# Real-time stats
docker stats

# Stats for specific containers
docker stats api db redis

# One-time snapshot
docker stats --no-stream

# Format output
docker stats --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}"

Logging Best Practices

Log to stdout/stderr

# Application logs should go to stdout/stderr
# Docker captures these automatically

# Example: Configure app to log to stdout
CMD ["python", "-u", "app.py"]  # -u for unbuffered output

# Symlink log files to stdout/stderr
RUN ln -sf /dev/stdout /var/log/app/access.log \
    && ln -sf /dev/stderr /var/log/app/error.log

Logging Drivers

# View container logs
docker logs myapp
docker logs -f myapp        # Follow
docker logs --tail 100 myapp  # Last 100 lines
docker logs --since 1h myapp  # Last hour

# Use different logging driver
docker run --log-driver=json-file \
  --log-opt max-size=10m \
  --log-opt max-file=3 \
  myapp

# Available drivers:
# - json-file (default)
# - syslog
# - journald
# - fluentd
# - awslogs
# - gcplogs

Structured Logging

# Use JSON logging for easier parsing
import logging
import json

class JsonFormatter(logging.Formatter):
    def format(self, record):
        log_obj = {
            "timestamp": self.formatTime(record),
            "level": record.levelname,
            "message": record.getMessage(),
            "module": record.module,
        }
        return json.dumps(log_obj)

# Output: {"timestamp": "2024-01-15 10:30:00", "level": "INFO", "message": "Request processed", "module": "api"}

Production Troubleshooting

Issue 1: Container Exits Immediately

Symptoms:

docker run myapp
# Container starts and exits immediately
docker ps -a
# STATUS: Exited (1) 2 seconds ago

Diagnosis:

# Check exit code
docker inspect myapp --format '{{.State.ExitCode}}'

# Check logs
docker logs myapp

# Common exit codes:
# 0   - Normal exit
# 1   - General error
# 137 - SIGKILL (OOM or docker kill)
# 139 - SIGSEGV (segmentation fault)
# 143 - SIGTERM (docker stop)

Common Causes:

Process runs in background - Container exits when main process ends ```dockerfile
BAD: nginx runs in background, container exits

CMD [“nginx”]

GOOD: Keep nginx in foreground

CMD [“nginx”, “-g”, “daemon off;”]

2. **Command fails**
```bash
# Debug by running shell
docker run -it myapp /bin/sh
# Then manually run your command

Missing environment variables
```
docker run -e REQUIRED_VAR=value myapp
```

Issue 2: OOM Killed (Exit Code 137)

Symptoms:

docker inspect myapp --format '{{.State.OOMKilled}}'
# true

Diagnosis:

# Check memory usage before kill
docker stats myapp --no-stream

# Check container memory limit
docker inspect myapp --format '{{.HostConfig.Memory}}'

# Check host memory
free -h

Solutions:

# Increase memory limit
docker run --memory=1g myapp

# Or fix the memory leak in your application
# Add profiling to find the leak

Issue 3: Cannot Connect to Container

Symptoms:

curl http://localhost:8080
# Connection refused

Diagnosis:

# 1. Is container running?
docker ps

# 2. Is port mapping correct?
docker port myapp
# 8080/tcp -> 0.0.0.0:8080

# 3. Is app listening on correct interface?
docker exec myapp netstat -tlnp
# App must listen on 0.0.0.0, not 127.0.0.1

# 4. Check container logs for errors
docker logs myapp

# 5. Test from inside container
docker exec myapp curl localhost:8080

Common Causes:

# BAD: Only listens on localhost (inside container)
app.run(host='127.0.0.1', port=8080)

# GOOD: Listens on all interfaces
app.run(host='0.0.0.0', port=8080)

Issue 4: Slow Container Startup

Symptoms:

# Container takes minutes to become healthy
docker ps
# STATUS: Up 2 minutes (health: starting)

Diagnosis:

# Check what's happening during startup
docker logs -f myapp

# Check health check timing
docker inspect myapp --format '' | jq

Solutions:

# Increase start period for slow apps
HEALTHCHECK --start-period=60s --interval=30s \
  CMD curl -f http://localhost:8080/health || exit 1

# Optimize startup:
# - Lazy load dependencies
# - Defer non-critical initialization
# - Use connection pooling (don't wait for DB on startup)

Issue 5: “No Space Left on Device”

Symptoms:

docker build -t myapp .
# Error: write /var/lib/docker/...: no space left on device

Diagnosis:

# Check Docker disk usage
docker system df

# Detailed breakdown
docker system df -v

# Check host disk
df -h

Solutions:

# Remove unused containers
docker container prune

# Remove unused images
docker image prune
docker image prune -a  # Remove all unused (not just dangling)

# Remove unused volumes
docker volume prune

# Remove unused networks
docker network prune

# Nuclear option - remove everything unused
docker system prune -a --volumes

# Clean build cache
docker builder prune

Issue 6: DNS Resolution Failing

Symptoms:

docker exec myapp ping google.com
# ping: bad address 'google.com'

Diagnosis:

# Check DNS configuration
docker exec myapp cat /etc/resolv.conf

# Test with IP (bypass DNS)
docker exec myapp ping 8.8.8.8

# Check Docker daemon DNS settings
docker info | grep -i dns

Solutions:

# Specify DNS servers
docker run --dns 8.8.8.8 --dns 8.8.4.4 myapp

# Or configure in daemon.json
# /etc/docker/daemon.json
{
  "dns": ["8.8.8.8", "8.8.4.4"]
}

Issue 7: Permission Denied on Volume

Symptoms:

docker run -v mydata:/data myapp
# Error: Permission denied: /data/file.txt

Diagnosis:

# Check file ownership inside container
docker exec myapp ls -la /data

# Check what user container runs as
docker exec myapp id
# uid=1000(appuser) gid=1000(appuser)

# Check volume ownership on host
sudo ls -la /var/lib/docker/volumes/mydata/_data

Solutions:

# Option 1: Change ownership in Dockerfile
RUN chown -R appuser:appuser /data

# Option 2: Run as root (not recommended for production)
docker run --user root -v mydata:/data myapp

# Option 3: Use init container to fix permissions
docker run --rm -v mydata:/data alpine chown -R 1000:1000 /data

# Option 4: Set permissions in entrypoint
# entrypoint.sh
#!/bin/sh
chown -R appuser:appuser /data
exec gosu appuser "$@"

Issue 8: Container Cannot Reach Other Containers

Symptoms:

docker exec api curl http://db:5432
# curl: (6) Could not resolve host: db

Diagnosis:

# Check if containers are on same network
docker network inspect bridge

# List container networks
docker inspect api --format '' | jq
docker inspect db --format '' | jq

Solutions:

# Default bridge doesn't support DNS resolution
# Use custom network instead

docker network create app-network
docker run -d --name db --network app-network postgres
docker run -d --name api --network app-network myapi

# Now api can reach db by name
docker exec api ping db  # Works!

Production Docker Compose Example

Complete production-ready docker-compose setup:

# docker-compose.yml
version: '3.8'

services:
  # Application
  api:
    build:
      context: ./api
      dockerfile: Dockerfile
      target: production
    image: mycompany/api:${VERSION:-latest}
    restart: unless-stopped
    ports:
      - "8080:8080"
    environment:
      - NODE_ENV=production
      - DATABASE_URL=postgres://user:${DB_PASSWORD}@db:5432/app
      - REDIS_URL=redis://redis:6379
    env_file:
      - .env
    depends_on:
      db:
        condition: service_healthy
      redis:
        condition: service_healthy
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s
    deploy:
      resources:
        limits:
          cpus: '1'
          memory: 512M
        reservations:
          cpus: '0.5'
          memory: 256M
    networks:
      - frontend
      - backend
    logging:
      driver: json-file
      options:
        max-size: "10m"
        max-file: "3"

  # Database
  db:
    image: postgres:15-alpine
    restart: unless-stopped
    volumes:
      - pgdata:/var/lib/postgresql/data
      - ./init.sql:/docker-entrypoint-initdb.d/init.sql:ro
    environment:
      - POSTGRES_USER=user
      - POSTGRES_PASSWORD=${DB_PASSWORD}
      - POSTGRES_DB=app
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U user -d app"]
      interval: 10s
      timeout: 5s
      retries: 5
    deploy:
      resources:
        limits:
          cpus: '0.5'
          memory: 256M
    networks:
      - backend

  # Cache
  redis:
    image: redis:7-alpine
    restart: unless-stopped
    command: redis-server --appendonly yes --maxmemory 128mb --maxmemory-policy allkeys-lru
    volumes:
      - redisdata:/data
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 10s
      timeout: 5s
      retries: 3
    deploy:
      resources:
        limits:
          cpus: '0.25'
          memory: 128M
    networks:
      - backend

  # Reverse proxy
  nginx:
    image: nginx:alpine
    restart: unless-stopped
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./nginx/nginx.conf:/etc/nginx/nginx.conf:ro
      - ./nginx/ssl:/etc/nginx/ssl:ro
    depends_on:
      - api
    healthcheck:
      test: ["CMD", "nginx", "-t"]
      interval: 30s
      timeout: 10s
      retries: 3
    networks:
      - frontend

volumes:
  pgdata:
  redisdata:

networks:
  frontend:
  backend:
    internal: true  # No external access to backend network

Quick Reference Commands

# === IMAGES ===
docker build -t name:tag .
docker images
docker rmi image:tag
docker image prune -a

# === CONTAINERS ===
docker run -d --name app -p 8080:80 image
docker ps -a
docker stop/start/restart app
docker rm app
docker logs -f app
docker exec -it app /bin/sh

# === INSPECT & DEBUG ===
docker inspect app
docker stats
docker top app
docker diff app

# === NETWORKS ===
docker network ls
docker network create mynet
docker network connect mynet app
docker network inspect mynet

# === VOLUMES ===
docker volume ls
docker volume create mydata
docker volume inspect mydata
docker volume prune

# === CLEANUP ===
docker system df
docker system prune -a --volumes
docker builder prune

# === COMPOSE ===
docker compose up -d
docker compose down -v
docker compose logs -f
docker compose exec app sh
docker compose ps

Best Practices Summary

Building Images

Use multi-stage builds
Choose minimal base images (Alpine, distroless)
Order Dockerfile for optimal caching
Use .dockerignore
Tag with specific versions, not latest

Security

Run as non-root user
Scan images for vulnerabilities
Don’t store secrets in images
Use read-only filesystems where possible
Drop unnecessary capabilities

Runtime

Set resource limits (CPU, memory)
Use health checks
Log to stdout/stderr
Use custom networks for service discovery
Use volumes for persistent data

Operations

Automate image builds in CI/CD
Implement image retention policies
Monitor container metrics
Have a cleanup strategy for unused resources

Docker and Containers - From Fundamentals to Production

Docker and Containers - From Fundamentals to Production

Chapter 1: Container Fundamentals

What Are Containers, Really?

Containers vs Virtual Machines

The Linux Kernel Features Behind Containers

Docker Architecture

Component Breakdown

Images and Layers

How Layers Work

Inspect Image Layers

Why Layers Matter

Container Lifecycle

Essential Container Commands

Container Inspection

Docker Networking Deep Dive

Network Drivers

Bridge Network (Default)

Network Commands

Port Mapping

Container DNS and Service Discovery

Network Debugging

Network Comparison Table

Docker Volumes and Storage

Storage Types

Volume Commands

Bind Mounts vs Volumes

Volume Use Cases

Storage Debugging

Docker Compose Fundamentals

Basic Structure

Compose Commands

Compose Networking

Environment Variables

Health Checks in Compose

Dependency Management

Chapter 2: Production-Grade Docker

Multi-Stage Builds

The Problem with Single-Stage Builds

Multi-Stage Solution

Multi-Stage for Compiled Languages

Python Multi-Stage Build

Image Optimization

Base Image Selection

Layer Optimization

.dockerignore

Caching Best Practices

Security Hardening

1. Run as Non-Root User

2. Use Read-Only Filesystem

3. Drop Capabilities

4. No New Privileges

5. Scan Images for Vulnerabilities

6. Use Specific Image Tags

7. Don’t Store Secrets in Images

8. Minimal Attack Surface

Security Checklist

Health Checks

Dockerfile Health Check

Health Check Parameters

Check Health Status

Resource Limits

Memory Limits

CPU Limits

Docker Compose Resources

Monitoring Resource Usage

Logging Best Practices

Log to stdout/stderr

Logging Drivers

Structured Logging

Production Troubleshooting

Issue 1: Container Exits Immediately

BAD: nginx runs in background, container exits

GOOD: Keep nginx in foreground

Issue 2: OOM Killed (Exit Code 137)

Issue 3: Cannot Connect to Container

Issue 4: Slow Container Startup

Issue 5: “No Space Left on Device”

Issue 6: DNS Resolution Failing

Issue 7: Permission Denied on Volume