122 lines
4.6 KiB
Markdown
122 lines
4.6 KiB
Markdown
# 🐳 Docker & Infrastructure Standards (The "Zero-Retry" Protocol)
|
|
|
|
**Source:** The Architect's repository.
|
|
**Audience:** AI Agents (specifically You).
|
|
**Objective:** **ELIMINATE** the "Build -> Fail -> Retry" loop. Get it right the first time.
|
|
|
|
> [!IMPORTANT]
|
|
> **The Golden Rule of Containerization:**
|
|
> "A container is not a VM. It is a process. If you treat it like a VM, The Architect will delete your memory."
|
|
|
|
## 1. 🧠 The AI "Pre-Flight" Check (READ THIS BEFORE WRITING CODE)
|
|
|
|
Before you write a single line of `Dockerfile` or `docker-compose.yml`, you must clear these cognitive hurdles:
|
|
|
|
1. **"Localhost" Hallucination:**
|
|
* *Fact:* `localhost` inside a container refers to the **container itself**, NOT the user's machine or other containers.
|
|
* *Correction:* Use service names (e.g., `db`, `redis`) as hostnames in `docker-compose`. Use `host.docker.internal` ONLY for development if accessing host apps.
|
|
2. **Ephemerality check:**
|
|
* *Question:* "If I `docker-compose down` and `up`, is the data lost?"
|
|
* *Mandate:* Databases/File Stores **MUST** have a named volume mapped.
|
|
3. **Port Blindness:**
|
|
* *Fact:* `EXPOSE` in Dockerfile does NOTHING. You **MUST** map ports (`ports: - "8080:80"`) in `docker-compose.yml` to access from host.
|
|
|
|
## 2. 🏗️ Dockerfile "Platinum" Standard
|
|
|
|
### The Layer Caching Strategy (Speed)
|
|
Agents frequently forget this. **DO NOT** copy source code before installing dependencies. It kills the cache.
|
|
|
|
**❌ BAD (Slows down every build):**
|
|
```dockerfile
|
|
COPY . .
|
|
RUN pip install -r requirements.txt
|
|
```
|
|
|
|
**✅ GOOD (Instant builds on code changes):**
|
|
```dockerfile
|
|
COPY requirements.txt .
|
|
RUN pip install --no-cache-dir -r requirements.txt
|
|
COPY . .
|
|
```
|
|
|
|
### Multi-Stage Protocol (Size)
|
|
**MANDATORY** for Compiled languages (Go, Rust, C++) and Frontend (Node/React).
|
|
**STRONGLY RECOMMENDED** for Python (to purge build tools).
|
|
|
|
```dockerfile
|
|
# Stage 1: Build
|
|
FROM python:3.11-alpine as builder
|
|
WORKDIR /app
|
|
COPY requirements.txt .
|
|
RUN apk add --no-cache gcc musl-dev libffi-dev && \
|
|
pip install --prefix=/install -r requirements.txt
|
|
|
|
# Stage 2: Run (The only thing that ships)
|
|
FROM python:3.11-alpine
|
|
WORKDIR /app
|
|
COPY --from=builder /install /usr/local
|
|
COPY . .
|
|
CMD ["python", "main.py"]
|
|
```
|
|
|
|
## 3. 🎼 Docker Compose "Orchestration" Standard
|
|
|
|
### The Dependency Trap (`depends_on`)
|
|
AI agents often crash applications because they start before the Database is ready.
|
|
**Rule:** Simply adding `depends_on` is NOT ENOUGH. It only starts the container, it doesn't wait for the *service*.
|
|
|
|
**✅ The Correct Pattern (Condition Service Healthy):**
|
|
|
|
```yaml
|
|
services:
|
|
web:
|
|
depends_on:
|
|
db:
|
|
condition: service_healthy # <--- CRITICAL
|
|
|
|
db:
|
|
image: postgres:15-alpine
|
|
healthcheck:
|
|
test: ["CMD-SHELL", "pg_isready -U postgres"]
|
|
interval: 10s
|
|
timeout: 5s
|
|
retries: 5
|
|
```
|
|
|
|
### Explicit Networking
|
|
Do not use the default bridge network. It makes DNS resolution messy.
|
|
1. Define a top-level `networks`.
|
|
2. Assign generic names (e.g., `internal_net`).
|
|
|
|
## 4. 🛡️ Security & Production Constraints
|
|
|
|
1. **The "Root" Sin:**
|
|
* Apps should NOT run as PID 1 root.
|
|
* *Fix:* Add `USER appuser` at the end of Dockerfile.
|
|
2. **Secret Leakage:**
|
|
* **NEVER** `ENV API_KEY=sk-123...` in Dockerfile.
|
|
* **ALWAYS** use `.env` file passing in `docker-compose`.
|
|
3. **Persistence:**
|
|
* Use **Named Volumes** for data logic (`postgres_data:/var/lib/postgresql/data`).
|
|
* Use **Bind Mounts** (`./src:/app/src`) ONLY for development hot-reloading.
|
|
|
|
## 5. 🤖 The "Self-Correction" Checklist (Run this before submitting)
|
|
|
|
Agents must simulate this audit before showing code to the user:
|
|
|
|
- [ ] **Base Image:** Is it `alpine` or `slim`? (If `ubuntu`, reject yourself).
|
|
- [ ] **Context:** Did I define `WORKDIR`? (Don't dump files in root `/`).
|
|
- [ ] **PID 1:** Does the container handle signals? (Use `exec` form: `CMD ["python", "app.py"]`, NOT `CMD python app.py`).
|
|
- [ ] **Zombie Processes:** Is my healthcheck actually testing the app, or just `echo`?
|
|
- [ ] **Orphan Ports:** Did I expose the port in Dockerfile AND map it in Compose?
|
|
- [ ] **Version Pinning:** Did I use `postgres:latest`? -> **CHANGE TO** `postgres:15-alpine`.
|
|
|
|
## 6. Emergency Recovery (When things fail)
|
|
|
|
If a container exits immediately (CrashLoopBackOff):
|
|
1. **Do NOT** just try to run it again.
|
|
2. **Action:** Override entrypoint to sleep.
|
|
* `command: ["sleep", "infinity"]`
|
|
3. **Debug:** Exec into container -> `docker exec -it <id> sh` -> Try running command manually.
|
|
4. **Fix:** Analyze logs which usually scream "Missing Dependency" or "Permission Denied".
|