# 🗄️ Database Standards (The "Relational Integrity" Protocol) **Audience:** Backend Agents & Architects. **Objective:** Build scalable, compatible schemas that prefer PostgreSQL but abide by MySQL limitations. > [!CRITICAL] > **The Data Mandate:** > "Code is temporary. Data is permanent. Broken schemas are a life sentence." ## 1. 🏗️ Architecture & Stack ### The Abstraction Layer - **Mandatory ORM:** Use **SQLAlchemy (Async)** or **Prisma** (if Node). - **Rationale:** We need to switch between Postgres and MySQL without rewriting queries. Raw SQL is forbidden unless for specific optimized reports. - **Migrations:** **Alembic** (Python) or **Prisma Migrate**. - *Rule:* Never modify the DB manually. Code-first always. ### The Duel: PostgreSQL vs MySQL We prefer **PostgreSQL**. * **Why:** JSONB, Better Indexing, Reliability. * **MySQL Support:** We must support it, so avoid logic that relies *exclusively* on obscure Postgres extensions unless behind a feature flag. ## 2. 🏛️ Schema Design Rules ### Naming Conventions (Snake_Case) - **Tables:** Plural, snake_case (`users`, `order_items`, `audit_logs`). - **Columns:** Singular, snake_case (`created_at`, `user_id`, `is_active`). - **Keys:** - Primary: `id` (UUIDv7 or BigInt optimized). - Foreign: `target_id` (e.g., `user_id` referencing `users.id`). ### Type Disciplines - **Timestamps:** ALWAYS use `UTC`. - Column: `created_at` (TIMESTAMP WITH TIME ZONE). - Column: `updated_at` (Auto-update trigger). - **JSON:** Use `JSONB` (Postgres) / `JSON` (MySQL). - *Constraint:* Do not treat the DB as a document store. Use JSON only for variable metadata, not core relations. - **Booleans:** Use `BOOLEAN`. (MySQL sets it to TinyInt(1) automatically, ORM handles this). ## 3. 🛡️ Performance & Reliability ### Indexing Strategy - **Foreign Keys:** MUST be indexed. - **Search:** If searching text, use Trigram (Postgres) or FullText (MySQL). - **Uniqueness:** Enforce at DB level (`unique=True`), not just code level. ### The "N+1" Sin - **Eager Loading:** Agents must explicitly join tables (`select_related` / `joinedload`). - **Pagination:** NEVER return `SELECT *` without `LIMIT/OFFSET` (Cursor pagination preferred for large sets). ## 4. 🔒 Compatibility Checklist (Postgres vs MySQL) Before committing a migration, verify: 1. **Quoting:** Postgres uses double quotes `"table"`, MySQL uses backticks `` `table` ``. *Result: Use the ORM to handle this.* 2. **Case Sensitivity:** MySQL on Windows is case-insensitive. Postgres is case-sensitive. *Result: Stick to lowercase snake_case explicitly.* 3. **Enums:** Native ENUMs are messy in migrations. *Result: Use VARCHAR columns with Application-level Enum validation OR lookup tables.* ## 5. 🤖 The Agent "Self-Query" Audit "Before I execute this query/migration..." - [ ] Did I use a migration file? - [ ] Is `created_at` default set to `now()`? - [ ] Am I fetching 10,000 rows? (Add LIMIT). - [ ] If I delete a Parent, what happens to the Child? (Define `ON DELETE CASCADE` or `SET NULL`). ## 6. ⏱️ Performance Self-Diagnosis (The "Slow Query" Check) Agents must run these mental or actual checks on any complex query: ### Test A: The "Explain" Ritual Before finalizing a query, simulate `EXPLAIN` (Postgres) or `EXPLAIN ANALYZE`. * **Fail Condition:** Does the result show `Seq Scan` on a table with > 1000 rows? * **Fix:** Add an index on the filtered column (`WHERE column = ...`). ### Test B: The "Limitless" Trap * **Fail Condition:** A query without `LIMIT` or `PAGE_SIZE` logic. * **Fix:** Hard inject `LIMIT 100` during dev/test to verify. ### Test C: The "N+1" Detector * **Fail Condition:** Using a loop to fetch related data. ```python users = session.query(User).all() for user in users: print(user.address) # 🚨 BAD: One query per user ``` * **Fix:** Use Eager Loading. ```python users = session.query(User).options(joinedload(User.address)).all() # ✅ GOOD: Single JOIN ```