minions-ai-agents/antigravity_brain_export/knowledge/database_standards.md

90 lines
4.0 KiB
Markdown

# 🗄️ Database Standards (The "Relational Integrity" Protocol)
**Audience:** Backend Agents & Architects.
**Objective:** Build scalable, compatible schemas that prefer PostgreSQL but abide by MySQL limitations.
> [!CRITICAL]
> **The Data Mandate:**
> "Code is temporary. Data is permanent. Broken schemas are a life sentence."
## 1. 🏗️ Architecture & Stack
### The Abstraction Layer
- **Mandatory ORM:** Use **SQLAlchemy (Async)** or **Prisma** (if Node).
- **Rationale:** We need to switch between Postgres and MySQL without rewriting queries. Raw SQL is forbidden unless for specific optimized reports.
- **Migrations:** **Alembic** (Python) or **Prisma Migrate**.
- *Rule:* Never modify the DB manually. Code-first always.
### The Duel: PostgreSQL vs MySQL
We prefer **PostgreSQL**.
* **Why:** JSONB, Better Indexing, Reliability.
* **MySQL Support:** We must support it, so avoid logic that relies *exclusively* on obscure Postgres extensions unless behind a feature flag.
## 2. 🏛️ Schema Design Rules
### Naming Conventions (Snake_Case)
- **Tables:** Plural, snake_case (`users`, `order_items`, `audit_logs`).
- **Columns:** Singular, snake_case (`created_at`, `user_id`, `is_active`).
- **Keys:**
- Primary: `id` (UUIDv7 or BigInt optimized).
- Foreign: `target_id` (e.g., `user_id` referencing `users.id`).
### Type Disciplines
- **Timestamps:** ALWAYS use `UTC`.
- Column: `created_at` (TIMESTAMP WITH TIME ZONE).
- Column: `updated_at` (Auto-update trigger).
- **JSON:** Use `JSONB` (Postgres) / `JSON` (MySQL).
- *Constraint:* Do not treat the DB as a document store. Use JSON only for variable metadata, not core relations.
- **Booleans:** Use `BOOLEAN`. (MySQL sets it to TinyInt(1) automatically, ORM handles this).
## 3. 🛡️ Performance & Reliability
### Indexing Strategy
- **Foreign Keys:** MUST be indexed.
- **Search:** If searching text, use Trigram (Postgres) or FullText (MySQL).
- **Uniqueness:** Enforce at DB level (`unique=True`), not just code level.
### The "N+1" Sin
- **Eager Loading:** Agents must explicitly join tables (`select_related` / `joinedload`).
- **Pagination:** NEVER return `SELECT *` without `LIMIT/OFFSET` (Cursor pagination preferred for large sets).
## 4. 🔒 Compatibility Checklist (Postgres vs MySQL)
Before committing a migration, verify:
1. **Quoting:** Postgres uses double quotes `"table"`, MySQL uses backticks `` `table` ``. *Result: Use the ORM to handle this.*
2. **Case Sensitivity:** MySQL on Windows is case-insensitive. Postgres is case-sensitive. *Result: Stick to lowercase snake_case explicitly.*
3. **Enums:** Native ENUMs are messy in migrations. *Result: Use VARCHAR columns with Application-level Enum validation OR lookup tables.*
## 5. 🤖 The Agent "Self-Query" Audit
"Before I execute this query/migration..."
- [ ] Did I use a migration file?
- [ ] Is `created_at` default set to `now()`?
- [ ] Am I fetching 10,000 rows? (Add LIMIT).
- [ ] If I delete a Parent, what happens to the Child? (Define `ON DELETE CASCADE` or `SET NULL`).
## 6. ⏱️ Performance Self-Diagnosis (The "Slow Query" Check)
Agents must run these mental or actual checks on any complex query:
### Test A: The "Explain" Ritual
Before finalizing a query, simulate `EXPLAIN` (Postgres) or `EXPLAIN ANALYZE`.
* **Fail Condition:** Does the result show `Seq Scan` on a table with > 1000 rows?
* **Fix:** Add an index on the filtered column (`WHERE column = ...`).
### Test B: The "Limitless" Trap
* **Fail Condition:** A query without `LIMIT` or `PAGE_SIZE` logic.
* **Fix:** Hard inject `LIMIT 100` during dev/test to verify.
### Test C: The "N+1" Detector
* **Fail Condition:** Using a loop to fetch related data.
```python
users = session.query(User).all()
for user in users:
print(user.address) # 🚨 BAD: One query per user
```
* **Fix:** Use Eager Loading.
```python
users = session.query(User).options(joinedload(User.address)).all() # ✅ GOOD: Single JOIN
```