feat: Add agent execution plan for Zabbix multi-WAN speedtest and strategic intelligence.

This commit is contained in:
João Pedro Toledo Goncalves 2026-01-04 19:28:56 -03:00
parent 0c8a8a0499
commit 2a2a322672
2 changed files with 110 additions and 199 deletions

View File

@ -1,199 +0,0 @@
# AI Agent Execution Plan: OpenVPN Hybrid Integration
**Objective**: transform the `pfsense_hybrid_snmp_agent` folder into a production-ready Hybrid solution.
**Context**: The files `C:\Users\joao.goncalves\Desktop\zabbix-itguys\templates_gold\pfsense_hybrid_snmp_agent\files\openvpn-discovery.sh` and `C:\Users\joao.goncalves\Desktop\zabbix-itguys\templates_gold\pfsense_hybrid_snmp_agent\files\userparameter_openvpn.conf` are currently raw copies. The YAML template is the original SNMP version located at `C:\Users\joao.goncalves\Desktop\zabbix-itguys\templates_gold\pfsense_hybrid_snmp_agent\template_app_pfsense_snmp.yaml`.
## Execution Strategy & Safety Context
**Goal**: We are building a "Hybrid" Zabbix Template for pfSense that combines standard SNMP monitoring with advanced OpenVPN metrics collected via Zabbix Agent Custom UserParameters.
**Key Features**:
1. **Dynamic Grouping**: Group VPN users by "Company" (Server Name) derived from log filenames.
2. **Security Triggers**: Detect Exfiltration (>10GB/h), Zombie Sessions (>24h), and Session Hijacking (IP Change).
3. **S2S vs User**: Distinguish Site-to-Site tunnels from human users to apply different alert rules via Macros and Discovery Overrides.
**CRITICAL INSTRUCTION**: If any step in this runbook is ambiguous, fails validation, or if you encounter unexpected file structures, **STOP IMMEDIATELY**. Do not guess. Ask the user for clarification before proceeding to the next step.
---
## Phase 1: Script & Agent Configuration
### 1.1. Rewrite `files/openvpn-discovery.sh`
**Path**: `C:\Users\joao.goncalves\Desktop\zabbix-itguys\templates_gold\pfsense_hybrid_snmp_agent\files\openvpn-discovery.sh`
**Logic**:
- Scan `/var/log/openvpn/status*.log`.
- Extract `{#VPN.SERVER}` from filename (Regex: `status_(.*).log` -> Group 1).
- Extract Users, Real IP, Byte Counts from content.
- **Critical**: Output JSON standard for Zabbix LLD.
```bash
#!/bin/sh
# OpenVPN Discovery Script (Arthur's Gold Standard)
# Outputs: {#VPN.USER}, {#VPN.SERVER}, {#VPN.REAL_IP}
JSON_OUTPUT="{\"data\":["
FIRST_ITEM=1
# Loop through all status logs
for logfile in /var/log/openvpn/status*.log; do
[ -e "$logfile" ] || continue
# Extract Server Name from Filename "status_SERVERNAME.log"
# Note: Busybox filename parsing
filename=$(basename "$logfile")
# Remove prefix "status_" and suffix ".log"
server_name=$(echo "$filename" | sed -e 's/^status_//' -e 's/\.log$//')
# Read the file and parse "CLIENT_LIST" lines
# Format: CLIENT_LIST,CommonName,RealAddress,VirtualAddress,BytesReceived,BytesSent,Since,Since(time_t),Username,ClientID,PeerID
while IFS=, read -r type common_name real_address virtual_address bytes_rx bytes_tx since since_unix username client_id peer_id; do
if [ "$type" = "CLIENT_LIST" ] && [ "$common_name" != "Common Name" ]; then
# Extract IP only from RealAddress (IP:PORT)
real_ip=$(echo "$real_address" | cut -d: -f1)
# Append to JSON
if [ $FIRST_ITEM -eq 0 ]; then JSON_OUTPUT="$JSON_OUTPUT,"; fi
JSON_OUTPUT="$JSON_OUTPUT{\"{#VPN.USER}\":\"$common_name\",\"{#VPN.SERVER}\":\"$server_name\",\"{#VPN.REAL_IP}\":\"$real_ip\"}"
FIRST_ITEM=0
fi
done < "$logfile"
done
JSON_OUTPUT="$JSON_OUTPUT]}"
echo "$JSON_OUTPUT"
```
### 1.2. Update `files/userparameter_openvpn.conf`
**Path**: `C:\Users\joao.goncalves\Desktop\zabbix-itguys\templates_gold\pfsense_hybrid_snmp_agent\files\userparameter_openvpn.conf`
**Logic**:
- Simplify. The discovery script now does the heavy lifting of finding users.
- The Items need to fetch data. Since we have multiple files, `grep` needs to search ALL of them.
- **Optimization**: `grep -h` (no filename) to search all generic status logs.
```conf
UserParameter=openvpn.discovery,/opt/zabbix/openvpn-discovery.sh
# Fetch raw metrics for a specific user (Usernames must be unique across servers or we grab the first match)
UserParameter=openvpn.user.bytes_received.total[*],grep -h "^CLIENT_LIST,$1," /var/log/openvpn/status*.log 2>/dev/null | head -1 | cut -d, -f6
UserParameter=openvpn.user.bytes_sent.total[*],grep -h "^CLIENT_LIST,$1," /var/log/openvpn/status*.log 2>/dev/null | head -1 | cut -d, -f7
UserParameter=openvpn.user.connected_since[*],grep -h "^CLIENT_LIST,$1," /var/log/openvpn/status*.log 2>/dev/null | head -1 | cut -d, -f9
UserParameter=openvpn.user.real_address.new[*],grep -h "^CLIENT_LIST,$1," /var/log/openvpn/status*.log 2>/dev/null | head -1 | cut -d, -f3 | cut -d: -f1
UserParameter=openvpn.user.status[*],if grep -q "^CLIENT_LIST,$1," /var/log/openvpn/status*.log 2>/dev/null; then echo 1; else echo 0; fi
UserParameter=openvpn.version,openvpn --version 2>&1 | head -1 | awk '{print $2}'
```
---
## Phase 2: Template Configuration (YAML)
**Target**: `templates_gold/pfsense_hybrid_snmp_agent/template_pfsense_hybrid_gold.yaml`
### 2.1. Add Macros
Append to `macros` section:
```yaml
- macro: '{$VPN.S2S.PATTERN}'
value: '^S2S_'
description: 'Regex para identificar túneis Site-to-Site.'
- macro: '{$VPN.DATA.LIMIT}'
value: '10737418240'
description: 'Limite de Download (10GB) para alerta de Exfiltração.'
- macro: '{$VPN.WORK.START}'
value: '080000'
- macro: '{$VPN.WORK.END}'
value: '180000'
- macro: '{$VPN.ZOMBIE.LIMIT}'
value: '86400'
description: 'Tempo máximo (24h) para considerar sessão zumbi.'
```
### 2.2. Add Discovery Rule
**Name**: `Descoberta de Usuários OpenVPN`
**Key**: `openvpn.discovery`
**Type**: `ZABBIX_ACTIVE` (Preferred for Agents behind NAT/Firewall)
**Overrides**:
```yaml
overrides:
- name: 'Site-to-Site (S2S)'
step: '1'
filter:
conditions:
- macro: '{#VPN.USER}'
value: '{$VPN.S2S.PATTERN}'
operator: REGEXP
operations:
- operationobject: ITEM_PROTOTYPE
operator: REGEXP
value: 'Stats Year|Forecast'
status: ENABLED
- operationobject: TRIGGER_PROTOTYPE
operator: REGEXP
value: 'Exfiltração|Horário|Zombie|IP Change'
status: DISABLED
- operationobject: ITEM_PROTOTYPE
operator: LIKE
value: ''
tags:
- tag: Type
value: S2S
- name: 'User (Colaborador)'
step: '2'
filter:
conditions:
- macro: '{#VPN.USER}'
value: '{$VPN.S2S.PATTERN}'
operator: NOT_REGEXP
operations:
- operationobject: ITEM_PROTOTYPE
operator: REGEXP
value: 'Stats Year|Forecast'
status: DISABLED
- operationobject: TRIGGER_PROTOTYPE
operator: REGEXP
value: 'Exfiltração|Horário|Zombie|IP Change'
status: ENABLED
- operationobject: ITEM_PROTOTYPE
operator: LIKE
value: ''
tags:
- tag: Type
value: User
```
### 2.3. Add Item Prototypes (Selected)
**Common Tags**: `Company: {{#VPN.SERVER}.regsub("(?:CLIENT_|S2S_)?(.*)", "\1")}`
1. **Download Total (Raw)**
- Key: `openvpn.user.bytes_received.total[{#VPN.USER}]`
- Type: ZABBIX_ACTIVE
- Units: B
2. **Download Rate (Calculated/Dependent?)**
- Actually, simpler to let Zabbix standard "Simple Change" preprocessing handle rate on a Dependent Item, or just use `Change per Second` preprocessing.
- **Decision**: Create `openvpn.user.bytes_received.rate[{#VPN.USER}]` dependent on Total, with `Change per Second`.
3. **Real IP (Inventory)**
- Key: `openvpn.user.real_address.new[{#VPN.USER}]`
- Populates Host Inventory field? No, this is an item prototype. Just keep as value.
**Reporting Metrics (Calculated)**
- **Stats Week**: `trendsum(//openvpn.user.bytes_received.total[{#VPN.USER}], 1w:now/w)`
- **Stats Month**: `trendsum(..., 1M:now/M)`
- **Forecast**: `(last(Month) / dayat()) * dayofmonth(end)` (Simplified)
### 2.4. Add Triggers
1. **Exfiltração**: `(last(Total) - last(Total, 1h)) > {$VPN.DATA.LIMIT}`
2. **IP Change**: `diff(RealIP)=1 and last(Status)=1`
3. **Zombie**: `last(ConnectedSince) < now() - {$VPN.ZOMBIE.LIMIT}`
### 2.5. Add Dashboard
Insert `dashboards:` block at root (or modify existing if any).
- Use `type: svggraph` for Trends.
- Use `type: tophosts` for Tables (Columns: Name, Latest Value of Metric).
---
## Phase 3: Validation
1. **Lint**: `validate_zabbix_template.py`.
2. **Docs**: `generate_template_docs.py`.

View File

@ -0,0 +1,110 @@
# 🚀 Agent Execution Plan: Zabbix Speedtest & Strategic Intelligence (v10)
**Status:** APPROVED
**Target Directory:** `c:\Users\joao.goncalves\Desktop\zabbix-itguys\templates_gold\pfsense_hybrid_snmp_agent`
**Objective:** Implement a robust, multi-WAN speedtest solution with Business Intelligence dashboards in the Gold Template.
---
## 🏗️ 1. File Structure & Scripts (Fase 1)
These files must be created in the `files/` subdirectory.
### A. `files/speedtest_discovery.py` (Python 3.11)
- **Goal:** Native pfSense Discovery without external dependencies.
- **Logic:**
1. Parse `/cf/conf/config.xml` using `xml.etree.ElementTree`.
2. Identify interfaces with a `<gateway>` tag (WANs).
3. Extract `descr` (Name), `if` (Interface ID like igb0), `ipaddr`.
- **Output (LLD JSON):**
```json
[{"{#IFNAME}": "igb0", "{#WAN_ALIAS}": "VIVO", "{#WAN_IP}": "x.x.x.x"}]
```
### B. `files/speedtest_worker.sh` (Bash)
- **Goal:** Execute speedtest ensuring no timeouts + Telemetry.
- **Arguments:** `--interface <IF>`
- **Logic:**
1. Start a background CPU monitor (`top -P 0` or `vmstat 1`) to capture `idle/interrupt` %.
2. Execute `speedtest --interface <IP_FROM_IF> --format json`.
3. Stop CPU monitor. Calculate average CPU Load during test.
4. Inject `cpu_load_during_test` and `interrupts` into the Speedtest JSON.
5. Send to Zabbix via `zabbix_sender` (Trappers).
### C. `files/install_speedtest.sh`
- Helper to install `pkg install -y speedtest`.
---
## 🧠 2. Template Intelligence (Fase 2)
Modify `template_pfsense_hybrid_gold.yaml` (Monolithic Approach).
### A. Discovery Rule
- **Key:** `speedtest.discovery`
- **Type:** Zabbix Agent (UserParameter calling the Python script).
### B. Item Prototypes (Trappers)
- **Master:** `speedtest.json[{#IFNAME}]`
- **Dependent Items:**
- `speedtest.download`, `speedtest.upload`, `speedtest.latency`
- `speedtest.cpu_load` (New)
- `speedtest.bufferbloat` (Calculated: `speedtest.latency - icmppingsec`)
### C. Calculated Items (Business Logic)
1. **Forecasting (Saturação):**
- `business.bandwidth.saturation_date`: Forecast when `net.if.in` reaches `{$WAN_CONTRACT_DOWN}`.
2. **Financeiro (Risco):**
- `business.downtime.cost`: `(100 - sla.uptime) * {$COMPANY.HOURLY_COST}`.
3. **Efficiency:**
- `business.link.efficiency`: `avg(net.if.in, 30d) / {$WAN_CONTRACT_DOWN} * 100`.
### D. Triggers (Cross-Correlation)
1. **Hardware Bottleneck:**
- `speedtest.download < Expected` AND `speedtest.cpu_load > 90%`.
- Msg: "🔥 Hardware Limiting Speed (Not ISP)".
2. **ISP Quality (Bufferbloat):**
- `speedtest.bufferbloat > 100ms`.
- Msg: "⚠️ ISP Latency Spike under Load".
3. **Financial Opportunity:**
- `business.downtime.cost > {$LINK.SECONDARY.COST}`.
- Msg: "💰 Downtime costs exceed Secondary Link price."
---
## 📊 3. Dashboards (Fase 3)
Create a Dashboard Resource in the YAML: "Speedtest & Strategic Planning".
**Widgets:**
1. **Executive Summary:** Markdown widget showing Financial Loss & Efficiency.
2. **Forecasting:** Graph showing Traffic Growth vs Contract Limit (Projected).
3. **Technical Correlation:** Graph with 3 Y-axes (Speed, CPU, Latency).
---
## 📝 4. Documentation (Fase 4)
Update `INSTRUCOES_AGENTE.txt`:
- Add "Step 5: Setup Speedtest".
- Instructions for `pkg install` and `cron` setup.
---
## ✅ 5. Validation & Auto-Documentation (MANDATORY)
Once the modifications are complete, the Agent **MUST** run the project's QA tools to ensure quality and compliance.
### A. Validate Template Structure
Run the validation script to check for UUID conflicts, schema errors, or duplicate keys.
```powershell
python c:\Users\joao.goncalves\Desktop\zabbix-itguys\validate_zabbix_template.py c:\Users\joao.goncalves\Desktop\zabbix-itguys\templates_gold\pfsense_hybrid_snmp_agent\template_pfsense_hybrid_gold.yaml
```
- **Constraint:** If this script returns ANY error, the Agent must fix it immediately before finishing the task.
### B. Generate Documentation
Run the documentation generator to automatically update the markdown documentation for the template.
```powershell
python c:\Users\joao.goncalves\Desktop\zabbix-itguys\generate_template_docs.py c:\Users\joao.goncalves\Desktop\zabbix-itguys\templates_gold\pfsense_hybrid_snmp_agent\template_pfsense_hybrid_gold.yaml
```
- **Output:** This will generate/update `template_pfsense_hybrid_gold_generated.md`.