feat: Add agent execution plan for Zabbix multi-WAN speedtest and strategic intelligence.
This commit is contained in:
parent
0c8a8a0499
commit
2a2a322672
|
|
@ -1,199 +0,0 @@
|
|||
# AI Agent Execution Plan: OpenVPN Hybrid Integration
|
||||
|
||||
**Objective**: transform the `pfsense_hybrid_snmp_agent` folder into a production-ready Hybrid solution.
|
||||
**Context**: The files `C:\Users\joao.goncalves\Desktop\zabbix-itguys\templates_gold\pfsense_hybrid_snmp_agent\files\openvpn-discovery.sh` and `C:\Users\joao.goncalves\Desktop\zabbix-itguys\templates_gold\pfsense_hybrid_snmp_agent\files\userparameter_openvpn.conf` are currently raw copies. The YAML template is the original SNMP version located at `C:\Users\joao.goncalves\Desktop\zabbix-itguys\templates_gold\pfsense_hybrid_snmp_agent\template_app_pfsense_snmp.yaml`.
|
||||
|
||||
|
||||
## Execution Strategy & Safety Context
|
||||
**Goal**: We are building a "Hybrid" Zabbix Template for pfSense that combines standard SNMP monitoring with advanced OpenVPN metrics collected via Zabbix Agent Custom UserParameters.
|
||||
**Key Features**:
|
||||
1. **Dynamic Grouping**: Group VPN users by "Company" (Server Name) derived from log filenames.
|
||||
2. **Security Triggers**: Detect Exfiltration (>10GB/h), Zombie Sessions (>24h), and Session Hijacking (IP Change).
|
||||
3. **S2S vs User**: Distinguish Site-to-Site tunnels from human users to apply different alert rules via Macros and Discovery Overrides.
|
||||
|
||||
**CRITICAL INSTRUCTION**: If any step in this runbook is ambiguous, fails validation, or if you encounter unexpected file structures, **STOP IMMEDIATELY**. Do not guess. Ask the user for clarification before proceeding to the next step.
|
||||
|
||||
---
|
||||
|
||||
## Phase 1: Script & Agent Configuration
|
||||
|
||||
### 1.1. Rewrite `files/openvpn-discovery.sh`
|
||||
**Path**: `C:\Users\joao.goncalves\Desktop\zabbix-itguys\templates_gold\pfsense_hybrid_snmp_agent\files\openvpn-discovery.sh`
|
||||
**Logic**:
|
||||
- Scan `/var/log/openvpn/status*.log`.
|
||||
- Extract `{#VPN.SERVER}` from filename (Regex: `status_(.*).log` -> Group 1).
|
||||
- Extract Users, Real IP, Byte Counts from content.
|
||||
- **Critical**: Output JSON standard for Zabbix LLD.
|
||||
|
||||
```bash
|
||||
#!/bin/sh
|
||||
# OpenVPN Discovery Script (Arthur's Gold Standard)
|
||||
# Outputs: {#VPN.USER}, {#VPN.SERVER}, {#VPN.REAL_IP}
|
||||
|
||||
JSON_OUTPUT="{\"data\":["
|
||||
FIRST_ITEM=1
|
||||
|
||||
# Loop through all status logs
|
||||
for logfile in /var/log/openvpn/status*.log; do
|
||||
[ -e "$logfile" ] || continue
|
||||
|
||||
# Extract Server Name from Filename "status_SERVERNAME.log"
|
||||
# Note: Busybox filename parsing
|
||||
filename=$(basename "$logfile")
|
||||
# Remove prefix "status_" and suffix ".log"
|
||||
server_name=$(echo "$filename" | sed -e 's/^status_//' -e 's/\.log$//')
|
||||
|
||||
# Read the file and parse "CLIENT_LIST" lines
|
||||
# Format: CLIENT_LIST,CommonName,RealAddress,VirtualAddress,BytesReceived,BytesSent,Since,Since(time_t),Username,ClientID,PeerID
|
||||
while IFS=, read -r type common_name real_address virtual_address bytes_rx bytes_tx since since_unix username client_id peer_id; do
|
||||
if [ "$type" = "CLIENT_LIST" ] && [ "$common_name" != "Common Name" ]; then
|
||||
# Extract IP only from RealAddress (IP:PORT)
|
||||
real_ip=$(echo "$real_address" | cut -d: -f1)
|
||||
|
||||
# Append to JSON
|
||||
if [ $FIRST_ITEM -eq 0 ]; then JSON_OUTPUT="$JSON_OUTPUT,"; fi
|
||||
JSON_OUTPUT="$JSON_OUTPUT{\"{#VPN.USER}\":\"$common_name\",\"{#VPN.SERVER}\":\"$server_name\",\"{#VPN.REAL_IP}\":\"$real_ip\"}"
|
||||
FIRST_ITEM=0
|
||||
fi
|
||||
done < "$logfile"
|
||||
done
|
||||
|
||||
JSON_OUTPUT="$JSON_OUTPUT]}"
|
||||
echo "$JSON_OUTPUT"
|
||||
```
|
||||
|
||||
### 1.2. Update `files/userparameter_openvpn.conf`
|
||||
**Path**: `C:\Users\joao.goncalves\Desktop\zabbix-itguys\templates_gold\pfsense_hybrid_snmp_agent\files\userparameter_openvpn.conf`
|
||||
**Logic**:
|
||||
- Simplify. The discovery script now does the heavy lifting of finding users.
|
||||
- The Items need to fetch data. Since we have multiple files, `grep` needs to search ALL of them.
|
||||
- **Optimization**: `grep -h` (no filename) to search all generic status logs.
|
||||
|
||||
```conf
|
||||
UserParameter=openvpn.discovery,/opt/zabbix/openvpn-discovery.sh
|
||||
# Fetch raw metrics for a specific user (Usernames must be unique across servers or we grab the first match)
|
||||
UserParameter=openvpn.user.bytes_received.total[*],grep -h "^CLIENT_LIST,$1," /var/log/openvpn/status*.log 2>/dev/null | head -1 | cut -d, -f6
|
||||
UserParameter=openvpn.user.bytes_sent.total[*],grep -h "^CLIENT_LIST,$1," /var/log/openvpn/status*.log 2>/dev/null | head -1 | cut -d, -f7
|
||||
UserParameter=openvpn.user.connected_since[*],grep -h "^CLIENT_LIST,$1," /var/log/openvpn/status*.log 2>/dev/null | head -1 | cut -d, -f9
|
||||
UserParameter=openvpn.user.real_address.new[*],grep -h "^CLIENT_LIST,$1," /var/log/openvpn/status*.log 2>/dev/null | head -1 | cut -d, -f3 | cut -d: -f1
|
||||
UserParameter=openvpn.user.status[*],if grep -q "^CLIENT_LIST,$1," /var/log/openvpn/status*.log 2>/dev/null; then echo 1; else echo 0; fi
|
||||
UserParameter=openvpn.version,openvpn --version 2>&1 | head -1 | awk '{print $2}'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: Template Configuration (YAML)
|
||||
|
||||
**Target**: `templates_gold/pfsense_hybrid_snmp_agent/template_pfsense_hybrid_gold.yaml`
|
||||
|
||||
### 2.1. Add Macros
|
||||
Append to `macros` section:
|
||||
```yaml
|
||||
- macro: '{$VPN.S2S.PATTERN}'
|
||||
value: '^S2S_'
|
||||
description: 'Regex para identificar túneis Site-to-Site.'
|
||||
- macro: '{$VPN.DATA.LIMIT}'
|
||||
value: '10737418240'
|
||||
description: 'Limite de Download (10GB) para alerta de Exfiltração.'
|
||||
- macro: '{$VPN.WORK.START}'
|
||||
value: '080000'
|
||||
- macro: '{$VPN.WORK.END}'
|
||||
value: '180000'
|
||||
- macro: '{$VPN.ZOMBIE.LIMIT}'
|
||||
value: '86400'
|
||||
description: 'Tempo máximo (24h) para considerar sessão zumbi.'
|
||||
```
|
||||
|
||||
### 2.2. Add Discovery Rule
|
||||
**Name**: `Descoberta de Usuários OpenVPN`
|
||||
**Key**: `openvpn.discovery`
|
||||
**Type**: `ZABBIX_ACTIVE` (Preferred for Agents behind NAT/Firewall)
|
||||
|
||||
**Overrides**:
|
||||
```yaml
|
||||
overrides:
|
||||
- name: 'Site-to-Site (S2S)'
|
||||
step: '1'
|
||||
filter:
|
||||
conditions:
|
||||
- macro: '{#VPN.USER}'
|
||||
value: '{$VPN.S2S.PATTERN}'
|
||||
operator: REGEXP
|
||||
operations:
|
||||
- operationobject: ITEM_PROTOTYPE
|
||||
operator: REGEXP
|
||||
value: 'Stats Year|Forecast'
|
||||
status: ENABLED
|
||||
- operationobject: TRIGGER_PROTOTYPE
|
||||
operator: REGEXP
|
||||
value: 'Exfiltração|Horário|Zombie|IP Change'
|
||||
status: DISABLED
|
||||
- operationobject: ITEM_PROTOTYPE
|
||||
operator: LIKE
|
||||
value: ''
|
||||
tags:
|
||||
- tag: Type
|
||||
value: S2S
|
||||
- name: 'User (Colaborador)'
|
||||
step: '2'
|
||||
filter:
|
||||
conditions:
|
||||
- macro: '{#VPN.USER}'
|
||||
value: '{$VPN.S2S.PATTERN}'
|
||||
operator: NOT_REGEXP
|
||||
operations:
|
||||
- operationobject: ITEM_PROTOTYPE
|
||||
operator: REGEXP
|
||||
value: 'Stats Year|Forecast'
|
||||
status: DISABLED
|
||||
- operationobject: TRIGGER_PROTOTYPE
|
||||
operator: REGEXP
|
||||
value: 'Exfiltração|Horário|Zombie|IP Change'
|
||||
status: ENABLED
|
||||
- operationobject: ITEM_PROTOTYPE
|
||||
operator: LIKE
|
||||
value: ''
|
||||
tags:
|
||||
- tag: Type
|
||||
value: User
|
||||
```
|
||||
|
||||
### 2.3. Add Item Prototypes (Selected)
|
||||
|
||||
**Common Tags**: `Company: {{#VPN.SERVER}.regsub("(?:CLIENT_|S2S_)?(.*)", "\1")}`
|
||||
|
||||
1. **Download Total (Raw)**
|
||||
- Key: `openvpn.user.bytes_received.total[{#VPN.USER}]`
|
||||
- Type: ZABBIX_ACTIVE
|
||||
- Units: B
|
||||
|
||||
2. **Download Rate (Calculated/Dependent?)**
|
||||
- Actually, simpler to let Zabbix standard "Simple Change" preprocessing handle rate on a Dependent Item, or just use `Change per Second` preprocessing.
|
||||
- **Decision**: Create `openvpn.user.bytes_received.rate[{#VPN.USER}]` dependent on Total, with `Change per Second`.
|
||||
|
||||
3. **Real IP (Inventory)**
|
||||
- Key: `openvpn.user.real_address.new[{#VPN.USER}]`
|
||||
- Populates Host Inventory field? No, this is an item prototype. Just keep as value.
|
||||
|
||||
**Reporting Metrics (Calculated)**
|
||||
- **Stats Week**: `trendsum(//openvpn.user.bytes_received.total[{#VPN.USER}], 1w:now/w)`
|
||||
- **Stats Month**: `trendsum(..., 1M:now/M)`
|
||||
- **Forecast**: `(last(Month) / dayat()) * dayofmonth(end)` (Simplified)
|
||||
|
||||
### 2.4. Add Triggers
|
||||
|
||||
1. **Exfiltração**: `(last(Total) - last(Total, 1h)) > {$VPN.DATA.LIMIT}`
|
||||
2. **IP Change**: `diff(RealIP)=1 and last(Status)=1`
|
||||
3. **Zombie**: `last(ConnectedSince) < now() - {$VPN.ZOMBIE.LIMIT}`
|
||||
|
||||
### 2.5. Add Dashboard
|
||||
Insert `dashboards:` block at root (or modify existing if any).
|
||||
- Use `type: svggraph` for Trends.
|
||||
- Use `type: tophosts` for Tables (Columns: Name, Latest Value of Metric).
|
||||
|
||||
---
|
||||
|
||||
## Phase 3: Validation
|
||||
|
||||
1. **Lint**: `validate_zabbix_template.py`.
|
||||
2. **Docs**: `generate_template_docs.py`.
|
||||
|
|
@ -0,0 +1,110 @@
|
|||
# 🚀 Agent Execution Plan: Zabbix Speedtest & Strategic Intelligence (v10)
|
||||
|
||||
**Status:** APPROVED
|
||||
**Target Directory:** `c:\Users\joao.goncalves\Desktop\zabbix-itguys\templates_gold\pfsense_hybrid_snmp_agent`
|
||||
**Objective:** Implement a robust, multi-WAN speedtest solution with Business Intelligence dashboards in the Gold Template.
|
||||
|
||||
---
|
||||
|
||||
## 🏗️ 1. File Structure & Scripts (Fase 1)
|
||||
|
||||
These files must be created in the `files/` subdirectory.
|
||||
|
||||
### A. `files/speedtest_discovery.py` (Python 3.11)
|
||||
- **Goal:** Native pfSense Discovery without external dependencies.
|
||||
- **Logic:**
|
||||
1. Parse `/cf/conf/config.xml` using `xml.etree.ElementTree`.
|
||||
2. Identify interfaces with a `<gateway>` tag (WANs).
|
||||
3. Extract `descr` (Name), `if` (Interface ID like igb0), `ipaddr`.
|
||||
- **Output (LLD JSON):**
|
||||
```json
|
||||
[{"{#IFNAME}": "igb0", "{#WAN_ALIAS}": "VIVO", "{#WAN_IP}": "x.x.x.x"}]
|
||||
```
|
||||
|
||||
### B. `files/speedtest_worker.sh` (Bash)
|
||||
- **Goal:** Execute speedtest ensuring no timeouts + Telemetry.
|
||||
- **Arguments:** `--interface <IF>`
|
||||
- **Logic:**
|
||||
1. Start a background CPU monitor (`top -P 0` or `vmstat 1`) to capture `idle/interrupt` %.
|
||||
2. Execute `speedtest --interface <IP_FROM_IF> --format json`.
|
||||
3. Stop CPU monitor. Calculate average CPU Load during test.
|
||||
4. Inject `cpu_load_during_test` and `interrupts` into the Speedtest JSON.
|
||||
5. Send to Zabbix via `zabbix_sender` (Trappers).
|
||||
|
||||
### C. `files/install_speedtest.sh`
|
||||
- Helper to install `pkg install -y speedtest`.
|
||||
|
||||
---
|
||||
|
||||
## 🧠 2. Template Intelligence (Fase 2)
|
||||
|
||||
Modify `template_pfsense_hybrid_gold.yaml` (Monolithic Approach).
|
||||
|
||||
### A. Discovery Rule
|
||||
- **Key:** `speedtest.discovery`
|
||||
- **Type:** Zabbix Agent (UserParameter calling the Python script).
|
||||
|
||||
### B. Item Prototypes (Trappers)
|
||||
- **Master:** `speedtest.json[{#IFNAME}]`
|
||||
- **Dependent Items:**
|
||||
- `speedtest.download`, `speedtest.upload`, `speedtest.latency`
|
||||
- `speedtest.cpu_load` (New)
|
||||
- `speedtest.bufferbloat` (Calculated: `speedtest.latency - icmppingsec`)
|
||||
|
||||
### C. Calculated Items (Business Logic)
|
||||
1. **Forecasting (Saturação):**
|
||||
- `business.bandwidth.saturation_date`: Forecast when `net.if.in` reaches `{$WAN_CONTRACT_DOWN}`.
|
||||
2. **Financeiro (Risco):**
|
||||
- `business.downtime.cost`: `(100 - sla.uptime) * {$COMPANY.HOURLY_COST}`.
|
||||
3. **Efficiency:**
|
||||
- `business.link.efficiency`: `avg(net.if.in, 30d) / {$WAN_CONTRACT_DOWN} * 100`.
|
||||
|
||||
### D. Triggers (Cross-Correlation)
|
||||
1. **Hardware Bottleneck:**
|
||||
- `speedtest.download < Expected` AND `speedtest.cpu_load > 90%`.
|
||||
- Msg: "🔥 Hardware Limiting Speed (Not ISP)".
|
||||
2. **ISP Quality (Bufferbloat):**
|
||||
- `speedtest.bufferbloat > 100ms`.
|
||||
- Msg: "⚠️ ISP Latency Spike under Load".
|
||||
3. **Financial Opportunity:**
|
||||
- `business.downtime.cost > {$LINK.SECONDARY.COST}`.
|
||||
- Msg: "💰 Downtime costs exceed Secondary Link price."
|
||||
|
||||
---
|
||||
|
||||
## 📊 3. Dashboards (Fase 3)
|
||||
|
||||
Create a Dashboard Resource in the YAML: "Speedtest & Strategic Planning".
|
||||
|
||||
**Widgets:**
|
||||
1. **Executive Summary:** Markdown widget showing Financial Loss & Efficiency.
|
||||
2. **Forecasting:** Graph showing Traffic Growth vs Contract Limit (Projected).
|
||||
3. **Technical Correlation:** Graph with 3 Y-axes (Speed, CPU, Latency).
|
||||
|
||||
---
|
||||
|
||||
## 📝 4. Documentation (Fase 4)
|
||||
|
||||
Update `INSTRUCOES_AGENTE.txt`:
|
||||
- Add "Step 5: Setup Speedtest".
|
||||
- Instructions for `pkg install` and `cron` setup.
|
||||
|
||||
---
|
||||
|
||||
## ✅ 5. Validation & Auto-Documentation (MANDATORY)
|
||||
|
||||
Once the modifications are complete, the Agent **MUST** run the project's QA tools to ensure quality and compliance.
|
||||
|
||||
### A. Validate Template Structure
|
||||
Run the validation script to check for UUID conflicts, schema errors, or duplicate keys.
|
||||
```powershell
|
||||
python c:\Users\joao.goncalves\Desktop\zabbix-itguys\validate_zabbix_template.py c:\Users\joao.goncalves\Desktop\zabbix-itguys\templates_gold\pfsense_hybrid_snmp_agent\template_pfsense_hybrid_gold.yaml
|
||||
```
|
||||
- **Constraint:** If this script returns ANY error, the Agent must fix it immediately before finishing the task.
|
||||
|
||||
### B. Generate Documentation
|
||||
Run the documentation generator to automatically update the markdown documentation for the template.
|
||||
```powershell
|
||||
python c:\Users\joao.goncalves\Desktop\zabbix-itguys\generate_template_docs.py c:\Users\joao.goncalves\Desktop\zabbix-itguys\templates_gold\pfsense_hybrid_snmp_agent\template_pfsense_hybrid_gold.yaml
|
||||
```
|
||||
- **Output:** This will generate/update `template_pfsense_hybrid_gold_generated.md`.
|
||||
Loading…
Reference in New Issue