Architecture Documentation for Pterodisbot v1.1.0

2025-11-06 16:35:14 +00:00
commit 6f016cade4

345
Architecture.md Normal file

@@ -0,0 +1,345 @@
# Architecture Overview
The bot follows a modular, async-first architecture designed for reliability, performance, and maintainability.
### System Architecture
```
┌─────────────────────────────────────────────────────────────┐
│ Discord Interface │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Slash │ │ Button │ │ Status │ │
│ │ Commands │ │ Interactions │ │ Embeds │ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
└─────────┼──────────────────┼──────────────────┼─────────────┘
│ │ │
▼ ▼ ▼
┌─────────────────────────────────────────────────────────────┐
│ PterodactylBot (Core Orchestrator) │
│ ┌────────────────────────────────────────────────────┐ │
│ │ • Command routing and validation │ │
│ │ • Embed lifecycle management │ │
│ │ • State tracking and change detection │ │
│ │ • Background task scheduling │ │
│ └────────────────────────────────────────────────────┘ │
└───────┬───────────────────────┬─────────────────┬───────────┘
│ │ │
▼ ▼ ▼
┌──────────────┐ ┌──────────────────┐ ┌──────────────┐
│ Pterodactyl │ │ Metrics Manager │ │ Storage │
│ API │ │ │ │ Layer │
│ │ │ • Data tracking │ │ │
│ • Client API │ │ • Graph gen. │ │ • Embed loc. │
│ • App API │ │ • Trend analysis │ │ • State data │
│ • Resource │ │ • FIFO queues │ │ • JSON pers. │
│ monitoring │ │ • Multi-vCPU │ │ │
└──────────────┘ └──────────────────┘ └──────────────┘
│ │
▼ ▼
┌─────────────────────────────────────────────────────────────┐
│ External Systems & Data Stores │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ Pterodactyl │ │ File │ │
│ │ Panel │ │ System │ │
│ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────────┘
```
### Core Components
#### 1. **PterodactylBot** (`pterodisbot.py`)
Main orchestrator class that coordinates all bot operations.
**Responsibilities:**
- Discord.py bot initialization and lifecycle
- Slash command routing and validation
- Embed creation, tracking, and lifecycle management
- Background task scheduling and execution
- State management and change detection
- Error handling and recovery
**Key Methods:**
- `setup_hook()`: Initializes API clients and background tasks
- `get_server_status_embed()`: Generates Discord embeds with current server data
- `update_status()`: Background task for intelligent embed updates
- `refresh_all_embeds()`: Complete embed refresh across all channels
- `track_new_embed()`: Persists embed location data
**Design Patterns:**
- **Observer Pattern**: Monitors server state changes
- **Strategy Pattern**: Different update strategies based on state changes
- **Factory Pattern**: Creates embeds and views dynamically
#### 2. **PterodactylAPI** (`pterodisbot.py`)
Abstraction layer for all Pterodactyl Panel API interactions.
**Responsibilities:**
- HTTP request management with aiohttp
- API authentication (dual key support)
- Request/response serialization
- Error handling and retry logic
- Rate limit management
**API Coverage:**
- **Client API** (`ptlc_*`): Server resources, power actions
- **Application API** (`ptla_*`): Server lists, allocations, details
**Key Features:**
- Async/await throughout for non-blocking I/O
- Request locking to prevent race conditions
- Automatic retry with exponential backoff
- Comprehensive error response handling
#### 3. **ServerMetricsGraphs** (`server_metrics_graphs.py`)
Time-series data tracking and visualization for individual servers.
**Responsibilities:**
- Historical data collection (1-minute sliding window)
- FIFO queue management (6 data points max)
- Graph generation using matplotlib
- Multi-vCPU scaling calculations
- Trend analysis and statistics
**Graph Features:**
- **CPU Graph**: Dynamic scaling for multi-vCPU servers
- **Memory Graph**: Adaptive Y-axis based on usage
- **Combined Graph**: Dual subplot for comprehensive view
- **Discord Optimized**: Dark theme matching Discord's interface
**Data Structure:**
```python
# Each data point: (timestamp, cpu_percent, memory_mb)
deque([
(datetime(...), 45.5, 1024.0),
(datetime(...), 50.2, 1100.0),
# ... up to 6 points
], maxlen=6)
```
#### 4. **ServerMetricsManager** (`server_metrics_graphs.py`)
Global coordinator for all server metrics tracking.
**Responsibilities:**
- Lifecycle management of ServerMetricsGraphs instances
- Server discovery and cleanup
- Bulk operations across all tracked servers
- Memory management
**Key Features:**
- Lazy initialization of graph instances
- Automatic cleanup of removed servers
- Summary statistics generation
- Thread-safe operations
#### 5. **ServerStatusView** (`pterodisbot.py`)
Discord UI component providing interactive server controls.
**Responsibilities:**
- Button state management
- User authorization and validation
- Power action execution
- Connection info display
**Buttons:**
- 🟢 **Start**: Sends `start` signal to server
- 🔴 **Stop**: Sends `stop` signal to server
- 🔵 **Restart**: Sends `restart` signal to server
- 📍 **Show Address**: Displays IP and port
**Security:**
- Guild ID validation
- Role-based access control
- Interaction-level authorization checks
- Ephemeral responses for sensitive data
### Data Flow
#### Server Status Update Flow
```
1. Background Task (every 10s)
├─> Fetch all servers from Pterodactyl
├─> Update server cache
└─> For each tracked embed:
├─> Get current resources
├─> Check state changes:
│ ├─> Power state changed? → UPDATE
│ ├─> CPU change >50%? → UPDATE
│ ├─> First check? → UPDATE
│ └─> 10min force update? → UPDATE
├─> If running: Collect metrics
├─> Generate embed + view
├─> Update Discord message
└─> Update state tracking
2. Metrics Collection (running servers only)
├─> Extract CPU/memory data
├─> Add to ServerMetricsGraphs
├─> Check sufficient data (6 points)
└─> Generate graph if available
3. State Tracking
├─> Store: (state, cpu_usage, last_force_update)
├─> Compare with previous state
└─> Determine if update needed
```
#### User Interaction Flow
```
1. User Types /server_status
├─> Guild validation
├─> Fetch all servers
├─> Generate statistics
├─> Create dropdown menu
└─> Send ephemeral response
2. User Selects Server
├─> Validate selection
├─> Check for existing embed
├─> Delete old embed if exists
├─> Create new status embed
├─> Track embed location
└─> Persist to JSON
3. User Clicks Power Button
├─> Interaction check
│ ├─> Verify guild ID
│ └─> Check user role
├─> Send power action to API
├─> Await confirmation
└─> Send ephemeral response
```
### Configuration System
The bot uses a multi-layer configuration system:
```python
1. Environment Variables (Docker/container environments)
2. config.ini file (traditional deployments)
3. Validation layer (startup checks)
4. Runtime constants
```
**Validation Checks:**
- Required sections and keys present
- API key prefix validation (`ptlc_` and `ptla_`)
- Guild ID format (valid integer)
- URL format (includes protocol)
- Raises `ConfigValidationError` on failure
### Persistent Storage
#### Embed Locations (`embed/embed_locations.json`)
```json
{
"server_abc123": {
"channel_id": "123456789",
"message_id": "987654321"
}
}
```
**Operations:**
- **Load**: On bot startup
- **Save**: After each embed create/delete
- **Cleanup**: Automatic on missing messages
#### Server State Tracking (in-memory)
```python
{
"server_abc123": (
"running", # Current state
45.5, # Last CPU usage
1699123456.0 # Last force update timestamp
)
}
```
**Purpose:**
- Change detection for smart updates
- Reduces unnecessary Discord API calls
- Tracks force update intervals
### Async Architecture
The bot is built on asyncio for concurrent operations:
```python
# Concurrent API requests
async with asyncio.gather(*[
api.get_server_resources(id1),
api.get_server_resources(id2),
api.get_server_resources(id3)
]):
# Process results
```
**Benefits:**
- Non-blocking I/O operations
- Concurrent server status checks
- Responsive to Discord interactions
- Efficient resource utilization
**Synchronization:**
- `asyncio.Lock()` for API rate limiting
- `asyncio.Lock()` for embed update cycles
- No blocking operations in event loop
### Error Handling Strategy
**Levels:**
1. **Graceful Degradation**: Continue operating with reduced functionality
2. **Automatic Retry**: Retry failed operations with backoff
3. **User Notification**: Inform users of transient errors
4. **Logging**: Comprehensive error logging for debugging
5. **Crash Prevention**: Catch and handle all exceptions
**Example:**
```python
try:
resources = await api.get_server_resources(server_id)
except Exception as e:
logger.error(f"Failed to fetch resources: {e}")
# Use cached data or display offline state
return {'attributes': {'current_state': 'offline'}}
```
### Performance Optimizations
#### 1. Smart Update Logic
Only updates embeds when necessary:
- Power state changes (always)
- Significant CPU changes (>50% delta)
- Force update interval (every 10 minutes)
- Initial server checks
**Impact**: ~90% reduction in Discord API calls
#### 2. FIFO Metric Queues
`collections.deque(maxlen=6)` automatically rotates old data:
- O(1) append operations
- Automatic memory management
- No manual cleanup needed
#### 3. Request Locking
Prevents concurrent API access:
```python
async with self.lock:
# Only one request at a time
response = await self.session.request(...)
```
**Benefits**:
- Avoids rate limits
- Prevents race conditions
- Ensures ordered operations
#### 4. Caching Strategy
- Server list cached between updates
- State tracking prevents redundant checks
- Embed locations loaded once at startup