Architecture Documentation for Pterodisbot v1.1.0

2025-11-06 16:35:14 +00:00
commit 6f016cade4
1 changed files with 345 additions and 0 deletions
--- a/Architecture.md
+++ b/Architecture.md
@@ -0,0 +1,345 @@
 # Architecture Overview
 The bot follows a modular, async-first architecture designed for reliability, performance, and maintainability.
 ### System Architecture
 ```
 ┌─────────────────────────────────────────────────────────────┐
 │                     Discord Interface                       │
 │  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐       │
 │  │ Slash        │  │ Button       │  │ Status       │       │
 │  │ Commands     │  │ Interactions │  │ Embeds       │       │
 │  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘       │
 └─────────┼──────────────────┼──────────────────┼─────────────┘
          │                  │                  │
          ▼                  ▼                  ▼
 ┌─────────────────────────────────────────────────────────────┐
 │              PterodactylBot (Core Orchestrator)             │
 │  ┌────────────────────────────────────────────────────┐     │
 │  │  • Command routing and validation                  │     │
 │  │  • Embed lifecycle management                      │     │
 │  │  • State tracking and change detection             │     │
 │  │  • Background task scheduling                      │     │
 │  └────────────────────────────────────────────────────┘     │
 └───────┬───────────────────────┬─────────────────┬───────────┘
        │                       │                 │
        ▼                       ▼                 ▼
 ┌──────────────┐    ┌──────────────────┐    ┌──────────────┐
 │ Pterodactyl  │    │ Metrics Manager  │    │   Storage    │
 │   API        │    │                  │    │   Layer      │
 │              │    │ • Data tracking  │    │              │
 │ • Client API │    │ • Graph gen.     │    │ • Embed loc. │
 │ • App API    │    │ • Trend analysis │    │ • State data │
 │ • Resource   │    │ • FIFO queues    │    │ • JSON pers. │
 │   monitoring │    │ • Multi-vCPU     │    │              │
 └──────────────┘    └──────────────────┘    └──────────────┘
        │                       │
        ▼                       ▼
 ┌─────────────────────────────────────────────────────────────┐
 │              External Systems & Data Stores                 │
 │  ┌──────────────┐         ┌──────────────┐                  │
 │  │ Pterodactyl  │         │   File       │                  │
 │  │   Panel      │         │   System     │                  │
 │  └──────────────┘         └──────────────┘                  │
 └─────────────────────────────────────────────────────────────┘
 ```
 ### Core Components
 #### 1. **PterodactylBot** (`pterodisbot.py`)
 Main orchestrator class that coordinates all bot operations.
 **Responsibilities:**
 - Discord.py bot initialization and lifecycle
 - Slash command routing and validation
 - Embed creation, tracking, and lifecycle management
 - Background task scheduling and execution
 - State management and change detection
 - Error handling and recovery
 **Key Methods:**
 - `setup_hook()`: Initializes API clients and background tasks
 - `get_server_status_embed()`: Generates Discord embeds with current server data
 - `update_status()`: Background task for intelligent embed updates
 - `refresh_all_embeds()`: Complete embed refresh across all channels
 - `track_new_embed()`: Persists embed location data
 **Design Patterns:**
 - **Observer Pattern**: Monitors server state changes
 - **Strategy Pattern**: Different update strategies based on state changes
 - **Factory Pattern**: Creates embeds and views dynamically
 #### 2. **PterodactylAPI** (`pterodisbot.py`)
 Abstraction layer for all Pterodactyl Panel API interactions.
 **Responsibilities:**
 - HTTP request management with aiohttp
 - API authentication (dual key support)
 - Request/response serialization
 - Error handling and retry logic
 - Rate limit management
 **API Coverage:**
 - **Client API** (`ptlc_*`): Server resources, power actions
 - **Application API** (`ptla_*`): Server lists, allocations, details
 **Key Features:**
 - Async/await throughout for non-blocking I/O
 - Request locking to prevent race conditions
 - Automatic retry with exponential backoff
 - Comprehensive error response handling
 #### 3. **ServerMetricsGraphs** (`server_metrics_graphs.py`)
 Time-series data tracking and visualization for individual servers.
 **Responsibilities:**
 - Historical data collection (1-minute sliding window)
 - FIFO queue management (6 data points max)
 - Graph generation using matplotlib
 - Multi-vCPU scaling calculations
 - Trend analysis and statistics
 **Graph Features:**
 - **CPU Graph**: Dynamic scaling for multi-vCPU servers
 - **Memory Graph**: Adaptive Y-axis based on usage
 - **Combined Graph**: Dual subplot for comprehensive view
 - **Discord Optimized**: Dark theme matching Discord's interface
 **Data Structure:**
 ```python
 # Each data point: (timestamp, cpu_percent, memory_mb)
 deque([
    (datetime(...), 45.5, 1024.0),
    (datetime(...), 50.2, 1100.0),
    # ... up to 6 points
 ], maxlen=6)
 ```
 #### 4. **ServerMetricsManager** (`server_metrics_graphs.py`)
 Global coordinator for all server metrics tracking.
 **Responsibilities:**
 - Lifecycle management of ServerMetricsGraphs instances
 - Server discovery and cleanup
 - Bulk operations across all tracked servers
 - Memory management
 **Key Features:**
 - Lazy initialization of graph instances
 - Automatic cleanup of removed servers
 - Summary statistics generation
 - Thread-safe operations
 #### 5. **ServerStatusView** (`pterodisbot.py`)
 Discord UI component providing interactive server controls.
 **Responsibilities:**
 - Button state management
 - User authorization and validation
 - Power action execution
 - Connection info display
 **Buttons:**
 - 🟢 **Start**: Sends `start` signal to server
 - 🔴 **Stop**: Sends `stop` signal to server
 - 🔵 **Restart**: Sends `restart` signal to server
 - 📍 **Show Address**: Displays IP and port
 **Security:**
 - Guild ID validation
 - Role-based access control
 - Interaction-level authorization checks
 - Ephemeral responses for sensitive data
 ### Data Flow
 #### Server Status Update Flow
 ```
 1. Background Task (every 10s)
   ├─> Fetch all servers from Pterodactyl
   ├─> Update server cache
   └─> For each tracked embed:
       ├─> Get current resources
       ├─> Check state changes:
       │   ├─> Power state changed? → UPDATE
       │   ├─> CPU change >50%? → UPDATE
       │   ├─> First check? → UPDATE
       │   └─> 10min force update? → UPDATE
       ├─> If running: Collect metrics
       ├─> Generate embed + view
       ├─> Update Discord message
       └─> Update state tracking
 2. Metrics Collection (running servers only)
   ├─> Extract CPU/memory data
   ├─> Add to ServerMetricsGraphs
   ├─> Check sufficient data (6 points)
   └─> Generate graph if available
 3. State Tracking
   ├─> Store: (state, cpu_usage, last_force_update)
   ├─> Compare with previous state
   └─> Determine if update needed
 ```
 #### User Interaction Flow
 ```
 1. User Types /server_status
   ├─> Guild validation
   ├─> Fetch all servers
   ├─> Generate statistics
   ├─> Create dropdown menu
   └─> Send ephemeral response
 2. User Selects Server
   ├─> Validate selection
   ├─> Check for existing embed
   ├─> Delete old embed if exists
   ├─> Create new status embed
   ├─> Track embed location
   └─> Persist to JSON
 3. User Clicks Power Button
   ├─> Interaction check
   │   ├─> Verify guild ID
   │   └─> Check user role
   ├─> Send power action to API
   ├─> Await confirmation
   └─> Send ephemeral response
 ```
 ### Configuration System
 The bot uses a multi-layer configuration system:
 ```python
 1. Environment Variables (Docker/container environments)
   ↓
 2. config.ini file (traditional deployments)
   ↓
 3. Validation layer (startup checks)
   ↓
 4. Runtime constants
 ```
 **Validation Checks:**
 - Required sections and keys present
 - API key prefix validation (`ptlc_` and `ptla_`)
 - Guild ID format (valid integer)
 - URL format (includes protocol)
 - Raises `ConfigValidationError` on failure
 ### Persistent Storage
 #### Embed Locations (`embed/embed_locations.json`)
 ```json
 {
  "server_abc123": {
    "channel_id": "123456789",
    "message_id": "987654321"
  }
 }
 ```
 **Operations:**
 - **Load**: On bot startup
 - **Save**: After each embed create/delete
 - **Cleanup**: Automatic on missing messages
 #### Server State Tracking (in-memory)
 ```python
 {
  "server_abc123": (
    "running",           # Current state
    45.5,               # Last CPU usage
    1699123456.0        # Last force update timestamp
  )
 }
 ```
 **Purpose:**
 - Change detection for smart updates
 - Reduces unnecessary Discord API calls
 - Tracks force update intervals
 ### Async Architecture
 The bot is built on asyncio for concurrent operations:
 ```python
 # Concurrent API requests
 async with asyncio.gather(*[
    api.get_server_resources(id1),
    api.get_server_resources(id2),
    api.get_server_resources(id3)
 ]):
    # Process results
 ```
 **Benefits:**
 - Non-blocking I/O operations
 - Concurrent server status checks
 - Responsive to Discord interactions
 - Efficient resource utilization
 **Synchronization:**
 - `asyncio.Lock()` for API rate limiting
 - `asyncio.Lock()` for embed update cycles
 - No blocking operations in event loop
 ### Error Handling Strategy
 **Levels:**
 1. **Graceful Degradation**: Continue operating with reduced functionality
 2. **Automatic Retry**: Retry failed operations with backoff
 3. **User Notification**: Inform users of transient errors
 4. **Logging**: Comprehensive error logging for debugging
 5. **Crash Prevention**: Catch and handle all exceptions
 **Example:**
 ```python
 try:
    resources = await api.get_server_resources(server_id)
 except Exception as e:
    logger.error(f"Failed to fetch resources: {e}")
    # Use cached data or display offline state
    return {'attributes': {'current_state': 'offline'}}
 ```
 ### Performance Optimizations
 #### 1. Smart Update Logic
 Only updates embeds when necessary:
 - Power state changes (always)
 - Significant CPU changes (>50% delta)
 - Force update interval (every 10 minutes)
 - Initial server checks
 **Impact**: ~90% reduction in Discord API calls
 #### 2. FIFO Metric Queues
 `collections.deque(maxlen=6)` automatically rotates old data:
 - O(1) append operations
 - Automatic memory management
 - No manual cleanup needed
 #### 3. Request Locking
 Prevents concurrent API access:
 ```python
 async with self.lock:
    # Only one request at a time
    response = await self.session.request(...)
 ```
 **Benefits**:
 - Avoids rate limits
 - Prevents race conditions
 - Ensures ordered operations
 #### 4. Caching Strategy
 - Server list cached between updates
 - State tracking prevents redundant checks
 - Embed locations loaded once at startup