- Implement `Systant.HaDiscovery` module for automatic device registration - Add comprehensive sensor discovery: CPU, memory, GPU, disk, network, temperature - Update MQTT client to publish discovery messages on startup - Add HomeAssistant configuration section to systant.toml - Create example configuration file with localhost MQTT broker - Update CLAUDE.md with complete HA integration documentation - Add mosquitto to development dependencies for testing 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
223 lines
12 KiB
Markdown
223 lines
12 KiB
Markdown
# CLAUDE.md
|
|
|
|
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
|
|
|
## Common Commands
|
|
|
|
### Development
|
|
```bash
|
|
# Install dependencies
|
|
mix deps.get
|
|
|
|
# Compile the project
|
|
mix compile
|
|
|
|
# Run in development (non-halt mode)
|
|
mix run --no-halt
|
|
|
|
# Run tests
|
|
mix test
|
|
|
|
# Run specific test
|
|
mix test test/systant_test.exs
|
|
|
|
# Enter development shell (via Nix)
|
|
nix develop
|
|
|
|
# Run both server and dashboard together (recommended)
|
|
just dev
|
|
# or directly: hivemind
|
|
|
|
# Run components individually
|
|
just server # or: cd server && mix run --no-halt
|
|
just dashboard # or: cd dashboard && mix phx.server
|
|
|
|
# Other just commands
|
|
just deps # Install dependencies for both projects
|
|
just compile # Compile both projects
|
|
just test # Run tests for both projects
|
|
just clean # Clean both projects
|
|
```
|
|
|
|
### Production
|
|
```bash
|
|
# Build production release
|
|
MIX_ENV=prod mix release
|
|
|
|
# Run production release
|
|
_build/prod/rel/systant/bin/systant start
|
|
```
|
|
|
|
## Architecture Overview
|
|
|
|
This is an Elixir OTP application that serves as a systemd daemon for MQTT-based system monitoring, designed for deployment across multiple NixOS hosts to integrate with Home Assistant.
|
|
|
|
### Core Components
|
|
- **Systant.Application** (`lib/systant/application.ex`): OTP application supervisor that starts the MQTT client
|
|
- **Systant.MqttClient** (`lib/systant/mqtt_client.ex`): GenServer handling MQTT connection, metrics publishing, and command subscriptions
|
|
- **Systant.MqttHandler** (`lib/systant/mqtt_handler.ex`): Custom Tortoise handler for processing command messages with security validation
|
|
- **Systant.CommandExecutor** (`lib/systant/command_executor.ex`): Secure command execution engine with whitelist validation and audit logging
|
|
- **Systant.SystemMetrics** (`lib/systant/system_metrics.ex`): Comprehensive Linux system metrics collection with configuration support
|
|
- **Systant.Config** (`lib/systant/config.ex`): TOML-based configuration loader with environment variable overrides
|
|
- **Dashboard.Application** (`dashboard/lib/dashboard/application.ex`): Phoenix LiveView dashboard application
|
|
- **Dashboard.MqttSubscriber** (`dashboard/lib/dashboard/mqtt_subscriber.ex`): Real-time MQTT subscriber that feeds data to the LiveView dashboard
|
|
|
|
### Key Libraries
|
|
- **Tortoise**: MQTT client library for pub/sub functionality
|
|
- **Jason**: JSON encoding/decoding for message payloads
|
|
- **Toml**: TOML configuration file parsing
|
|
- **Phoenix LiveView**: Real-time dashboard framework
|
|
|
|
### MQTT Behavior
|
|
- Publishes comprehensive system metrics (CPU, memory, disk, GPU, network, temperature, processes) to stats topic
|
|
- Subscribes to commands topic for incoming events that can trigger user-customizable actions
|
|
- Uses hostname-based randomized client ID to avoid conflicts across multiple hosts
|
|
- Configurable startup delay (default 5 seconds) before first metrics publish
|
|
- Real-time metrics collection with configurable intervals
|
|
|
|
### Configuration System
|
|
Systant uses a TOML-based configuration system with environment variable overrides:
|
|
|
|
- **Config File**: `systant.toml` (current dir, `~/.config/systant/`, or `/etc/systant/`)
|
|
- **Module Control**: Enable/disable metric collection modules (cpu, memory, disk, gpu, network, temperature, processes, system)
|
|
- **Filtering Options**: Configurable filtering for disks, network interfaces, processes
|
|
- **Environment Overrides**: `MQTT_HOST`, `MQTT_PORT`, `SYSTANT_INTERVAL`, `SYSTANT_LOG_LEVEL`
|
|
|
|
#### Key Configuration Sections
|
|
- `[general]`: Collection intervals, enabled modules
|
|
- `[mqtt]`: Broker settings, client ID prefix, credentials
|
|
- `[commands]`: Command execution settings, security options
|
|
- `[[commands.available]]`: User-defined command definitions with security parameters
|
|
- `[disk]`: Mount filtering, filesystem exclusions
|
|
- `[gpu]`: NVIDIA/AMD GPU limits and settings
|
|
- `[network]`: Interface filtering, traffic thresholds
|
|
- `[processes]`: Top process limits, sorting options
|
|
- `[temperature]`: CPU/sensor temperature monitoring
|
|
|
|
### Default Configuration
|
|
- **MQTT Host**: `mqtt.home` (configurable via `MQTT_HOST`)
|
|
- **Stats Topic**: `systant/${hostname}/stats` (per-host topics)
|
|
- **Command Topic**: `systant/${hostname}/commands` (per-host topics)
|
|
- **Response Topic**: `systant/${hostname}/responses` (command responses)
|
|
- **Publish Interval**: 30 seconds (configurable via `SYSTANT_INTERVAL`)
|
|
- **Command System**: Enabled by default with example commands (restart, info, df, ps, ping)
|
|
|
|
### NixOS Deployment
|
|
This project includes a complete Nix packaging and NixOS module:
|
|
|
|
- **Package**: `nix/package.nix` - Builds the Elixir release using beamPackages.mixRelease
|
|
- **Module**: `nix/nixos-module.nix` - Provides `services.systant` configuration options
|
|
- **Development**: Use `nix develop` for development shell with Elixir/Erlang
|
|
|
|
The NixOS module supports:
|
|
- Configurable MQTT connection settings
|
|
- Per-host topic naming using `${config.networking.hostName}`
|
|
- Environment variable configuration for runtime settings
|
|
- Systemd service with security hardening
|
|
- Auto-restart and logging to systemd journal
|
|
|
|
## Dashboard
|
|
|
|
The project includes a Phoenix LiveView dashboard (`dashboard/`) that provides real-time monitoring of all systant instances.
|
|
|
|
### Dashboard Features
|
|
- Real-time host status updates via MQTT subscription
|
|
- LiveView interface showing all connected hosts
|
|
- Automatic reconnection and error handling
|
|
|
|
### Dashboard MQTT Configuration
|
|
- Subscribes to `systant/+/stats` to receive updates from all hosts
|
|
- Uses hostname-based client ID: `systant-dashboard-${hostname}` to avoid conflicts
|
|
- Connects to `mqtt.home:1883` (same broker as systant instances)
|
|
|
|
### Important Implementation Notes
|
|
- **Tortoise Handler**: The `handle_message/3` callback must return `{:ok, state}`, not `[]`
|
|
- **Topic Parsing**: Topics may arrive as lists or strings, handle both formats
|
|
- **Client ID Conflicts**: Use unique client IDs to prevent connection instability
|
|
|
|
## Development Roadmap
|
|
|
|
### Phase 1: System Metrics Collection (Completed)
|
|
- ✅ **SystemMetrics Module**: `server/lib/systant/system_metrics.ex` - Comprehensive metrics collection
|
|
- ✅ **CPU Metrics**: Load averages (1/5/15min) via `/proc/loadavg`
|
|
- ✅ **Memory Metrics**: System memory data via `/proc/meminfo` with usage percentages
|
|
- ✅ **Disk Metrics**: Disk usage and capacity via `df` command with configurable filtering
|
|
- ✅ **GPU Metrics**: NVIDIA (nvidia-smi) and AMD (rocm-smi) GPU monitoring with temperature, utilization, memory
|
|
- ✅ **Network Metrics**: Interface statistics via `/proc/net/dev` with traffic filtering
|
|
- ✅ **Temperature Metrics**: CPU temperature and lm-sensors data via system files and `sensors` command
|
|
- ✅ **Process Metrics**: Top processes by CPU/memory via `ps` command with configurable limits
|
|
- ✅ **System Info**: Uptime via `/proc/uptime`, kernel version, OS info, Erlang runtime data
|
|
- ✅ **MQTT Integration**: Real metrics published with configurable intervals replacing simple messages
|
|
- ✅ **Configuration System**: Complete TOML-based configuration with environment overrides
|
|
- ✅ **Dashboard Integration**: Phoenix LiveView dashboard with real-time graphical metrics display
|
|
|
|
#### Implementation Details
|
|
- Uses Linux native system commands and `/proc` filesystem for accuracy over Erlang os_mon
|
|
- Configuration-driven metric collection with per-module enable/disable capabilities
|
|
- Advanced filtering: disk mounts/types, network interfaces, process thresholds
|
|
- Graceful error handling with fallbacks when commands/files unavailable
|
|
- JSON payload structure: `{timestamp, hostname, cpu, memory, disk, gpu, network, temperature, processes, system}`
|
|
- Dashboard displays metrics as progress bars and cards with color-coded status indicators
|
|
- TOML configuration with environment variable overrides for deployment flexibility
|
|
|
|
### Phase 2: Command System (Completed)
|
|
- ✅ **Command Execution**: `server/lib/systant/command_executor.ex` - Secure command processing with whitelist validation
|
|
- ✅ **MQTT Handler**: `server/lib/systant/mqtt_handler.ex` - Custom Tortoise handler for command message processing
|
|
- ✅ **User Configuration**: Commands fully configurable via `systant.toml` with security parameters
|
|
- ✅ **MQTT Integration**: Commands via `systant/{hostname}/commands`, responses via `systant/{hostname}/responses`
|
|
- ✅ **Security Features**: Whitelist-only execution, parameter validation, timeouts, comprehensive logging
|
|
- ✅ **Built-in Commands**: `list` command shows all available user-defined commands
|
|
|
|
#### Command System Features
|
|
- **User-Configurable Commands**: Define custom commands in `systant.toml` with triggers, allowed parameters, timeouts
|
|
- **Enterprise Security**: No arbitrary shell execution, strict parameter validation, execution timeouts
|
|
- **Simple Interface**: Send `{"command":"trigger","params":[...]}`, receive structured JSON responses
|
|
- **Request Tracking**: Auto-generated request IDs for command/response correlation
|
|
- **Comprehensive Logging**: Full audit trail of all command executions with timing and results
|
|
|
|
#### Example Command Usage
|
|
```bash
|
|
# Send commands via MQTT
|
|
mosquitto_pub -t "systant/hostname/commands" -m '{"command":"list"}'
|
|
mosquitto_pub -t "systant/hostname/commands" -m '{"command":"info"}'
|
|
mosquitto_pub -t "systant/hostname/commands" -m '{"command":"df","params":["/home"]}'
|
|
mosquitto_pub -t "systant/hostname/commands" -m '{"command":"restart","params":["nginx"]}'
|
|
|
|
# Listen for responses
|
|
mosquitto_sub -t "systant/+/responses"
|
|
```
|
|
|
|
### Phase 3: Home Assistant Integration (Completed)
|
|
- ✅ **MQTT Auto-Discovery**: `server/lib/systant/ha_discovery.ex` - Publishes HA discovery configurations for automatic device registration
|
|
- ✅ **Device Registration**: Creates unified "Systant {hostname}" device in Home Assistant with comprehensive sensor suite
|
|
- ✅ **Sensor Auto-Discovery**: CPU load averages, memory usage, system uptime, temperatures, GPU metrics, disk usage, network stats
|
|
- ✅ **Configuration Integration**: TOML-based enable/disable with `homeassistant.discovery_enabled` setting
|
|
- ✅ **Value Templates**: Proper JSON path extraction for nested metrics data with error handling
|
|
- ✅ **Real-time Updates**: Seamless integration with existing MQTT stats publishing - no additional topics needed
|
|
|
|
#### Home Assistant Integration Features
|
|
- **Automatic Discovery**: No custom integration required - uses standard MQTT discovery protocol
|
|
- **Device Grouping**: All sensors grouped under single "Systant {hostname}" device for clean organization
|
|
- **Comprehensive Metrics**: CPU, memory, disk, GPU (NVIDIA/AMD), network, temperature, and system sensors
|
|
- **Configuration Control**: Enable/disable discovery via `systant.toml` configuration
|
|
- **Template Flexibility**: Advanced Jinja2 templates handle optional/missing data gracefully
|
|
- **Topic Structure**: Discovery on `homeassistant/#`, stats remain on `systant/{hostname}/stats`
|
|
|
|
#### Setup Instructions
|
|
1. **Configure MQTT Discovery**: Set `homeassistant.discovery_enabled = true` in `systant.toml`
|
|
2. **Start Systant**: Discovery messages published automatically on startup (1s after MQTT connection)
|
|
3. **Check Home Assistant**: Device and sensors appear automatically in MQTT integration
|
|
4. **Verify Metrics**: All sensors should show current values within 30 seconds
|
|
|
|
#### Available Sensors
|
|
- **CPU**: Load averages (1m, 5m, 15m), temperature
|
|
- **Memory**: Usage percentage, used/total in GB
|
|
- **Disk**: Root and home filesystem usage percentages
|
|
- **GPU**: NVIDIA/AMD utilization, temperature, memory usage
|
|
- **Network**: RX/TX bytes for primary interface
|
|
- **System**: Uptime in hours, kernel version, online status
|
|
|
|
### Future Plans
|
|
- Multi-host deployment for comprehensive system monitoring
|
|
- Advanced alerting and threshold monitoring
|
|
- Historical data retention and trending |