docs: Rewrite README with comprehensive feature documentation

Complete overhaul of README.md with better structure and clarity. New sections: - Clear overview of what GuardDen is (and isn't) - Feature comparison table with costs - Detailed feature descriptions - Prerequisites table - Step-by-step Discord bot setup - Configuration options explained - Detection flow diagram - Cost controls breakdown - Troubleshooting guide - Project structure - Development guide Improvements: - Professional formatting with tables - Clear cost transparency - Better quick start instructions - Comprehensive configuration guide - Troubleshooting section for common issues
2026-01-27 20:19:49 +01:00
parent 562c92dae6
commit b9bc2cf0b5
1 changed files with 333 additions and 193 deletions
--- a/README.md
+++ b/README.md
@@ -1,158 +1,231 @@
 # GuardDen
-A lightweight, cost-conscious Discord moderation bot focused on essential protection. Built for self-hosting with minimal resource usage and AI costs.
+A lightweight, cost-conscious Discord moderation bot focused on automated protection against spam and NSFW content. Built for self-hosting with minimal resource usage and predictable AI costs.
 ## Overview
 GuardDen is a minimal Discord bot designed for small to medium servers (1-2 guilds) that need automated moderation without the complexity of full-featured moderation systems. It focuses on two core areas:
 1. **Spam Detection** - Automatic rate limiting, duplicate detection, and mass mention protection
 2. **NSFW Content Filtering** - AI-powered image analysis with aggressive cost controls
 **What GuardDen is NOT:**
 - Not a full moderation suite (no manual mod commands, logging, or strike systems)
 - Not a verification/captcha system
 - Not a chat moderation bot (no text analysis, banned words, or scam detection)
 **Target Users:**
 - Small community servers that need automated spam + NSFW protection
 - Budget-conscious server owners (~$5-25/month AI costs)
 - Self-hosters who want a simple, maintainable bot
 ---
 ## Features
-### Spam Detection
+| Feature | Description | Cost |
- **Anti-Spam** - Rate limiting, duplicate detection, mass mention protection
+|---------|-------------|------|
- **Automatic Actions** - Message deletion and user timeout for spam violations
+| **Spam Detection** | Rate limiting, duplicate messages, mass mentions | Free |
 | **NSFW Image Detection** | AI-powered analysis of images/GIFs using Claude or GPT | ~$5-25/month |
 | **User Blocklist** | Block ALL media from specific users instantly | Free |
 | **NSFW Domain Blocking** | Instant blocking of known NSFW video domains | Free |
 | **Cost Controls** | Rate limits, deduplication, file size limits | Built-in |
 | **Single Config File** | One YAML file for all settings | Easy |
 | **Owner Commands** | Status, reload, ping | Free |
-### AI-Powered NSFW Image Detection
+### Spam Detection
- **Smart Image Analysis** - AI-powered detection of inappropriate images using Claude or GPT
+
- **Cost Controls** - Conservative rate limits (25 checks/hour/guild by default)
+Automatically detects and deletes spam messages based on:
- **Embed Support** - Optional checking of Discord GIF embeds
+- **Message Rate Limiting**: Max 5 messages per 5 seconds (configurable)
- **NSFW Video Domain Blocking** - Block known NSFW video domains
+- **Duplicate Detection**: Flags repeated identical messages
- **Configurable Sensitivity** - Adjust strictness (0-100)
+- **Mass Mentions**: Limits @mentions per message and per time window
 - **Actions**: Deletes message, no notifications to user
 ### NSFW Image Detection
 AI-powered analysis of images and GIFs with strict cost controls:
 - **Supported Providers**: Anthropic Claude, OpenAI GPT
 - **Content Types**: Image attachments, Discord GIF embeds (optional)
 - **NSFW Categories**: Suggestive, Partial Nudity, Nudity, Explicit
 - **Filtering Mode**: NSFW-only by default (only blocks sexual content)
 - **Cost Controls**:
  - 25 AI checks/hour/guild (default)
  - 5 AI checks/hour/user (default)
  - Image deduplication (tracks 1000 recent messages)
  - File size limit (skip > 3MB)
  - Max images per message (2 by default)
 - **Actions**: Deletes message, no notifications to user
 ### User Blocklist
 Instantly delete ALL media from specific users:
 - **Blocks**: Images, GIFs, embeds, URLs
 - **No AI Cost**: Instant deletion without analysis
 - **Use Case**: Known problematic users, spam accounts
 ### NSFW Domain Blocking
 Pre-configured list of known NSFW video domains:
 - Blocks: pornhub.com, xvideos.com, xnxx.com, etc.
 - **No AI Cost**: Pattern matching only
 - **Instant**: Deletes message immediately
 ---
 ## Quick Start
 ### Prerequisites
 - Python 3.11+
 - PostgreSQL 15+
 - Discord Bot Token (see setup below)
 - (Optional) Anthropic or OpenAI API key for AI features
-### Discord Bot Setup
+| Requirement | Version | Purpose |
 |-------------|---------|---------|
 | Python | 3.11+ | Bot runtime |
 | PostgreSQL | 15+ | Database |
 | Discord Bot Token | - | Bot authentication |
 | AI API Key | (Optional) | Claude or OpenAI for NSFW detection |
-1. Go to the [Discord Developer Portal](https://discord.com/developers/applications)
+### 1. Discord Bot Setup
 2. Click **New Application** and give it a name (e.g., "GuardDen")
 3. Go to the **Bot** tab and click **Add Bot**
-4. **Configure Bot Settings:**
+1. **Create Application**
-   - Disable **Public Bot** if you only want yourself to add it
+   - Go to [Discord Developer Portal](https://discord.com/developers/applications)
-   - Copy the **Token** (click "Reset Token") - this is your `GUARDDEN_DISCORD_TOKEN`
+   - Click **New Application** → Name it (e.g., "GuardDen")
   - Go to **Bot** tab → **Add Bot**
-5. **Enable Privileged Gateway Intents** (required):
+2. **Get Bot Token**
-   - **Message Content Intent** - for reading messages (spam detection, image checking)
+   - Click **Reset Token** → Copy the token
   - Save as `GUARDDEN_DISCORD_TOKEN` in `.env`
-6. **Generate Invite URL** - Go to **OAuth2** > **URL Generator**:
+3. **Enable Intents**
-   
+   - Enable **Message Content Intent** (required for reading messages)
   **Scopes:**
   - `bot`
   **Bot Permissions:**
   - Moderate Members (timeout)
   - View Channels
   - Send Messages
   - Manage Messages
   - Read Message History
   Or use permission integer: `275415089216`
-7. Use the generated URL to invite the bot to your server
+4. **Generate Invite URL**
   - Go to **OAuth2** → **URL Generator**
   - Select scopes: `bot`
   - Select permissions:
     - Moderate Members (timeout)
     - View Channels
     - Send Messages
     - Manage Messages
     - Read Message History
   - Or use permission integer: `275415089216`
   - Copy generated URL and invite to your server
-### Docker Deployment (Recommended)
+### 2. Installation
-1. Clone the repository:
+**Option A: Docker (Recommended)**
   ```bash
   git clone https://git.hiddenden.cafe/Hiddenden/GuardDen.git
   cd guardden
   ```
-2. Create your configuration files:
+```bash
-   ```bash
+# Clone repository
-   # Environment variables
+git clone https://git.hiddenden.cafe/Hiddenden/GuardDen.git
-   cp .env.example .env
+cd GuardDen
   # Edit .env and add your Discord token
   # Bot configuration
   cp config.example.yml config.yml
   # Edit config.yml with your settings
   ```
-3. Start with Docker Compose:
+# Create configuration files
-   ```bash
+cp .env.example .env
-   docker compose up -d
+cp config.example.yml config.yml
   ```
-### Local Development
+# Edit .env - Add your Discord token
 nano .env
-1. Create a virtual environment:
+# Edit config.yml - Configure settings
-   ```bash
+nano config.yml
   python -m venv venv
   source venv/bin/activate  # On Windows: venv\Scripts\activate
   ```
-2. Install dependencies:
+# Start with Docker Compose
-   ```bash
+docker compose up -d
   pip install -e ".[dev,ai]"
   ```
-3. Set up configuration:
+# View logs
-   ```bash
+docker logs guardden-bot -f
-   # Environment variables
+```
   cp .env.example .env
   # Edit .env with your Discord token
   # Bot configuration
   cp config.example.yml config.yml
   # Edit config.yml with your settings
   ```
-4. Start PostgreSQL (or use Docker):
+**Option B: Local Development**
   ```bash
   docker compose up db -d
   ```
-5. Run the bot:
+```bash
-   ```bash
+# Clone repository
-   python -m guardden
+git clone https://git.hiddenden.cafe/Hiddenden/GuardDen.git
-   ```
+cd GuardDen
 # Create virtual environment
 python -m venv venv
 source venv/bin/activate  # Windows: venv\Scripts\activate
 # Install dependencies
 pip install -e ".[dev,ai]"
 # Create configuration files
 cp .env.example .env
 cp config.example.yml config.yml
 # Edit configuration
 nano .env
 nano config.yml
 # Start PostgreSQL (or use Docker)
 docker compose up db -d
 # Run database migrations
 alembic upgrade head
 # Start bot
 python -m guardden
 ```
 ---
 ## Configuration
-GuardDen uses a **single YAML configuration file** (`config.yml`) for managing all bot settings across all guilds.
+### Environment Variables (`.env`)
-### Configuration File (`config.yml`)
+| Variable | Required | Description | Default |
 |----------|----------|-------------|---------|
 | `GUARDDEN_DISCORD_TOKEN` | ✅ | Discord bot token | - |
 | `GUARDDEN_DATABASE_URL` | No | PostgreSQL connection URL | `postgresql://guardden:guardden@localhost:5432/guardden` |
 | `GUARDDEN_LOG_LEVEL` | No | Logging level (DEBUG/INFO/WARNING/ERROR) | `INFO` |
 | `GUARDDEN_AI_PROVIDER` | No | AI provider (`anthropic`/`openai`/`none`) | `none` |
 | `GUARDDEN_ANTHROPIC_API_KEY` | No* | Anthropic API key | - |
 | `GUARDDEN_OPENAI_API_KEY` | No* | OpenAI API key | - |
-Create a `config.yml` file in your project root:
+*Required if `AI_PROVIDER` is set to `anthropic` or `openai`
 ### Bot Configuration (`config.yml`)
 ```yaml
 # Bot Settings
 bot:
  prefix: "!"
  owner_ids:
-    - 123456789012345678  # Your Discord user ID
+    - 123456789012345678  # Your Discord user ID (for owner commands)
-# Spam detection settings
+# Spam Detection
 automod:
  enabled: true
  anti_spam_enabled: true
  message_rate_limit: 5           # Max messages per window
  message_rate_window: 5          # Window in seconds
-  duplicate_threshold: 3          # Duplicates to trigger
+  duplicate_threshold: 3          # Duplicate messages to trigger
  mention_limit: 5                # Max mentions per message
  mention_rate_limit: 10          # Max mentions per window
-  mention_rate_window: 60         # Window in seconds
+  mention_rate_window: 60         # Mention window in seconds
-# AI moderation settings
+# AI Moderation (NSFW Detection)
 ai_moderation:
  enabled: true
  sensitivity: 80                  # 0-100 (higher = stricter)
  nsfw_only_filtering: true        # Only filter sexual content
-  max_checks_per_hour_per_guild: 25  # Cost control
+  
-  max_checks_per_user_per_hour: 5    # Cost control
+  # Cost Controls
-  max_images_per_message: 2          # Analyze max 2 images/msg
+  max_checks_per_hour_per_guild: 25  # Conservative limit
-  max_image_size_mb: 3               # Skip images > 3MB
+  max_checks_per_user_per_hour: 5    # Prevent abuse
-  check_embed_images: true           # Check Discord GIF embeds
+  max_images_per_message: 2          # Analyze max 2 images
  max_image_size_mb: 3               # Skip large files
  # Feature Toggles
  check_embed_images: true           # Check Discord GIFs
  check_video_thumbnails: false      # Skip video thumbnails
-  url_image_check_enabled: false     # Skip URL image downloads
+  url_image_check_enabled: false     # Skip URL downloads
-# User blocklist (blocks ALL media from specific users)
+# User Blocklist (instant deletion)
 blocked_user_ids:
  - 123456789012345678  # Discord user ID to block
-# Known NSFW video domains (auto-block)
+# NSFW Domain Blocklist (instant blocking)
 nsfw_video_domains:
  - pornhub.com
  - xvideos.com
@@ -161,64 +234,100 @@ nsfw_video_domains:
  - youporn.com
 ```
-### Key Configuration Options
+### Configuration Options Explained
 **AI Moderation (NSFW Image Detection):**
 - `sensitivity`: 0-100 scale (higher = stricter detection)
 - `nsfw_only_filtering`: Only flag sexual content (violence/harassment allowed)
 - `max_checks_per_hour_per_guild`: Cost control - limits AI API calls
 - `check_embed_images`: Whether to analyze Discord GIF embeds
 **Spam Detection:**
- `message_rate_limit`: Max messages allowed per window
+- `message_rate_limit`: How many messages allowed in time window
- `duplicate_threshold`: How many duplicate messages trigger action
+- `duplicate_threshold`: How many identical messages trigger spam detection
 - `mention_limit`: Max @mentions allowed per message
 **AI Moderation:**
 - `sensitivity`: Detection strictness (80 = balanced, 100 = very strict, 50 = lenient)
 - `nsfw_only_filtering`: `true` = only block sexual content (default), `false` = block all inappropriate content
 - `max_checks_per_hour_per_guild`: Hard limit on AI API calls per guild (cost control)
 - `max_checks_per_user_per_hour`: Per-user limit to prevent spam/abuse
 **User Blocklist:**
- `blocked_user_ids`: List of Discord user IDs to block
+- Add Discord user IDs to instantly delete ALL their media
- Automatically deletes ALL images, GIFs, embeds, and URLs from these users
+- No AI cost - instant pattern matching
- No AI cost - instant deletion
+- Useful for repeat offenders or spam bots
 - Useful for known problematic users or spam accounts
-**Cost Controls:**
+**Cost Estimation:**
-The bot includes multiple layers of cost control:
+- Small server (< 100 users): ~$5-10/month
- Rate limiting (25 AI checks/hour/guild, 5/hour/user by default)
+- Medium server (100-500 users): ~$15-25/month
- Image deduplication (tracks last 1000 analyzed messages)
+- Large server (500+ users): Increase rate limits or disable embed checking
 - File size limits (skip images > 3MB)
 - Max images per message (analyze max 2 images)
 - Optional embed checking (disable to save costs)
-### Environment Variables
+---
 | Variable | Description | Default |
 |----------|-------------|---------|
 | `GUARDDEN_DISCORD_TOKEN` | Your Discord bot token | **Required** |
 | `GUARDDEN_DATABASE_URL` | PostgreSQL connection URL | `postgresql://guardden:guardden@localhost:5432/guardden` |
 | `GUARDDEN_LOG_LEVEL` | Logging level | `INFO` |
 | `GUARDDEN_AI_PROVIDER` | AI provider (anthropic/openai/none) | `none` |
 | `GUARDDEN_ANTHROPIC_API_KEY` | Anthropic API key (if using Claude) | - |
 | `GUARDDEN_OPENAI_API_KEY` | OpenAI API key (if using GPT) | - |
 ## Owner Commands
 GuardDen includes a minimal set of owner-only commands for bot management:
 | Command | Description |
 |---------|-------------|
 | `!status` | Show bot status (uptime, guilds, latency, AI provider) |
-| `!reload` | Reload all cogs |
+| `!reload` | Reload all cogs (apply code changes without restart) |
 | `!ping` | Check bot latency |
-**Note:** All configuration is done via the `config.yml` file. There are no in-Discord configuration commands.
+**Note:** All configuration is done via `config.yml`. There are no in-Discord configuration commands.
-## Project Structure
+---
 ## How It Works
 ### Detection Flow
 ```
 Message Received
    ↓
 [1] User Blocklist Check (instant)
    ↓ (if not blocked)
 [2] NSFW Domain Check (instant)
    ↓ (if no NSFW domain)
 [3] Spam Detection (free)
    ↓ (if not spam)
 [4] Has Images/Embeds?
    ↓ (if yes)
 [5] AI Rate Limit Check
    ↓ (if under limit)
 [6] Image Deduplication
    ↓ (if not analyzed recently)
 [7] AI Analysis (cost)
    ↓
 [8] Action: Delete if violation
 ```
 ### Action Behavior
 When a violation is detected:
 - ✅ **Message deleted** immediately
 - ✅ **Action logged** to console/log file
 - ❌ **No DM sent** to user (silent)
 - ❌ **No timeout** applied (delete only)
 - ❌ **No moderation log** in Discord
 ### Cost Controls
 Multiple layers to keep AI costs predictable:
 1. **User Blocklist** - Skip AI entirely for known bad actors
 2. **Domain Blocklist** - Skip AI for known NSFW domains
 3. **Rate Limiting** - Hard caps per guild and per user
 4. **Deduplication** - Don't re-analyze same message
 5. **File Size Limits** - Skip very large files
 6. **Max Images** - Limit images analyzed per message
 7. **Optional Features** - Disable embed checking to save costs
 ---
 ## Development
 ### Project Structure
 ```
 guardden/
 ├── src/guardden/
 │   ├── bot.py                  # Main bot class
 │   ├── config.py               # Settings management
-│   ├── cogs/                   # Discord command groups
+│   ├── cogs/                   # Discord command modules
 │   │   ├── automod.py          # Spam detection
 │   │   ├── ai_moderation.py    # NSFW image detection
 │   │   └── owner.py            # Owner commands
@@ -228,86 +337,117 @@ guardden/
 │   │   ├── ai/                 # AI provider implementations
 │   │   ├── automod.py          # Spam detection logic
 │   │   ├── config_loader.py    # YAML config loading
-│   │   ├── ai_rate_limiter.py  # AI cost control
+│   │   ├── ai_rate_limiter.py  # Cost control
-│   │   ├── database.py         # DB connections
+│   │   └── database.py         # DB connections
 │   │   └── guild_config.py     # Config caching
 │   └── __main__.py             # Entry point
-├── config.yml                  # Bot configuration
+├── config.yml                  # Bot configuration (not in git)
 ├── config.example.yml          # Configuration template
 ├── .env                        # Environment variables (not in git)
 ├── .env.example                # Environment template
 ├── tests/                      # Test suite
 ├── migrations/                 # Database migrations
 ├── docker-compose.yml          # Docker deployment
-├── pyproject.toml              # Dependencies
+└── pyproject.toml              # Dependencies
 └── README.md                   # This file
 ```
 ## How It Works
 ### User Blocklist (Instant, No AI Cost)
 1. Checks if message author is in `blocked_user_ids` list
 2. If message contains ANY media (images, embeds, URLs), instantly deletes it
 3. No AI analysis needed - immediate action
 4. Useful for known spam accounts or problematic users
 ### Spam Detection
 1. Bot monitors message rate per user
 2. Detects duplicate messages
 3. Counts @mentions (mass mention detection)
 4. Violations result in message deletion + timeout
 ### NSFW Image Detection
 1. Checks user blocklist first (instant deletion if matched)
 2. Checks NSFW video domain blocklist (instant deletion)
 3. Bot checks attachments and embeds for images
 4. Applies rate limiting and deduplication
 5. Downloads image and sends to AI provider
 6. AI analyzes for NSFW content categories
 7. Violations result in message deletion + timeout
 ### Cost Management
 The bot includes aggressive cost controls for AI usage:
 - **Rate Limiting**: 25 checks/hour/guild, 5/hour/user (configurable)
 - **Deduplication**: Skips recently analyzed message IDs
 - **File Size Limits**: Skips images larger than 3MB
 - **Max Images**: Analyzes max 2 images per message
 - **Optional Features**: Embed checking, video thumbnails, URL downloads all controllable
 **Estimated Costs** (with defaults):
 - Small server (< 100 users): ~$5-10/month
 - Medium server (100-500 users): ~$15-25/month
 - Large server (500+ users): Consider increasing rate limits or disabling embeds
 ## Development
 ### Running Tests
 ```bash
 # Run all tests
 pytest
-pytest -v                           # Verbose output
+
-pytest tests/test_automod.py        # Specific file
+# Run specific tests
-pytest -k "test_scam"               # Filter by name
+pytest tests/test_automod.py
 # Run with coverage
 pytest --cov=src/guardden --cov-report=html
 ```
 ### Code Quality
 ```bash
-ruff check src tests                # Linting
+# Linting
-ruff format src tests               # Formatting
+ruff check src tests
-mypy src                            # Type checking
+
 # Formatting
 ruff format src tests
 # Type checking
 mypy src
 ```
 ### Database Migrations
 ```bash
 # Apply migrations
 alembic upgrade head
 # Create new migration
 alembic revision --autogenerate -m "description"
 # Rollback one migration
 alembic downgrade -1
 ```
 ---
 ## Troubleshooting
 ### Bot won't start
 **Error: `Config file not found: config.yml`**
 - Solution: Copy `config.example.yml` to `config.yml` and edit settings
 **Error: `Discord token cannot be empty`**
 - Solution: Add `GUARDDEN_DISCORD_TOKEN` to `.env` file
 **Error: `Cannot import name 'ModerationResult'`**
 - Solution: Pull latest changes and rebuild: `docker compose up -d --build`
 ### Bot doesn't respond to commands
 **Check:**
 1. Bot is online in Discord
 2. Bot has correct permissions (Manage Messages, View Channels)
 3. Your user ID is in `owner_ids` in config.yml
 4. Check logs: `docker logs guardden-bot -f`
 ### AI not working
 **Check:**
 1. `ai_moderation.enabled: true` in config.yml
 2. `GUARDDEN_AI_PROVIDER` set to `anthropic` or `openai` in .env
 3. API key is set in .env (`GUARDDEN_ANTHROPIC_API_KEY` or `GUARDDEN_OPENAI_API_KEY`)
 4. Check logs for API errors
 ### High AI costs
 **Reduce costs by:**
 1. Lower `max_checks_per_hour_per_guild` in config.yml
 2. Set `check_embed_images: false` to skip GIF embeds
 3. Add known offenders to `blocked_user_ids` blocklist
 4. Increase `max_image_size_mb` to skip large files
 ---
 ## License
 MIT License - see LICENSE file for details.
 ---
 ## Support
- **Issues**: Report bugs at https://github.com/anthropics/claude-code/issues
+- **Issues**: [Report bugs](https://git.hiddenden.cafe/Hiddenden/GuardDen/issues)
- **Documentation**: See `docs/` directory
+- **Configuration**: See `CLAUDE.md` for developer guidance
- **Configuration Help**: Check `CLAUDE.md` for developer guidance
+- **Testing**: See `TESTING_TODO.md` for test status
-## Future Considerations
+---
- [ ] Per-guild sensitivity settings (currently global)
+## Roadmap
 - [ ] Per-guild configuration support
 - [ ] Slash commands
- [ ] Custom NSFW category thresholds
+- [ ] Custom NSFW thresholds per category
 - [ ] Whitelist for trusted image sources
 - [ ] Dashboard for viewing stats