diff --git a/README.md b/README.md index c4c848a..bc1e2d1 100644 --- a/README.md +++ b/README.md @@ -1,158 +1,231 @@ # GuardDen -A lightweight, cost-conscious Discord moderation bot focused on essential protection. Built for self-hosting with minimal resource usage and AI costs. +A lightweight, cost-conscious Discord moderation bot focused on automated protection against spam and NSFW content. Built for self-hosting with minimal resource usage and predictable AI costs. + +## Overview + +GuardDen is a minimal Discord bot designed for small to medium servers (1-2 guilds) that need automated moderation without the complexity of full-featured moderation systems. It focuses on two core areas: + +1. **Spam Detection** - Automatic rate limiting, duplicate detection, and mass mention protection +2. **NSFW Content Filtering** - AI-powered image analysis with aggressive cost controls + +**What GuardDen is NOT:** +- Not a full moderation suite (no manual mod commands, logging, or strike systems) +- Not a verification/captcha system +- Not a chat moderation bot (no text analysis, banned words, or scam detection) + +**Target Users:** +- Small community servers that need automated spam + NSFW protection +- Budget-conscious server owners (~$5-25/month AI costs) +- Self-hosters who want a simple, maintainable bot + +--- ## Features -### Spam Detection -- **Anti-Spam** - Rate limiting, duplicate detection, mass mention protection -- **Automatic Actions** - Message deletion and user timeout for spam violations +| Feature | Description | Cost | +|---------|-------------|------| +| **Spam Detection** | Rate limiting, duplicate messages, mass mentions | Free | +| **NSFW Image Detection** | AI-powered analysis of images/GIFs using Claude or GPT | ~$5-25/month | +| **User Blocklist** | Block ALL media from specific users instantly | Free | +| **NSFW Domain Blocking** | Instant blocking of known NSFW video domains | Free | +| **Cost Controls** | Rate limits, deduplication, file size limits | Built-in | +| **Single Config File** | One YAML file for all settings | Easy | +| **Owner Commands** | Status, reload, ping | Free | -### AI-Powered NSFW Image Detection -- **Smart Image Analysis** - AI-powered detection of inappropriate images using Claude or GPT -- **Cost Controls** - Conservative rate limits (25 checks/hour/guild by default) -- **Embed Support** - Optional checking of Discord GIF embeds -- **NSFW Video Domain Blocking** - Block known NSFW video domains -- **Configurable Sensitivity** - Adjust strictness (0-100) +### Spam Detection + +Automatically detects and deletes spam messages based on: +- **Message Rate Limiting**: Max 5 messages per 5 seconds (configurable) +- **Duplicate Detection**: Flags repeated identical messages +- **Mass Mentions**: Limits @mentions per message and per time window +- **Actions**: Deletes message, no notifications to user + +### NSFW Image Detection + +AI-powered analysis of images and GIFs with strict cost controls: +- **Supported Providers**: Anthropic Claude, OpenAI GPT +- **Content Types**: Image attachments, Discord GIF embeds (optional) +- **NSFW Categories**: Suggestive, Partial Nudity, Nudity, Explicit +- **Filtering Mode**: NSFW-only by default (only blocks sexual content) +- **Cost Controls**: + - 25 AI checks/hour/guild (default) + - 5 AI checks/hour/user (default) + - Image deduplication (tracks 1000 recent messages) + - File size limit (skip > 3MB) + - Max images per message (2 by default) +- **Actions**: Deletes message, no notifications to user + +### User Blocklist + +Instantly delete ALL media from specific users: +- **Blocks**: Images, GIFs, embeds, URLs +- **No AI Cost**: Instant deletion without analysis +- **Use Case**: Known problematic users, spam accounts + +### NSFW Domain Blocking + +Pre-configured list of known NSFW video domains: +- Blocks: pornhub.com, xvideos.com, xnxx.com, etc. +- **No AI Cost**: Pattern matching only +- **Instant**: Deletes message immediately + +--- ## Quick Start ### Prerequisites -- Python 3.11+ -- PostgreSQL 15+ -- Discord Bot Token (see setup below) -- (Optional) Anthropic or OpenAI API key for AI features -### Discord Bot Setup +| Requirement | Version | Purpose | +|-------------|---------|---------| +| Python | 3.11+ | Bot runtime | +| PostgreSQL | 15+ | Database | +| Discord Bot Token | - | Bot authentication | +| AI API Key | (Optional) | Claude or OpenAI for NSFW detection | -1. Go to the [Discord Developer Portal](https://discord.com/developers/applications) -2. Click **New Application** and give it a name (e.g., "GuardDen") -3. Go to the **Bot** tab and click **Add Bot** +### 1. Discord Bot Setup -4. **Configure Bot Settings:** - - Disable **Public Bot** if you only want yourself to add it - - Copy the **Token** (click "Reset Token") - this is your `GUARDDEN_DISCORD_TOKEN` +1. **Create Application** + - Go to [Discord Developer Portal](https://discord.com/developers/applications) + - Click **New Application** → Name it (e.g., "GuardDen") + - Go to **Bot** tab → **Add Bot** -5. **Enable Privileged Gateway Intents** (required): - - **Message Content Intent** - for reading messages (spam detection, image checking) +2. **Get Bot Token** + - Click **Reset Token** → Copy the token + - Save as `GUARDDEN_DISCORD_TOKEN` in `.env` -6. **Generate Invite URL** - Go to **OAuth2** > **URL Generator**: - - **Scopes:** - - `bot` - - **Bot Permissions:** - - Moderate Members (timeout) - - View Channels - - Send Messages - - Manage Messages - - Read Message History - - Or use permission integer: `275415089216` +3. **Enable Intents** + - Enable **Message Content Intent** (required for reading messages) -7. Use the generated URL to invite the bot to your server +4. **Generate Invite URL** + - Go to **OAuth2** → **URL Generator** + - Select scopes: `bot` + - Select permissions: + - Moderate Members (timeout) + - View Channels + - Send Messages + - Manage Messages + - Read Message History + - Or use permission integer: `275415089216` + - Copy generated URL and invite to your server -### Docker Deployment (Recommended) +### 2. Installation -1. Clone the repository: - ```bash - git clone https://git.hiddenden.cafe/Hiddenden/GuardDen.git - cd guardden - ``` +**Option A: Docker (Recommended)** -2. Create your configuration files: - ```bash - # Environment variables - cp .env.example .env - # Edit .env and add your Discord token - - # Bot configuration - cp config.example.yml config.yml - # Edit config.yml with your settings - ``` +```bash +# Clone repository +git clone https://git.hiddenden.cafe/Hiddenden/GuardDen.git +cd GuardDen -3. Start with Docker Compose: - ```bash - docker compose up -d - ``` +# Create configuration files +cp .env.example .env +cp config.example.yml config.yml -### Local Development +# Edit .env - Add your Discord token +nano .env -1. Create a virtual environment: - ```bash - python -m venv venv - source venv/bin/activate # On Windows: venv\Scripts\activate - ``` +# Edit config.yml - Configure settings +nano config.yml -2. Install dependencies: - ```bash - pip install -e ".[dev,ai]" - ``` +# Start with Docker Compose +docker compose up -d -3. Set up configuration: - ```bash - # Environment variables - cp .env.example .env - # Edit .env with your Discord token - - # Bot configuration - cp config.example.yml config.yml - # Edit config.yml with your settings - ``` +# View logs +docker logs guardden-bot -f +``` -4. Start PostgreSQL (or use Docker): - ```bash - docker compose up db -d - ``` +**Option B: Local Development** -5. Run the bot: - ```bash - python -m guardden - ``` +```bash +# Clone repository +git clone https://git.hiddenden.cafe/Hiddenden/GuardDen.git +cd GuardDen + +# Create virtual environment +python -m venv venv +source venv/bin/activate # Windows: venv\Scripts\activate + +# Install dependencies +pip install -e ".[dev,ai]" + +# Create configuration files +cp .env.example .env +cp config.example.yml config.yml + +# Edit configuration +nano .env +nano config.yml + +# Start PostgreSQL (or use Docker) +docker compose up db -d + +# Run database migrations +alembic upgrade head + +# Start bot +python -m guardden +``` + +--- ## Configuration -GuardDen uses a **single YAML configuration file** (`config.yml`) for managing all bot settings across all guilds. +### Environment Variables (`.env`) -### Configuration File (`config.yml`) +| Variable | Required | Description | Default | +|----------|----------|-------------|---------| +| `GUARDDEN_DISCORD_TOKEN` | ✅ | Discord bot token | - | +| `GUARDDEN_DATABASE_URL` | No | PostgreSQL connection URL | `postgresql://guardden:guardden@localhost:5432/guardden` | +| `GUARDDEN_LOG_LEVEL` | No | Logging level (DEBUG/INFO/WARNING/ERROR) | `INFO` | +| `GUARDDEN_AI_PROVIDER` | No | AI provider (`anthropic`/`openai`/`none`) | `none` | +| `GUARDDEN_ANTHROPIC_API_KEY` | No* | Anthropic API key | - | +| `GUARDDEN_OPENAI_API_KEY` | No* | OpenAI API key | - | -Create a `config.yml` file in your project root: +*Required if `AI_PROVIDER` is set to `anthropic` or `openai` + +### Bot Configuration (`config.yml`) ```yaml +# Bot Settings bot: prefix: "!" owner_ids: - - 123456789012345678 # Your Discord user ID + - 123456789012345678 # Your Discord user ID (for owner commands) -# Spam detection settings +# Spam Detection automod: enabled: true anti_spam_enabled: true message_rate_limit: 5 # Max messages per window message_rate_window: 5 # Window in seconds - duplicate_threshold: 3 # Duplicates to trigger + duplicate_threshold: 3 # Duplicate messages to trigger mention_limit: 5 # Max mentions per message mention_rate_limit: 10 # Max mentions per window - mention_rate_window: 60 # Window in seconds + mention_rate_window: 60 # Mention window in seconds -# AI moderation settings +# AI Moderation (NSFW Detection) ai_moderation: enabled: true sensitivity: 80 # 0-100 (higher = stricter) nsfw_only_filtering: true # Only filter sexual content - max_checks_per_hour_per_guild: 25 # Cost control - max_checks_per_user_per_hour: 5 # Cost control - max_images_per_message: 2 # Analyze max 2 images/msg - max_image_size_mb: 3 # Skip images > 3MB - check_embed_images: true # Check Discord GIF embeds + + # Cost Controls + max_checks_per_hour_per_guild: 25 # Conservative limit + max_checks_per_user_per_hour: 5 # Prevent abuse + max_images_per_message: 2 # Analyze max 2 images + max_image_size_mb: 3 # Skip large files + + # Feature Toggles + check_embed_images: true # Check Discord GIFs check_video_thumbnails: false # Skip video thumbnails - url_image_check_enabled: false # Skip URL image downloads + url_image_check_enabled: false # Skip URL downloads -# User blocklist (blocks ALL media from specific users) +# User Blocklist (instant deletion) blocked_user_ids: - 123456789012345678 # Discord user ID to block -# Known NSFW video domains (auto-block) +# NSFW Domain Blocklist (instant blocking) nsfw_video_domains: - pornhub.com - xvideos.com @@ -161,64 +234,100 @@ nsfw_video_domains: - youporn.com ``` -### Key Configuration Options - -**AI Moderation (NSFW Image Detection):** -- `sensitivity`: 0-100 scale (higher = stricter detection) -- `nsfw_only_filtering`: Only flag sexual content (violence/harassment allowed) -- `max_checks_per_hour_per_guild`: Cost control - limits AI API calls -- `check_embed_images`: Whether to analyze Discord GIF embeds +### Configuration Options Explained **Spam Detection:** -- `message_rate_limit`: Max messages allowed per window -- `duplicate_threshold`: How many duplicate messages trigger action +- `message_rate_limit`: How many messages allowed in time window +- `duplicate_threshold`: How many identical messages trigger spam detection - `mention_limit`: Max @mentions allowed per message +**AI Moderation:** +- `sensitivity`: Detection strictness (80 = balanced, 100 = very strict, 50 = lenient) +- `nsfw_only_filtering`: `true` = only block sexual content (default), `false` = block all inappropriate content +- `max_checks_per_hour_per_guild`: Hard limit on AI API calls per guild (cost control) +- `max_checks_per_user_per_hour`: Per-user limit to prevent spam/abuse + **User Blocklist:** -- `blocked_user_ids`: List of Discord user IDs to block -- Automatically deletes ALL images, GIFs, embeds, and URLs from these users -- No AI cost - instant deletion -- Useful for known problematic users or spam accounts +- Add Discord user IDs to instantly delete ALL their media +- No AI cost - instant pattern matching +- Useful for repeat offenders or spam bots -**Cost Controls:** -The bot includes multiple layers of cost control: -- Rate limiting (25 AI checks/hour/guild, 5/hour/user by default) -- Image deduplication (tracks last 1000 analyzed messages) -- File size limits (skip images > 3MB) -- Max images per message (analyze max 2 images) -- Optional embed checking (disable to save costs) +**Cost Estimation:** +- Small server (< 100 users): ~$5-10/month +- Medium server (100-500 users): ~$15-25/month +- Large server (500+ users): Increase rate limits or disable embed checking -### Environment Variables - -| Variable | Description | Default | -|----------|-------------|---------| -| `GUARDDEN_DISCORD_TOKEN` | Your Discord bot token | **Required** | -| `GUARDDEN_DATABASE_URL` | PostgreSQL connection URL | `postgresql://guardden:guardden@localhost:5432/guardden` | -| `GUARDDEN_LOG_LEVEL` | Logging level | `INFO` | -| `GUARDDEN_AI_PROVIDER` | AI provider (anthropic/openai/none) | `none` | -| `GUARDDEN_ANTHROPIC_API_KEY` | Anthropic API key (if using Claude) | - | -| `GUARDDEN_OPENAI_API_KEY` | OpenAI API key (if using GPT) | - | +--- ## Owner Commands -GuardDen includes a minimal set of owner-only commands for bot management: - | Command | Description | |---------|-------------| | `!status` | Show bot status (uptime, guilds, latency, AI provider) | -| `!reload` | Reload all cogs | +| `!reload` | Reload all cogs (apply code changes without restart) | | `!ping` | Check bot latency | -**Note:** All configuration is done via the `config.yml` file. There are no in-Discord configuration commands. +**Note:** All configuration is done via `config.yml`. There are no in-Discord configuration commands. -## Project Structure +--- + +## How It Works + +### Detection Flow + +``` +Message Received + ↓ +[1] User Blocklist Check (instant) + ↓ (if not blocked) +[2] NSFW Domain Check (instant) + ↓ (if no NSFW domain) +[3] Spam Detection (free) + ↓ (if not spam) +[4] Has Images/Embeds? + ↓ (if yes) +[5] AI Rate Limit Check + ↓ (if under limit) +[6] Image Deduplication + ↓ (if not analyzed recently) +[7] AI Analysis (cost) + ↓ +[8] Action: Delete if violation +``` + +### Action Behavior + +When a violation is detected: +- ✅ **Message deleted** immediately +- ✅ **Action logged** to console/log file +- ❌ **No DM sent** to user (silent) +- ❌ **No timeout** applied (delete only) +- ❌ **No moderation log** in Discord + +### Cost Controls + +Multiple layers to keep AI costs predictable: + +1. **User Blocklist** - Skip AI entirely for known bad actors +2. **Domain Blocklist** - Skip AI for known NSFW domains +3. **Rate Limiting** - Hard caps per guild and per user +4. **Deduplication** - Don't re-analyze same message +5. **File Size Limits** - Skip very large files +6. **Max Images** - Limit images analyzed per message +7. **Optional Features** - Disable embed checking to save costs + +--- + +## Development + +### Project Structure ``` guardden/ ├── src/guardden/ │ ├── bot.py # Main bot class │ ├── config.py # Settings management -│ ├── cogs/ # Discord command groups +│ ├── cogs/ # Discord command modules │ │ ├── automod.py # Spam detection │ │ ├── ai_moderation.py # NSFW image detection │ │ └── owner.py # Owner commands @@ -228,86 +337,117 @@ guardden/ │ │ ├── ai/ # AI provider implementations │ │ ├── automod.py # Spam detection logic │ │ ├── config_loader.py # YAML config loading -│ │ ├── ai_rate_limiter.py # AI cost control -│ │ ├── database.py # DB connections -│ │ └── guild_config.py # Config caching +│ │ ├── ai_rate_limiter.py # Cost control +│ │ └── database.py # DB connections │ └── __main__.py # Entry point -├── config.yml # Bot configuration +├── config.yml # Bot configuration (not in git) +├── config.example.yml # Configuration template +├── .env # Environment variables (not in git) +├── .env.example # Environment template ├── tests/ # Test suite ├── migrations/ # Database migrations ├── docker-compose.yml # Docker deployment -├── pyproject.toml # Dependencies -└── README.md # This file +└── pyproject.toml # Dependencies ``` -## How It Works - -### User Blocklist (Instant, No AI Cost) -1. Checks if message author is in `blocked_user_ids` list -2. If message contains ANY media (images, embeds, URLs), instantly deletes it -3. No AI analysis needed - immediate action -4. Useful for known spam accounts or problematic users - -### Spam Detection -1. Bot monitors message rate per user -2. Detects duplicate messages -3. Counts @mentions (mass mention detection) -4. Violations result in message deletion + timeout - -### NSFW Image Detection -1. Checks user blocklist first (instant deletion if matched) -2. Checks NSFW video domain blocklist (instant deletion) -3. Bot checks attachments and embeds for images -4. Applies rate limiting and deduplication -5. Downloads image and sends to AI provider -6. AI analyzes for NSFW content categories -7. Violations result in message deletion + timeout - -### Cost Management -The bot includes aggressive cost controls for AI usage: -- **Rate Limiting**: 25 checks/hour/guild, 5/hour/user (configurable) -- **Deduplication**: Skips recently analyzed message IDs -- **File Size Limits**: Skips images larger than 3MB -- **Max Images**: Analyzes max 2 images per message -- **Optional Features**: Embed checking, video thumbnails, URL downloads all controllable - -**Estimated Costs** (with defaults): -- Small server (< 100 users): ~$5-10/month -- Medium server (100-500 users): ~$15-25/month -- Large server (500+ users): Consider increasing rate limits or disabling embeds - -## Development - ### Running Tests ```bash +# Run all tests pytest -pytest -v # Verbose output -pytest tests/test_automod.py # Specific file -pytest -k "test_scam" # Filter by name + +# Run specific tests +pytest tests/test_automod.py + +# Run with coverage +pytest --cov=src/guardden --cov-report=html ``` ### Code Quality ```bash -ruff check src tests # Linting -ruff format src tests # Formatting -mypy src # Type checking +# Linting +ruff check src tests + +# Formatting +ruff format src tests + +# Type checking +mypy src ``` +### Database Migrations + +```bash +# Apply migrations +alembic upgrade head + +# Create new migration +alembic revision --autogenerate -m "description" + +# Rollback one migration +alembic downgrade -1 +``` + +--- + +## Troubleshooting + +### Bot won't start + +**Error: `Config file not found: config.yml`** +- Solution: Copy `config.example.yml` to `config.yml` and edit settings + +**Error: `Discord token cannot be empty`** +- Solution: Add `GUARDDEN_DISCORD_TOKEN` to `.env` file + +**Error: `Cannot import name 'ModerationResult'`** +- Solution: Pull latest changes and rebuild: `docker compose up -d --build` + +### Bot doesn't respond to commands + +**Check:** +1. Bot is online in Discord +2. Bot has correct permissions (Manage Messages, View Channels) +3. Your user ID is in `owner_ids` in config.yml +4. Check logs: `docker logs guardden-bot -f` + +### AI not working + +**Check:** +1. `ai_moderation.enabled: true` in config.yml +2. `GUARDDEN_AI_PROVIDER` set to `anthropic` or `openai` in .env +3. API key is set in .env (`GUARDDEN_ANTHROPIC_API_KEY` or `GUARDDEN_OPENAI_API_KEY`) +4. Check logs for API errors + +### High AI costs + +**Reduce costs by:** +1. Lower `max_checks_per_hour_per_guild` in config.yml +2. Set `check_embed_images: false` to skip GIF embeds +3. Add known offenders to `blocked_user_ids` blocklist +4. Increase `max_image_size_mb` to skip large files + +--- + ## License MIT License - see LICENSE file for details. +--- + ## Support -- **Issues**: Report bugs at https://github.com/anthropics/claude-code/issues -- **Documentation**: See `docs/` directory -- **Configuration Help**: Check `CLAUDE.md` for developer guidance +- **Issues**: [Report bugs](https://git.hiddenden.cafe/Hiddenden/GuardDen/issues) +- **Configuration**: See `CLAUDE.md` for developer guidance +- **Testing**: See `TESTING_TODO.md` for test status -## Future Considerations +--- -- [ ] Per-guild sensitivity settings (currently global) +## Roadmap + +- [ ] Per-guild configuration support - [ ] Slash commands -- [ ] Custom NSFW category thresholds +- [ ] Custom NSFW thresholds per category - [ ] Whitelist for trusted image sources +- [ ] Dashboard for viewing stats