GuardDen/README.md

# GuardDen

A lightweight, cost-conscious Discord moderation bot focused on automated protection against spam and NSFW content. Built for self-hosting with minimal resource usage and predictable AI costs.

## Overview

GuardDen is a minimal Discord bot designed for small to medium servers (1-2 guilds) that need automated moderation without the complexity of full-featured moderation systems. It focuses on two core areas:

1. **Spam Detection** - Automatic rate limiting, duplicate detection, and mass mention protection
2. **NSFW Content Filtering** - AI-powered image analysis with aggressive cost controls

**What GuardDen is NOT:**
- Not a full moderation suite (no manual mod commands, logging, or strike systems)
- Not a verification/captcha system
- Not a chat moderation bot (no text analysis, banned words, or scam detection)

**Target Users:**
- Small community servers that need automated spam + NSFW protection
- Budget-conscious server owners (~$5-25/month AI costs)
- Self-hosters who want a simple, maintainable bot

---

## Features

| Feature | Description | Cost |
|---------|-------------|------|
| **Spam Detection** | Rate limiting, duplicate messages, mass mentions | Free |
| **NSFW Image Detection** | AI-powered analysis of images/GIFs using Claude or GPT | ~$5-25/month |
| **User Blocklist** | Block ALL media from specific users instantly | Free |
| **NSFW Domain Blocking** | Instant blocking of known NSFW video domains | Free |
| **Cost Controls** | Rate limits, deduplication, file size limits | Built-in |
| **Single Config File** | One YAML file for all settings | Easy |
| **Owner Commands** | Status, reload, ping | Free |

### Spam Detection

Automatically detects and deletes spam messages based on:
- **Message Rate Limiting**: Max 5 messages per 5 seconds (configurable)
- **Duplicate Detection**: Flags repeated identical messages
- **Mass Mentions**: Limits @mentions per message and per time window
- **Actions**: Deletes message, no notifications to user

### NSFW Image Detection

AI-powered analysis of images and GIFs with strict cost controls:
- **Supported Providers**: Anthropic Claude, OpenAI GPT
- **Content Types**: Image attachments, Discord GIF embeds (optional)
- **NSFW Categories**: Suggestive, Partial Nudity, Nudity, Explicit
- **Filtering Mode**: NSFW-only by default (only blocks sexual content)
- **Cost Controls**:
  - 25 AI checks/hour/guild (default)
  - 5 AI checks/hour/user (default)
  - Image deduplication (tracks 1000 recent messages)
  - File size limit (skip > 3MB)
  - Max images per message (2 by default)
- **Actions**: Deletes message, no notifications to user

### User Blocklist

Instantly delete ALL media from specific users:
- **Blocks**: Images, GIFs, embeds, URLs
- **No AI Cost**: Instant deletion without analysis
- **Use Case**: Known problematic users, spam accounts

### NSFW Domain Blocking

Pre-configured list of known NSFW video domains:
- Blocks: pornhub.com, xvideos.com, xnxx.com, etc.
- **No AI Cost**: Pattern matching only
- **Instant**: Deletes message immediately

---

## Quick Start

### Prerequisites

| Requirement | Version | Purpose |
|-------------|---------|---------|
| Python | 3.11+ | Bot runtime |
| PostgreSQL | 15+ | Database |
| Discord Bot Token | - | Bot authentication |
| AI API Key | (Optional) | Claude or OpenAI for NSFW detection |

### 1. Discord Bot Setup

1. **Create Application**
   - Go to [Discord Developer Portal](https://discord.com/developers/applications)
   - Click **New Application** → Name it (e.g., "GuardDen")
   - Go to **Bot** tab → **Add Bot**

2. **Get Bot Token**
   - Click **Reset Token** → Copy the token
   - Save as `GUARDDEN_DISCORD_TOKEN` in `.env`

3. **Enable Intents**
   - Enable **Message Content Intent** (required for reading messages)

4. **Generate Invite URL**
   - Go to **OAuth2** → **URL Generator**
   - Select scopes: `bot`
   - Select permissions:
     - Moderate Members (timeout)
     - View Channels
     - Send Messages
     - Manage Messages
     - Read Message History
   - Or use permission integer: `275415089216`
   - Copy generated URL and invite to your server

### 2. Installation

**Option A: Docker (Recommended)**

```bash
# Clone repository
git clone https://git.hiddenden.cafe/Hiddenden/GuardDen.git
cd GuardDen

# Create configuration files
cp .env.example .env
cp config.example.yml config.yml

# Edit .env - Add your Discord token
nano .env

# Edit config.yml - Configure settings
nano config.yml

# Start with Docker Compose
docker compose up -d

# View logs
docker logs guardden-bot -f
```

**Option B: Local Development**

```bash
# Clone repository
git clone https://git.hiddenden.cafe/Hiddenden/GuardDen.git
cd GuardDen

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# Install dependencies
pip install -e ".[dev,ai]"

# Create configuration files
cp .env.example .env
cp config.example.yml config.yml

# Edit configuration
nano .env
nano config.yml

# Start PostgreSQL (or use Docker)
docker compose up db -d

# Run database migrations
alembic upgrade head

# Start bot
python -m guardden
```

---

## Configuration

### Environment Variables (`.env`)

| Variable | Required | Description | Default |
|----------|----------|-------------|---------|
| `GUARDDEN_DISCORD_TOKEN` | ✅ | Discord bot token | - |
| `GUARDDEN_DATABASE_URL` | No | PostgreSQL connection URL | `postgresql://guardden:guardden@localhost:5432/guardden` |
| `GUARDDEN_LOG_LEVEL` | No | Logging level (DEBUG/INFO/WARNING/ERROR) | `INFO` |
| `GUARDDEN_AI_PROVIDER` | No | AI provider (`anthropic`/`openai`/`none`) | `none` |
| `GUARDDEN_ANTHROPIC_API_KEY` | No* | Anthropic API key | - |
| `GUARDDEN_OPENAI_API_KEY` | No* | OpenAI API key | - |

*Required if `AI_PROVIDER` is set to `anthropic` or `openai`

### Bot Configuration (`config.yml`)

```yaml
# Bot Settings
bot:
  prefix: "!"
  owner_ids:
    - 123456789012345678  # Your Discord user ID (for owner commands)

# Spam Detection
automod:
  enabled: true
  anti_spam_enabled: true
  message_rate_limit: 5           # Max messages per window
  message_rate_window: 5          # Window in seconds
  duplicate_threshold: 3          # Duplicate messages to trigger
  mention_limit: 5                # Max mentions per message
  mention_rate_limit: 10          # Max mentions per window
  mention_rate_window: 60         # Mention window in seconds

# AI Moderation (NSFW Detection)
ai_moderation:
  enabled: true
  sensitivity: 80                  # 0-100 (higher = stricter)
  nsfw_only_filtering: true        # Only filter sexual content

  # Cost Controls
  max_checks_per_hour_per_guild: 25  # Conservative limit
  max_checks_per_user_per_hour: 5    # Prevent abuse
  max_images_per_message: 2          # Analyze max 2 images
  max_image_size_mb: 3               # Skip large files

  # Feature Toggles
  check_embed_images: true           # Check Discord GIFs
  check_video_thumbnails: false      # Skip video thumbnails
  url_image_check_enabled: false     # Skip URL downloads

# User Blocklist (instant deletion)
blocked_user_ids:
  - 123456789012345678  # Discord user ID to block

# NSFW Domain Blocklist (instant blocking)
nsfw_video_domains:
  - pornhub.com
  - xvideos.com
  - xnxx.com
  - redtube.com
  - youporn.com
```

### Configuration Options Explained

**Spam Detection:**
- `message_rate_limit`: How many messages allowed in time window
- `duplicate_threshold`: How many identical messages trigger spam detection
- `mention_limit`: Max @mentions allowed per message

**AI Moderation:**
- `sensitivity`: Detection strictness (80 = balanced, 100 = very strict, 50 = lenient)
- `nsfw_only_filtering`: `true` = only block sexual content (default), `false` = block all inappropriate content
- `max_checks_per_hour_per_guild`: Hard limit on AI API calls per guild (cost control)
- `max_checks_per_user_per_hour`: Per-user limit to prevent spam/abuse

**User Blocklist:**
- Add Discord user IDs to instantly delete ALL their media
- No AI cost - instant pattern matching
- Useful for repeat offenders or spam bots

**Cost Estimation:**
- Small server (< 100 users): ~$5-10/month
- Medium server (100-500 users): ~$15-25/month
- Large server (500+ users): Increase rate limits or disable embed checking

---

## Owner Commands

| Command | Description |
|---------|-------------|
| `!status` | Show bot status (uptime, guilds, latency, AI provider) |
| `!reload` | Reload all cogs (apply code changes without restart) |
| `!ping` | Check bot latency |

**Note:** All configuration is done via `config.yml`. There are no in-Discord configuration commands.

---

## How It Works

### Detection Flow

```
Message Received
    ↓
[1] User Blocklist Check (instant)
    ↓ (if not blocked)
[2] NSFW Domain Check (instant)
    ↓ (if no NSFW domain)
[3] Spam Detection (free)
    ↓ (if not spam)
[4] Has Images/Embeds?
    ↓ (if yes)
[5] AI Rate Limit Check
    ↓ (if under limit)
[6] Image Deduplication
    ↓ (if not analyzed recently)
[7] AI Analysis (cost)
    ↓
[8] Action: Delete if violation
```

### Action Behavior

When a violation is detected:
- ✅ **Message deleted** immediately
- ✅ **Action logged** to console/log file
- ❌ **No DM sent** to user (silent)
- ❌ **No timeout** applied (delete only)
- ❌ **No moderation log** in Discord

### Cost Controls

Multiple layers to keep AI costs predictable:

1. **User Blocklist** - Skip AI entirely for known bad actors
2. **Domain Blocklist** - Skip AI for known NSFW domains
3. **Rate Limiting** - Hard caps per guild and per user
4. **Deduplication** - Don't re-analyze same message
5. **File Size Limits** - Skip very large files
6. **Max Images** - Limit images analyzed per message
7. **Optional Features** - Disable embed checking to save costs

---

## Development

### Project Structure

```
guardden/
├── src/guardden/
│   ├── bot.py                  # Main bot class
│   ├── config.py               # Settings management
│   ├── cogs/                   # Discord command modules
│   │   ├── automod.py          # Spam detection
│   │   ├── ai_moderation.py    # NSFW image detection
│   │   └── owner.py            # Owner commands
│   ├── models/                 # Database models
│   │   └── guild.py            # Guild settings
│   ├── services/               # Business logic
│   │   ├── ai/                 # AI provider implementations
│   │   ├── automod.py          # Spam detection logic
│   │   ├── config_loader.py    # YAML config loading
│   │   ├── ai_rate_limiter.py  # Cost control
│   │   └── database.py         # DB connections
│   └── __main__.py             # Entry point
├── config.yml                  # Bot configuration (not in git)
├── config.example.yml          # Configuration template
├── .env                        # Environment variables (not in git)
├── .env.example                # Environment template
├── tests/                      # Test suite
├── migrations/                 # Database migrations
├── docker-compose.yml          # Docker deployment
└── pyproject.toml              # Dependencies
```

### Running Tests

```bash
# Run all tests
pytest

# Run specific tests
pytest tests/test_automod.py

# Run with coverage
pytest --cov=src/guardden --cov-report=html
```

### Code Quality

```bash
# Linting
ruff check src tests

# Formatting
ruff format src tests

# Type checking
mypy src
```

### Database Migrations

```bash
# Apply migrations
alembic upgrade head

# Create new migration
alembic revision --autogenerate -m "description"

# Rollback one migration
alembic downgrade -1
```

---

## Troubleshooting

### Bot won't start

**Error: `Config file not found: config.yml`**
- Solution: Copy `config.example.yml` to `config.yml` and edit settings

**Error: `Discord token cannot be empty`**
- Solution: Add `GUARDDEN_DISCORD_TOKEN` to `.env` file

**Error: `Cannot import name 'ModerationResult'`**
- Solution: Pull latest changes and rebuild: `docker compose up -d --build`

### Bot doesn't respond to commands

**Check:**
1. Bot is online in Discord
2. Bot has correct permissions (Manage Messages, View Channels)
3. Your user ID is in `owner_ids` in config.yml
4. Check logs: `docker logs guardden-bot -f`

### AI not working

**Check:**
1. `ai_moderation.enabled: true` in config.yml
2. `GUARDDEN_AI_PROVIDER` set to `anthropic` or `openai` in .env
3. API key is set in .env (`GUARDDEN_ANTHROPIC_API_KEY` or `GUARDDEN_OPENAI_API_KEY`)
4. Check logs for API errors

### High AI costs

**Reduce costs by:**
1. Lower `max_checks_per_hour_per_guild` in config.yml
2. Set `check_embed_images: false` to skip GIF embeds
3. Add known offenders to `blocked_user_ids` blocklist
4. Increase `max_image_size_mb` to skip large files

---

## License

MIT License - see LICENSE file for details.

---

## Support

- **Issues**: [Report bugs](https://git.hiddenden.cafe/Hiddenden/GuardDen/issues)
- **Configuration**: See `CLAUDE.md` for developer guidance
- **Testing**: See `TESTING_TODO.md` for test status

---

## Roadmap

- [ ] Per-guild configuration support
- [ ] Slash commands
- [ ] Custom NSFW thresholds per category
- [ ] Whitelist for trusted image sources
- [ ] Dashboard for viewing stats