Merge pull request 'Add SearXNG web search for current information' (#8) from feature/internet-access into main
Reviewed-on: MSC/Daemon-Boyfriend#8
This commit is contained in:
12
.env.example
12
.env.example
@@ -49,6 +49,18 @@ BOT_STATUS=for mentions
|
|||||||
# Number of messages to remember per user (higher = more context, more tokens)
|
# Number of messages to remember per user (higher = more context, more tokens)
|
||||||
MAX_CONVERSATION_HISTORY=20
|
MAX_CONVERSATION_HISTORY=20
|
||||||
|
|
||||||
|
# ===========================================
|
||||||
|
# Web Search (SearXNG)
|
||||||
|
# ===========================================
|
||||||
|
# SearXNG instance URL for web search (enables the bot to access current information)
|
||||||
|
SEARXNG_URL=https://search.example.com
|
||||||
|
|
||||||
|
# Enable/disable web search capability (true/false)
|
||||||
|
SEARXNG_ENABLED=true
|
||||||
|
|
||||||
|
# Maximum number of search results to fetch (1-20)
|
||||||
|
SEARXNG_MAX_RESULTS=5
|
||||||
|
|
||||||
# ===========================================
|
# ===========================================
|
||||||
# Logging
|
# Logging
|
||||||
# ===========================================
|
# ===========================================
|
||||||
|
|||||||
10
CLAUDE.md
10
CLAUDE.md
@@ -39,11 +39,21 @@ Cogs are auto-loaded by `bot.py` from the `cogs/` directory.
|
|||||||
### Configuration
|
### Configuration
|
||||||
All config flows through `config.py` using pydantic-settings. The `settings` singleton is created at module load, so env vars must be set before importing.
|
All config flows through `config.py` using pydantic-settings. The `settings` singleton is created at module load, so env vars must be set before importing.
|
||||||
|
|
||||||
|
### Web Search
|
||||||
|
The bot can search the web for current information via SearXNG:
|
||||||
|
- `services/searxng.py` provides `SearXNGService` for web queries
|
||||||
|
- `ai_chat.py` uses a two-step approach: first asks AI if search is needed, then provides results as context
|
||||||
|
- Search is triggered automatically when the AI determines the query needs current information
|
||||||
|
- Configured via `SEARXNG_URL`, `SEARXNG_ENABLED`, and `SEARXNG_MAX_RESULTS` env vars
|
||||||
|
|
||||||
### Key Design Decisions
|
### Key Design Decisions
|
||||||
- `ConversationManager` stores per-user chat history in memory with configurable max length
|
- `ConversationManager` stores per-user chat history in memory with configurable max length
|
||||||
- Long AI responses are split via `split_message()` in `ai_chat.py` to respect Discord's 2000 char limit
|
- Long AI responses are split via `split_message()` in `ai_chat.py` to respect Discord's 2000 char limit
|
||||||
- The bot responds only to @mentions via `on_message` listener
|
- The bot responds only to @mentions via `on_message` listener
|
||||||
|
- Web search uses AI to decide when to search, avoiding unnecessary API calls for general knowledge questions
|
||||||
|
|
||||||
## Environment Variables
|
## Environment Variables
|
||||||
|
|
||||||
Required: `DISCORD_TOKEN`, plus one of `OPENAI_API_KEY`, `OPENROUTER_API_KEY`, `ANTHROPIC_API_KEY`, or `GEMINI_API_KEY` depending on `AI_PROVIDER` setting.
|
Required: `DISCORD_TOKEN`, plus one of `OPENAI_API_KEY`, `OPENROUTER_API_KEY`, `ANTHROPIC_API_KEY`, or `GEMINI_API_KEY` depending on `AI_PROVIDER` setting.
|
||||||
|
|
||||||
|
Optional: `SEARXNG_URL` for web search capability.
|
||||||
|
|||||||
14
README.md
14
README.md
@@ -5,6 +5,7 @@ A customizable Discord bot that responds to @mentions with AI-generated response
|
|||||||
## Features
|
## Features
|
||||||
|
|
||||||
- **Multi-Provider AI**: Supports OpenAI, OpenRouter, Anthropic (Claude), and Google Gemini
|
- **Multi-Provider AI**: Supports OpenAI, OpenRouter, Anthropic (Claude), and Google Gemini
|
||||||
|
- **Web Search**: Access current information via SearXNG integration
|
||||||
- **Fully Customizable**: Configure bot name, personality, and behavior
|
- **Fully Customizable**: Configure bot name, personality, and behavior
|
||||||
- **Conversation Memory**: Remembers context per user
|
- **Conversation Memory**: Remembers context per user
|
||||||
- **Easy Deployment**: Docker support included
|
- **Easy Deployment**: Docker support included
|
||||||
@@ -75,6 +76,16 @@ All configuration is done via environment variables in `.env`.
|
|||||||
| `AI_TEMPERATURE` | `0.7` | Response creativity (0.0-2.0) |
|
| `AI_TEMPERATURE` | `0.7` | Response creativity (0.0-2.0) |
|
||||||
| `MAX_CONVERSATION_HISTORY` | `20` | Messages to remember per user |
|
| `MAX_CONVERSATION_HISTORY` | `20` | Messages to remember per user |
|
||||||
|
|
||||||
|
### Web Search (SearXNG)
|
||||||
|
|
||||||
|
| Variable | Default | Description |
|
||||||
|
|----------|---------|-------------|
|
||||||
|
| `SEARXNG_URL` | (none) | SearXNG instance URL |
|
||||||
|
| `SEARXNG_ENABLED` | `true` | Enable/disable web search |
|
||||||
|
| `SEARXNG_MAX_RESULTS` | `5` | Max search results to fetch |
|
||||||
|
|
||||||
|
When configured, the bot automatically searches the web for queries that need current information (news, weather, etc.).
|
||||||
|
|
||||||
### Example Configurations
|
### Example Configurations
|
||||||
|
|
||||||
**Friendly Assistant:**
|
**Friendly Assistant:**
|
||||||
@@ -143,7 +154,8 @@ src/daemon_boyfriend/
|
|||||||
└── services/
|
└── services/
|
||||||
├── ai_service.py # AI provider factory
|
├── ai_service.py # AI provider factory
|
||||||
├── providers/ # AI providers
|
├── providers/ # AI providers
|
||||||
└── conversation.py # Chat history
|
├── conversation.py # Chat history
|
||||||
|
└── searxng.py # Web search service
|
||||||
```
|
```
|
||||||
|
|
||||||
## License
|
## License
|
||||||
|
|||||||
@@ -7,7 +7,7 @@ import discord
|
|||||||
from discord.ext import commands
|
from discord.ext import commands
|
||||||
|
|
||||||
from daemon_boyfriend.config import settings
|
from daemon_boyfriend.config import settings
|
||||||
from daemon_boyfriend.services import AIService, ConversationManager, Message
|
from daemon_boyfriend.services import AIService, ConversationManager, Message, SearXNGService
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
@@ -71,6 +71,9 @@ class AIChatCog(commands.Cog):
|
|||||||
self.bot = bot
|
self.bot = bot
|
||||||
self.ai_service = AIService()
|
self.ai_service = AIService()
|
||||||
self.conversations = ConversationManager()
|
self.conversations = ConversationManager()
|
||||||
|
self.search_service: SearXNGService | None = None
|
||||||
|
if settings.searxng_enabled and settings.searxng_url:
|
||||||
|
self.search_service = SearXNGService(settings.searxng_url)
|
||||||
|
|
||||||
@commands.Cog.listener()
|
@commands.Cog.listener()
|
||||||
async def on_message(self, message: discord.Message) -> None:
|
async def on_message(self, message: discord.Message) -> None:
|
||||||
@@ -138,10 +141,23 @@ class AIChatCog(commands.Cog):
|
|||||||
# Add current message to history for the API call
|
# Add current message to history for the API call
|
||||||
messages = history + [Message(role="user", content=user_message)]
|
messages = history + [Message(role="user", content=user_message)]
|
||||||
|
|
||||||
|
# Check if we should search the web
|
||||||
|
search_context = await self._maybe_search(user_message)
|
||||||
|
|
||||||
|
# Build system prompt with search context if available
|
||||||
|
system_prompt = self.ai_service.get_system_prompt()
|
||||||
|
if search_context:
|
||||||
|
system_prompt += (
|
||||||
|
"\n\n--- Web Search Results ---\n"
|
||||||
|
"Use the following current information from the web to help answer the user's question. "
|
||||||
|
"Cite sources when relevant.\n\n"
|
||||||
|
f"{search_context}"
|
||||||
|
)
|
||||||
|
|
||||||
# Generate response
|
# Generate response
|
||||||
response = await self.ai_service.chat(
|
response = await self.ai_service.chat(
|
||||||
messages=messages,
|
messages=messages,
|
||||||
system_prompt=self.ai_service.get_system_prompt(),
|
system_prompt=system_prompt,
|
||||||
)
|
)
|
||||||
|
|
||||||
# Save the exchange to history
|
# Save the exchange to history
|
||||||
@@ -154,6 +170,64 @@ class AIChatCog(commands.Cog):
|
|||||||
|
|
||||||
return response.content
|
return response.content
|
||||||
|
|
||||||
|
async def _maybe_search(self, query: str) -> str | None:
|
||||||
|
"""Determine if a search is needed and perform it.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
query: The user's message
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Formatted search results or None if search not needed/available
|
||||||
|
"""
|
||||||
|
if not self.search_service:
|
||||||
|
return None
|
||||||
|
|
||||||
|
# Ask the AI if this query needs current information
|
||||||
|
decision_prompt = (
|
||||||
|
"You are a search decision assistant. Your ONLY job is to decide if the user's "
|
||||||
|
"question requires current/real-time information from the internet.\n\n"
|
||||||
|
"Respond with ONLY 'SEARCH: <query>' if a web search would help answer the question "
|
||||||
|
"(replace <query> with optimal search terms), or 'NO_SEARCH' if the question can be "
|
||||||
|
"answered with general knowledge.\n\n"
|
||||||
|
"Examples that NEED search:\n"
|
||||||
|
"- Current events, news, recent happenings\n"
|
||||||
|
"- Current weather, stock prices, sports scores\n"
|
||||||
|
"- Latest version of software, current documentation\n"
|
||||||
|
"- Information about specific people, companies, or products that may have changed\n"
|
||||||
|
"- 'What time is it in Tokyo?' or any real-time data\n\n"
|
||||||
|
"Examples that DON'T need search:\n"
|
||||||
|
"- General knowledge, science, math, history\n"
|
||||||
|
"- Coding help, programming concepts\n"
|
||||||
|
"- Personal advice, opinions, creative writing\n"
|
||||||
|
"- Explanations of concepts or 'how does X work'"
|
||||||
|
)
|
||||||
|
|
||||||
|
try:
|
||||||
|
decision = await self.ai_service.chat(
|
||||||
|
messages=[Message(role="user", content=query)],
|
||||||
|
system_prompt=decision_prompt,
|
||||||
|
)
|
||||||
|
|
||||||
|
response_text = decision.content.strip()
|
||||||
|
|
||||||
|
if response_text.startswith("SEARCH:"):
|
||||||
|
search_query = response_text[7:].strip()
|
||||||
|
logger.info(f"AI decided to search for: {search_query}")
|
||||||
|
|
||||||
|
results = await self.search_service.search(
|
||||||
|
query=search_query,
|
||||||
|
max_results=settings.searxng_max_results,
|
||||||
|
)
|
||||||
|
|
||||||
|
if results:
|
||||||
|
return self.search_service.format_results_for_context(results)
|
||||||
|
|
||||||
|
return None
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
logger.warning(f"Search decision/execution failed: {e}")
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
async def setup(bot: commands.Bot) -> None:
|
async def setup(bot: commands.Bot) -> None:
|
||||||
"""Load the AI Chat cog."""
|
"""Load the AI Chat cog."""
|
||||||
|
|||||||
@@ -61,6 +61,11 @@ class Settings(BaseSettings):
|
|||||||
20, description="Max messages to keep in conversation memory per user"
|
20, description="Max messages to keep in conversation memory per user"
|
||||||
)
|
)
|
||||||
|
|
||||||
|
# SearXNG Configuration
|
||||||
|
searxng_url: str | None = Field(None, description="SearXNG instance URL for web search")
|
||||||
|
searxng_enabled: bool = Field(True, description="Enable web search capability")
|
||||||
|
searxng_max_results: int = Field(5, ge=1, le=20, description="Maximum search results to fetch")
|
||||||
|
|
||||||
def get_api_key(self) -> str:
|
def get_api_key(self) -> str:
|
||||||
"""Get the API key for the configured provider."""
|
"""Get the API key for the configured provider."""
|
||||||
key_map = {
|
key_map = {
|
||||||
|
|||||||
@@ -3,10 +3,12 @@
|
|||||||
from .ai_service import AIService
|
from .ai_service import AIService
|
||||||
from .conversation import ConversationManager
|
from .conversation import ConversationManager
|
||||||
from .providers import AIResponse, Message
|
from .providers import AIResponse, Message
|
||||||
|
from .searxng import SearXNGService
|
||||||
|
|
||||||
__all__ = [
|
__all__ = [
|
||||||
"AIService",
|
"AIService",
|
||||||
"AIResponse",
|
"AIResponse",
|
||||||
"Message",
|
"Message",
|
||||||
"ConversationManager",
|
"ConversationManager",
|
||||||
|
"SearXNGService",
|
||||||
]
|
]
|
||||||
|
|||||||
107
src/daemon_boyfriend/services/searxng.py
Normal file
107
src/daemon_boyfriend/services/searxng.py
Normal file
@@ -0,0 +1,107 @@
|
|||||||
|
"""SearXNG search service for web queries."""
|
||||||
|
|
||||||
|
import logging
|
||||||
|
from dataclasses import dataclass
|
||||||
|
|
||||||
|
import aiohttp
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class SearchResult:
|
||||||
|
"""A single search result."""
|
||||||
|
|
||||||
|
title: str
|
||||||
|
url: str
|
||||||
|
content: str
|
||||||
|
|
||||||
|
|
||||||
|
class SearXNGService:
|
||||||
|
"""Service for searching the web via SearXNG."""
|
||||||
|
|
||||||
|
def __init__(self, base_url: str, timeout: int = 10) -> None:
|
||||||
|
"""Initialize the SearXNG service.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
base_url: The base URL of the SearXNG instance
|
||||||
|
timeout: Request timeout in seconds
|
||||||
|
"""
|
||||||
|
self.base_url = base_url.rstrip("/")
|
||||||
|
self.timeout = timeout
|
||||||
|
|
||||||
|
async def search(
|
||||||
|
self,
|
||||||
|
query: str,
|
||||||
|
categories: str = "general",
|
||||||
|
max_results: int = 5,
|
||||||
|
) -> list[SearchResult]:
|
||||||
|
"""Search the web using SearXNG.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
query: The search query
|
||||||
|
categories: Search categories (general, images, news, etc.)
|
||||||
|
max_results: Maximum number of results to return
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
List of search results
|
||||||
|
"""
|
||||||
|
url = f"{self.base_url}/search"
|
||||||
|
params = {
|
||||||
|
"q": query,
|
||||||
|
"format": "json",
|
||||||
|
"categories": categories,
|
||||||
|
}
|
||||||
|
|
||||||
|
logger.debug(f"Searching SearXNG for: {query}")
|
||||||
|
|
||||||
|
try:
|
||||||
|
async with aiohttp.ClientSession() as session:
|
||||||
|
async with session.get(
|
||||||
|
url,
|
||||||
|
params=params,
|
||||||
|
timeout=aiohttp.ClientTimeout(total=self.timeout),
|
||||||
|
) as response:
|
||||||
|
if response.status != 200:
|
||||||
|
logger.error(f"SearXNG returned status {response.status}")
|
||||||
|
return []
|
||||||
|
|
||||||
|
data = await response.json()
|
||||||
|
results = []
|
||||||
|
|
||||||
|
for item in data.get("results", [])[:max_results]:
|
||||||
|
results.append(
|
||||||
|
SearchResult(
|
||||||
|
title=item.get("title", ""),
|
||||||
|
url=item.get("url", ""),
|
||||||
|
content=item.get("content", ""),
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
logger.debug(f"SearXNG returned {len(results)} results")
|
||||||
|
return results
|
||||||
|
|
||||||
|
except aiohttp.ClientError as e:
|
||||||
|
logger.error(f"SearXNG request failed: {e}")
|
||||||
|
return []
|
||||||
|
except TimeoutError:
|
||||||
|
logger.error("SearXNG request timed out")
|
||||||
|
return []
|
||||||
|
|
||||||
|
def format_results_for_context(self, results: list[SearchResult]) -> str:
|
||||||
|
"""Format search results as context for the AI.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
results: List of search results
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Formatted string with search results
|
||||||
|
"""
|
||||||
|
if not results:
|
||||||
|
return "No search results found."
|
||||||
|
|
||||||
|
formatted = []
|
||||||
|
for i, result in enumerate(results, 1):
|
||||||
|
formatted.append(f"[{i}] {result.title}\n URL: {result.url}\n {result.content}")
|
||||||
|
|
||||||
|
return "\n\n".join(formatted)
|
||||||
Reference in New Issue
Block a user