feat: Add @codebot explain-diff command for plain-language PR explanations

Implements code diff explainer that translates technical changes into plain language for non-technical stakeholders (PMs, designers, new team members). Features: - Plain-language explanations without jargon - File-by-file breakdown with 'what' and 'why' context - Architecture impact analysis - Breaking change detection - Perfect for onboarding and cross-functional reviews Implementation: - Added explain_diff.md prompt template with plain-language guidelines - Implemented _handle_explain_diff_command() in PRAgent - Added _format_diff_explanation() for readable markdown - Updated PRAgent.can_handle() to route explain-diff commands - Added 'explain-diff' to config.yml commands list Workflow Safety (prevents duplicate runs): - Added '@codebot explain-diff' to ai-comment-reply.yml conditions - Excluded from ai-chat.yml to prevent duplication - Only triggers on PR comments (not issues) - Manual command only (no automatic triggering) Testing: - 9 comprehensive tests in TestDiffExplanation class - Tests command detection, formatting, plain-language output - Verifies prompt formatting and empty section handling Documentation: - Updated README.md with explain-diff command and examples - Added detailed implementation guide in CLAUDE.md - Included plain-language rules and use cases Related: Milestone 2 high-priority feature - code diff explainer
2025-12-29 12:44:54 +00:00
parent 1d468e360e
commit 37f3eb45d0
8 changed files with 680 additions and 3 deletions
@@ -100,7 +100,15 @@ class PRAgent(BaseAgent):
                )
                has_summarize = f"{mention_prefix} summarize" in comment_body.lower()
                has_changelog = f"{mention_prefix} changelog" in comment_body.lower()
-                return is_pr and (has_review_again or has_summarize or has_changelog)
+                has_explain_diff = (
+                    f"{mention_prefix} explain-diff" in comment_body.lower()
+                )
+                return is_pr and (
+                    has_review_again
+                    or has_summarize
+                    or has_changelog
+                    or has_explain_diff
+                )

        return False

@@ -116,6 +124,8 @@ class PRAgent(BaseAgent):
                return self._handle_summarize_command(context)
            elif f"{mention_prefix} changelog" in comment_body.lower():
                return self._handle_changelog_command(context)
+            elif f"{mention_prefix} explain-diff" in comment_body.lower():
+                return self._handle_explain_diff_command(context)
            elif f"{mention_prefix} review-again" in comment_body.lower():
                return self._handle_review_again(context)

@@ -1211,3 +1221,193 @@ class PRAgent(BaseAgent):
                lines.append(f"- **Main components:** {', '.join(components)}")

        return "\n".join(lines)
+
+    def _handle_explain_diff_command(self, context: AgentContext) -> AgentResult:
+        """Handle @codebot explain-diff command from PR comments.
+
+        Generates plain-language explanation of code changes for non-technical stakeholders.
+
+        Args:
+            context: Agent context with event data
+
+        Returns:
+            AgentResult with success status and actions taken
+        """
+        issue = context.event_data.get("issue", {})
+        pr_number = issue.get("number")
+        comment_author = (
+            context.event_data.get("comment", {}).get("user", {}).get("login", "user")
+        )
+
+        self.logger.info(
+            f"Generating diff explanation for PR #{pr_number} at user request"
+        )
+
+        try:
+            # Get PR data
+            pr = self.gitea.get_pull_request(context.owner, context.repo, pr_number)
+            pr_title = pr.get("title", "")
+            pr_description = pr.get("body", "")
+
+            # Get PR diff
+            diff = self._get_diff(context.owner, context.repo, pr_number)
+            if not diff.strip():
+                error_msg = (
+                    f"@{comment_author}\n\n"
+                    f"{self.AI_DISCLAIMER}\n\n"
+                    "**⚠️ Diff Explanation Failed**\n\n"
+                    "No changes found in this PR to explain."
+                )
+                self.gitea.create_issue_comment(
+                    context.owner, context.repo, pr_number, error_msg
+                )
+                return AgentResult(
+                    success=False,
+                    message=f"No diff to explain for PR #{pr_number}",
+                )
+
+            # Load explain_diff prompt
+            prompt_template = self.load_prompt("explain_diff")
+            prompt = prompt_template.format(
+                pr_title=pr_title,
+                pr_description=pr_description or "(No description provided)",
+            )
+            prompt = f"{prompt}\n{diff}"
+
+            # Call LLM to generate explanation
+            result = self.call_llm_json(prompt)
+
+            # Format the explanation comment
+            explanation_comment = self._format_diff_explanation(result, pr_number)
+
+            # Post explanation comment
+            self.gitea.create_issue_comment(
+                context.owner, context.repo, pr_number, explanation_comment
+            )
+
+            return AgentResult(
+                success=True,
+                message=f"Generated diff explanation for PR #{pr_number}",
+                actions_taken=["Posted diff explanation comment"],
+            )
+
+        except Exception as e:
+            self.logger.error(f"Failed to generate diff explanation: {e}")
+
+            # Post error message
+            error_msg = (
+                f"@{comment_author}\n\n"
+                f"{self.AI_DISCLAIMER}\n\n"
+                "**⚠️ Diff Explanation Failed**\n\n"
+                f"I encountered an error while generating the explanation: {str(e)}\n\n"
+                "This could be due to:\n"
+                "- The PR is too large to analyze\n"
+                "- The LLM service is temporarily unavailable\n"
+                "- An unexpected error occurred"
+            )
+            self.gitea.create_issue_comment(
+                context.owner, context.repo, pr_number, error_msg
+            )
+
+            return AgentResult(
+                success=False,
+                message=f"Failed to generate diff explanation for PR #{pr_number}",
+                error=str(e),
+            )
+
+    def _format_diff_explanation(self, explanation_data: dict, pr_number: int) -> str:
+        """Format diff explanation data into readable markdown.
+
+        Args:
+            explanation_data: JSON data from LLM containing explanation
+            pr_number: PR number for reference
+
+        Returns:
+            Formatted markdown explanation
+        """
+        lines = [
+            self.AI_DISCLAIMER,
+            "",
+            f"## 📖 Code Changes Explained (PR #{pr_number})",
+            "",
+        ]
+
+        # Overview
+        overview = explanation_data.get("overview", "")
+        if overview:
+            lines.append("### 🎯 Overview")
+            lines.append(overview)
+            lines.append("")
+
+        # Key changes
+        key_changes = explanation_data.get("key_changes", [])
+        if key_changes:
+            lines.append("### 🔍 What Changed")
+            lines.append("")
+            for change in key_changes:
+                file_path = change.get("file", "unknown")
+                status = change.get("status", "modified")
+                explanation = change.get("explanation", "")
+                why_it_matters = change.get("why_it_matters", "")
+
+                # Status emoji
+                status_emoji = {"new": "➕", "modified": "📝", "deleted": "🗑️"}
+                emoji = status_emoji.get(status, "📝")
+
+                lines.append(f"#### {emoji} `{file_path}` ({status})")
+                lines.append(f"**What changed:** {explanation}")
+                if why_it_matters:
+                    lines.append(f"**Why it matters:** {why_it_matters}")
+                lines.append("")
+
+        # Architecture impact
+        arch_impact = explanation_data.get("architecture_impact", {})
+        if arch_impact and arch_impact.get("description"):
+            lines.append("---")
+            lines.append("")
+            lines.append("### 🏗️ Architecture Impact")
+            lines.append(arch_impact.get("description", ""))
+            lines.append("")
+
+            new_deps = arch_impact.get("new_dependencies", [])
+            if new_deps:
+                lines.append("**New dependencies:**")
+                for dep in new_deps:
+                    lines.append(f"- {dep}")
+                lines.append("")
+
+            affected = arch_impact.get("affected_components", [])
+            if affected:
+                lines.append("**Affected components:**")
+                for comp in affected:
+                    lines.append(f"- {comp}")
+                lines.append("")
+
+        # Breaking changes
+        breaking = explanation_data.get("breaking_changes", [])
+        if breaking:
+            lines.append("---")
+            lines.append("")
+            lines.append("### ⚠️ Breaking Changes")
+            for change in breaking:
+                lines.append(f"- **{change}**")
+            lines.append("")
+
+        # Technical details
+        tech = explanation_data.get("technical_details", {})
+        if tech:
+            lines.append("---")
+            lines.append("")
+            lines.append("### 📊 Technical Summary")
+
+            files = tech.get("files_changed", 0)
+            additions = tech.get("insertions", 0)
+            deletions = tech.get("deletions", 0)
+            lines.append(f"- **Files changed:** {files}")
+            lines.append(f"- **Lines:** +{additions} / -{deletions}")
+
+            components = tech.get("main_components", [])
+            if components:
+                lines.append(f"- **Components:** {', '.join(components)}")
+
+        return "\n".join(lines)
@@ -68,6 +68,7 @@ interaction:
        - security
        - summarize # Generate PR summary (works on both issues and PRs)
        - changelog # Generate Keep a Changelog format entries (PR comments only)
+        - explain-diff # Explain code changes in plain language (PR comments only)
        - triage
        - review-again

@@ -0,0 +1,99 @@
+You are an experienced technical writer explaining code changes to **non-technical stakeholders** (product managers, designers, business analysts).
+
+Your goal is to translate complex code diffs into **clear, plain-language explanations** that anyone can understand, regardless of their technical background.
+
+---
+
+## Requirements
+
+Analyze the PR diff and generate a structured explanation with:
+
+1. **Overview** - High-level summary in 1-2 sentences (what changed and why)
+2. **Key Changes** - File-by-file breakdown in plain language
+3. **Architecture Impact** - How this affects the overall system
+4. **Breaking Changes** - Any changes that affect existing functionality (if applicable)
+5. **Technical Details** - Summary of files, lines, and components (for reference)
+
+---
+
+## Output Format
+
+Return a JSON object with this structure:
+
+```json
+{{{{
+  "overview": "One or two sentence summary of what this PR accomplishes",
+  "key_changes": [
+    {{{{
+      "file": "path/to/file.py",
+      "status": "new" | "modified" | "deleted",
+      "explanation": "Plain language explanation of what changed in this file",
+      "why_it_matters": "Why this change is important or what problem it solves"
+    }}}}
+  ],
+  "architecture_impact": {{{{
+    "description": "How this affects the overall system architecture",
+    "new_dependencies": ["List of new libraries or services added"],
+    "affected_components": ["List of system components that are impacted"]
+  }}}},
+  "breaking_changes": [
+    "List of changes that break backward compatibility or affect existing features"
+  ],
+  "technical_details": {{{{
+    "files_changed": 15,
+    "insertions": 450,
+    "deletions": 120,
+    "main_components": ["List of main directories/components affected"]
+  }}}}
+}}}}
+```
+
+---
+
+## Rules for Plain Language Explanations
+
+1. **Avoid jargon**: Use everyday language, not technical terms
+   - ❌ Bad: "Refactored the authentication middleware to use JWT tokens"
+   - ✅ Good: "Updated the login system to use secure tokens that expire after 24 hours"
+
+2. **Explain the "why", not just the "what"**
+   - ❌ Bad: "Added new function `calculate_total()`"
+   - ✅ Good: "Added calculation logic to automatically sum up order totals, preventing manual errors"
+
+3. **Use analogies and real-world examples**
+   - ❌ Bad: "Implemented caching layer using Redis"
+   - ✅ Good: "Added a memory system that remembers frequently accessed data, making the app load 10x faster"
+
+4. **Focus on user impact**
+   - ❌ Bad: "Optimized database queries"
+   - ✅ Good: "Made the search feature faster by improving how we retrieve data"
+
+5. **Group related changes together**
+   - Instead of listing 10 small files, say "Updated 10 files across the payment system to fix checkout bugs"
+
+6. **Be specific about impact**
+   - "This change affects all users on the mobile app"
+   - "This only impacts admin users"
+   - "This is internal cleanup with no user-visible changes"
+
+7. **Translate technical concepts**
+   - API → "connection point between systems"
+   - Database migration → "updating how data is stored"
+   - Refactoring → "cleaning up code without changing behavior"
+   - Dependency → "external library or tool we use"
+
+8. **Highlight risks clearly**
+   - "This requires a system restart"
+   - "Users will need to log in again"
+   - "This changes how existing features work"
+
+---
+
+## PR Information
+
+**Title:** {pr_title}
+
+**Description:** {pr_description}
+
+**Diff:**
+