logo
0
0
WeChat Login

CodeSentinel

An automated code security audit system powered by AI large language models, helping developers quickly identify security vulnerabilities in code and providing remediation suggestions.

Key Features

  • Three Iron Rules — Eliminate AI hallucinations at the root: no guessing file paths, no fabricating code, no reporting unseen vulnerabilities (see Core Philosophy)
  • Four-Round Challenge Verification — Every vulnerability undergoes four progressive rounds of challenge on reachability, code logic, data flow, and exploitability, with false positives automatically filtered (see Challenge Mechanism)
  • Code Upload & Parsing — Supports ZIP source code packages with automatic extraction, project structure recognition, and tech stack identification
  • Multi-Model Support — Compatible with multiple LLMs via OpenAI API format (Claude, GPT-4o, DeepSeek, Qwen, Hunyuan, etc.)
  • Model Configuration Management — 10 built-in preset templates, supports custom API endpoints and keys, one-click default model selection
  • AI Security Audit — Covers 14 vulnerability types including SQL injection, XSS, hardcoded secrets, command injection, path traversal, SSRF, plus 8 composite function vulnerability patterns
  • Cross-File Semantic Analysis — Automatically parses import/require dependencies, tracks cross-module data flows and composite function vulnerabilities
  • Batch Concurrency + Checkpoint Resume — Supports large-scale projects (1000+ code chunks), automatic batch concurrent processing, saves progress on timeout for later continuation
  • Security Report Generation — Generates structured audit reports (Markdown + JSON) with vulnerability details, risk levels, and remediation suggestions
  • Real-Time Logs — Displays thinking, findings, progress, warnings, and other log types in real time during auditing
  • Task Management — Asynchronous task processing with audit progress tracking and history

Core Philosophy

Three Iron Rules of Auditing

This project enforces three inviolable hard constraints on AI audit behavior, fundamentally eliminating hallucinations and fabrications to ensure every vulnerability report is evidence-based:

RuleProhibited BehaviorCorrect ApproachViolation Consequence
Rule 1: No Guessing File PathsReferencing file paths from memory or speculationOnly reference code content actually provided to the AIAll analysis based on non-existent files is invalid
Rule 2: No Fabricating Code SnippetsDescribing code from impression, or referencing code not actually seenMust reference actual line numbers and content from provided codeAll vulnerability analysis based on fabricated code is invalid
Rule 3: No Reporting Vulnerabilities in Unseen CodeReporting vulnerabilities without actually seeing the codeSee code → Analyze code → Then report vulnerabilitiesVulnerability reports for unseen code are directly marked invalid

Violating any iron rule will invalidate the audit results.

Design motivation: Large language models are prone to "hallucinations" in code audit scenarios — fabricating non-existent file paths, inventing code snippets that never appeared, or asserting vulnerabilities in files they never read. These three iron rules constrain AI behavior at the highest priority, anchoring the audit process to actual code evidence, thereby ensuring report credibility and traceability.

Core Audit Principle

Code defect exists ≠ Vulnerability is exploitable

The system requires the AI to verify the following 9 dimensions based on actual code before reporting each vulnerability:

#Verification DimensionDescription
1Defect AuthenticityWhether the code defect truly exists, whether overlooked upstream protections exist
2Path ReachabilityWhether the code path is reachable (excluding dead code, legacy code, unsatisfied conditions)
3Input ReachabilityWhether user input can actually reach the danger point
4Practical ExploitabilityWhether an attacker can exploit it in a real environment
5Systematic DesignWhether the pattern is a systematic framework design rather than an individual oversight
6Source TypeWhether the source is external user input rather than trusted server-side code
7Self-Attack TestWhether the prerequisite privileges already exceed the vulnerability's own capability
8Design IntentWhether the behavior is the framework's intended design rather than a defect
9Runtime FeasibilityWhether the theoretical attack is feasible in the actual runtime environment

Incorrectly labeling secure code as a "vulnerability" is misleading. Accuracy over quantity — one accurate vulnerability report is far better than ten false positives.

Four-Round Challenge Verification Mechanism

Every potential vulnerability detected by AI must pass four progressive rounds of challenge verification before being written to the final report. Failure in any round affects the final determination:

Potential Vuln → Round 0 → Round 1 → Round 2 → Round 3 → Final Verdict │ │ │ │ ▼ ▼ ▼ ▼ Reachability Code Logic Data Flow Exploitability

Detailed rules for each round:

RoundNameChallenge QuestionPass ConditionElimination Condition
Round 0Reachability & Design IntentIs the code path reachable?Path reachable and not design behaviorDead code / Legacy code / Intended design behavior
Round 1Code Logic ChallengeDoes the dangerous code pattern truly exist?Confidence is MEDIUM or HIGHConfidence is LOW → Direct elimination
Round 2Data Flow ChallengeCan user input reach the danger point?Clear data flow description exists (>10 chars)Data flow unclear or non-existent
Round 3Exploitability ChallengeCan an attacker construct an effective attack?Confidence HIGH or severity CRITICALInsufficient confidence for regular vulns (composite vulns have exemption rules)

Final verdict criteria:

Rounds PassedVerdict StatusAction
4/4 passedpassed — Confirmed vulnerabilityWritten to final report
2-3/4 passedpartial — Under observationWritten to report with pending verification note
0-1/4 passedfailed — False positiveFiltered out, excluded from final report

Composite function vulnerabilities (involving cross-file data flows) enjoy special exemption rules in Round 3 — they pass as long as confidence is not LOW, because cross-file vulnerabilities are inherently harder to confirm but often more damaging.

Supported Vulnerability Detection Types

Single-Point Vulnerabilities (14 Types)

Vulnerability TypeCWE IDDefault LevelDescription
SQL InjectionCWE-89HIGHDetects SQL concatenation vulnerabilities, distinguishes parameterized queries (safe) from string concatenation (dangerous)
XSS (Cross-Site Scripting)CWE-79HIGHDetects reflected, stored, and DOM-based XSS, covers innerHTML/v-html/dangerouslySetInnerHTML
Hardcoded SecretsCWE-798HIGHDetects API Keys/Tokens/Passwords/Private Keys/Connection Strings, auto-excludes placeholders and env variable references
Command InjectionCWE-78CRITICALDetects dangerous calls like eval/exec/system/child_process
Path TraversalCWE-22HIGHDetects file path manipulation vulnerabilities
SSRFCWE-918HIGHDetects Server-Side Request Forgery vulnerabilities
Insecure DeserializationCWE-502HIGHDetects pickle.loads/ObjectInputStream/unserialize, etc.
Authentication FlawsCWE-287HIGHDetects authentication/authorization implementation flaws
Sensitive Data ExposureCWE-200MEDIUMDetects error message leakage, sensitive data in logs
XXECWE-611HIGHDetects XML External Entity injection
Insecure RandomnessCWE-330LOWDetects weak random number generators in security contexts
Prototype PollutionCWE-1321HIGHDetects JavaScript prototype chain pollution
CSRFCWE-352MEDIUMDetects Cross-Site Request Forgery
IDORCWE-639MEDIUMDetects Insecure Direct Object References

Composite Function Vulnerabilities (8 Patterns)

The system pays special attention to cross-function, cross-file composite security issues:

PatternDescription
Cross-Function Data Flow TaintFunction A receives user input without sanitization → passes to Function B → Function B uses it in dangerous operations
Privilege Escalation ChainNormal user modifies state via Function A → bypasses Function B's permission checks
Race Condition (TOCTOU)Function A checks permissions → Function B modifies data after check but before operation
Error Handling Leak ChainFunction A's exception is caught by Function B → Function B returns error details to client
Auth/AuthZ Bypass ComboCertain function call combinations skip intermediate authentication/authorization checks
Prototype Pollution PropagationObject merge in Function A is polluted → affects Function B's logic decisions
Second-Order InjectionFunction A stores unsanitized user input to database → Function B reads and uses it in dangerous operations
Callback/Event-Driven VulnerabilityData passing between event handler functions lacks validation

Technical Architecture

┌──────────────────────────────────────────────────────┐ │ Nginx (Port 80) │ │ Static Assets + API Reverse Proxy │ ├────────────────────┬─────────────────────────────────┤ │ React SPA │ Express API (Port 3001) │ │ TypeScript + Vite │ Node.js 20 + MongoDB Driver │ │ Tailwind + DaisyUI│ AI Model Calls (OpenAI compat.) │ └────────────────────┴──────────┬──────────────────────┘ │ ┌────────┴────────┐ │ MongoDB 7 │ │ Data Storage │ └─────────────────┘
LayerTechnology
Frontend FrameworkReact 19 + TypeScript
Build ToolVite 6
UI StylingTailwind CSS 3 + DaisyUI 4
RoutingReact Router 6
BackendNode.js 20 + Express 4
DatabaseMongoDB 7
AI ModelsOpenAI-compatible API (Claude / GPT / DeepSeek / Qwen / Hunyuan, etc.)
DeploymentDocker Compose (Nginx + Node.js + MongoDB)

Directory Structure

AI_code_review_agent/ ├── src/ # Frontend source code │ ├── components/ # Reusable components │ │ ├── FileUpload.tsx # File upload (ZIP drag & drop) │ │ ├── TaskProgress.tsx # Task progress & real-time logs │ │ ├── ReportViewer.tsx # Audit report viewer │ │ ├── VulnerabilityCard.tsx # Vulnerability detail card │ │ ├── CodeHighlight.tsx # Code syntax highlighting │ │ ├── Navbar.tsx # Top navigation bar │ │ └── Footer.tsx # Footer │ ├── pages/ # Page components │ │ ├── HomePage.tsx # Home page (upload entry) │ │ ├── TaskPage.tsx # Task details (progress/logs/report) │ │ ├── HistoryPage.tsx # Audit history │ │ └── SettingsPage.tsx # Model configuration management │ ├── types/audit.ts # TypeScript type definitions │ ├── utils/api.ts # API request utilities │ ├── App.tsx # App routing entry │ └── index.css # Global styles ├── server/ # Backend API service │ ├── src/ │ │ ├── index.js # Express entry + route mounting │ │ ├── routes/ │ │ │ ├── tasks.js # Task CRUD + report data │ │ │ ├── audit.js # Trigger security audit │ │ │ ├── report.js # Trigger report generation │ │ │ └── model-configs.js # Model configuration management │ │ ├── services/ │ │ │ ├── ai.js # AI model calls (OpenAI compatible) │ │ │ ├── analyzeCode.js # ZIP extraction + code chunking │ │ │ ├── securityAudit.js # Audit engine (concurrency/retry/resume) │ │ │ └── generateReport.js # Report generation (Markdown + JSON) │ │ └── utils/db.js # MongoDB connection & indexes │ ├── Dockerfile # Backend container config │ ├── .env.example # Environment variable template │ └── package.json ├── shared/prompts/ # AI audit prompt templates ├── docker-compose.yml # Docker Compose orchestration ├── Dockerfile # Frontend multi-stage build (Vite → Nginx) ├── nginx.conf # Nginx reverse proxy config ├── .env # Frontend environment variables └── package.json # Frontend dependencies

Quick Start

Prerequisites

  • Docker and Docker Compose
  • At least one AI model API key compatible with OpenAI API format

Docker Deployment (Recommended)

# Clone the project git clone <repo-url> cd AI_code_review_agent # Build and start all services docker compose up -d # View logs docker compose logs -f

After startup, visit http://localhost:8080

Custom port:

APP_PORT=3000 docker compose up -d

Stop services:

docker compose down

Local Development

For scenarios requiring code modification with hot reload debugging.

1. Start MongoDB

docker run -d --name mongo -p 27017:27017 mongo:7

2. Start Backend

cd server npm install # Create .env (or copy .env.example) cat > .env << EOF MONGODB_URI=mongodb://localhost:27017/ai_code_review PORT=3001 DATA_DIR=./data EOF npm run dev

3. Start Frontend (new terminal)

# Return to project root npm install npm run dev

The frontend dev server starts at http://localhost:5173 by default, with API requests automatically proxied to backend localhost:3001.

Usage

1. Configure AI Model

Configure the model before first use:

  1. Open the app and click "Model Configuration" in the navigation bar
  2. Select a model from preset templates (or click "Add Custom Configuration")
  3. Enter the API Endpoint and API Key
  4. Click "Set as Default" to designate the model for auditing

2. Upload Code for Audit

  1. Click the upload area on the home page or drag & drop a ZIP file
  2. The system automatically extracts, identifies the tech stack, and chunks the code
  3. AI model audits each chunk automatically, with real-time progress and logs displayed
  4. Report is automatically generated after audit completion

3. View Report

  • View the report directly on the task page after audit completion
  • Report includes: risk summary, vulnerability list (with severity level, CWE ID, code location, remediation suggestions)
  • Supports downloading the report in Markdown format

4. History

View all audit tasks on the "History" page, with pagination and deletion support.

Configuration

Environment Variables

Frontend (.env):

VariableDefaultDescription
VITE_API_BASE_URL/apiAPI request path prefix

Backend (server/.env):

VariableDefaultDescription
MONGODB_URImongodb://mongo:27017/ai_code_reviewMongoDB connection URI
PORT3001Backend service port
DATA_DIR/app/dataUpload files and report storage path

Docker Compose:

VariableDefaultDescription
APP_PORT8080Externally exposed access port

Audit Engine Parameters

ParameterValueDescription
Batch Concurrency2Number of code chunks audited simultaneously per batch
Single Run Limit100 chunksMaximum code chunks per single execution
AI Request Timeout150s (increments to 210s)Initial 150s, +30s per retry
Max Retries2Retry count after individual chunk failure
Large Chunk Threshold120 linesAuto-split chunks exceeding this line count
Safe Exit Threshold540sSaves progress on timeout for later continuation

Supported AI Models

10 built-in preset templates, plus support for any OpenAI API-compatible model:

ModelDescription
GPT-4oOpenAI flagship model
GPT-4o MiniOpenAI lightweight model
Claude Opus 4Anthropic flagship model
Claude Sonnet 4Anthropic cost-effective model
DeepSeek V3DeepSeek general-purpose model
DeepSeek R1DeepSeek reasoning model
Qwen MaxAlibaba Qwen flagship model
Hunyuan TurboTencent Hunyuan high-performance reasoning model
Hunyuan ProTencent Hunyuan best-effect model
Custom ModelAny endpoint compatible with OpenAI Chat Completions API

MongoDB Collections

CollectionPurpose
audit_tasksAudit task info (status, file paths, tech stack, etc.)
audit_resultsAudit results (vulnerability lists stored per file)
audit_logsAudit logs (real-time progress, thinking process, etc.)
audit_code_filesExtracted code chunks
audit_vulnerabilitiesTemporary vulnerability data (cleaned after report generation)
model_configsAI model configurations

Common Commands

# Docker operations docker compose up -d # Start services docker compose down # Stop services docker compose logs -f # View logs docker compose up -d --build # Rebuild and start # Local development npm run dev # Start frontend dev server npm run build # Build frontend npm run lint # ESLint check npm run format # Prettier formatting cd server && npm run dev # Start backend dev server

License

This project is licensed under the Apache License 2.0.

About

AI Code Review Agent

487.00 KiB
0 forks0 stars1 branches0 TagREADMEApache-2.0 license
Language
TypeScript62.7%
JavaScript35.2%
CSS1.6%
Dockerfile0.3%
Others0.2%