浏览器（带 Cookie）
  → FastAPI 路由（提取 request.headers["cookie"]）
    → LangChain Engine（注入 RunnableConfig.configurable）
      → @tool 函数（从 config 中读取 cookie，附加到 HTTP 请求头）
        → 外部服务（鉴权通过 ✅）

关键代码说明

1. 定义工具 — 通过 `RunnableConfig` 接收额外参数

在 app/tools.py 中，工具函数通过声明 config: RunnableConfig 参数来接收运行时上下文：

from langchain_core.runnables import RunnableConfig
from langchain_core.tools import tool

@tool
async def list_user_databases(
    page: int = 1,
    page_size: int = 20,
    *,
    config: RunnableConfig,        # ← LangChain 自动注入，不会暴露给 LLM
) -> str:
    """获取当前用户创建的数据库列表。"""

    # 从 config 中提取用户鉴权信息
    user_cookie = config.get("configurable", {}).get("user_cookie", "")

    # 将鉴权信息透传到外部服务请求中
    headers = {"Cookie": user_cookie}
    async with httpx.AsyncClient() as client:
        resp = await client.get("http://external-service/api/databases", headers=headers)
        return resp.text

⚠️ 重要：config 参数的类型必须是 RunnableConfig（不能是 Optional[RunnableConfig]）。 LangChain 通过参数类型注解来识别并注入 config，如果用了 Optional 包装，LangChain 将无法识别，config 会是 None。

2. 调用工具时注入用户上下文

在 app/engine.py 中，调用 LLM 和执行工具时，通过 config 参数传递用户信息：

from langchain_core.runnables import RunnableConfig

# 构建包含用户信息的 config
config: RunnableConfig = {
    "configurable": {
        "user_cookie": user_cookie,   # 从 HTTP 请求头中提取
        "user_token": user_token,     # 也可以传递 Bearer Token
    }
}

# 调用 LLM 时传入 config（LLM 本身不使用，但会透传给工具）
response = await llm_with_tools.ainvoke(messages, config=config)

# 执行工具时传入同一个 config → 工具函数就能拿到 user_cookie
tool_result = await tool_func.ainvoke(tool_args, config=config)

3. 路由层提取用户原始请求信息

在 app/routes_chat.py 中，从 FastAPI 的 Request 对象提取浏览器发来的 Cookie：

@router.post("/stream")
async def stream_message(req: SendMessageRequest, request: Request, ...):
    # 提取用户浏览器发来的原始 Cookie 字符串
    user_cookie = request.headers.get("cookie", "")

    # 传递给聊天引擎，最终会到达工具函数
    async for event in chat_stream(
        db=db, session_id=req.session_id,
        user_message=req.message,
        user_cookie=user_cookie,       # ← 透传
    ):
        yield f"data: {json.dumps(event)}\n\n"

工具调用循环流程

LLM 的工具调用不是一次完成的，而是一个多轮循环：

用户: "帮我查一下我创建了哪些数据库"

  [第 1 轮] LLM 返回 tool_calls: [{name: "list_user_databases", args: {}}]
            → 执行工具，得到 JSON 结果
            → 将结果作为 ToolMessage 加入消息列表

  [第 2 轮] LLM 收到工具结果，生成最终文本回复
            → "您目前创建了 5 个数据库，分别是..."

本项目使用 astream 流式接口统一处理工具调用和文本输出，避免重复调用 LLM。

如何扩展自己的工具

在 app/tools.py 中添加：

@tool
async def my_new_tool(
    param1: str,
    param2: int = 10,
    *,
    config: RunnableConfig,
) -> str:
    """工具描述（LLM 会根据这段描述决定何时调用此工具）。"""
    user_cookie = config.get("configurable", {}).get("user_cookie", "")

    async with httpx.AsyncClient() as client:
        resp = await client.get(
            "https://your-service.com/api/xxx",
            headers={"Cookie": user_cookie},
            params={"param1": param1, "param2": param2},
        )
        return resp.text

然后添加到 ALL_TOOLS 列表：

ALL_TOOLS = [
    list_user_databases,
    get_database_detail,
    call_authenticated_api,
    my_new_tool,                # ← 新增
]

项目结构

app/
├── main.py            # FastAPI 入口 + 生命周期管理
├── config.py          # 环境变量配置
├── database.py        # 异步 SQLAlchemy 数据库连接
├── models.py          # 数据模型（User / ChatSession / ChatMessage）
├── auth.py            # JWT 认证 + bcrypt 密码哈希
├── engine.py          # ★ LangChain 聊天引擎（工具调用循环 + 流式输出）
├── tools.py           # ★ 工具定义（鉴权透传的核心实现）
├── routes_auth.py     # 认证 API（注册 / 登录 / 登出）
└── routes_chat.py     # 聊天 API（会话管理 / 消息发送 / SSE 流式）

templates/
├── login.html         # 登录 / 注册页面
└── index.html         # 聊天页面

mock_server.py         # Mock 外部服务（用于测试工具调用 + 鉴权透传）

快速开始

# 1. 安装依赖
pip install -r requirements.txt

# 2. 配置环境变量
cp .env.example .env
# 编辑 .env，填入 OPENAI_API_KEY 等配置

# 3. 启动 Mock 外部服务（可选，用于测试工具调用）
python3 mock_server.py

# 4. 启动主服务
python3 run.py

# 5. 访问 http://localhost:8000

踩坑记录

问题	原因	解决方案
工具函数收不到 config	`config: Optional[RunnableConfig] = None` 类型不对	改为 `config: RunnableConfig`（keyword-only 参数）
passlib + bcrypt 5.x 报错	passlib 与新版 bcrypt 不兼容	弃用 passlib，直接使用 `bcrypt` 原生 API
工具调用场景下响应慢	`chat_stream` 先 `ainvoke` 再 `astream`，多调了一次 LLM	全程只用 `astream`，通过收集 `tool_call_chunks` 判断