06_AI 接口说明（初版）

本文件整理当前与规划中的 AI 接口形态。当前已实现：POST /ai/chat（基于 Claude 意图识别的智能分流），支持 SSE 流式透传与非流式。

全局配置

NEW_API_BASE_URL：new-api 聚合服务地址（默认 https://www.api.moduoduo.cn）
NEW_API_KEY：new-api 访问令牌
AI_DEFAULT_HUMANITIES_MODEL：临时默认模型（默认 ernie-4.5-turbo-vl）
AI_REQUEST_TIMEOUT：请求超时（秒），默认 600
AI_LEFT_CHAT_BY_CLAUDE：是否启用“Claude 负责左侧聊天 + 输出 WorkPlan，右侧由专长模型执行”的模式（默认 false）

这些配置通过 app/core/config.py 的 Settings 读取，可用环境变量覆盖。

聚合入口

POST `/ai/chat`

功能：智能分流聊天式问答。
- 当 AI_LEFT_CHAT_BY_CLAUDE=true 时：统一由 Claude 输出“左侧聊天 + 右侧 WorkPlan JSON”。前端解析 WorkPlan 后，按类型调用对应直连接口（见下）。
- 当 AI_LEFT_CHAT_BY_CLAUDE=false 时（默认）：服务端先用 Claude（见下方意图服务配置）做意图识别并自动路由到最适合的处理流程：
- stem_qa → 理科问答处理（AI_STEM_MODEL）
- humanities_qa → 人文问答处理（AI_DEFAULT_HUMANITIES_MODEL，包含常识性问题）
- vision_qa → 图像理解处理（AI_VISION_MODEL）
- t2i → 文生图处理（调用 /v1/images/generations API）
- t2v → 文生视频处理（调用 /doubao/contents/generations/tasks API）
- codegen → 代码生成处理（AI_CODE_MODEL）
入参（JSON）：
- messages: [{ role: "system"|"developer"|"user"|"assistant", content: string }]
- stream?: boolean（默认 true）是否启用 Server-Sent Events 流式返回
说明：如果请求未包含 system 或 developer，服务端会注入默认提示词以启用“左侧聊天 + 右侧 WorkPlan”的双通道协议（见下）。若上游不支持 developer 角色，服务端会自动将 developer 内容合并到 system。
出参：
- stream=true：text/event-stream，逐行 data: {json} 事件；末尾会收到 data: [DONE]
- stream=false：一次性返回完整 JSON（上游 new-api 的响应格式）

示例（非流式）：

bash

curl -X POST "$BASE/ai/chat" -H "Content-Type: application/json" -d '{
  "messages": [
    {"role": "user", "content": "你好！"}
  ],
  "stream": false
}'

示例（流式，SSE）：

python

# test_9mdd_stream.py
import os
import sys
import json
import argparse
import requests


def main():
    parser = argparse.ArgumentParser()
    parser.add_argument("--base-url", default="http://localhost:8000")
    parser.add_argument("--token", default=os.getenv("TOKEN", ""))
    parser.add_argument("--model", default="ernie-4.5-turbo-vl")
    parser.add_argument("--developer", default="你是一个有帮助的助手。")
    parser.add_argument("--user", default="你好！")
    parser.add_argument("--use-proxy", action="store_true", help="use system proxy from env (default off)")
    parser.add_argument("--insecure", action="store_true", help="disable TLS verification (not recommended)")
    args = parser.parse_args()

    url = f"{args.base_url.rstrip('/')}/ai/chat"
    headers = {
        "Content-Type": "application/json",
    }
    payload = {
        "messages": [
            {"role": "user", "content": args.user},
        ],
        "stream": True,
    }

    session = requests.Session()
    if not args.use_proxy:
        session.trust_env = False

    try:
        r = session.post(
            url,
            headers=headers,
            json=payload,
            stream=True,
            timeout=600,
            verify=not args.insecure,
        )
    except Exception as e:
        print("request error:", e)
        sys.exit(1)

    if r.status_code != 200:
        print(f"HTTP {r.status_code}")
        try:
            print(r.text)
        except Exception:
            pass
        sys.exit(1)

    for line in r.iter_lines(decode_unicode=True):
        if not line:
            continue
        if line.startswith("data: "):
            data = line[6:]
            if data == "[DONE]":
                print("\n[stream end]")
                break
            try:
                obj = json.loads(data)
                choice = (obj.get("choices") or [{}])[0]
                delta = choice.get("delta") or {}
                content = delta.get("content")
                if content:
                    print(content, end="", flush=True)
            except Exception:
                print(data)

if __name__ == "__main__":
    main()

SSE 数据块示例：

json

{"id":"chatcmpl-123","object":"chat.completion.chunk","created":1694268190,"model":"gpt-4o-mini","system_fingerprint":"fp_44709d6fcb","choices":[{"index":0,"delta":{"role":"assistant","content":""},"logprobs":null,"finish_reason":null}]}
{"id":"chatcmpl-123","object":"chat.completion.chunk","created":1694268190,"model":"gpt-4o-mini","system_fingerprint":"fp_44709d6fcb","choices":[{"index":0,"delta":{"content":"你好"},"logprobs":null,"finish_reason":null}]}
{"id":"chatcmpl-123","object":"chat.completion.chunk","created":1694268190,"model":"gpt-4o-mini","system_fingerprint":"fp_44709d6fcb","choices":[{"index":0,"delta":{},"logprobs":null,"finish_reason":"stop"}]}

说明：服务端以 data: {json} 形式逐事件返回，并在流结束时发送 data: [DONE]。

智能分流架构优势

统一入口：用户只需调用一个 /ai/chat 接口，系统自动识别意图并选择最佳处理流程
专业处理：每种能力都有专门的处理逻辑，确保最佳效果
API 兼容：文生图、文生视频等特殊能力会调用对应的专业 API 端点
回退机制：当意图识别失败时，默认回退到人文问答处理

双通道响应协议（默认提示词注入）

服务端会在缺省时自动注入两条提示词：

system：

你是一个面向中小学场景的学习助理。与学生自然交流，优先使用中文，语气友善、简洁。注意内容安全与年龄适配：不输出暴力、色情、违法、隐私数据，不给出危险操作。必要时先澄清再回答。

developer（简化版，若上游不支持 developer 角色，服务端会合并到 system）：

请按“双通道响应协议”作答：先输出面向学生的 Chat 内容；如果需要在右侧生成作品，则紧跟一个且仅一个 ```json 代码块，内容为 WorkPlan：
{
  "work": {"need": true|false, "type": "essay|stem|image|video|code|vision|none", "title": "", "engine": "...", "inputs": {...}, "ui": {"preview_markdown": "", "auto_execute": true}}
}
不要在 JSON 代码块内写注释或多余文本。

类型与参数约定（放入 work.inputs）：

essay: { "topic": string, "outline": [string], "length_words": number, "style": string, "constraints": [string] }（作文/讲解/课程总结等长文）
stem: { "problem": string, "outline": [string], "steps_detail": boolean, "format": "markdown"|"text", "constraints": [string] }（理科讲解/解题步骤）
image: { "prompt": string, "negative_prompt": string, "size": string, "steps": number, "seed": number|null }
video: { "prompt": string, "duration": number, "fps": number, "size": string, "seed": number|null }
code: { "language": string, "task": string, "specs": [string], "tests": string }
vision:{ "prompt": string, "images": [ { "type": "image_url"|"image_base64", "url"?: string, "data"?: string } ] }

已实现直连接口

POST /ai/chat/humanities：人文问答（固定用 ernie-4.5-turbo-vl，包含常识性问题）
POST /ai/chat/stem：理科问答（固定用 qwen3-235b-a22b-thinking-2507）
POST /ai/draw：文生图（固定用 Doubao-seedream-3-0-t2i）
POST /ai/video/task：文生视频提交任务（固定用 Doubao-seedance-1-0-pro）
GET /ai/video/task/{task_id}：文生视频查询任务进度
POST /ai/vision：图像理解（固定用 ernie-4.5-turbo-vl）
POST /ai/code：文生代码（固定用 qwen3-coder-plus）

文生图 `POST /ai/draw`

请求体：

json

{
  "prompt": "a cat astronaut, 3D style, cinematic lighting",
  "size": "1024x1024",
  "n": 1,
  "response_format": "b64_json"
}

行为：转发至 POST /v1/images/generations，固定 model=Doubao-seedream-3-0-t2i（可通过环境变量覆盖）。成功时返回上游 JSON；当 response_format=b64_json 时，取 data[0].b64_json 解码为 PNG。

便捷保存接口：

POST /upload/save-b64-image
- 入参：原始 base64 字符串（不含 data:image/png;base64, 前缀）
- 返回：{"url":"/static/ai/<file>.png"}
- 用法：将 /ai/draw 返回中的 data[0].b64_json 直接作为请求体字段 data 发送到此接口，即可获得公共访问 URL。

文生视频提交 `POST /ai/video/task`

请求体（示例）：

json

{
  "model": "Doubao-seedance-1-0-pro",
  "content": [
    {"type": "text", "text": "A running dog on school playground"},
    {"type": "image_url", "image_url": {"url": "https://example.com/ref.png"}}
  ]
}

行为：转发至 POST /doubao/contents/generations/tasks，返回上游 task_id 等信息。

文生视频轮询 `GET /ai/video/task/{task_id}`

行为：转发至 GET /doubao/contents/generations/tasks/{task_id}，直接返回上游 JSON；前端可比照你的 test_doubao.py 逻辑轮询直到成功并提取 video_url。

图像理解 `POST /ai/vision`

请求体：

json

{
  "prompt": "请描述图片中的场景",
  "images": [
    {"type": "image_url", "url": "https://example.com/image.png"}
  ],
  "stream": true
}

行为：固定模型 ernie-4.5-turbo-vl，拼装为多模态 messages 调用 /v1/chat/completions，支持 SSE。

理科问答 `POST /ai/chat/stem`

请求体：

json

{
  "messages": [
    {"role": "system", "content": "你是一个善于解析数学、物理、化学问题的理科助教。"},
    {"role": "user", "content": "请帮我解方程 x^2 - 5x + 6 = 0"}
  ],
  "stream": true
}

行为：固定模型 qwen3-235b-a22b-thinking-2507（可配置），直连 /v1/chat/completions，支持 SSE。

备注

鉴权与限流：默认开启 AI_REQUIRE_AUTH=true 时，/ai/* 与 /upload/save-b64-image 需要 Bearer Token（JWT），并按 AI_RATE_LIMIT_PER_MINUTE 做最小限流；可在 .env 调整或关闭。
服务已集成 Prometheus 监控，AI 指标将在后续版本补充。

AI 接口说明（当前版）

本文件整理当前已接入的 AI 接口与用法。现阶段仅实现聚合聊天入口 POST /ai/chat，临时固定路由到 humanities_qa（模型：ernie-4.5-turbo-vl），支持非流式与流式（SSE）两种返回方式。后续将逐步扩展到其他能力（图像理解、文生图、文生视频、文生代码）与自动意图识别。

环境变量

NEW_API_BASE_URL：new-api 聚合服务的基础地址（默认：https://www.api.moduoduo.cn）
NEW_API_KEY：new-api 的访问密钥（Bearer Token）
AI_DEFAULT_HUMANITIES_MODEL：/ai/chat 使用的默认模型（默认 ernie-4.5-turbo-vl）
AI_T2I_MODEL：/ai/draw 使用的模型（默认 Doubao-seedream-3-0-t2i）
AI_T2V_MODEL：/ai/video 使用的模型（默认 Doubao-seedance-1-0-pro）
AI_CODE_MODEL：/ai/code 使用的模型（默认 qwen3-coder-plus）
AI_VISION_MODEL：/ai/vision 使用的模型（默认 ernie-4.5-turbo-vl）
AI_STEM_MODEL：/ai/chat/stem 使用的模型（默认 qwen3-235b-a22b-thinking-2507）
AI_INTENT_BASE_URL：用于意图识别的海外聚合地址（默认：https://www.moduoduo.pro）
AI_INTENT_KEY：用于意图识别的 API Key（必填）
AI_INTENT_MODEL：意图识别模型（默认：claude-sonnet-4-20250514）

注意：如果未配置 NEW_API_KEY，调用 /ai/chat 将返回 500。

路由

POST /ai/chat
- 说明：聚合聊天入口，当前临时固定模型为 ernie-4.5-turbo-vl（对应 humanities_qa）。
- 行为：
  - 当 stream=false（默认）时，转发为常规 JSON 响应。
  - 当 stream=true 时，使用 SSE 流式转发上游的事件流（text/event-stream）。

请求体（示例）

json

{
  "messages": [
    { "role": "developer", "content": "你是一个有帮助的助手。" },
    { "role": "user", "content": "你好！" }
  ],
  "stream": true
}

字段说明：

messages: 消息列表，role 支持 system | user | assistant | developer，content 为文本内容。
stream: 是否启用流式（SSE）返回，默认 false。
可选参数：temperature、top_p（透传到上游）。

目前 model 字段可省略，服务端会固定使用 ernie-4.5-turbo-vl。后续会开放并受意图识别/强制路由控制。

响应示例

非流式：直接返回 new-api 的 JSON。
流式：透传 SSE 事件流。数据块格式示例（来自上游）：

json

{"id":"chatcmpl-123","object":"chat.completion.chunk","created":1694268190,"model":"gpt-4o-mini", "system_fingerprint": "fp_44709d6fcb", "choices":[{"index":0,"delta":{"role":"assistant","content":""},"logprobs":null,"finish_reason":null}]}

{"id":"chatcmpl-123","object":"chat.completion.chunk","created":1694268190,"model":"gpt-4o-mini", "system_fingerprint": "fp_44709d6fcb", "choices":[{"index":0,"delta":{"content":"你好"},"logprobs":null,"finish_reason":null}]}

{"id":"chatcmpl-123","object":"chat.completion.chunk","created":1694268190,"model":"gpt-4o-mini", "system_fingerprint": "fp_44709d6fcb", "choices":[{"index":0,"delta":{},"logprobs":null,"finish_reason":"stop"}]}

调用示例

PowerShell（流式，直连 new-api 示例，仅用于比对上游行为）：

powershell

$TOKEN = "<YOUR_NEW_API_TOKEN>"
$uri = "https://www.api.moduoduo.cn/v1/chat/completions"
$body = @'
{
  "model": "ernie-4.5-turbo-vl",
  "messages": [
    { "role": "developer", "content": "你是一个有帮助的助手。" },
    { "role": "user", "content": "你好！" }
  ],
  "stream": true
}
'@

$handler = New-Object System.Net.Http.HttpClientHandler
$client  = New-Object System.Net.Http.HttpClient($handler)
$request = New-Object System.Net.Http.HttpRequestMessage 'POST', $uri
$request.Headers.Add('Authorization', "Bearer $TOKEN")
$request.Content = New-Object System.Net.Http.StringContent($body, [System.Text.Encoding]::UTF8, 'application/json')

try {
  $resp = $client.SendAsync($request, [System.Net.Http.HttpCompletionOption]::ResponseHeadersRead).Result
  if (-not $resp.IsSuccessStatusCode) {
    Write-Host "HTTP $($resp.StatusCode)"; Write-Host ($resp.Content.ReadAsStringAsync().Result); exit 1
  }
  $stream = $resp.Content.ReadAsStreamAsync().Result
  $reader = New-Object System.IO.StreamReader($stream)
  while (($line = $reader.ReadLine()) -ne $null) {
    if ($line.StartsWith("data: ")) {
      $data = $line.Substring(6)
      if ($data -eq "[DONE]") { break }
      try {
        $obj = $data | ConvertFrom-Json
        $delta = $obj.choices[0].delta
        $content = $delta.content
        if ($content) { Write-Host -NoNewline $content }
      } catch {
        Write-Host $data
      }
    }
  }
  Write-Host "`n[stream end]"
} catch {
  Write-Host $_.Exception.Message
  exit 1
}

cURL 调用本服务 /ai/chat（流式）：

bash

curl -N -X POST "$BASE_URL/ai/chat" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      { "role": "developer", "content": "你是一个有帮助的助手。" },
      { "role": "user", "content": "你好！" }
    ],
    "stream": true
  }'

若上游启用鉴权且本服务配置了 NEW_API_KEY，不需要在客户端向本服务再传入上游 Token；本服务会使用服务端配置的 Token 代表客户端访问上游。

后续规划

引入 Claude 做意图识别，自动路由到：
- stem_qa → qwen3-235b-a22b-thinking-2507
- humanities_qa → ernie-4.5-turbo-vl
- vision_qa → ernie-4.5-turbo-vl
- t2i → doubao-seedream-3-0-t2i
- t2v → doubao-seedance-1-0-pro
- codegen → qwen3-coder-plus
暴露直连路由（如 /ai/draw、/ai/vision、/ai/video、/ai/code）。

06_AI 接口说明（初版） ​

全局配置 ​

聚合入口 ​

POST /ai/chat ​

智能分流架构优势 ​

双通道响应协议（默认提示词注入） ​

已实现直连接口 ​

文生图 POST /ai/draw ​

文生视频提交 POST /ai/video/task ​

文生视频轮询 GET /ai/video/task/{task_id} ​

图像理解 POST /ai/vision ​

理科问答 POST /ai/chat/stem ​

备注 ​

AI 接口说明（当前版） ​

环境变量 ​

路由 ​

请求体（示例） ​

响应示例 ​

调用示例 ​

后续规划 ​

06_AI 接口说明（初版）

全局配置

聚合入口

POST `/ai/chat`

智能分流架构优势

双通道响应协议（默认提示词注入）

已实现直连接口

文生图 `POST /ai/draw`

文生视频提交 `POST /ai/video/task`

文生视频轮询 `GET /ai/video/task/{task_id}`

图像理解 `POST /ai/vision`

理科问答 `POST /ai/chat/stem`

备注

AI 接口说明（当前版）

环境变量

路由

请求体（示例）

响应示例

调用示例

后续规划