单证识别参考
本页回答“当前单证识别由哪些模块构成、对外暴露什么 API、怎么表达回顾持久化”。
主模块
| 路径 | 作用 |
|---|---|
ai_service/document_recognition/domain/ |
单证识别 run/review/issue/summary 模型 |
ai_service/document_recognition/application/ports/ |
repository / asset store 抽象 |
ai_service/document_recognition/application/use_cases/create_runs.py |
runtime metadata 解析与 canonical run 创建用例 |
ai_service/document_recognition/application/use_cases/review_runs.py |
Fusion run projection 与字段回顾用例 |
ai_service/document_recognition/application/projections.py |
summary / issue / field review 归一化 |
ai_service/document_recognition/infrastructure/persistence/document_recognition_repository.py |
SQLAlchemy 仓库适配器 |
ai_service/document_recognition/infrastructure/persistence/legacy_document_extraction_job_bridge.py |
历史 review storage bridge |
ai_service/document_recognition/infrastructure/storage/minio_document_asset_store.py |
文档资产读写 |
ai_service/document_recognition/interfaces/http/router.py |
HTTP 聚合入口 |
ai_service/document_recognition/interfaces/http/runs.py |
run/review API |
当前 API 面
| 接口 | 作用 |
|---|---|
GET /document-recognition/runtime-agents |
列出当前已注册的可选 Fusion runtime agent |
GET /document-recognition/runtime-agents/{runtime_agent_id} |
查看 runtime 的 published 版本、上传槽位解析与执行策略摘要 |
POST /document-recognition/runs |
通过 canonical document-recognition 入口上传文件并创建 run |
GET /document-recognition/runs |
列出可投影的单证识别 run |
GET /document-recognition/runs/{run_id} |
查看单个 run 的 source、summary、fields、issues、review 状态与 workspace_output |
PATCH /document-recognition/runs/{run_id}/field-reviews/{field_id} |
修正字段 review |
GET /document-recognition/runs/{run_id}/field-reviews/{field_id}/revisions |
按需读取单字段 baseline 与 revision timeline |
GET /document-recognition/runs/{run_id}/source-document |
下载原始 source document |
GET /document-recognition/runs/{run_id}/source-pdf |
下载 PDF source |
GET /document-recognition/runs/{run_id}/result |
下载 structured result JSON |
GET /admin/document-recognition/overview |
Studio 总览 |
GET /admin/document-recognition/runs |
后台运行记录 |
GET /admin/document-recognition/runtime-agents |
查看当前 registry |
PUT /admin/document-recognition/runtime-agents/{agent_id} |
注册一个可用于单证识别的 Fusion agent |
DELETE /admin/document-recognition/runtime-agents/{agent_id} |
从 registry 中移除一个 Fusion agent |
Registry PUT / DELETE 使用 agent_id 作为稳定标识,要求管理员具备 document_recognition.write,并写入 admin audit;该路径不要求 x-admin-challenge-token。
Run Detail Response
run detail 返回:
runtime_agent_idruntime_agent_version_idruntime_agent_type_snapshotsource_document_urlsource_pdf_urlsummaryfield_reviewsissue_listpreview_pagesworkspace_output
其中 field_reviews[] 只携带轻量 revision summary:
revision_countis_changed_from_extractedlast_revised_atlast_revised_by
完整单字段 ledger 需要单独读取 GET /document-recognition/runs/{run_id}/field-reviews/{field_id}/revisions。
Field Revision Timeline
单字段 timeline 响应会返回:
- baseline extracted value
- 当前
current_value/current_review_status/current_reviewer_note history_status,取值为recorded或unrecorded- append-only
revisions[]
对于功能上线前没有 ledger 的旧 run,服务端会返回 baseline 快照与 history_status=unrecorded。前端应把它展示成“未记录历史”,而不是空白。
应用边界提示
- Fusion 子系统负责创建和执行 run。
document_recognition负责把 Fusion output 转成 review projection。- run detail 默认只返回轻量 field revision summary;timeline 按需单独读取,避免默认 payload 过重。
- runtime agent 是否属于 document recognition 由 admin registry 显式声明。
- 旧 review storage 兼容只在 infrastructure bridge 内部处理。
/document-recognition/runtime-agents*与/document-recognition/runs*都是公开 route,不再维护额外的前缀别名。/agents/{agent_id}/extraction-jobs不再由ai_service/document_recognition包注册。
读代码时的典型切入点
想改 Fusion output 到字段回顾的映射
看 application/projections.py。
想改 projection 持久化
看 application/use_cases/review_runs.py 和 infrastructure/persistence/legacy_document_extraction_job_bridge.py。
想改 API 返回字段
看:
interfaces/http/schemas.pyinterfaces/http/serialization.py