RAG 参考

本页回答“当前项目的知识链路除了 ingest 和 retrieve 之外，实际上还有哪些调试、修复和验证能力”。

主链路模块

路径	作用
`ai_service/services/ingestion.py`	文档摄取、chunk 生成、向量写入、ingestion job 主流程
`ai_service/services/chunking.py`	chunking strategy 注册、能力描述、参数校验
`ai_service/services/embedding.py`	embedding provider 与 readiness check
`ai_service/services/rag_retrieval.py`	最终检索入口，负责 mounted source 过滤、top-k、threshold、retrieval mode
`ai_service/services/multi_query.py`	multi-query 扩展
`ai_service/services/rerank.py`	rerank 阶段

这部分是当前项目文档里最容易被低估的能力面。

路径	作用
`ai_service/services/rag_debug.py`	生成结构化 retrieval debug report，解释某次查询为什么召回或没召回
`ai_service/services/manual_inspection.py`	运行 persisted manual inspect，支持 `vector_health` 和 `chunking_preview` 两种模式
`ai_service/services/document_vector_reconciliation.py`	启动期核验 indexed documents 的向量健康，并把结果回写到 PostgreSQL
`ai_service/services/snapshot_rebuild.py`	从 parse snapshot 同步重建指定文档
`ai_service/services/knowledge_correction_service.py`	把对话纠错沉淀为 managed correction source

路径	作用
`ai_service/api/routers/knowledge.py`	knowledge source、document、ingestion、manual inspection、snapshot rebuild、retrieval test
`ai_service/conversations/interfaces/http/router.py`	knowledge correction 与对话侧联动
`ai_service/storage/model_domains/documents.py`	document、chunk、ingestion、manual inspection、parse snapshot 持久化模型
`ai_service/storage/model_domains/knowledge.py`	source 与 agent mount 关系
`ai_service/storage/qdrant_client.py`	Qdrant collection / vector count / 搜索访问层

ai_service/services/rag_debug.py 不是简单列 top-k，而是会同时输出：

这意味着当前项目已经具备“运维可诊断的 RAG”能力，不只是黑盒检索。

ai_service/services/manual_inspection.py 当前支持两条路径：

模式	用途
`vector_health`	验证已索引文档在 Qdrant 中是否仍然存在预期向量点
`chunking_preview`	在不改 live chunk/vector state 的前提下预览另一种 chunking strategy 效果

这个设计很关键，因为它把“看问题”和“改线上状态”明确分开了。

路径	作用
`ai_service/utils/settings.py`	RAG、embedding、rerank、debug 等配置
`ai_service/storage/model_domains/enums.py`	document status / vector status 枚举真相源