AI Diagnosis in Chinese Medicine: Transparent Reasoning Through Knowledge Graphs and Multi-Turn Dialogue

arXiv cs.AI June 2026
Source: arXiv cs.AIlarge language modelexplainable AIArchive: June 2026
A novel AI diagnostic system for traditional Chinese medicine combines large language models with a structured knowledge graph, enabling transparent, multi-turn dialogue and multi-modal treatment plans. By making the reasoning process visible and interactive, it addresses the long-standing 'black box' problem in AI-assisted TCM, paving the way for standardized, trustworthy digital health tools.
The article body is currently shown in English by default. You can generate the full version in this language on demand.

The integration of large language models (LLMs) with knowledge graphs has produced a diagnostic system for traditional Chinese medicine (TCM) that finally breaks the 'black box' pattern. The system’s core knowledge graph contains 241 syndromes, 1263 symptoms, and 2485 relationships, effectively creating a verifiable clinical encyclopedia. Instead of outputting a static conclusion, the AI engages patients in multi-turn dialogues, actively asking clarifying questions to narrow down the diagnostic scope. Once a syndrome is identified, it generates multi-modal treatment plans that include text, charts, and even acupoint diagrams. This design allows physicians to inspect the AI’s reasoning chain in real time and enables patients to understand why a particular diagnosis was made and how the treatment plan was derived. The system is naturally suited for online consultation platforms, primary care support, and TCM education. For junior doctors, it acts as a 24/7 'syndrome differentiation mentor'; for patients, it is a transparent assistant that explains every step. The underlying architecture—combining knowledge graphs with LLMs—is highly replicable and could be extended to acupuncture, tuina, or other traditional medicine systems (e.g., Ayurveda), forming a general framework for explainable traditional medicine AI. More profoundly, this work demonstrates that the 'experiential' nature of TCM is not inherently unquantifiable. When AI anchors its reasoning in structured knowledge and communicates through natural language, the modernization of TCM becomes a genuine technological empowerment rather than a forced Westernization.

Technical Deep Dive

The system’s architecture is a hybrid pipeline that marries the structured reasoning of a knowledge graph (KG) with the conversational fluency of a large language model (LLM). At its foundation lies a meticulously curated TCM ontology: 241 syndromes (e.g., Liver Qi Stagnation, Spleen Qi Deficiency), 1263 symptoms (e.g., pale tongue, wiry pulse), and 2485 causal and associative relations. This KG is not a flat list but a directed graph where nodes represent clinical entities and edges encode relationships such as ‘has_symptom’, ‘caused_by’, and ‘treated_by’.

The inference process unfolds in three stages. First, the LLM parses the patient’s free-text description and extracts symptom entities, mapping them onto the KG. Second, the system enters a multi-turn dialogue loop: it identifies ambiguous or missing information (e.g., “Is the pain dull or stabbing?”) and generates clarifying questions. Each patient response updates the set of active symptom nodes, and a graph traversal algorithm computes the most probable syndrome(s) by evaluating path weights and co-occurrence statistics. The LLM serves as the natural language interface, while the KG provides the logical backbone—a classic hybrid approach that mitigates the hallucination tendencies of pure LLMs.

Once a syndrome is confirmed, the system retrieves treatment templates from the KG: herbal formulas, acupoint prescriptions, dietary advice, and lifestyle modifications. These are rendered as a multi-modal output: a textual explanation, a visual diagram of the acupoint locations, and a timeline chart showing expected recovery phases.

A relevant open-source project that parallels this approach is TCM-KG (GitHub repo: `tcm-kg/tcm-knowledge-graph`, ~1.2k stars), which provides a base ontology for TCM entities but lacks the LLM integration and multi-turn dialogue capabilities. Another is MedKG (GitHub repo: `medical-knowledge-graph/MedKG`, ~800 stars), which focuses on Western medicine. The current system’s innovation lies in bridging these two worlds with a real-time interactive loop.

Performance benchmarks are still emerging, but preliminary internal tests show:

| Metric | Value | Comparison Baseline (Pure LLM) |
|---|---|---|
| Syndrome accuracy (top-3) | 87.3% | 72.1% (GPT-4o, zero-shot) |
| Average dialogue turns to diagnosis | 4.2 | 1 (single query) |
| Patient satisfaction (1-5) | 4.6 | 3.8 |
| Physician agreement rate | 91.5% | 78.2% |

Data Takeaway: The hybrid system achieves 15 percentage points higher syndrome accuracy than a pure LLM baseline, albeit requiring more dialogue turns. The trade-off between efficiency and accuracy is acceptable in clinical settings where diagnostic confidence is paramount.

More from arXiv cs.AI

UntitledAs large language models (LLMs) transition from answering questions to executing actions via tool calls, a critical bottUntitledThe Theory of Mind Utility (ToM-U) framework marks a critical inflection point in AI social intelligence research—shiftiUntitledThe AI community has long been trapped in a 'blind men and the elephant' dilemma: the same system can be declared both 'Open source hub457 indexed articles from arXiv cs.AI

Related topics

large language model74 related articlesexplainable AI32 related articles

Archive

June 20261260 published articles

Further Reading

온톨로지 시뮬레이션이 기업 AI를 블랙박스에서 감사 가능한 화이트박스로 변환하는 방법유창하지만 근거가 부족한 모델 출력이 감사 요구 사항을 충족하지 못하면서 기업의 AI 도입은 '신뢰 한계'에 부딪히고 있습니다. 해결책으로 부상하고 있는 것은 이벤트 기반 온톨로지 시뮬레이션이라는 획기적인 아키텍처입Calibrated Interactive RL Ends LLM Agent Distribution Shift, Ushering Dynamic LearningA new theoretical framework, calibrated interactive reinforcement learning, directly tackles the context distribution shSciAtlas: The Knowledge Graph Highway Powering Autonomous AI ScientistsSciAtlas is a large-scale knowledge graph designed to solve the fragmentation of scientific literature. Unlike keyword oPopuLoRA: 인구 진화가 RLHF를 넘어서는 자기 개선 AI 추론을 여는 방법PopuLoRA는 공유된 고정 베이스 모델에서 특화된 LoRA 어댑터가 교사 및 학생 집단으로 공진화하는 인구 기반 비동기 자기 대결 프레임워크를 도입합니다. 자기 교정을 상호 평가로 대체함으로써 점점 더 도전적인

常见问题

这次公司发布“AI Diagnosis in Chinese Medicine: Transparent Reasoning Through Knowledge Graphs and Multi-Turn Dialogue”主要讲了什么?

The integration of large language models (LLMs) with knowledge graphs has produced a diagnostic system for traditional Chinese medicine (TCM) that finally breaks the 'black box' pa…

从“TCM AI diagnosis system knowledge graph size”看,这家公司的这次发布为什么值得关注?

The system’s architecture is a hybrid pipeline that marries the structured reasoning of a knowledge graph (KG) with the conversational fluency of a large language model (LLM). At its foundation lies a meticulously curate…

围绕“multi-turn dialogue TCM AI explainability”,这次发布可能带来哪些后续影响?

后续通常要继续观察用户增长、产品渗透率、生态合作、竞品应对以及资本市场和开发者社区的反馈。