When the Chain Breaks: Interactive Diagnosis of LLM Chain-of-Thought Reasoning Errors
Published in Computer Graphics Forum, Nottingham, UK, 2026
TL;DR. ReasonDiag helps users inspect long Chain-of-Thought traces, detect factual and logical errors, and trace how local mistakes propagate through an LLM's reasoning process.
Why This Matters
Current Large Language Models, especially Large Reasoning Models, can produce long Chain-of-Thought (CoT) traces that expose parts of their reasoning process. These traces can help users calibrate trust, but they are often verbose, nonlinear, and vulnerable to factual or logical errors.
ReasonDiag turns these traces into inspectable visual structures. It combines an automated error-detection pipeline with coordinated visualizations so users can identify suspicious steps, compare local and global reasoning patterns, and trace possible root causes.
Key Contributions
- An error-detection pipeline that combines external fact-checking with symbolic formal logic validation to flag step-level factual and logical errors.
- An arc diagram that summarizes reasoning-step distributions and error-propagation patterns across a CoT trace.
- A hierarchical node-link diagram that reveals high-level reasoning flows and premise-conclusion dependencies.
- Evaluation through technical analysis, case studies, and user interviews with 16 participants.
System Views
ReasonDiag is organized around two complementary levels of inspection:
- A trace-level view that helps users scan the distribution of reasoning steps and error propagation.
- A dependency-level view that helps users follow how premises support conclusions across the reasoning chain.
Resources
BibTeX
@article{chen2026reasondiag,
title={When the Chain Breaks: Interactive Diagnosis of LLM Chain-of-Thought Reasoning Errors},
author={Chen, Shiwei and Sritharan, Niruthikka and Wen, Xiaolin and Zhang, Chenxi and Wang, Xingbo and Wang, Yong},
journal={arXiv preprint arXiv:2603.21286},
year={2026}
}
Recommended citation: Chen, S., Sritharan, N., Wen, X., Zhang, C., Wang, X., & Wang, Y. (2026). When the Chain Breaks: Interactive Diagnosis of LLM Chain-of-Thought Reasoning Errors. Computer Graphics Forum, e70439.
Download Paper | Download Slides
