Integrating a 'Critic' Agent to cross-verify traversal decisions against real-time distributed tracing data (Trace-Augmented Verification) eliminates hallucinated causal links in the dependency graph.
Motivation
Static dependency graphs often fail to capture transient states or configuration drifts that occur during an incident. A single agent might hallucinate a dependency based on code structure that isn't active in runtime. A multi-agent approach where a 'Navigator' proposes a path and a 'Critic' verifies it against live trace data (e.g., Jaeger/Zipkin spans) ensures the traversal remains grounded in reality.
Proposed Method
Implement a dual-agent framework. Agent A (Navigator) suggests the next node based on the static graph. Agent B (Critic) queries the distributed tracing backend for high-latency spans or error codes between the current node and the proposed node. If no trace evidence exists, the path is rejected. Evaluate the system's precision in identifying root causes on a dataset comprising both code bugs and configuration errors.
Expected Contribution
A robust 'Trace-in-the-Loop' agentic architecture that combines static structural knowledge with dynamic runtime evidence, significantly reducing false positive diagnoses.
Required Resources
Access to a microservices testbed with distributed tracing (e.g., OpenTelemetry), LLM API access, and fault injection scripts.
Source Paper
Agentic Structured Graph Traversal for Root Cause Analysis of Code-related Incidents in Cloud Applications