← Back to Ideas

Integrating a 'Critic' Agent to cross-verify traversal decisions against real-time distributed tracing data (Trace-Augmented Verification) eliminates hallucinated causal links in the dependency graph.

Feasibility: 9 Novelty: 6

Motivation

Static dependency graphs often fail to capture transient states or configuration drifts that occur during an incident. A single agent might hallucinate a dependency based on code structure that isn't active in runtime. A multi-agent approach where a 'Navigator' proposes a path and a 'Critic' verifies it against live trace data (e.g., Jaeger/Zipkin spans) ensures the traversal remains grounded in reality.

Proposed Method

Implement a dual-agent framework. Agent A (Navigator) suggests the next node based on the static graph. Agent B (Critic) queries the distributed tracing backend for high-latency spans or error codes between the current node and the proposed node. If no trace evidence exists, the path is rejected. Evaluate the system's precision in identifying root causes on a dataset comprising both code bugs and configuration errors.

Expected Contribution

A robust 'Trace-in-the-Loop' agentic architecture that combines static structural knowledge with dynamic runtime evidence, significantly reducing false positive diagnoses.

Required Resources

Access to a microservices testbed with distributed tracing (e.g., OpenTelemetry), LLM API access, and fault injection scripts.

Source Paper

Agentic Structured Graph Traversal for Root Cause Analysis of Code-related Incidents in Cloud Applications

View Paper Details →

←

Reinforcement Learning-based Fine-tuning of Traver

→

The agentic structured graph traversal framework c