Hallucination Chains: How Multi-Agent Systems Amplify Lies
Single Agent Hallucinations Are Isolated. Multi-Agent Hallucinations Are Cascades.
One agent hallucinates confidently. In a single-agent system, the user notices the lie immediately. In a pipeline of three agents?
Consider this: Agent A hallucinates a transaction ID (TX-12345) and returns it as fact. Agent B receives this output and uses it to check an account balance—treating TX-12345 as ground truth. Agent C takes that balance and executes a payment decision based on it.
The user gets a decision built on a cascade of unverified claims, starting with a single hallucination that was never caught.
In production fintech systems, this looks like:
- Agent A (fetcher): "I found transaction ID TX-12345" — actually hallucinated, never verified against an API
- Agent B (validator): "Balance for this transaction is $5,000" — built on Agent A's false output
- Agent C (processor): "Executing payment of $5,000" — now the hallucination has consequences
Problem: Nobody checked Agent A's output before passing it to Agent B.
This isn't theoretical. LLM orchestration frameworks like LangChain, CrewAI, and MCP chain agents together because distribution solves problems—complex tasks break into smaller steps. But breaking tasks into steps creates handoff points. Handoff points without verification are hallucination pipelines waiting to fail.
Logs Aren't Verification: Each Agent Logs Its Own Claim
You might think audit trails catch these cascades. They don't.
Agent A logs: "verified transaction". But a log is self-reporting. There's no independent witness confirming it actually happened.
Agent B logs: "received TX-12345 from A". This claims it got the output, but it doesn't verify that the output is real.
Orchestrator logs: "pipeline executed successfully". This assumes all agents worked correctly.
In a compliance audit, the logs look clean. The pipeline looks fine. Timestamps are in order. But the base claim was never verified—just logged.
Here's the critical distinction:
- Logs = Vendor self-reporting. The agent says what it did.
- Verification = Independent witness. Someone else confirms what actually happened.
Healthcare example: Agent A looks up a patient name in a database. Logs: "found patient John Smith". Agent B prescribes based on that patient record. Agent C dispenses medication. The logs show a clean pipeline, one patient, one prescription, one dispense.
But Agent A hallucinated the patient ID. The actual patient John Smith exists, but the ID used in the lookup is wrong. Agent A confident-hallucinated a different patient ID. Agent B and C never knew. Logs look fine. Patient got the wrong medication.
EU AI Act requires accountability for end-to-end agent behavior. Logs don't provide accountability—they're claims. Verification provides proof. If a regulator asks, "Prove that Patient A's ID was actually valid," logs give you nothing. Cryptographic proof of the lookup against the source system gives you everything.
The Verification Gap: Where Hallucinations Hide
The blind spot is at agent-to-agent handoffs.
Each agent verifies its own logic—"if X then Y"—but nothing verifies the input it received is actually ground truth.
Agent A verifies its logic internally: "if transaction exists then fetch balance". It doesn't verify that the transaction actually exists.
Handoff to Agent B: no verification of A's output.
Agent B verifies its logic: "if balance > threshold then approve". It assumes the input balance is real. Zero inter-agent verification.
Handoff to Agent C: no verification of B's output.
The pipeline has internal logic verification (each agent checks its own reasoning). But zero inter-agent verification (no one checks outputs at boundaries). This is where hallucinations hide.
What doesn't get caught:
- API was never called (agent claims it was, verification would show the proof)
- Database record doesn't exist (agent returns it, inter-agent check would compare against source)
- Number is hallucinated (confident, but unvalidated)
- Tool invocation failed silently (agent claims success, verification would show the proof signature)
Visually:
Agent A ✓(logic OK) → Agent B ✓(logic OK) → Agent C ✓(logic OK)
✗ ✗ ✗
(no boundary check) (no boundary check) (no boundary check)
Each box verifies its internal logic. Nothing verifies between the boxes.
Inter-Agent Verification Boundaries: Catching Lies at Handoffs
Verification at every handoff prevents false claims from becoming ground truth for downstream agents.
Between Agent A and Agent B: "Did A actually call that API? Show me the cryptographic proof." If there's proof, pass to B. If there's no proof, block the handoff.
Between Agent B and C: "Is this database record real? Compare the claimed record against the source system." Mismatch? Block. Match? Continue.
At system output: "Is this claim falsifiable? Can we validate it against ground truth?" Yes? Validate before the user gets the answer.
How it works in practice:
-
Agent A claims:
"Transaction TX-12345 verified"
Trust Layer checks: Is there cryptographic proof of this transaction? Timestamp? Hash? Signature? If yes, pass to B. If no, halt. -
Agent B claims:
"Patient ID P-98765 found"
Trust Layer checks: Does this patient ID exist in the medical records source? Direct comparison. Match? Continue. Mismatch? Escalate. -
Agent C claims:
"API returned USD 100"
Trust Layer checks: Did this API actually execute? Is there a signed proof with the timestamp? If verification fails, the handoff is blocked before C builds conclusions on it.
This is different from input sandboxing (preventing bad agents from running). This is output verification (proving what actually happened).
Implementation: Where to Put Verification Boundaries
Verification isn't a single checkpoint. It's a fabric across every agent handoff.
Between orchestrator and Agent A: verify A received correct input.
Between Agent A and Agent B: verify A's output before B depends on it.
Between Agent B and Agent C: verify B's output before C depends on it.
At system boundaries: verify final output against ground truth before the user gets it.
Cost: minimal latency (verification is often off-path), massive safety gain.
The pattern:
- Agent A executes → returns output + execution proof (timestamp, hash, signature)
- Trust Layer intercepts at boundary:
"Is this output trustworthy?" - Verification result: ✓ VALID (pass to Agent B) OR ✗ INVALID (halt, escalate)
- Agent B receives output only after verification
This works with any agent framework—LangChain, MCP, CrewAI, custom orchestrators. The verification layer is framework-agnostic.
Why This Matters in Production
Hallucination cascades aren't theoretical. They're happening now in production multi-agent systems.
Fintech: Agent chain handles payment processing. Agent A fetches account details (hallucinates). Agent B authorizes transfer based on the wrong account. Agent C completes the transfer. No inter-agent verification means the transfer happens to the wrong person. Impact: financial loss, regulatory violation, loss of customer trust.
Healthcare: Agent pipeline: fetch patient record → prescribe medication → dispense. If Agent A hallucinates the patient ID, Agent C dispenses wrong medication. Logs show clean pipeline. But patient got the wrong drug. Impact: patient harm, malpractice liability, compliance violation.
Compliance & Governance: Agent generates audit report claiming all decisions were verified. But inter-agent verification never happened—just logs. Regulator asks: "Prove Agent A's output was actually verified." You have logs. You don't have proof. Impact: regulatory failure, loss of license, inability to prove compliance.
EU AI Act requires accountability for agent behavior across the entire chain. If hallucinations cascade undetected, you can't prove accountability. Verification at every boundary is your proof—the evidence you need when regulators ask questions.
Conclusion: Hallucination Chains Require Verification Chains
Single-agent verification isn't enough. Multi-agent systems need verification at every handoff.
Key points:
- Orchestrators can't see what agents actually did—they only see logs
- Logs are self-reporting, not proof—verification is independent witness
- Each agent-to-agent handoff is an opportunity for unverified claims to become ground truth for downstream agents
- Cascading hallucinations are silent—logs look fine, audit trails look clean, but the output is wrong
- Inter-agent verification boundaries prevent lies from propagating
Trust Layer is the only verification layer that works across agent boundaries, models, and infrastructure. It's not about preventing agents from running (that's sandboxing). It's about proving what actually happened at every handoff—which is what compliance, safety, and accountability require.
If you're building multi-agent systems, the question isn't whether you'll get hallucinations. The question is: are you verifying outputs at every boundary before they cascade?