Governance Frameworks Tell You What to Log. They Don't Prove It Happened.
Governance Frameworks Tell You What to Log. They Don't Prove It Happened.
Microsoft released their agent-governance-toolkit. NIST published the AI RMF. The EU AI Act mandates logging for high-risk systems under Article 12. Every major framework now agrees: AI agents need audit trails.
None of them specify how to make those audit trails tamper-proof.
That's the governance-to-evidence gap. Policy says "log every tool call." Your agent logs every tool call. An auditor asks for proof. You hand over log files that the agent itself wrote. The auditor has no way to verify those logs weren't modified, truncated, or fabricated after the fact.
Governance without evidence is a checkbox exercise.
The problem is structural, not procedural
Consider a typical multi-agent pipeline: an orchestrator delegates tasks to specialist agents, each calling external APIs via MCP. Your governance framework says each call must be logged with timestamp, payload, and response.
So you add logging:
@server.call_tool()
async def handle_tool(name: str, arguments: dict):
resp = await httpx.post(upstream_url, json=arguments)
logger.info(f"Tool {name} called at {datetime.utcnow()}")
return resp.json()
This satisfies the governance requirement on paper. But three problems remain:
- The logger is controlled by the same process that executed the action. A compromised agent can log whatever it wants.
- Timestamps are self-reported. No external authority certifies when the call happened.
- Log integrity is assumed, not proven. If someone modifies a log entry six months later, nothing in the system detects it.
Governance frameworks acknowledge these risks. They just don't solve them at the runtime level.
What the frameworks actually require
The EU AI Act Article 12 mandates "automatic recording of events" for high-risk AI systems. Article 13 requires transparency about system behavior. Article 17 demands quality management systems with audit capabilities.
NIST AI RMF's MEASURE function calls for "mechanisms to track AI system behavior in deployment." ISO 42001 clause 9.1 requires monitoring and measurement of AI management system performance.
Read carefully: every framework requires evidence of what happened. Not just logs of what happened. The distinction matters because logs are claims. Evidence requires independent verification.
Closing the gap with cryptographic receipts
An Agent Action Receipt (AAR) transforms a log entry into independently verifiable evidence. Instead of your agent logging its own actions, a neutral proxy sits between the agent and the upstream API:
# BEFORE: agent calls API directly, logs itself
resp = await httpx.post("https://api.example.com/send", json=payload)
# AFTER: agent calls through a verification proxy
resp = await httpx.post(
"https://trust.arkforge.tech/v1/proxy",
headers={"X-Api-Key": API_KEY},
json={
"target": "https://api.example.com/send",
"method": "POST",
"payload": payload
}
)
proof = resp.json()["proof"]
The proxy does three things the agent cannot do for itself:
- Hashes both request and response (SHA-256) — binding what was sent to what was received
- Signs the receipt with Ed25519 — using a key the agent never holds
- Registers in Sigstore Rekor — a public, append-only transparency log maintained by the Linux Foundation
The receipt also includes an RFC 3161 timestamp from an external Time Stamping Authority. Three independent witnesses, none of which are the agent.
What a receipt looks like
{
"proof_id": "prf_20260406_091530_a7c3f1",
"spec_version": "1.2",
"hashes": {
"request": "sha256:b159d950...",
"response": "sha256:e51b41fd...",
"chain": "sha256:1c90c2a5..."
},
"timestamp": "2026-04-06T09:15:30Z",
"arkforge_signature": "ed25519:tMbiAuME7uToStdm...",
"transparency_log": {
"provider": "sigstore-rekor",
"log_index": 1217489868,
"verify_url": "https://search.sigstore.dev/?logIndex=1217489868"
}
}
The chain hash binds all fields together using canonical JSON serialization (Spec v1.2), preventing field-reordering attacks. Anyone can verify the receipt without contacting the proxy — the Sigstore entry and public key are independently accessible.
Mapping receipts to governance requirements
Here's where the governance gap closes. Each framework requirement maps to a concrete receipt property:
| Framework Requirement | Receipt Property |
|---|---|
| EU AI Act Art. 12 — automatic event recording | One receipt per tool call, generated at execution time |
| EU AI Act Art. 13 — transparency | Receipt includes full request/response hashes, shareable with users |
| NIST MEASURE — track behavior in deployment | Receipt chain provides complete execution history |
| ISO 42001 §9.1 — monitoring and measurement | Receipts are queryable, countable, auditable |
| Record retention (7+ years) | Sigstore Rekor entries are permanent and publicly searchable |
This isn't a theoretical mapping. You can generate a compliance report from actual receipts:
curl -X POST https://trust.arkforge.tech/v1/compliance-report \
-H "X-Api-Key: $KEY" \
-d '{"framework": "eu_ai_act", "date_from": "2026-01-01", "date_to": "2026-12-31"}'
The response shows per-article coverage (covered, partial, gap) with evidence summaries tied to specific proof IDs.
The cost of not closing the gap
EU AI Act enforcement begins August 2026. Organizations deploying high-risk AI systems need to demonstrate compliance — not describe it. The difference between "we have a logging policy" and "here are 47,000 cryptographic receipts covering every agent action in Q1" is the difference between an audit finding and an audit pass.
Governance toolkits are necessary. They define what compliance looks like. But they're the map, not the territory. The territory is what your agents actually did, provably, with evidence that survives scrutiny from parties who have every reason to be skeptical.
Try it
The ArkForge Trust Layer generates receipts for any HTTP transaction. Free tier: 500 proofs/month, no card required. Point your MCP server at the proxy endpoint, and every tool call produces a receipt that satisfies the logging requirements your governance framework already defines.
Proof spec (open source) — verify the cryptographic claims yourself.
Prove it happened. Cryptographically.
ArkForge generates independent, verifiable proofs for every API call your agents make. Free tier included.
Get my free API key → See pricing