Your AI agent says it completed the task. How do you verify that?

April 04, 2026 mcp agents verification python trust

Your AI agent says it completed the task. How do you verify that?

Your agent just ran send_email. It returned: "Email sent to alice@company.com at 14:03."

You trust this. You move on. But here is the uncomfortable question: on what basis?

The agent produced a string. That string came from a tool call that ran on a server you may not control. Between "agent invoked the tool" and "task complete", there is a gap: nothing independent confirms that the reported action actually happened, with the arguments you expected, at the time claimed.

This is not a hypothetical edge case. It surfaces as real problems:

  • A customer disputes an automated action. Your logs say it happened. Their system says it didn't.
  • A pipeline runs store_record twice due to a retry. The agent reports success once. You don't know which version is canonical.
  • An auditor asks for proof that your agent ran action X before action Y. Your logs are self-attested.

The self-reporting problem

Most MCP integrations work like this:

your code
    → calls agent
    → agent calls tools/call
    → tool executes on remote server
    → server returns result
    → agent returns "Done."

The agent's "Done." is the only feedback you get as the caller. The agent isn't lying—but it's reporting based on the tool's return value. If the tool said it worked, the agent says it worked. If the tool's return value was wrong (partial execution, optimistic response, network retry), the agent's report is wrong too.

You, as the client, have no receipt.

What a receipt gives you

An MCP receipt is a signed record of what actually happened at the transport layer—not what the tool claimed happened. It captures:

  • the exact request payload sent to the upstream API
  • the exact response received
  • a timestamp from an independent source
  • a signature you can verify without contacting the server that executed the action

The key distinction: a receipt is created by a neutral proxy that sits between your MCP server and the upstream API. The MCP server cannot issue its own receipt for its own actions—that would be self-attestation again. The receipt comes from infrastructure the MCP server doesn't control.

your code (MCP client)
    → agent
    → MCP server
        → [Trust Layer proxy]  ← issues receipt here
        → upstream API
    → receipt returned alongside response

Verifying a receipt from the client side

When you use a proxy like ArkForge Trust Layer, each tool call generates a proof stored under a prf_ ID. Here is how to consume and verify it in Python:

import httpx
import hashlib
import json

TRUST_BASE = "https://trust.arkforge.tech/v1/proof"

def canonical_json(data: dict) -> str:
    return json.dumps(data, sort_keys=True, separators=(",", ":"))

def verify_receipt(proof_id: str, original_payload: dict) -> bool:
    """
    Verify that a receipt matches what you sent.
    Returns True only if: receipt exists, integrity verified, and payload hash matches.
    """
    # Step 1: integrity check — no auth required
    check = httpx.get(f"{TRUST_BASE}/{proof_id}/verify").json()
    if not check.get("integrity_verified"):
        return False

    # Step 2: payload hash comparison — was this the request I actually sent?
    proof = httpx.get(f"{TRUST_BASE}/{proof_id}").json()
    recorded = proof.get("hashes", {}).get("request", "").replace("sha256:", "")
    expected = hashlib.sha256(canonical_json(original_payload).encode()).hexdigest()

    return recorded == expected

You don't need the MCP server's cooperation for this verification. The proof ID is public. Both endpoints are independent. You can call them from anywhere, at any time, days or months later.

Practical example: verifying an email send

Here is a concrete workflow. Your agent uses an MCP tool that routes through a certifying proxy:

async def agent_sends_email(to: str, subject: str, body: str):
    # Your agent calls the MCP tool (which internally routes through the proxy)
    result = await mcp_client.call_tool("send_email", {
        "to": to,
        "subject": subject,
        "body": body
    })

    # The proxy sets X-ArkForge-Proof-ID on its HTTP response.
    # An MCP server author surfaces this in the tool response JSON as "_proof_id".
    proof_id = result.get("_proof_id")

    if proof_id:
        store_proof(
            action="send_email",
            recipient=to,
            proof_id=proof_id,
            timestamp=result.get("_proof_ts")
        )

    return result

Later, if a recipient disputes receiving the email:

def audit_email_action(proof_id: str) -> dict:
    check = httpx.get(
        f"https://trust.arkforge.tech/v1/proof/{proof_id}/verify"
    ).json()

    return {
        "integrity_verified": check.get("integrity_verified"),
        "timestamp": check.get("timestamp"),
        "transparency_log": check.get("transparency_log", {}).get("status"),
        "verification_url": check.get("verification_url"),
    }

The transparency_log.status field indicates whether the chain hash has been anchored in Sigstore Rekor—a public, append-only transparency log. When status is verified, the record exists outside your infrastructure and outside the MCP server's infrastructure. It's the third independent witness.

What this doesn't solve

Receipts prove that a specific HTTP request was sent to a specific endpoint and a specific response was received. They don't prove:

  • That the upstream service actually processed the request correctly (the email service might have accepted and then silently dropped the message)
  • That the agent's interpretation of the result was correct
  • That the tool did the right thing semantically

What receipts do establish: the exact bytes sent, the exact bytes received, the certified time, and an independent record. That's enough to resolve the majority of real disputes, and enough to satisfy audit requirements for the transport layer.

For semantic verification—did the agent do the right thing, not just a thing—you still need application-level checks. Receipts are transport-layer proof, not correctness proof.

When to use client-side verification

Not every tool call needs independent verification. The overhead is real (an extra HTTP round-trip per call). Use receipts for:

  • Irreversible actions: email sends, payment initiations, record deletions
  • Cross-party handoffs: where another team or company will consume the output
  • Compliance-sensitive operations: anything that falls under logging requirements in your jurisdiction
  • Debugging multi-agent chains: when an orchestrator delegates to sub-agents and you need to trace causality

For read-only or idempotent operations (queries, lookups, summaries), receipts add cost with minimal benefit.

Setting up client-side receipt collection

If you're already using a Trust Layer proxy on your MCP server, no server-side changes are needed. Receipts are generated automatically. On the client side:

  1. Configure your MCP server to surface X-ArkForge-Proof-ID (returned by the proxy as a response header) in the tool call result JSON as _proof_id
  2. Store proof IDs alongside the action record in your application database
  3. Verify on demand: GET /v1/proof/{proof_id}/verify — no auth, always free

Free tier: 500 proofs/month, no card required. The verification endpoint is always free—there's no charge to verify an existing proof.

# Check proof endpoint (no auth required for verification)
curl https://trust.arkforge.tech/v1/proof/prf_20260303_161853_4d0904

The response includes a human-readable HTML badge you can share with clients or auditors.


ArkForge Trust Layer — free tier: 500 proofs/month. GitHub | Live API