The forces converging on AI provenance
"Provenance" used to be a museum word. In 2026 it's an infrastructure requirement — because three forces hit at the same time:
- C2PA — the Content Authenticity Initiative standard backed by Adobe, Microsoft, OpenAI, Sony, BBC, Nikon, Canon, and now most major AI labs. C2PA defines a manifest format for content provenance: who made it, with which tool, from which source. It does not define where that manifest is anchored. That's where chain-anchoring comes in.
- EU AI Act — Article 50 requires that synthetic content be detectable and disclosed. Implementing this without verifiable provenance is impossible. Anchored hashes are the simplest substrate.
- Deepfake disputes — defamation cases, election interference, financial fraud are increasingly fought over whether content was tampered with. A SHA-256 + Merkle proof + on-chain timestamp is the cleanest evidence package a court can verify.
The technical problem
AI labs already record what they generate. Most providers log prompts, model versions, output hashes. But that log lives in their database. When a dispute arises a year later — *"was this image really generated by your model in March 2026?"* — you have a fundamental trust problem:
- The AI lab could have modified their logs.
- The lab could be acquired, change policies, or shut down entirely.
- The verifier has no way to check without trusting the lab's word.
The fix is to put the hash somewhere the lab doesn't control. That's blockchain anchoring. The lab still keeps the original metadata, but the verifiable fingerprint lives on an immutable, third-party-readable ledger.
Why C2PA alone isn't enough
C2PA gives you the manifest format — a JSON-LD document describing the asset's provenance. It also gives you cryptographic signing: every step of the content's life can be signed by an identity. That's necessary, but it's not sufficient. Consider the gaps:
1. Signing trust expires
Every C2PA signature relies on a certificate. Certificates expire. CAs revoke. If you try to verify a 5-year-old image, the signing cert may be invalid even though the content is genuine. Anchoring the manifest hash on a public chain creates a permanent timestamp that doesn't decay.
2. The manifest itself can be lost
C2PA manifests are stored alongside the asset — in image metadata, in a sidecar file, in cloud storage. They get stripped by image-optimization pipelines, lost in transit, removed by uploaders. An anchored hash survives these stripping steps because any verifier can re-hash the asset and check against the chain.
3. No public attestation
A C2PA manifest signed by an AI lab is only as trustworthy as that lab. A regulator or court wants third-party-verifiable proof. The Merkle root on a public chain provides exactly that — anyone can independently check it.
C2PA tells you what the manifest contains. Anchoring tells you the manifest existed at this exact moment. You need both.
The Anchora pattern
Anchoring an AI output looks the same as anchoring any record. The point isn't novelty — it's that the same primitive works for AI:
import { AnchoraClient } from '@anchora/sdk';
import sha256 from 'js-sha256';
const anchora = new AnchoraClient({
apiKey: process.env.ANCHORA_API_KEY,
projectId: 'proj_...'
});
// 1. Generate an AI output
const generated = await openai.images.generate({ prompt: '...' });
const imageBytes = await fetch(generated.url).then(r => r.arrayBuffer());
// 2. Build the provenance record (C2PA-shaped manifest)
const manifest = {
asset_hash: sha256(imageBytes),
model: 'dall-e-3',
prompt_hash: sha256(prompt),
generated_at: new Date().toISOString(),
c2pa_compatible: true
};
// 3. Anchor the manifest hash on-chain
const result = await anchora.anchor({
data: manifest,
hashOnly: true
});
// → { hash, recordId, status: 'QUEUED' }
// Now anyone can re-hash the image, look up the on-chain proof, and
// confirm: "yes, this exact byte sequence was registered at this exact time."
Three lines that matter: hash the asset, build a manifest, anchor it. The manifest itself can include whatever provenance fields the standard demands — C2PA today, whatever comes next tomorrow. What's anchored is its hash, immutable forever.
What chain should anchor it?
This is where Anchora's multi-chain story matters for AI specifically:
| Strategy | Best fit for |
|---|---|
| Public (Polygon) | Consumer AI tools, anyone-can-verify use cases, news media, journalist tools. Court-admissible, permissionlessly verifiable. |
| Private (Fabric BYO) | Internal AI tooling at regulated enterprises (banks running internal LLMs, hospitals using AI on PHI). Provenance stays within the trust boundary. |
| Hybrid | Regulated AI workflows where both internal audit and external attestation are required. Pharma AI tooling, gov AI systems, large-scale clinical AI. |
| Custom | Existing AI consortiums on their own chain (Ethereum mainnet, Besu, Quorum). |
Why this is the right time to build on this layer
Three concrete signals that AI provenance is moving from "research project" to "production requirement":
- OpenAI's C2PA rollout — DALL-E 3 outputs include C2PA manifests by default. Sora video is following. Watch for ChatGPT text watermarking.
- Adobe's "Content Credentials" — rolled out across the Creative Cloud. Every Photoshop AI-Fill generation can carry a provenance chain.
- Camera makers shipping C2PA hardware — Sony, Nikon, Canon shipping bodies that sign images at capture time. Once the camera signs and your AI pipeline signs, the only thing missing is a permanent, verifiable anchor.
What this doesn't solve
Honest framing. Anchoring an AI output's hash does NOT:
- Prove the model wasn't manipulated. A bad-faith operator could anchor a forged manifest. Trust still flows from identity + signing + timing constraints — not the chain alone.
- Detect deepfakes that were never anchored. Anchoring is opt-in. Bad actors won't anchor their forgeries. What anchoring does is let you prove the genuine outputs ARE genuine — making the unanchored forgery stand out.
- Stop AI misuse. Cryptographic provenance is a forensics tool, not a content filter.
The pitch isn't *"blockchain solves AI"*. The pitch is *"if your generative AI product cares about long-term verifiability, anchoring is the cheapest infrastructure for it"*.
How to start
Practically, if you're building an AI product right now:
- Decide which outputs need provenance. Not all do. Marketing copy probably doesn't. Compliance-critical outputs (medical notes, financial summaries, legal drafts, news content, evidence imagery) definitely do.
- Hash everything at the model boundary. Before the asset leaves your serving layer, SHA-256 it. Cheap, deterministic, infinite future utility.
- Anchor what matters. The Anchora free tier covers 1,000 records/month. That's enough to anchor a meaningful slice of high-value outputs without budget approval.
- Add C2PA later. Anchoring first, manifest standardization second. The hash is the substrate; C2PA is the wrapper.
Anchoring an AI workflow?
Free tier covers 1,000 records/month on Polygon Amoy. Three lines of code. No credit card.
Start anchoring AI outputs