Use case

Proving Your Work Predates AI Training Data

AI models train on data ingested up to a cutoff date. A blockchain-anchored timestamp predating that cutoff is verifiable evidence that your work existed before the model could have seen it.

The temporal question in AI training disputes

When a creator alleges that an AI model was trained on their work without permission — a photographer whose images appear in a generative image dataset, a writer whose text is in a language model’s training corpus, a musician whose tracks were ingested into an audio model — the central evidentiary question is temporal. Did the work exist before the model’s training data was collected?

The answer to that question determines whether ingestion was even possible. If the work demonstrably existed before the cutoff, ingestion is plausible. If it didn’t, ingestion is impossible. So “when did your work exist?” is the first question — and it’s harder to answer cleanly than you’d expect.

Why standard evidence is weak

Without a tamper-evident anchor, the available evidence of when a digital file existed is surprisingly thin:

A blockchain-anchored SHA-256 hash addresses the weakness directly. The hash binds to the exact file bytes. The on-chain timestamp is recorded by network consensus and cannot be altered after the fact. Verification is public and adversarial — anyone can check the timestamp without trusting you.

How anchoring fits an AI training timeline

Most major AI model training datasets have published or inferable cutoff dates. For image models, training data is typically scraped before a specific date and frozen. Same for language models and audio models. The training cutoff is a fixed point in time after which new data cannot have been included.

If your work was anchored before that cutoff, the on-chain timestamp establishes prior existence beyond dispute. If it was anchored after, it doesn’t address the training data question — but it still establishes a useful evidentiary baseline for any future disputes.

Practical workflow

Anchor before publishing or sharing publicly

The strongest position is to anchor immediately upon creating a finished file, before any web upload or share. The on-chain timestamp then predates any opportunity for the work to be scraped or ingested.

Anchor retroactively if you have original files

If you have files that already exist and were created before any specific AI model's training cutoff, anchor them now. The timestamp won't go back in time, but the hash will match the original file bytes — and your file's modification dates (combined with the matching hash) provide additional context.

Anchor at portfolio-level granularity

For working professionals, anchor liberally — drafts, finished pieces, anything that might end up commercially valuable. Cost is cents per file. Coverage matters more than per-file analysis at the moment of anchoring.

Keep the master files

Anchoring only proves the hash existed at the timestamp. To use that proof later, you need the original file. Back up master files separately from your active workflow.

What anchoring does and does not do

Does: establish verifiable proof that a specific file existed at a specific time, which is useful evidence in any dispute about training data provenance, authorship, or prior publication.

Does not: prevent your work from being scraped, opt your work out of training datasets, establish authorship by itself, or constitute legal action against an AI provider. Anchoring is a proof system, not a prevention system or a legal claim. It pairs with other tools (no-AI metadata, content provenance signals like C2PA, copyright registration) to build a full defensive stack.

Pricing

Free tier: 5 lifetime anchors. Pro: $9.99/month for 50 anchors. Business: $49.99/month for 500 anchors with API access. See pricing.

Related reading