Like with many tech developments, the regulation lags behind the revolution.
The EU Generative AI Outlook Report 2025 is full of promise: a fair data economy, harmonised copyright rules, and “trustworthy” AI that respects creators.
But read between the lines and the cracks begin to appear. The report itself warns of “a lack of consistency being exacerbated by differing opinions of courts in EU Member States” and admits that “existing technologies, such as robots.txt… may not be well-suited for GenAI applications.”
In other words: the rules aren’t ready for the machines.
Shocker.
AI models have already absorbed decades of journalism, from local archives to national dailies, without consent, compensation, or clear liability. Publishers are being told to “trust the process,” while that very process is being rewritten by technology companies that don’t ask permission.
The gap between the policy ideal and market reality is widening, and it’s publishers, journalists, artists and creators who fall through it.
Consequence: The invisible invoice
When AI models train on publisher content, the intellectual property value doesn’t quite vanish, more so it migrates elsewhere.
Every scraped article, every unlicensed dataset, is a quiet transfer of wealth from those who create knowledge to those who automate it.
Yet the legal landscape offers little recourse. Liability is murky: if AI-generated output infringes on a third party’s copyright, who’s responsible? The model provider? The user? The data source?
Personally, I would say it’s the provider, as they build and train the models, but it remains up for debate.
The report calls for “intensified efforts toward standardisation”, but until that happens, publishers are left footing the bill for innovation they didn’t agree to fund.
The reality is that licensing content is tedious, difficult and an all round pain in the arse.
Anthropic called traditional licensing “business slog”, due to the extraneous effort involved legally acquiring content from across the web (hard to blame them).
Meanwhile, old defensive tools publishers use, like robots.txt, once the digital equivalent of “no trespassing”, are useless in the age of large-scale scraping and fine-tuned models.
The result is a paradox: publishers are more visible than ever, and yet less in control of their own visibility.
How to win: Turn compliance into commercial leverage
Here’s the flip side.
The same regulation that feels heavy-handed today can become a monetisation framework tomorrow.
The European Strategy for Data envisions “Common European Data Spaces” — structured, secure environments where data flows with provenance, auditability, and consent.
That’s infrastructure people need, rather than additional bureaucracy, which people hate. It means every dataset, including publisher archives, could soon move through a system where ownership and usage are verifiable.
In other words attribution and revenue.
If you can prove your data’s provenance, package it as a clean dataset, and reserve your rights for text and data mining (TDM), you morph from a passive participant in AI training to a reliable supplier of a desired product: your content.
Why is this important? Well, simply put: suppliers get paid.
To get there, publishers need to build three key capabilities:
Rights posture
Make ownership visible and machine-readable; reserve TDM rights explicitly.Data packaging
Structure archives so they can be licensed, not just read.Provenance & observability
Tag, trace, and verify content across the AI supply chain.
These aren’t compliance chores; they’re commercial positions.
Done right, they turn policy overhead into competitive advantage.
The Three New Plays for Publishers
1️⃣ Blockchain-Based Licensing: The proof layer
When the report calls for “harmonised approaches to reserving rights”, it’s describing a technical vacuum that blockchain can fill.
Avoid thinking of blockchain as only crypto speculation, that’s a mistake. Instead, recognise it as a proof layer for ownership and authenticity.
Writers’ Bloc creates cryptographic digital fingerprints for your content and log those to the blockchain for provenance and ownership.
Other projects like the Coalition for Content Provenance and Authenticity (C2PA) and Content Credentials also use similar cryptographic signatures to verify where and when a piece of content was created.
For publishers, this means your archive can carry its own passport: proof of origin, usage rights, and permitted contexts.
As AI regulation demands traceability, the ability to attach verifiable provenance becomes a commercial differentiator.
So what:
Publishers who embed provenance metadata now will own the trust premium later. When models are required to disclose training sources, only the verifiable get paid.
2️⃣ GenAI Rights Management: The defensive engine
The next battlefield is detection: seeing where your work shows up in places it shouldn’t. And doing something about it.
AI now rewrites, paraphrases, and blends content at scale. Without automation, monitoring infringements is impossible.
Enter GenAI-assisted rights management: a new class of tools that use machine learning to find derivative or paraphrased copies of your work across the web.
Writers’ Bloc makes it easy to see where your protected articles and content are appearing on the web without attribution or payment.
Pair that with standardised digital rights metadata (like IPTC or schema.org licensing attributes), and you can turn rights protection into a scalable, semi-automated workflow.
So what:
This isn’t about chasing lawsuits. That only adds more legal admin work that no one wants (except the lawyers).
Instead it’s about being able to quantify the value of your IP. When you can see who’s using your content and how, you shift from reactive defence to measurable asset management; data that strengthens every future negotiation.
3️⃣ Data-Mining Licensing: The monetisation frontier
Under current EU copyright law, AI developers can use your data for text and data mining (TDM), unless you explicitly say otherwise.
This “opt-out by default” system is the quiet loophole fuelling the GenAI boom.
It’s pretty extortionate, if you ask me. If anything it should be opt-in. But here we are.
The GenAI Outlook Report flags this issue directly: “Solutions must be sought to fairly compensate creators whose works are used in the AI training process.”
That solution starts with rights reservation. Once you tag your content as “TDM-protected,” you can offer authorised access through controlled APIs or licensing agreements; effectively creating premium, rights-cleared datasets.
Private, partisan deals between News Corp, Axel Springer, and OpenAI show this model is already emerging: licensing archives not as journalism, but as training data.
So what:
If you’re a publisher, this is the new syndication.
Your words have further value than to simply inform readers. Now they also train algorithms. The difference between being scraped or being paid is whether your data is reserved, structured, and licensable.
From red tape to runway
The EU’s AI legislation may be slow, but it’s moving toward one inevitable principle: data traceability equals accountability.
And accountability is monetisable.
The winners in this new ecosystem won’t necessarily be those who publish the highest quantity, nor the most frequently, but rather those who publish the most traceable.
Attribution is the name of the game.
Start by clarifying your rights posture.
Then package your data so it can travel with attribution.
Finally, ensure your content carries a verifiable signature wherever it goes.
You don’t need to out-innovate or out-litigate AI companies (hint: you won’t do either).
You just need to out-verify them.
In the coming data economy, at least in the EU, the players who can prove their provenance will also be the ones who profit from it.
What next?
Publishers have always been the backbone of information. It’s time to become the infrastructure too.
If you want to understand how to turn your archives into revenue-generating data assets, and actually get paid for your content in the AI era, let’s talk.
I’ll walk you through how to make regulation work for you, not against you.
Book a call, or reply to this email, and we can have a brief chat to see where you’re at.
