Web Summit Lisbon last week didn’t change my mind about AI and publishing. It did something more important: it validated the 450+ odd conversations I’ve had with journalists and editors since 2022.
Suddenly I feel a lot less crazy.
For the last three years I’ve been asking publishers the same question: “What is your archive really worth?” At Web Summit, I finally heard that question echoed back from the stage.
Behind the keynotes and the hype, a quieter story ran through the event this year:
AI labs and publishers need each other more than either side wants to admit.
I went to Lisbon with a clear hypothesis from those 450+ interviews, which I’ve banged on about before: publishers are, in reality, data companies; their archives are wildly undervalued; and AI labs and media are in a co-dependent relationship they haven’t fully owned yet.
Here are the top 3 insights I took away, and what actually matters if you work in media.
1. Publishers are data companies (even if you don’t introduce yourself that way)
My favourite talk was “Publishers in the age of AI: from traffic to clicks” with Brooke Hartley Moy, CEO and co-founder of Infactory. She clearly articulated something I’ve been trying to tell publishers for years: the old web was built for humans; the new web is being built for machines.
Paywalls are for people.
The old web cared about clicks and dwell time. The new web is built for bots: embeddings, training sets, queryable datasets. In an AI-first world, a reader may never visit your homepage. Their assistant will query an underlying knowledge base and surface an answer. Whether that answer is grounded in your work depends on one thing: is your content structured in a way a machine can understand and trust?
Most publishers already have what AI systems want: decades of reporting and verification on specific beats. Often considered “old content”, yet in fact, it’s training data and reference data of the highest quality.
The subtle but important shift Brooke made was to say this out loud: you’re not just producing stories, you’re maintaining a living record of a domain. Records can be licensed and queried. Pages can only be scrolled.
This is precisely why I’m building Writers’ Bloc: a marketplace where publishers and AI labs can trade data, stories, and archives in a simple, legal, win-win way.
No scraping. No dark-pattern “partnerships”. Simply: here’s what we’ve built, here’s how you can use it, here’s how everyone gets paid.
Make trade, not war.
You don’t need a data science team to move in this direction. Start by:
Listing what you actually have: long-running series, evergreen explainers, archives, analogue material.
Marking what is truly evergreen: if it will still help someone in three years, it’s disproportionately valuable.
Giving someone informal ownership of “the archive” as an asset, not a dump.
Even that small reframing changes how you think about value, partnerships, and eventually deals with AI companies.

Brooke Hartley Moy, CEO + Founder, Infactory
2. Big broadcasters are still paying for humans on the ground
Another talk that stuck with me was “Airwaves of Influence” with David Rhodes (Sky News), Rachel Corp (ITN), and Pedro Vargas David (Euronews), moderated by Dominic Ponsford (Press Gazette).
There was plenty of platform talk: TikTok numbers, product experiences, younger audiences. But the line that mattered most to me was simple: these organisations are still spending serious money to send reporters in on the ground.
War zones, natural disasters, fragile political situations; the sort of reporting that is slow, risky, and expensive. In a year where you could very easily hide behind agency footage and AI-generated explainers, they’re still writing cheques so humans can stand in the middle of the story.
I don’t agree with every editorial decision any of these outlets make, and you probably don’t either. But that commitment to on-the-ground journalism deserves respect and a shoutout. It’s a reminder that high-trust journalism still comes from people who are physically there, able to verify, question, and witness.
For freelancers and smaller publishers, that’s the emotional takeaway: your moat is not “we also have opinions”. Your moat is proximity to the story and the trust you build over time. AI can help you translate, clip, and repurpose your work. It cannot do the work on the ground, nor replicate the decision to show up, nor the credibility that accumulates when you keep doing it.

3. Why “graph-based RAG” should be on your radar
The last piece came from a more technical angle: a talk by Jamie Hutton, co-founder and CTO of Quantexa, about how AI actually improves decisions.
His line was: be decision-driven, not data-driven. Start from the decision you care about, and let that dictate how you use AI. For publishers, that’s where graph-based retrieval-augmented generation (graph RAG) becomes interesting.
Traditional RAG treats your archive like a pile of documents. Ask a question, and the system fetches a few articles that look similar and asks a model to stitch an answer. It’s fine for FAQs, but it struggles with journalism: long timelines, recurring characters, companies that merge or rebrand, investigations that unfold over years.
Graph-based RAG models your world as a network: people, organisations, places, cases, dates, and the links between them. When a question comes in, it doesn’t just look for matching wording; it walks the graph:
Who is connected to this minister?
What else involves this company?
How has this situation evolved across your coverage, not just in one story?
For a newsroom, that means better timelines on demand, stronger background in seconds, and fewer hallucinations, because the system is anchored to a graph of your own verified reporting rather than a fuzzy guess over random web text.
You don’t have to deploy graph RAG next quarter. But you can start asking graph-shaped questions:
Are we capturing entities and relationships, or just headlines and tags?
Which editorial or commercial decisions would be better if we could see connections across years of coverage instantly?
Those questions nudge you toward treating your archive as critical infrastructure, not a relic of the past.
4. So what; how can you use this info?
Coming home from Lisbon, what I felt most was relief. And excitement.
It’s nice to have your ideas independently validated by others (it’s how I know I’m not totally insane!)
The things I’ve been hearing in one-on-one conversations, the anxiety, the sense that something big is shifting, the frustration at IP being strip-mined by AI labs, are real. And the people building the next layer of the web are starting to say the quiet part out loud: they need what publishers already have.
The headline, for me, is this:
AI labs and publishers are co-dependent.
AI systems get better with high-quality, structured, verified data.
Publishers need new ways to monetise that data and reach audiences without being swallowed by intermediaries.
Fighting each other is less interesting than figuring out how to build on top of each other.
Staying stuck in “block everything and hope the lawyers win” mode won’t get us there. Neither will handing over the archive for a one-off cheque and a vague promise. The path forward looks more like: understand what you have, structure it, and negotiate from there.
Start small. Pick one series, one beat, one dusty box. Treat it as the beginning of a data business, not the end of a publishing cycle.
Remember: make trade, not war.
I’m going to keep sharing what I’m learning at the intersection of rights, licensing, and AI. If there’s a specific question you want answered, reply and tell me (I read every one).
If you know an editor, publisher, or reporter who’s quietly worrying about AI and their archive, share this with them. These are conversations we shouldn’t be having alone.
Ps. if you got to the end of this bravo! We ran slightly longer this week, but as it was such an action-packed week in Lisbon, it was hard to distill everything down to a 5-minute read.