Collective Licensing Wants to Help Publishers Scrape Back Revenue From AI Companies

By Bron Maher
A rake is seen attempting to scrape back a few dollars, illustrating a story about the development of collective licensing in the UK for AI applications.
Stock.adobe.com

Nearly three years after ChatGPT sparked a panic over the future of publisher copyright, the machinery of the existing copyright regime is starting to grind into gear. 

This year a group of UK collective licensing services—bodies that collect royalties on behalf of a given industry—are set to unveil a new licensing service tailored to dealing with the use of copyrighted text by generative AI applications. The U.S.’s Copyright Clearance Center, likewise, promises its AI training license will be unveiled before the end of 2025.

One of those UK organizations, Publishers’ Licensing Services, says the “pioneering” collective license will give publishers a way to seek remuneration when their content is used in the training and fine-tuning of AI models.

But don’t make that car down payment just yet: the sums involved, if they’re anything like those currently being disbursed, look unlikely to be transformational.

Collective licensing services in the UK are government-regulated, generally non-profit organizations, but are otherwise analogous to the U.S.’s for-profit Copyright Clearance Center. They sell blanket licenses that allow customers to legally use copyrighted materials belonging to any of their members.

The PLS, for example, has 4,500 member publishers including Thomson Reuters, Springer Nature and the Royal College of Physicians, and allows both British, US and EU publishers to sign up to receive licensing revenue originating in the UK. Customers who pay for the right to use short extracts from PLS members’ publications are most often businesses and academia. (British newspapers license through a related body named NLA media access.) 

The service redistributes around £50 million ($67 million) to publishers each year—on average a little over £10,000 ($14,000) annually for each member. Copyright owners are paid based on how much their work is used by customers, something PLS ascertains through usage reports and surveys.

But Will Crook, the head of policy and communications at the PLS, told AMO that collective licensing generates “material revenues for the ‘long tail’ of publishing,” with B2B and specialist titles receiving “a significant share” of the total. He said he wouldn’t be able to share specific revenue totals, but recommended any interested publishers speak “to the PLS team directly to get a better sense of what to expect.”

Perhaps the strongest argument for joining a collective licensing service is that it’s free: in the PLS’ case, its operations are funded by taking a little over 5% off the money it handles for members. They’re also non-exclusive, so any rights-holders who sign on to one of the services can still license their content directly to partners.

The creation of a collective license for AI applications—currently under development by licensing umbrella organization the Copyright Licensing Agency—comes as media organizations grapple with the widespread ingestion of their copyrighted materials for the training of AI models like ChatGPT. While some publishers have struck agreements with AI companies governing how their data can be used going forward, most currently receive no compensation for the use of their material. Some media companies have filed lawsuits against AI companies, which are ongoing. 

The success of the new AI license will rely on AI companies signing up to the service, which is not necessarily a given. OpenAI, for example, maintains that its training the models underpinning ChatGPT on content from the internet fall within the bounds of fair use. Meta, meanwhile, is alleged to have flouted copyright law altogether by accessing a pirated books library while training its Llama system.

But Crook said the deals already struck with publishers indicated “some caution to the risk of copyright infringement” on the part of AI companies.

The AI license brandishes both a carrot and a stick toward firms like OpenAI, Google and Meta. Crook said signing up “provides them with legal access to high quality, curated content essential to the reliability of a generative AI model.” But it also offers “some protection from claims of infringement, and should give confidence to a developer, be they large or small, to innovate using that content.”

To that end he felt it would be “wrong to paint all AI developers with the same brush.” Some developers want to establish “a more ethical and sustainable relationship with the industries who will be supplying the content for their models for years to come.”

Meanwhile, the PLS has also seen “a significant and growing” licensing market among companies who develop AI tools for internal use, Crook said, “for which the sourcing of dependable, high-quality, rights-cleared content is a key component.”

While the non-profits develop their response to AI, the private sector sees an opportunity. The Financial Times reported earlier this month that private content licensing and data marketplace startups have received $215 million since 2022, while one of those services, music licensing startup Vermillio, told the paper it expects the AI licensing market to expand more than six-fold to $67.5 billion in the next five years.

PIP Labs, which has developed a blockchain protocol that it says can track the use of intellectual property by AI, received $80 million from investors led by Andreessen Horowitz last year, while startup ProRata, which wants to share revenue with publishers each time their content is used, was reportedly valued at $130 million in November following an investment from DMG Media.

Crook said the PLS “welcome the work of organizations who are helping to facilitate licensing between rights-holders and AI developers, and with our strong connection to rights-holders, we are open to exploring potential cooperation with them.”

The advantage of the non-profits’ approach was that Copyright Licensing Agency licenses are approved not just by the PLS but by sister agencies representing authors, designers and artists and photographers, he said.

“This means that each license is backed by the rights-holders and creators who have produced the content used, and therefore are able to provide a broad grant of rights, which in turn enables CLA to offer a broad licensed repertoire to licensees. This unique and inclusive model is one of the reasons we’ve used the word ‘pioneering’ in describing our AI licensing development.”