Many, if not most, producers of AI technologies argue that fair use gives them the right to train AI models on copyrighted material that has been removed from the Internet — even if they don’t get permission from the rights holders. But some vendors, such as OpenAI, are hedging their bets — perhaps wary of the outcome of pending related lawsuits.
OpenAI today was announced that it reached an agreement with Axel Springer, the Berlin-based owner of publications including Business Insider and Politico, to train its generative AI models on the publisher’s content and add recent articles published by Axel Springer in OpenAI’s AI-powered Viral ChatGPT.
It’s OpenAI’s second such deal with a news organization after the startup said it would grant a license some of the Associated Press archives for model training.
From now on, ChatGPT users will receive summaries of “featured” articles from Axel Springer publications — including stories that are usually enclosed behind a paywall. Excerpts will be accompanied by both a reference and links to the full articles.
In return, Axel Springer will receive payments of unspecified size and frequency from OpenAI. The agreement runs for several years and — while not binding either side to exclusivity — Axel Springer says it will support the store’s existing AI-based businesses “based on OpenAI’s technology.”
“We are excited to have formed this global partnership between Axel Springer and OpenAI — the first of its kind,” Axel Springer CEO Mathias Döpfner said in a canned. statement. “We want to explore the opportunities of journalism with artificial intelligence — to take the quality, social relevance and business model of journalism to the next level.”
In addition to publishers tapping Generative AI for questionable content strategies;publishers and AI vendor producers have a tentative relationship, with the first complaint of copyright infringement and is increasingly concerned about production models cannibalizing traffic. For example, Google’s new artificial intelligence search experience, called SGE, has pushed links that appear in traditional search further down search results pages — potentially reducing traffic to these links up to 40%.
Publishers also object to vendors training their models on content without compensation agreements in place — especially in light of reports that tech giants include Google they are experimenting with artificial intelligence tools to summarize the news. According to a recent overviewhundreds of news organizations now use code to prevent OpenAI, Google and others from crawling their sites for training data.
In August, several media organizations, including Getty Images, The Associated Press, National Press Photographers Association and The Authors Guild published a I open a letter calling for more transparency and copyright protection in artificial intelligence. In the letter, the signatories urged policymakers to consider regulations that require transparency in training data sets and allow media companies to negotiate with AI model operators, among other proposals.
“[Current] The practices undermine the core business models of the media industry, which are based on readership and viewership (such as subscriptions), licensing and advertising,” the letter states. “In addition to violating copyright law, the resulting impact is to substantially reduce media diversity and undermine the financial viability of companies to invest in media coverage, further reducing public access to high-quality and Reliable information”.