AP Business

The New York Times sues OpenAI and Microsoft for using its stories to train chatbots

FILE - A sign for The New York Times hangs above the entrance to its building, Thursday, May 6, 2021 in New York. The New York Times filed a federal lawsuit against OpenAI and Microsoft on Wednesday, Dec. 27, 2023 seeking to end the practice of using published material to train chatbots. (AP Photo/Mark Lennihan, File)

NEW YORK (AP) — The New York Times is striking back against the threat that artificial intelligence poses to the news industry, filing a federal lawsuit Wednesday against OpenAI and Microsoft seeking to end the practice of using its stories to train chatbots.

The Times says the companies are threatening its livelihood by effectively stealing billions of dollars worth of work by its journalists, in some cases spitting out Times’ material verbatim to people who seek answers from generative artificial intelligence like OpenAI’s ChatGPT. The newspaper’s lawsuit was filed in federal court in Manhattan and follows what appears to be a breakdown in talks between the newspaper and the two companies, which began in April.

The media has already been pummeled by a migration of readers to online platforms. While many publications — most notably the Times — have successfully carved out a digital space, the rapid development of AI threatens to significantly upend the publishing industry.

Web traffic is an important component of the paper’s advertising revenue and helps drive subscriptions to its online site. But the outputs from AI chatbots divert that traffic away from the paper and other copyright holders, the Times says, making it less likely that users will visit the original source for the information.

“These bots compete with the content they are trained on,” said Ian B. Crosby, partner and lead counsel at Susman Godfrey, which is representing The Times.

An OpenAI spokesperson said in a prepared statement that the company respects the rights of content creators and is “committed” to working with them to help them benefit from the technology and new revenue models.

“Our ongoing conversations with the New York Times have been productive and moving forward constructively, so we are surprised and disappointed with this development,” the spokesperson said. “We’re hopeful that we will find a mutually beneficial way to work together, as we are doing with many other publishers.”

Microsoft did not respond to requests for comment.

Artificial intelligence companies scrape information available online, including articles published by news organizations, to train generative AI chatbots. The large language models are also trained on a huge trove of other human-written materials, which helps them to build a strong command of language and grammar and to answer questions correctly.

But the technology is still under development and gets many things wrong. In its lawsuit, for example, the Times said OpenAI’s GPT-4 falsely attributed product recommendations to Wirecutter, the paper’s product reviews site, endangering its reputation.

OpenAI and other AI companies, including rival Anthropic, have attracted billions of dollars in investments very rapidly since public and business interest in the technology exploded, particularly this year.

Microsoft has a partnership with OpenAI that allows it to capitalize on the company’s AI technology. The Redmond, Washington, tech giant is also OpenAI’s biggest backer and has invested at least $13 billion into the company since the two began their partnership in 2019, according to the lawsuit. As part of the agreement, Microsoft’s supercomputers help power OpenAI’s AI research and the tech giant integrates the startup’s technology into its products.

The paper’s complaint comes as the number of lawsuits filed against OpenAI for copyright infringement is growing. The company has been sued by several writers — including comedian Sarah Silverman — who say their books were ingested to train OpenAI’s AI models without their permission. In June, more than 4,000 writers signed a letter to the CEOs of OpenAI and other tech companies accusing them of exploitative practices in building chatbots.

As AI technology develops, growing fears over its use have also fueled labor strikes and lawsuits in other industries, including Hollywood. Different stakeholders are realizing the technology could disrupt their entire business model, but the question will be how to respond to it, said Sarah Kreps, director of Cornell University’s Tech Policy Institute.

Kreps said she agrees The New York Times is facing a threat from these chatbots. But she also argued solving the issue completely is going to be an uphill battle.

“There’s so many other language models out there that are doing the same thing,” she said.

The lawsuit filed Wednesday cited examples of OpenAI’s GPT-4 spitting out large portions of news articles from the Times, including a Pulitzer-Prize winning investigation into New York City’s taxi industry that took 18 months to complete. It also cited outputs from Bing Chat — now called Copilot — that included verbatim excerpts from Times articles.

The Times did not list specific damages that it is seeking, but said the legal action “seeks to hold them responsible for the billions of dollars in statutory and actual damages that they owe” for copying and using its work. It is also asking the court to order the tech companies to destroy AI models or data sets that incorporate its work.

The News/Media Alliance, a trade group representing more than 2,200 news organizations, applauded Wednesday’s action by the Times.

“Quality journalism and GenAI can complement each other if approached collaboratively,” said Danielle Coffey, alliance president and CEO. “But using journalism without permission or payment is unlawful, and certainly not fair use.”

In July, OpenAI and The Associated Press announced a deal for the artificial intelligence company to license AP’s archive of news stories. This month, OpenAI also signed a similar partnership with Axel Springer, a media company in Berlin that owns Politico and Business Insider. Under the deal, users of OpenAI’s ChatGPT will receive summaries of “selected global news content” from Axel Springer’s media brands. The companies said the answers to queries will include attribution and links to the original articles.

The Times has compared its action to a copyright lawsuit more than two decades ago against Napster, when record companies sued the file-sharing service for unlawful use of their material. The record companies won and Napster was soon gone, but it has had a major impact on the industry. Industry-endorsed streaming now dominates the music business.

___

AP Technology Writer Matt O’Brien contributed to this story.