OpenAI claims New York Times not ‘telling the full story’ in suit

The suit claims OpenAI and its largest investor Microsoft Corp relied on copyrighted articles to train the start-up’s popular ChatGPT chatbot and other AI features. PHOTO: REUTERS

CALIFORNIA – OpenAI publicly addressed a lawsuit from the New York Times in a strongly worded blog post on Jan 8, saying that the newspaper’s complaint was “not telling the full story” about its use of Times data.

The suit, filed last December, claims that OpenAI and its largest investor, Microsoft Corp, relied on copyrighted articles to train the start-up’s popular ChatGPT chatbot and other artificial intelligence (AI) features. The complaint pointed to examples of the chatbot reproducing chunks of text pulled almost verbatim from New York Times stories.

OpenAI said the sort of “regurgitation” the paper referred to in its recent lawsuit is a “rare bug” that the company is “working to drive to zero”. OpenAI also said the Times may have “intentionally manipulated prompts” and “cherry-picked their examples from many attempts”. 

In a response to the blog post, a lawyer for the New York Times said the start-up has included the Times’ journalism in the creation of its products.

“The blog concedes that OpenAI used the Times’ work, along with the work of many others, to build ChatGPT,” wrote Mr Ian Crosby, a Susman Godfrey partner and lead counsel for the Times, in a statement.

Citing the complaint, he said: “‘Defendants seek to free-ride on the Times’ massive investment in its journalism by using it to build substitutive products without permission or payment.’ That’s not fair use by any measure.” 

The generative AI technology behind products like OpenAI’s chatbot is powered by large language models – massive AI systems that suck up enormous volumes of digital text from news articles, social media posts or other Internet sources.

The programmes analyse that written material to become adept at generating new text, like summaries of current events, in response to a few words of prompting from a user. 

Though the use of online data has long been a common practice by companies and academic researchers during Silicon Valley’s AI boom, such systems have recently come under fire from artists and other content creators about compensation for the use of their work to create the technology. The AI products have already spurred numerous other lawsuits. 

In its post, OpenAI said that sometimes, the systems memorise chunks of text, an issue it called “a rare failure of the learning process that we are continually making progress on”. BLOOMBERG

Join ST's Telegram channel and get the latest breaking news delivered to you.