OpenAI is pushing back against a court order that requires the company to hand over 20 million complete user chats to The New York Times and other news organizations. The NYTs had requested the chats, which were allegedly used by users to circumvent the newspaper's paywall, as part of its case against OpenAI for copyright infringement.
However, OpenAI maintains that producing the entire dataset is not necessary and could compromise user privacy. The company claims that only a small subset of these logs would be relevant to the case, while the majority of conversations have nothing to do with the lawsuit. OpenAI argues that revealing complete conversations, which include multiple prompt-output pairs, could expose more private information than individual log entries.
In its filing, OpenAI pointed out that it had previously offered 20 million user chats as part of its defense against copyright infringement claims but was rejected by the NYTs. The company also claimed that producing the full dataset would set a precedent for other companies to demand the production of tens of millions of conversations without first narrowing them down for relevance.
OpenAI is seeking permission from the court to identify which logs are relevant to the case and provide those instead of handing over the entire dataset. The company says it has already implemented security measures, such as client-side encryption, to protect user data.
The NYTs, on the other hand, claims that OpenAI's proposal would not allow them to analyze how real-world users interact with its product and how it delivers news content. The newspaper maintains that the court order is necessary to hold OpenAI accountable for allegedly stealing millions of copyrighted works to create competing products.
A hearing in this case is scheduled for February 26, 2026, which could determine whether the court order stands or if OpenAI gets to produce a smaller subset of relevant logs.
However, OpenAI maintains that producing the entire dataset is not necessary and could compromise user privacy. The company claims that only a small subset of these logs would be relevant to the case, while the majority of conversations have nothing to do with the lawsuit. OpenAI argues that revealing complete conversations, which include multiple prompt-output pairs, could expose more private information than individual log entries.
In its filing, OpenAI pointed out that it had previously offered 20 million user chats as part of its defense against copyright infringement claims but was rejected by the NYTs. The company also claimed that producing the full dataset would set a precedent for other companies to demand the production of tens of millions of conversations without first narrowing them down for relevance.
OpenAI is seeking permission from the court to identify which logs are relevant to the case and provide those instead of handing over the entire dataset. The company says it has already implemented security measures, such as client-side encryption, to protect user data.
The NYTs, on the other hand, claims that OpenAI's proposal would not allow them to analyze how real-world users interact with its product and how it delivers news content. The newspaper maintains that the court order is necessary to hold OpenAI accountable for allegedly stealing millions of copyrighted works to create competing products.
A hearing in this case is scheduled for February 26, 2026, which could determine whether the court order stands or if OpenAI gets to produce a smaller subset of relevant logs.