NVIDIA is being sued for copyright infringement in AI training

By: Bohdan Kaminskyi | 13.03.2024, 15:48

NVIDIA

NVIDIA is facing a new copyright class action lawsuit over the use of books without authorisation to train the NeMo AI language model.

Here's What We Know

Authors Abdi Nazemian, Brian Keene, and Stewart O'Nan accuse NVIDIA of illegally using the Books3 dataset containing their books to train NeMo Megatron AI systems. They claim the dataset includes a pirated Bibliotek library of 196,640 books.

The plaintiffs are seeking a jury trial, damages from NVIDIA, and the destruction of all copies of Books3 used to create NeMo's large language models.

"NVIDIA has admitted training its NeMo Megatron models on a copy of The Pile dataset. [...] Books3 is part of The Pile. [...] NVIDIA necessarily trained its NeMo Megatron models on one or more copies of the Infringed Works, thereby directly infringing the copyrights of the Plaintiffs" the lawsuit states.

However, NVIDIA claims that NeMo was created in strict compliance with copyright law and that the company respects the rights of all content creators.

This is not the first lawsuit against the tech giant for using copyrighted works to train AI systems. Previously, similar claims were made against OpenAI, Microsoft and other companies.

Source: Engadget