
AI companies scraping copyrighted content has become an alarming trend. NVIDIA is the latest to come under fire for allegedly scraping “a human lifetime” worth of videos to train AI.
A report by 404 Media claims that NVIDIA asked its employees to download massive amounts of video data for use in commercial AI projects like its “digital human” initiatives, self-driving car systems, and Omniverse 3D world generator.
The report also makes note of NVIDIA’s alleged content theft from Netflix. Ex-staff members say that they received orders to scrape full-length films and television series from the streaming giant. Employees who raised concerns were reportedly assured by their managers that the practice has been green-lit at the highest levels of the organization.
404 Media reports that NVIDIA got through YouTube’s anti-scraping policies by using machine learning techniques and open-source video downloaders. The company apparently downloaded almost 80 years’ worth of video content every day using up to 30 virtual machines on Amazon Web Services.
Apart from the videos on YouTube and Netflix, it is believed that NVIDIA gave employees instructions to train on the movie trailer database MovieNet, internal video game footage libraries, and the GitHub video datasets WebVid (which was removed following a cease-and-desist order) and InternVid-10M (a dataset with 10 million YouTube video IDs).
NVIDIA justified its actions in a statement by citing fair use guidelines and claiming compliance with copyright laws. However, YouTube has already made it clear that using its content to train AI models would be against its terms of service. The report also claims that NVIDIA accessed datasets exclusively designated for academic and non-commercial use, thereby disregarding usage restrictions for its commercial AI products.
This is just one of the many troubling instances where AI companies forgo rules and ethics as they compete with each other in a frenzied race to establish dominance within the industry.
- OpenAI Enhances ChatGPT with Memory for Context-Aware Chats - April 24, 2025
- GPT-4o Expands ChatGPT’s Capabilities with Enhanced Image Generation - April 9, 2025
- OpenAI Upgrades ChatGPT Voice Mode for Smoother Conversations - April 8, 2025