Dutch authors accuse Meta of stealing books to train chatbot; 600 complaints filed
Dutch authors are expressing outrage at tech giant Meta, accusing the company of illegally using their books to train its chatbot without consent. The Authors’ Guild of the Netherlands has already received over 600 complaints from writers, following revelations that Meta used an enormous database of pirated books to develop its language model, NOS reports.
According to court documents from the United States, Meta accessed tens of terabytes of texts from Library Genesis (LibGen), an online database known for hosting copyrighted materials without permission. The database includes approximately 7.5 million books and 81 million academic articles. Among them are thousands of titles written by Dutch authors.
Bestselling author and journalist Sacha Bronwasser told NOS, “I feel robbed.” Author and director Philip Huff called Meta’s actions “absurd,” adding, “Because they are so big, they get away with it. It’s just classic colonialism: a massive entity takes what it wants.”
An analysis by NOS Nieuwsuur shows that more than half of the 49 Dutch bestsellers from 2023 were available in LibGen at the time Meta trained its model. For some authors, such as Arthur Japin, up to 20 Dutch-language titles were listed.
Anja Sicking, chair of the Literary Authors section of the Authors’ Guild, condemned the practice as “mafia tactics.” She urged writers to report violations and confirmed that hundreds had already come forward.
“You cannot use copyrighted work without permission,” Charlotte Meindersma, a Dutch lawyer specializing in AI and copyright law, told NOS.
Popular chatbots like Meta's—now available in Dutch via Facebook and WhatsApp—are trained using massive amounts of text. These systems use AI to recognize patterns and predict language. However, companies developing such tools typically do not request permission from the original authors.
While a U.S. judge recently ruled that companies like Meta and Anthropic are protected by fair use laws, Meindersma emphasized that no such legal shield exists in the Netherlands or the European Union. She added that Meta’s use of copyrighted work does not qualify for exceptions such as scientific research because the intent is a commercial space.
Meta responded to NOS, claiming its practices are legal and that its language models “promote incredible innovation, productivity, and creativity.” However, the company did not answer specific questions about consent or compensation.
Authors remain unconvinced. “Innovation is not an excuse for immoral activity,” Huff told NOS. “And you have to wonder how innovative these models really are if they can’t function without eight million books.” The Authors’ Guild is now exploring possible legal action.
The controversy coincides with the upcoming AI Act, set to take effect next week across the European Union. The legislation will require AI developers to seek permission and be transparent about the data used to train their models. A related code of conduct outlines these obligations in more detail, but Meta has already announced it will not sign the code.
The European Commission confirmed to NOS Nieuwsuur that all companies operating in the EU must comply with the AI Act starting August 2. If Meta refuses to follow the code, it may face heightened scrutiny and enforcement from Brussels.
