180,000 books were used without permission or payment to make AI systems smarter, but who pays the price?
Spoiler, it is not the tech companies like Meta and Bloomberg
This is a longer unedited version of my Dutch piece on this matter for Belgian magazine Knack Weekend entitled: “Free AI software, authors are paying the price”
On Monday, numerous authors, including myself, woke up to a disturbing message about our intellectual property. The American The Atlantic published a series of articles showing that 180,000 books were used without permission by Meta, Bloomberg and other giants to train generative AI systems. Alex Reisner wrote in The Atlantic about how the dataset, known as 'Books3', was based on a collection of pirated e-books. The Atlantic also created a search function so that authors could use it to look up their work. Almost all my English-speaking colleagues found one or more of their books in it (only English-language books were stolen), I also found my debut book Pride and Pudding in the list.
Writing a book takes time. In my case, more than a year of historical research and a year of writing, developing recipes and photography. There is also a source reference at the back of my books, because you must mention information that you use and process and sometimes you have to pay for the use of pieces of text that are subject to copyright. As an author, you usually receive an advance on your royalties for the time it takes to create a book. In the case of this book, the Belgian publisher thought that as a debuting author I did not have any claim to payment. So I worked 7/7 for almost three years for no pay. It was an investment in myself that would give me and my readers great satisfaction and allowed me to demand reasonable compensation for my next books. The book came out in 2015 and also got some translations; after two years, I finally started getting royalties. Now, after book seven, I have a good mix of advances on royalties for new books and royalties on sales of previous books and on the licenses for translations. Royalties and licenses are, therefore, integral to an author's income. And authors don't make heaps of money.
Royalties and licenses are integral to an author's income.
Author Sathnam Sanghera wrote in his opinion piece on 'Books3' in The Times that the average income from writing is now just £7,000 (€8,064) a year and that the real income of authors had fallen by 60 percent over the past fifteen years. The picture is also similar in Australia, where, according to ABC News, the average annual income of an author is $18,200 (€ 11,062). That is approximately a monthly wage of € 900. I have not been able to find out what the amounts are in Belgium in a short time, but given that Belgium is a tiny market compared to Australia and Britain, I suspect that the amounts are even lower. The fact that authors' work is used without payment is therefore criminal.
Tech giants versus small authors
This story is also about inequality, Goldman Sachs estimates that annual investments in AI could reach $200 billion worldwide by 2025. This while there are virtually no investments in those who take care of the books in your bookcase. It is also impossible for authors to litigate against these tech giants. Professor Toby Walsh, Chief Scientist at UNSW's AI Institute, is one of Australia's leading AI experts. His book was also added to 'Books3'. He told ABC News: "It's typical of the cavalier way people in Silicon Valley treat people's intellectual property."
They chose to use authors' works through a loophole, without permission, without attribution and without payment.
The tech giants had a unique opportunity to get this right, they could have trained their systems to only use works in the public domain (which fall outside of copyright law). They could also have applied for licenses from the copyright owners of the works. Not to make authors rich, let's be honest, that would never happen, but realistic compensation for the costs associated with the creation of a work. The costs associated with paying for the licenses would be just a drop in the ocean for the tech giants. They chose to use authors' works through a loophole, without permission, without attribution and without payment. This is copyright infringement. AI software is free for now, but someone is footing the bill and they are the ones who created the works that make the software smarter.
Technology is currently advancing faster than legislation can be designed around it. That is why tech giants such as Meta and Bloomberg can operate unchecked.
AI or Artificial Intelligence still sounds a bit like a term from a science fiction movie and that's why it has been dismissed as something harmless for too long and often. Yet science fiction makers such as Canadian film director James Cameron of the 1984 film 'The Terminator' have repeatedly expressed concerns about the dangers of the rapid advance of AI. 'I warned you in 1984':' Cameron said about AI in an interview with CTV news.
Today we do not know to what extent the AI systems have already been fed with material that has been obtained illegally. We also don't know to what extent AI can take our jobs. In a Belgian political talk show “De Afspraak” of September 7, Chris Umé, from a well-known Belgian AI company, stated that there is absolutely no threat - this in context to the strike of film and television writers in the US who have a concern about AI, they were also supported by actors who in turn are concerned about the use of their likeness via Ai, which could make them partly or partially replaceable - which is precisely what the Umé company does, among other things - in the field of 'deep fakes', after all, the company is among the world's best in AI facial imitation (although they do this with permission). A strange short testimony to convince us that it is not all that bad, there was also no one there to speak for those whose jobs might be affected by AI, strange for a political talkshow.
This is of course in the interest of AI companies, as long as the disadvantages of AI are not taken seriously by public opinion, they can continue to feed the AI monster. In contrast to what the man on “De Afspraak” dismissed as harmless, after a 148-day strike, American writers of films and television did receive a deal that guarantees that AI software (for example, ChatGPT) is not allowed to write or rewrite literary material and that the writer may not be forced to use AI software when performing writing services. A clause also prohibits using authors' materials to train AI as 'Books3' does.
As long as the disadvantages of AI are not taken seriously by public opinion, they can continue to feed the AI monster.
The world needs to wake up and realize that in addition to the wonderful things that AI brings (in the medical world, AI can do a lot of good in detecting diseases and finding solutions), there is also a dark side to the story. Two years ago, our social media channels were flooded with illustrations that people had made of their faces with the AI software Midjourney. Artists all over the world reacted with disbelief and tried to make it clear that by feeding your image to Midjourney, you are feeding AI and compromising their jobs. AI “artists” were on the rise, the classical press liked to write about them. Non-artists dismissed it as nonsense, it was a far-from-their-own-bed show.
When this weekend, just in this eventful week, a piece appeared in Weekend Knack (a restpected Belgian magazine) entitled: “Cooking with ChatGPT as a sous chef” - read, use AI instead of buying a cookbook - I saw that the classic press has not yet understood that AI also threatens their livelihood. Unfortunately, for many it is only when your own livelihood is at risk that you realize what the threat is and how important it is that laws are introduced to restrict AI. And especially requiring AI companies to pay for their sources of information.
Unfortunately, for many, it is only when your own livelihood is at risk that you realize what the threat is and how important it is that laws are introduced to restrict AI.
For the journalist in question it was a nice experiment, but it was especially painful for me to see how someone who also earns a living from his writing shows people the way to use a temporarily free AI software instead of buying a book. to use. The piece was behind a paywall, fortunately the journalist is still being paid, the sources of the recipes that ChatGPT used to create this article will however not be paid. And that's the problem.
Meanwhile, the American Authors' Guild has sent an open letter to the leading companies in AI to limit the damage to our profession by taking the following steps:
1. Obtain permission to use our copyrighted material in your generative AI programs.
2. Compensate writers fairly for past and ongoing use of our works in your generative AI programs.
3. Compensate writers fairly for the use of our works in AI production, regardless of whether the output violates current law.
Link to my shorter edited piece on Knack: “Free AI software, authors are paying the price”
https://weekend.knack.be/human-interest/regula-ysewijn-ai-software-is-gratis-maar-wij-auteurs-betalen-de-rekening/
Further reading and consulted pages
https://www.theatlantic.com/technology/archive/2023/08/books3-ai-meta-llama-pirated-books/675063/
https://www.theatlantic.com/technology/archive/2023/09/books3-ai-training-meta-copyright-infringement-lawsuit/675411/
https://actionnetwork.org/petitions/authors-guild-open-letter-to-generative-ai-leaders/?source=twitter&
https://www.wgacontract2023.org/the-campaign/summary-of-the-2023-wga-mba
On a better note, Cheltenham Literature Festival is throwing me an Afternoon Tea event with bakes from my book Dark Rye and Honey cake and a chat with food historian dr. Annie Gray Oktober 11 at 3:30-5PM, tickets are limited for this intimate event and cost £32.
You know it is going to be a hoot when dr. Annie Gray and I get together, so if you missed our British Library event, this is your chance! Apart from savoury finger sandwiches and scones from my book Oats in the North, Wheat from the South, the chefs are making the Potsuikervlaai Nigella Lawson loved and featured when my book came out, Peperkoek (honey cake), Almond speculaas and Pain a la grecque (a crisp sugar biscuit). Attendees of this event will also receive a bonus recipe card.
I hope to see some of you there, do bring along books you already own to get signed!
Chilling Regula. It’s a very sobering read and impossible to fully imagine what this will translate to in the future.
If I was in the U.K. on 11th I would stay at my Mums (I was born and brought up in Cheltenham) and come to see you. What a hoot. Sadly I’ll be in Italy.