In this week's Self-Publishing News, ALLi News Editor Dan Holloway takes a look at Amazon's response to Jane Friedman's discovery of fake books claiming her as author.
Do have a listen to the new self-publishing news podcast. Howard and I have been talking about the way the legal cases that are being brought against Open AI, as well as the FTC investigation. We've also been considering whether Meta's new Threads social media platform will be a viable alternative to Twitter.
Amazon Says It Will Take Action on AI Book Catfishing
There is a satisfying symmetry to the big stories in this week’s news, which are about AI, copyright, and AI and copyright. And as an aside for the doubters, try writing that sentence without an Oxford Comma!
First the AI. Jane Friedman featured in last week’s column with her updated version of the pathways to publishing. This week she is right at the heart of the news. The story starts with Friedman discovering that several books had appeared on Amazon claiming her as author. And they had the kind of titles that would be plausible for her to write. And they were attractive to readers; advice on self-publishing, from one of the best advisors on the subject there is. She knew, of course, without looking any deeper, that she hadn’t written the books. It was clear, however, that the content was junk. Probably, she thought, generated by AI like ChatGPT.
Books that aren’t what they seem are hardly new on Amazon. In the time I’ve been reporting, I’ve covered all kinds of Scamazons as one might call them. There has been brushing (setting up fake accounts to leave fake reviews). Then there was the content stuffing scandal of the early Kindle Unlimited days. People would “write” books thousands of pages long and use dodgy content tables to send people straight to the back and claim the page reads. And then there was the catfishing episode in which people created fake pen name accounts and produced multiple copies of utterly fake books.
Catfishing Mark 2
This latest, er, chapter is a variant on the last of these. It’s a catfishing scam that adds a second level of fraud because it passes the book off not just as being by some random non-existent writer but by a real and respected writer, Jane Friedman.
What’s interesting is the response from Amazon. Amazon takes such things very seriously. That’s part of its customer first philosophy. And it led them to deal with other issues as well and quickly as they could. And genuine writers have often been on the wrong side of overzealous moderators removing titles pending proof of authorship. But, at first at least, they left these offending titles up. If you’re going to fail to act when an author points out a scam, then probably when that author is Jane Friedman is about the worst time you could do so. Sure enough, with the Authors Guild about to wade in, Amazon relented. But the high-profile nature of the incident has revealed just how widespread the practice is.
Internet Archive and Association of American Publishers File Proposed Settlement in Copyright Case
Next comes the copyright story. And it will be no surprise that it features Internet Archive and the judgment against Open Library. In March this year, courts ruled in favour of publishers who claimed that Open Library had infringed copyright repeatedly. The latest development, as reported by Publishing Perspectives, is the submission of a jointly prepared proposal by the Association of American Publishers and the Internet Archive. If the courts accept the proposal, it would bring the case to a close.
The proposal includes a statement that both Open Library’s controlled digital lending policy and its activity during Covid as the National Emergency Library constitute copyright infringement. It contains a commitment not to reproduce books or create other formats of books in its possession. And there is also an undisclosed compensation arrangement for all members of the AAP affected.
Prosecraft Website Removed After Accusations of Non-consensual AI Training
And then we have a story that brings AI and copyright together. It’s not exactly breaking news that creators are worried about AI using their work to train. But first, it rounds out the segment nicely. And second, the story of Prosecraft raises some interesting and important questions that we need to address.
Prosecraft is a site that launched in 2017. It offered authors the chance to compare their own writing with that of their icons. It did this by comparing the words you enter with a database of millions of words from other authors’ text. The result was a statistical analysis of similarities and differences. The site, especially its paid adjunct Shaxpir, was always controversial. Largely at first because people cast doubt on the value of statistics alone when applied to the craft of writing.
But last week things blew up when Holden Shepperd realised his book, The Brink, was part of Prosecraft’s comparative database. Within days, Prosecraft’s founder, Benji Smith, had taken the site down amid a storm over the use of creative work without consent to train AI.
I thoroughly recommend this piece in The Conversation which outlines some of the legal ramifications. But for me, this is a really interesting story with implications we need to address. As the Conversation piece notes, the central issue here is the existence and use of “shadow libraries.” These are collections of books that are accessed by computers for various purposes. While the establishing of such collections may well be illegal, their use may be a different matter.
What is and what isn't AI?
But I want to raise two points. Not as a commentary, but because they should be borne in mind. First, writers as a community can come together, rally round, and fight for good. But it's also possible for stories to gather steam and for large groups to create huge pressure. This site has been going for six years. The outcry happened when someone said “AI.” Which brings me to the second point. Words, as we know as writers, have power. And some words really press buttons. AI is one such. As several commentators have now pointed out, it's far from clear that what Prosecraft did had any of the hallmarks of large language model machine learning, or generative AI. Just like “blockchain” came to be used lazily as shorthand for “database” so AI might end up being used for “big data.” When people do bad things with copyright, we need to blow the whistle. Misusing a hot catchword to amplify that whistle does no one any favours. Which isn't a comment on this case. But a note to scrutinise what we read, and to arm ourselves with enough knowledge to get these things right.Amazon eventually takes down fake books claiming to be written by Jane Friedman, and other top #selfpub news stories for #indieauthors, in one quick read, by #ALLi News Editor Dan Holloway @agnieszkasshoes #digitaleconomy… Click To Tweet