skip to Main Content
News Summary: New Study Shows Poetry Can Bypass AI Guardrails; Character AI Shifts Its Teen Strategy

News Summary: New Study Shows Poetry Can Bypass AI Guardrails; Character AI Shifts Its Teen Strategy

Authoritarian governments—and commentators on Lord Byron alike—have long suspected poets might be the most dangerous people in society. Indeed, one of my favorite novels, Bolaño’s doorstop The Savage Detectives, has this fear at its heart. A new study has discovered there might be something in that after all. The paper, catchily titled “Adversarial Poetry as a Universal Single-Turn Jailbreak in Large Language Models (LLMs),” can essentially be summed up by saying, “AI will teach you how to do naughty things if you ask it in poetry.”

ALLi News Editor Dan Holloway

Large language models tend to have hefty guardrails built in to stop people using them for nefarious purposes, of a kind I am not going to outline here but you can no doubt imagine. It’s been known for the use of additional meaningless content (known as “adversarial suffixes”) to be able to circumvent these guardrails.

It seems that rhythm and rhyme can also do so, with no less than a 62 percent success rate across LLMs, which is considerably higher than the 43 percent success rate of AI-generated poetry designed to do the same job. So, er, a win for humans there.

Dubious AI and Teen Users

Also in the world of dubious AI comes the announcement from Character AI that it is not going to leave younger teenagers high and dry when it stops them accessing their chatbot. It will, instead, enable them to generate AI-generated stories featuring fictional characters, a genre, and a premise of their choice.

Character AI has an article that is so stuffed with jargon I veered between feeling very old and barely understanding it (“Image-forward storytelling, with even richer multimodal elements coming soon” no less!). What is less prominent (not there) in the piece is a note on how the platform was trained.
For those curious, see Character AI’s posts introducing Stories and its update for under-eighteens, along with TechCrunch’s coverage here.

Poetry as a Jailbreak

Meanwhile, for more detail on the study about poetic jailbreaks, Wired has a striking report on how poems can trick AI into helping users do things it was designed to block.


Thoughts or further questions on this post or any self-publishing issue?

Question mark in light bulbsIf you’re an ALLi member, head over to the SelfPubConnect forum for support from our experienced community of indie authors, advisors, and our own ALLi team. Simply create an account (if you haven’t already) to request to join the forum and get going.

Non-members looking for more information can search our extensive archive of blog posts and podcast episodes packed with tips and advice at ALLi's Self-Publishing Advice Center.

Author: Dan Holloway

Dan Holloway is a novelist, poet and spoken word artist. He is the MC of the performance arts show The New Libertines, which has appeared at festivals and fringes from Manchester to Stoke Newington. In 2010 he was the winner of the 100th episode of the international spoken prose event Literary Death Match, and earlier this year he competed at the National Poetry Slam final at the Royal Albert Hall. His latest collection, The Transparency of Sutures, is available for Kindle at http://www.amazon.co.uk/Transparency-Sutures-Dan-Holloway-ebook/dp/B01A6YAA40

Share

Leave a Reply

Your email address will not be published. Required fields are marked *


The reCAPTCHA verification period has expired. Please reload the page.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Latest advice, news, ratings, tools and trends.

Back To Top
×Close search
Search
Loading...