In this episode of the ALLi Self-Publishing Advice Podcast, Anna Featherstone speaks with Phil Marshall, co-founder of Spoken, about the fast-changing world of audiobook production and how authors can now record audiobooks more affordably with AI assistance. They discuss using AI versions of an author’s own voice, as well as creating multi-voice audiobooks. Spoken is a newer ALLi partner member whose platform is designed specifically for authors, offering natural-sounding narration through custom-generated voices and licensed voice clones from human narrators. The conversation explores practical ways authors can bring both new titles and backlist books to life in audio without prohibitive costs.
Listen to the Podcast: Producing Affordable Audiobooks with AI Assistance
Sponsor
This podcast is proudly sponsored by Gatekeeper Press — your partner in premium independent publishing. Empowering authors with expert guidance, 100% rights, 100% royalties, and global distribution. From editing to marketing, their all-inclusive services help you publish professionally and confidently. Gatekeeper Press — Where Authors Are Family.
Thoughts or further questions on this post or any self-publishing issue?
If you’re an ALLi member, head over to the SelfPubConnect forum for support from our experienced community of indie authors, advisors, and team. Simply create an account (if you haven’t already) to request to join the forum and get going.
Non-members looking for more information can search our extensive archive of blog posts and podcast episodes packed with tips and advice at ALLi's Self-Publishing Advice Center.
And if you haven’t already, we invite you to join our organization and become a self-publishing ally.
About the Host
Anna Featherstone is ALLi’s nonfiction adviser and an author advocate and mentor. A judge of The Australian Business Book Awards and Australian Society of Travel Writers awards, she’s also the founder of Bold Authors and presents author marketing and self-publishing workshops for organizations, including Byron Writers Festival. Anna has authored books including how-to and memoirs and her book Look-It’s Your Book! about writing, publishing, marketing, and leveraging nonfiction is on the Australian Society of Authors recommended reading list. When she’s not being bookish, Anna’s into bees, beings, and the big issues of our time.
About the Guest
Phil Marshall is a technologist, former surgeon, entrepreneur, and hard science fiction author. He is the founder and CEO of Spoken, an AI-assisted audiobook platform that helps authors create single-, dual-, and multi-voice audiobooks affordably. His debut novel, Taming the Perilous Skies, explores the implications of his original Theory of Persistence, a speculative physics framework linking quantum mechanics, gravity, and time.
Read the Transcript
Anna Featherstone: Welcome and thanks for tuning in from your special part of this precious, complex, or inspiring planet. I'm Anna Featherstone and I'm recording in Sydney, Australia today. I'd like to pay my respects to the traditional owners and storytellers of this land, the Gadigal people of the Eora Nation. Today we're going to be discovering a platform that offers a variety of creation and narration options for audiobooks. Joining us is debut sci-fi author, CEO, and founder of Spoken — Phil Marshall. Welcome, Phil.
Phil Marshall: It's great to be with you.
Anna Featherstone: Where in the world are you today?
Phil Marshall: Just outside of Portland, Oregon.
Anna Featherstone: Oh, beautiful. How's the weather over there?
Phil Marshall: It's crisp, it's clear, it's fall, and we're getting ready for our big Thanksgiving holiday next week — lots of turkey and pumpkin pie and all the sorts of things that you probably should do in moderation, but none of us will.
Anna Featherstone: Down here we have a very sunny day — beach weather for sure. We're heading into a hot summer. So Phil, before we talk tech, can you share a little about your background as both a debut author and as part of the Spoken team? What drew you into this intersection between storytelling and audio technology?
Phil's Background: From Surgery to Startup
Phil Marshall: I was actually trained as a surgeon, believe it or not. I've been in technology for 25 years — healthcare tech first and foremost. I sold my conversational AI company, Conversa, in 2021. And ever since around 2001 I had notions about a story I wanted to tell, centered around an idea in gravitational physics. I knew the parts of the story and the world I wanted to write about. So I devoted myself full time to that, met some fantastic people along the way, and because I'm an audio-only reader myself, the audio modality of bringing my work to life was very important.
When I studied under Nancy Kress and Walter Jon Williams at Taos Toolbox, I met people who were very influential to me on where AI tech was in helping authors bring their stories to life. I used the tools at the time — this was more than two years ago — and I knew I could build a better mousetrap. So I founded Spoken. My short story ‘Killer Darlings,' which I wrote at Taos Toolbox, was my very first story on Spoken — the very first story on the platform, in fact — with six different characters and different accents. Multi-voice, the notion of full cast where every character has their own voice, was central to what I wanted to build.
Anna Featherstone: So basically you wrote a short story first, then you wanted to bring a novel to life, and so you created a whole way to do that yourself with AI narration.
Phil Marshall: That's right. And so Taming the Perilous Skies is my debut sci-fi techno thriller. It hit the virtual shelves on September 12th, and in probably another month or so the audiobook — which has over 100 speaking voices, voice-designed around every speaking character — is going to be brought to market. Pretty exciting.
What Spoken Does and How It Differs
Anna Featherstone: Spoken is a fairly new ALLi partner. Most people won't know what it is. How would you describe what Spoken does and how it differs from other audiobook creation options?
Phil Marshall: You have your manuscript, and we help you create the audiobook, automating as much of that as possible. If you want a single narrator, you can choose from a library of voice actor voices — real voice actors who get paid for every use, very high quality — or you can design your own narrator based on a prompt, a custom voice that's never been heard before, created just for you. You can do that either for single narrator works, for dual narration — usually male/female, common in romance — or as I did and generate a custom voice for every single speaking character.
For non-fiction, for memoirs, and even for fiction writers, authors can also record their own personal voice on Spoken and use that for any of the narration. And if you choose multi-voice or duet narration, where each character has their own speaking voice, we automate every single passage attributed to the right speaking character. Once you've decided on a voice, it carries through the entire story for that character.
Accents and Languages
Anna Featherstone: How is it with accents? We have listeners and writers from all around the world — Australians, Kiwis, Canadians, Brits, Irish, Scottish, Italian…
Phil Marshall: We've got you covered. Accents are amazing — it's really impressive what the voices do with accent. We use two service providers: ElevenLabs and Hume. ElevenLabs has been in the market for a few years now and has gained a lot of acclaim — great voices, very reliable pronunciations, translations, and accents. Thirty-two languages from ElevenLabs for each of the voices you choose. Hume is a little more limited, usually just the core romance languages, but does a very nice job with accents. Most of my voices in my book are Hume voices and I've got 12 different accents, which gives you a sense of the variety you're able to achieve.
Personal Voice Clone vs. Full Narration
Anna Featherstone: For nonfiction — very close to my heart — would I narrate my entire book, or is it more about getting my voice cloned so I don't have to do the whole thing? And can you have both options?
Phil Marshall: That's exactly what I was saying — yes, both options. One is called actual voice and one is called personal voice clone. Most people choose the personal voice clone. You read a short script that goes through a variety of emotional tones, and then that voice model is on the platform for your exclusive use for whatever works you want. You can do entire works as a single narrator in your own voice. Or you can do as I did and just make yourself the villain that pops up every once in a while. Lots of variety: custom voices, voice actor voices, your own personal voice clone. And if you want to narrate something directly as a studio recording of a passage — like your own intro, which is very popular — you absolutely can.
Ensuring Consistency, Pacing, and Emotion
Anna Featherstone: How does Spoken ensure consistency of pacing, tone, and emotion so the listening experience is enjoyable?
Phil Marshall: Each one of what we call passages is done as an object of narration. We're able to pass the prior passage and the next passage for context, which helps ensure contextual intonation. We also automate how much padding there is between each passage. In Taming the Perilous Skies I have over 10,000 passages — 10,000 different spoken pieces by either a character speaking aloud or by a narrator — so timing, emotional context, and narrative cohesion are critically important.
We also handle emotional cues. Hume is particularly adept at this, and the new ElevenLabs V3 also handles it. That's the whisper, the shout, the crying, the laughing — those emotions that get woven in when you pass emotional cues become really important. And if a passage isn't delivered the way you want, you can just regenerate it. But what most people do now is use a good mic — a Blue Yeti or even AirPods — click the microphone on that passage in Spoken, speak it the way you want, and it comes out in the character's voice exactly as you spoke it.
Anna Featherstone: That's really interesting. Even professional narrators sometimes miss that you were joking or being sarcastic — and you'd need to tell them. With this, the author just goes in and speaks it the way they meant it.
Phil Marshall: Exactly. You can put things in all caps with asterisks, add two question marks so it knows to end on a question, or you can just click the microphone and speak it the way you want. And it does that in the voice you've programmed.
Anna Featherstone: So authors can either just press and leave, or be quite hands on.
Phil Marshall: You're definitely going to want to proof your work. It covers the gamut on how much work it'll take. I went all the way to the extreme — using mostly Hume voices, all different accents — and I did this before Hume came out with their Speak It functionality, which means I was just regenerating over and over. That can be labor intensive and time intensive. But now with Speak It, voice to voice or speech to speech, it simplifies everything enormously. For single narrator with ElevenLabs, most passages are going to come out pretty acceptable. You'll still proof and alter them occasionally. And we have a lexicon feature — if you have unusual words or names, you can change them once at the top to the phonetic replacement and that's what goes into text-to-speech every time that alien name comes up again.
Spoken vs. ElevenLabs and Hume: What's the Difference?
Anna Featherstone: What is the key difference between ElevenLabs, Hume, and Spoken?
Phil Marshall: ElevenLabs and Hume serve the same role for us — they're both narration partners that we pass everything into. But we're providing the chassis around it, the studio around it, designed by writers for writers. We're exclusively for this market and the entire workflow is designed around it — from manuscript through analysis to voice selection, to passages, to publishing. Once you've got your finished work, your files can go to Voices by Authors Republic in audio or as MP3 zip files, to Spotify, or — increasingly — authors are posting their audiobooks to YouTube. It's the writer-focused workflow. That's what it's all about.
Time Investment and Pricing
Anna Featherstone: Indie authors are juggling limited time and tight budgets. What's the typical learning curve and workflow, from uploading your manuscript to the final audio download? What are we going to have to invest in terms of time and headspace?
Phil Marshall: If it's single narrator with ElevenLabs, it's on the lesser side. If it's multi-voice with Hume, that's the other extreme. But with Speak It now available for Hume, that reduces the work considerably. Generally, for a 15-hour audiobook with multi-voice or duet, authors are finding that somewhere between 30 and 40 hours of proofing, tweaking, and adjusting is what you'd want to spend. Most of the work on our network right now is multi-voice or duet because it's become so popular and is so inaccessible except through a tool like Spoken.
Anna Featherstone: What if an author wants to narrate their own book versus using their voice clone?
Phil Marshall: If you want to narrate your own book yourself, go buy a Blue Yeti, start narrating, use Audacity or Audition or any audio software to layer in those files. You don't need Spoken for that, you never have. It's really about blending that human voice with the AI, and that's where we help. The personal voice clone is key to that.
Anna Featherstone: What does the pricing model look like?
Phil Marshall: We started with a free beta, which ran for more than a year, and then went to a paywall at the narration stage. We learned that most people are new to this and don't know what to expect, so we moved to a fixed price at the end of the process — you pay when you're done and you know exactly what you have. It's based on the word count of your work. The subscriber price is $10 per every 5,000 words. If you don't subscribe, it's $20 per 5,000 words. The subscription is $50 and gives you 50% off that fixed price. So for a 50,000 word work — 10 blocks of 5,000 — that's $100 at the subscriber rate. We say: free to use, pay when perfect.
Anna Featherstone: So someone with just one book could subscribe for one month, complete the book, cancel, and still own their files?
Phil Marshall: Yes, absolutely. Once you pay, you're publishing — we stitch and master that final work for you and it's yours. No constraints, no rights that we take on your work in any case. It's yours to distribute how you wish.
Mastering and Distribution
Anna Featherstone: What does the mastering process involve, and what platforms can the output be uploaded to?
Phil Marshall: We match ACX standards — fixed bit rate, floor and ceiling leveling, background noise removal. With multi-voice or duet narration, we have to make sure everything blends and gets mastered together beautifully. We guide the author through the cover art requirements as well, because when you're distributing through something like Voices by Authors Republic, which goes to all the different services, they require certain things and we enforce those on our side. The platform is available worldwide. Most of our users are US, UK, or Canadian, but there's no geographic constraint — 32 languages through ElevenLabs helps make sure whatever you put in comes out nicely.
Genres and Back Catalogs
Anna Featherstone: What kinds of books are you seeing coming through? Any particular genres?
Phil Marshall: Speculative fiction is still our top — fantasy and science fiction — but romance has been held back because we've only just this week launched duet narration. So that's going to skyrocket. To explain the difference: dual narration is where the female character takes the female chapters and the male takes the male chapters, narrating those chapters entirely. Duet narration is where the female voice covers every female character and the male covers every male character. In our world it goes even further than that, because ours is multi-voice — every character has their own voice within those chapters. So our duet is, as I may have just coined it, Fancy Pants Duet.
Beyond that: plenty of thriller, plenty of horror, and a really good dose of nonfiction — memoirs especially, where it's so easy to use your own personal voice clone and get the work out. And a lot of authors are using us for their back catalogs. Some of our authors have 50, 100, or more than 100 titles in their back list. The economics of producing audio for all of those didn't make sense before, but now they do.
Marketing: Reader Magnets and Cinematic Trailers
Anna Featherstone: Is there a way people are using Spoken to help with their marketing?
Phil Marshall: Jesse Quack, who is a great author and coach for authors, just put out a video about creating your story magnet — your reader magnet, that first chapter or short story in the same universe. You can publish that right on Spoken, which is a streaming network, and share it with all of your followers. That's become very popular.
We've also started producing cinematic trailers. My colleague Joshua is a film director and leads our trailer development efforts. We're going to be mechanizing that to help authors with discovery, using the audio we've already produced for them.
How to Get Started
Anna Featherstone: For authors thinking this sounds interesting, what's the best way for them to start experimenting safely, affordably, and without taking over their entire brain?
Phil Marshall: Take just a chapter or a short story. It's free to use — you only pay when you're ready to publish. So you can play around and get really familiar with it. You'll be able to do your actual voice on passages, your personal voice clone, draw from a big library of voice actor voices, or create a custom voice around each character. It's fun to hear those characters come to life. I mean, for mine it was a thrill. No barrier whatsoever — just go on in and test it out.
Anna Featherstone: What should we prepare before uploading our manuscript? Any tips to make the process smoother and the end result more professional?
Phil Marshall: Not really. We try to do a lot of the work for you. If you want to test with a chapter, just take out any administrative front matter — your copyright page and so on — and upload just your story. But as far as changing your story itself, nothing at all. Having said that, I as an author have learned some interesting lessons from being in this position of doing my own audio. I've been able to take out about a third of my dialogue tags because I'm doing multi-voice and the listener knows who's speaking. Long-term, that will impact how people write for this medium versus others — it becomes an audio edit.
Environmental Considerations
Anna Featherstone: When I think about hundreds of thousands of authors and AI use, that can chew up a lot of energy and water. What does Spoken do in that space — how do you mitigate some of those demands?
Phil Marshall: The main thing we do is try to get as close as we can to one shot being right. If you're regenerating over and over, that's using those NVIDIA GPUs over and over. Getting the number of regenerations down as low as possible really reduces that. And things like the Speak It functionality — being able to just speak it the way you want and not regenerate repeatedly — transforms this into a much lower-energy endeavor.
Human Voice Professionals and What's Coming
Anna Featherstone: Are human narrators able to audition with you, or do they go through other channels?
Phil Marshall: By the time your readers see this, we will have a whole different story for the voice professional. I'll just say stay tuned — it'll be very evident on our website. Spoken will become, I think, a very powerful tool for the voice professional. And there are a lot of things the voice professional community can do for authors, and we're going to be bringing them together like never before on Spoken.
Anna Featherstone: I was doing research before our interview and on your website there were media releases with updates coming out every few weeks — you guys are progressing fast.
Phil Marshall: Yes. This week's release includes duet narration, digital signature for Voices by Authors Republic, and Speak It. People ask me what this is going to look like in five years and I say, are you kidding me? I can't tell you what it's going to look like in six months. But the team is good, the team builds a high-quality product, and I'm not going to be beat on product.
Phil's Next Novel
Anna Featherstone: Are you working on another novel yourself?
Phil Marshall: I need to make sure Taming the Perilous Skies and the audiobook get the attention they need, and I'm leading a very quickly moving startup. So the second book — which I've tentatively titled If Memory Serves — is 50 years after the first novel. The first novel is set in 2076, after the invention of anti-gravity following a notion of unified physics that has changed the world. Fifty years after the tragedy of 2076, when the anti-gravity grid fails, there's another invention that transforms life just as massively: the ability to retain all memories crystal clear, and the kind of impact that has on society. I'm about a third of the way through drafting it, but it stalled a couple of years ago. Book two will happen though. Book two will happen.
Closing and Where to Find Spoken
Anna Featherstone: Well Phil, thank you so much. It has been fascinating speaking with you. Where can people find out more?
Phil Marshall: SpokenPress.com — that's the place to find out all about what we're doing. And drphilmarshall.com is my personal site. We're moving fast and we're here for authors exclusively, so if you see anything you don't like, you can just pick up the phone and call me. I'll pick it up.
Anna Featherstone: I really appreciate learning more about what you're doing, and I look forward to seeing all the new updates that come out. Thank you so much, Phil.
Phil Marshall: My pleasure, Anna. And just for your audience — I said Tasmania, not Tanzania, earlier. I'm American. I don't know my world geography and that's been bothering me ever since I said it.
Anna Featherstone: I think we'll leave this in because it's kind of fun. And to all our listeners out there, thank you for tuning in and being a part of this beautiful ALLi podcast family. Here's wishing you lots of time to write and create. Thank you.




