skip to Main Content
The Future Of Short-Form Audio — Part I: How Spoken Is Helping Authors Turn Stories Into Streamable Content

The Future of Short-Form Audio — Part I: How Spoken Is Helping Authors Turn Stories into Streamable Content

Short-form audio and AI-generated audiobooks are emerging as a major new frontier for authors. As listening habits shift and production tools become more accessible, new platforms are stepping in to meet the demand.

One of those platforms is Spoken, which helps authors create, distribute, and eventually monetize short-form audio using AI or human narration. The interview below was prompted by Dan Holloway’s recent news report on Spotify’s new short audiobook initiative, which highlights just how quickly the space is evolving.

This is Part 1 of a two-part series on the future of short-form audio. Next week, we’ll check in with Spotify directly about their new program. For now, I spoke with Phil Marshall and Andrew Wallner of Spoken Press to learn more about how this shift is opening up new opportunities for a wider range of authors.

Howard Lovy: I think I’m doing everything wrong. I’ve got a 75,000-word book coming out at the end of the month, and I’m narrating it myself—no AI involved. It’s not short-form, and it’s just my voice, croaking it out. Maybe I’m behind the times.

Phil Marshall: Well, I’ll tell you what, Howard—we’ve got a solution for you. As of, I think, Monday, VO 8.6 will include personal voice. You can upload audio of your own voice and use it as your own custom voice.

Howard Lovy: So… will it do it with feeling?

Phil Marshall: How much feeling is in the file you upload?

Howard Lovy: That’s cool—something like that could definitely save a lot of time for a lot of authors. Let me back up a bit. I’m doing a general story prompted by Spotify’s announcement about its new focus on short-form audiobooks. I wanted to get your perspective as an expert, but also hear more about your company. Can you tell me a little about the history of Spoken Press and how you got into this space?

Phil Marshall

“Spoken was born out of that exact need: to be able to create and self-publish great audio, and then be able to stream, socialize, and monetize that audio—to create that sort of studio-plus-network marriage that benefits both author and listener.” — Phil Marshall

Phil Marshall: Sure, happy to. Andrew and I, and our third co-founder—our chief technology officer—sold our conversational AI company in 2021. After that, I went on to finish my hard sci-fi novel. During that process, I created short form, and one particular story had six interesting characters with different accents. It was a piece that really called for a multicast approach.

Because I’m an AI entrepreneur and an audio-only reader, audio is at the top of the marquee for me. I got really deep on how AI was going to help indie authors self-publish. I wanted to create great audio of my own work using AI, but the workflow at that point, especially for multicast, was just impossible. And once I actually had the work, which was 40 minutes in length, I had nowhere to put it—certainly nowhere that could have made me any money off it as a piece of art.

Spoken was born out of that exact need: to be able to create and self-publish great audio, and then be able to stream, socialize, and monetize that audio—to create that sort of studio-plus-network marriage that benefits both author and listener.

We started in a free beta last July. We’ve now got more than 3,200 registered users, over 2,000 projects on the network, and over 350 authors, including New York Times bestselling authors. So from breakthrough to bestsellers, as we say. And we’re growing pretty rapidly.

Howard Lovy: So this really came from your own dilemma—you had something a little too short to find a home for. Do you see this as a larger trend? I mean, I do a lot of driving, and I listen to audiobooks constantly. But sometimes I don’t want to commit to eight hours. I’ll listen to a short podcast instead. Is this move toward shorter listening content something bigger?

Andrew Wallner: I think this is more than a phase. There’s this huge, massive world of words that are sort of below the self-published author waterline. I’m sure you have a pile of short stories in a shoebox under your bed. And for the vast world of creative writers out there—like the people you’d find on Reddit, Wattpad, Tumblr, or AO3—there are entire worlds of content. A lot of these folks don’t necessarily consider themselves authors, but they’ve been creating really high-quality, great narrative work for a long time, and there’s an audience for it. Now that the barrier to voice access has been lowered, it meets this broader global trend of listening becoming the new form of reading. With half of Americans listening to spoken word media every day, we’re going to see a lot more short and medium-form audio narrative, and I think that’s going to become a mainstay for content.

Howard Lovy: Obviously, Spotify saw something in this space. Does their recent announcement feel like validation that you’re headed in the right direction?

Andrew Wallner: It’s incredibly validating. We started working on Spoken about two years ago, putting these pieces together. Along the way, we’ve gotten direct feedback from authors that this is exactly the kind of thing they were looking for—now they can platform work they otherwise wouldn’t have been able to. So when an ElevenLabs or a Spotify comes around with similar marketplace and generative capabilities and distribution, it shows we’re not working in a vacuum. This is a broader shift toward supporting authors and creative writers of all kinds, whether they’re breakthrough or bestseller.

Howard Lovy: Can you briefly explain the ecosystem? You're partnering with ElevenLabs—but they also partner with Spotify. How does that all fit together?

Phil Marshall: We help authors create great audio and also serve as the network for streaming, sharing, socializing, and ultimately monetizing that work. Right now, we’re in a free beta, but we’ll soon allow authors to charge for their work—whether that’s free, subscription-based, or direct sale.

From a technical standpoint, we’ve built an ecosystem around best-of-breed narration services, driven by deep analysis of the text to prepare it for high-quality narration, including multicast, which is a specialty of ours and one of the reasons I started the company.

We currently use five text-to-speech providers, including ElevenLabs, and we’re about to add a sixth with a really strong, emotive narration style. Our users can choose voices from ElevenLabs, OpenAI, Google, Microsoft, or Amazon. It’s our preparation plus the voice tech that delivers the final product our authors really value.

Andrew Wallner

“At the end of the day, we’re agnostic about the text-to-speech provider. It’s all about the story and the storyteller—whatever works best for them.” — Andrew Wallner

Andrew Wallner: At the end of the day, we’re agnostic about the text-to-speech provider. It’s all about the story and the storyteller—whatever works best for them.

Howard Lovy: I also produce audiobooks for clients. It sounds like that job won’t disappear, but it’s definitely going to change. Instead of recording everything myself, it might become more about helping authors choose a voice—whether it’s their own or from a menu of options.

Phil Marshall: What we’re seeing as a trend is that while we have almost 400 independent authors on the platform, we’re now getting real interest from people who represent them—people like you, who work with a number of authors.

Here’s why: just because you’re an author doesn’t mean you have what we call “audio literacy.” In the same way Instagram taught us visual literacy—like knowing what a great morning latte photo looks like—we believe there’s going to be a real upswing in audio literacy. But most authors don’t have that yet. Working with someone who can help them optimize their work on the platform will be very valuable.

Whether authors use their own voice, a custom voice, or a voice actor, we support all of that. Our voice library includes about 300 voice actors, but you can also add your own voice clone or generate a custom voice for each character. If you go that route with us, we’ll analyze every character, create a prompt, and generate a brand-new voice that’s exclusive to you as the author.

When you combine that with emotive narration, blending of different voices, and mastering, it becomes a very different kind of audio experience. Right now, we’re strongest on short form. That’s where we started. About a third of our works are full, multi-chapter novels, but two-thirds are still short form.

I do want to respond to your comment about your job evolving. What we’re starting to see is a kind of tectonic shift, where the typical audio storytelling experience is beginning to fragment.

You may have heard of GraphicAudio or Sound Booth Theater—what some call a “movie in your mind.” Those productions include bespoke music, Foley, sound effects, and multiple characters interacting. They’re incredibly premium experiences and very high-ticket items—out of reach for most people.

Then you have what we still consider the gold standard: human performance and narration. But for a vast majority of independent, self-published authors, that’s either out of reach or an afterthought, as they try to manage everything else in their entrepreneurial domain.

And then there’s what a recent connection of ours called “the paperback of audio performance”—AI narration. It’s accessible to the masses, almost like print-on-demand. We’ve built a really great experience for anyone—whether you’re a self-published author who really cares about audio literacy or a hobbyist writer on Wattpad—so you can come in easily, create something, share it with the world, and maybe make some money with it.

Howard Lovy: Interesting. And the way I see it, you're not really taking anybody's job because these audiobooks would never have been made.

Phil Marshall: That’s just it. When Andrew talks about what’s below the waterline—what normally would or wouldn’t be made—we’re talking about material that hasn’t been made into audio because it simply hasn’t been accessible. That includes all the short form, which wouldn’t have been worth the time, hassle, workflow, and cost for, say, 18 or 20 or 500 short stories you might have laying around.

And if you want to support voice actors, our entire library is made up of voice actors who get paid for every use. What we’re seeing, though, is that authors are more focused on making sure the delivery of their story is great—and sometimes, custom voices are what it takes.

Howard Lovy: Who’s going to accept this for distribution? Right now, Audible doesn’t accept any AI voices, right?

Phil Marshall: That’s right—they don’t accept AI voices. Findaway Voices is starting to, and we’re working with them to get that through their distribution. So, increasingly, the reins are loosening a little on that.

But for us, the truth is that because our authors can publish directly to our network—and we have a growing user base—we’re surpassing sixteen hours of listen time for every one hour on the network, which is great. We’ve got a strong ratio of listeners and return users to new users or authors, so it’s all very favorable for us.

We believe in authors having access to their work. When they publish it and make it public, they can also download it and take it with them. We believe in writers having ownership over their work. We don’t take any rights whatsoever.

Whether it’s the rights issue, the revenue split we’ll offer when we enable direct sale, or the subscription model we’re building, it will all be very transparent. Our economic model and our belief in writers and their ownership over works is just very different from the rest of the industry.

Howard Lovy: At ALLi, there are a range of opinions on AI—from “AI is bad, and if you even say the word, I’ll quit,” to people like Joanna Penn, who is fully on board. What do you say to the skeptics?

Phil Marshall: Joanna is, first of all, just an incredible mind on this stuff. She’s a powerhouse. She’s really taken us to task—asking why this, why that, why not ElevenLabs, why would I use you? So we’ve got an ongoing conversation with Joanna, which we greatly value.

Skeptics come in all forms. And when you’re a startup like we are, there are ten skeptics for every block you walk. It’s just something we navigate. But honestly, it’s becoming less and less about “anti-AI.”

We used to run into a lot of resistance. Most authors didn’t want to use AI at all. They didn’t understand how it was built, and they thought it was stealing directly from their work and plagiarizing. They also didn’t want to take business away from voice actors. But then we’d say, “You can choose from our library of 300 voice actors to narrate your work, and we’ll even recommend the ones that best match your content.” That’s when they started to warm up.

In the end, what authors really want is a great conveyance of their work. And custom voices are becoming more and more accepted. For those who are open to AI—like Joanna—there are different levels. Joanna is a bestseller. She’s got her distribution, her newsletter, her Patreon—her entire process is dialed in. She has an outlet to reach thousands of readers with every release, and those readers want good audio. She’s able to create that with AI.

But then you move beyond that, to authors who are just getting started—what we call “breakthrough” authors. At that level, they need more than just tools—they need help marrying discovery with the work itself.

Andrew Wallner: We’ve built a studio experience where we can take the narrative text that a creator brings and run a lot of analysis on it—not only to support the narration process, but also to help with content moderation, safety, summarizing the work, and finding comps. We also analyze the work to identify which micro subgenre or social trend it might fit into, helping the author place it in the network on the digital bookshelf where it’s most likely to be picked up.

Phil Marshall: It’s not just genre or category—it’s mood, narrative style, setting, character mix. We actually take eight dimensions of every story and turn them into a vector to profile it. Based on a listener’s habits and responses, we can then personalize recommendations to match those dimensions. That kind of discovery is pretty remarkable.

We’re also taking it a step further. For authors, we’re beginning to automate the creation of visual trailers—like TikTok-style videos—that use actual audio from key moments in the story. These visuals will let people quickly browse stories to find something they connect with. We’re going to automate this for authors so they can share on their socials.

Howard Lovy: That’s really important. Discovery is everything, especially for indie authors who often feel like they’re just one grain of sand in a vast landscape.

Andrew Wallner: That’s a huge thing we’re helping to address—this idea of dimensionalizing stories. Today, finding the perfect story as a listener or reader is hard. You rely on word of mouth or the next book in a series. But as AI-generated content becomes more prolific, the sheer volume—the flood—of material entering networks like Spotify and Audible is going to explode. We’re working hard to solve the problem of helping people find the story that fits their taste like a glove, while still serving the story and the storyteller.


Thoughts or further questions on this post or any self-publishing issue?

Question mark in light bulbsIf you’re an ALLi member, head over to the SelfPubConnect forum for support from our experienced community of indie authors, advisors, and our own ALLi team. Simply create an account (if you haven’t already) to request to join the forum and get going.

Non-members looking for more information can search our extensive archive of blog posts and podcast episodes packed with tips and advice at ALLi's Self-Publishing Advice Center.

Sign up for the ALLi Monthly
Non-Member Newsletter

Advice. News. Ratings. Tools. Trends.

ALLi MEMBERS PLEASE NOTE: Do not sign up here. You already receive a weekly member update (and much more) direct from the Alliance of Independent Authors

    We won't send you spam. Unsubscribe at any time.

    Author: Howard Lovy

    Howard Lovy is an author, book editor, and journalist. He is also the Content and Communications Manager for the Alliance of Independent Authors, where he hosts and produces podcasts and keeps the blog updated. You can find more of his work at https://howardlovy.com/

    Share

    Leave a Reply

    Your email address will not be published. Required fields are marked *


    The reCAPTCHA verification period has expired. Please reload the page.

    This site uses Akismet to reduce spam. Learn how your comment data is processed.

    Latest advice, news, ratings, tools and trends.

    Back To Top
    ×Close search
    Search
    Loading...