Not long ago, I would have said absolutely not; now, I think it’s worth considering

A lot of people love listening to audiobooks and a lot of authors would love to have audio versions of their books.

But a big issue for authors is the cost of audiobook production. Traditional audiobook production with a professional narrator can run $300-$500 or more per finished hour. Audiobooks are typically narrated at a rate of 9,000-10,000 words per hour. That means a 50,000-word manuscript could cost anywhere from $1,800 to $3,000 or more to turn into an audiobook. You have to sell a lot of copies to recoup that investment.

Some authors can save money by doing their own narration, but if they don’t have the required speaking skills and an appropriate recording environment, their audiobook won’t sound good and/or it will take extra work for the audio engineer doing the editing to get it to the point where it’s acceptable for distribution.

Enter AI voice technology

When AI voice technology was first introduced, it sounded like the robots from old sci-fi films. Today, it’s hard for a casual listener to detect an AI voice. And if you want to use your own voice, AI voice cloning is accessible, potentially more affordable, and can save you hours of recording time.

We’ve recently been working with AI voice technology for audiobooks from two angles. One is a client who wants to narrate his own books but doesn’t have the time or patience to actually do it, so he’s using a text-to-speech product that can clone his voice. The other is Amazon KDP’s Virtual Voice (currently in beta).

Voice cloning

There are a number of voice AI platforms that allow you to choose from a library of virtual voices or clone your own voice. While they can do a quick cloning process with a few short segments of audio, the best result will come if you provide a minimum of three hours of clean audio for the platform to use to learn to speak like you.

Clean audio means no background noise, no stammers or fillers (uhs and ahs), no audible breathing or mouth noises (clicks and smacks), and clear articulation.

If you make the upfront effort to train the AI platform to speak like you, it can produce a great quality audiobook that sounds like you.

Amazon KDP Virtual Voice

Amazon KDP’s Virtual Voice is currently in beta and not all KDP publishers have access to it. It lets you convert an eligible ebook to an audiobook—and while the conversion happens in seconds, you’ll still need to spend hours listening to the book and correcting any mispronunciations and other issues that could distract the listener.

At this writing, KDP has more than 50 virtual voices publishers can choose from. They are masculine and feminine, and include American English, Southern American English; British English, Australian English, and American Spanish.

I decided to try creating an audiobook from The Mindset of High Achievers, an ebook I published on Amazon in 2017. The manuscript is about 8,000 words and worked out to a 55-minute audiobook.

The platform is easy to use. Not all of the features are available (remember, it’s in beta). I chose the American English #15 voice, and I’m impressed with how good it sounds. There were a few places where I had to correct the pronunciation, especially where I had digits in the original manuscript and had to spell them out so the AI voice would know how to say them. And I can’t quite get the AI voice to say Jacquelyn correctly. 😊 Other than that, the biggest issue is that the AI system occasionally would drop an odd word or sound into the audio and there’s no way to remove it, but it didn’t happen often. And the platform is still in beta, so it’s possible KDP is working on this.

It took me about three hours to get The Mindset of High Achievers ready to publish. That time included my learning curve, and I think the next book I do will go faster.

Audiobooks created by KDP’s Virtual Voice are available on Amazon and Audible, and the price must be $3.99 to $14.99. The author earns a 40% royalty. While in beta, there’s no charge to create an audio book using Virtual Voice; it’s unclear if that will change in the future.

Should you embrace or reject this technology?

When we began exploring the text-to-voice AI platforms at our client’s request, I made a post in a self-publishing support group on Facebook, asking which products other publishers preferred. I received some good recommendations (ElevenLabs seems to be the most popular). I was also excoriated by several members for even considering using AI for an audiobook.

The naysayers criticized the quality of AI voices, saying they provided a negative user experience. They said AI is taking business away from professional narrators. One said I would ruin my reputation by using an AI voice. Another insulted my character “because it can benefit you while hurting others.” Still another called using AI a cop-out.

Over the past four decades, I have watched technology change the publishing landscape in ways I never thought possible. AI for audiobooks is another one of those technological changes that we simply can’t ignore.

The client I referred to earlier wants audio versions of his books narrated in his own voice. The cost is not an issue for him, but he doesn’t have time to spend weeks in the studio recording them. And AI can do it.

We have other clients who would like to have audio versions of their books but can’t afford (or can’t justify) the cost of hiring a human narrator with traditional production.

Jacquelyn Lynn with headset and microphoneWe have produced audio versions of two of my books (my novel Choices and the second edition of Finding Joy in the Morning: You can make it through the night). That entailed hours (I don’t recall how many) recording in our makeshift home studio, then Jerry Clement spent even more hours editing and mastering the audio files so we could upload them to Author’s Republic for distribution. Our actual expenses were minimal (we purchased a few pieces of equipment) but the time investment was significant.

When I created the audio version of The Mindset of High Achievers, all I had to do was let AI read the book and listen to the recording to make sure it sounded right. It’s not my voice, but it still sounds good.

We would prefer to have my books recorded in my voice, so we’re planning to do that with voice cloning in the near future. And Jerry will probably do some editing and mastering of those files so they meet our high standards, but he won’t have to do anywhere near as much work on them as he did on the ones I narrated.

Do I recommend creating audiobooks using a robotic voice? No. But using a quality AI-generated voice is something I think authors should consider, especially if it comes down to the choice of using AI and getting your audiobook out there or not using AI and not having an audiobook at all.

Jacquelyn Lynn
Follow me