When My AI Voice Told the Story Better Than I Could

Neville Hobson 21 Jul 2025 4 min read

On a rainy Saturday in mid-July, my wife and I spent the day in London visiting family over from Costa Rica. It turned out to be quite an eventful day, marked by torrential rain, major traffic disruptions, and a joyful reunion that made all the chaos worthwhile.

However, while the day itself was memorable, it also became the catalyst for something I had been meaning to try for a while: cloning my voice using AI.

Why I Decided to Clone My Voice

I’ve had a paid account with ElevenLabs – a leading text-to-speech and voice cloning platform – for a while. But apart from some initial testing, I hadn’t done much with it.

Until now.

Inspired by this particular story and its strong narrative shape, I decided to see how well ElevenLabs could create an AI-generated version of me – telling this story in my voice.

How I Did It

The process was surprisingly straightforward:

I recorded a short WAV audio clip of myself reading a sample text, which I uploaded to ElevenLabs to train a voice model. It asked for a minimum of ten seconds; I gave it 27.
I then recorded the narrative based on what I originally posted to friends on Facebook.
I ran that WAV recording through Adobe's excellent Enhance Speech feature, part of its Adobe Podcast toolset, to clean it up and enhance the quality.
Finally, I gave the transcript of my recording to ElevenLabs, which generated a new text-to-speech MP3 audio file of “me” reading it.

The result is remarkable. It really does sound like me – arguably, better than if I’d recorded it myself. It’s smooth, natural, and pleasant to listen to. If I’d recorded it directly, I’m not sure I could have delivered it as clearly or consistently.

Listen to the Result

Here’s the audio clip generated by ElevenLabs, based on my voice model and the text below. Have a listen – what do you think?

An Experiment in Text-to-Speech Storytelling

0:00

/98.123878

(Download the MP3 if the player doesn’t load.)

Transcript of the Audio:

We enjoyed a terrific day in London yesterday, despite the phenomenal rainstorm that went on for hours and dumped torrents of rain across the capital. It’s at moments like this that you really see the consequences of road drains that can’t handle huge volumes of water all at once, whether due to poor maintenance, bad planning, or simply an overwhelmed system (so yes, let’s blame Thames Water).

Our enjoyment was also in spite of the severe disruption caused by groups of people exercising their democratic right to protest and support Palestine, no matter the consequences. (They have their rights, after all, but seemingly no responsibilities.) We discovered this when taking a taxi from Waterloo Station to Covent Garden. The police had just closed all roads south of the Thames from Westminster Bridge to Blackfriars – and then closed all the bridges, too, apart from Blackfriars.

What should have been a relatively quick trip turned into more than an hour, much of it spent stuck in a massive traffic jam. Sat navs were pretty useless at this point as everywhere was jammed. Our driver, who clearly had The Knowledge, did his best to find a route, but eventually we bailed out somewhere near Holborn as the meter was running huge numbers by this stage and the roads remained jammed.

Still, none of that really mattered. As we neared our lunch venue – the Lamb & Flag pub near Covent Garden – over an hour late, the rain cleared, the sun came out, and we enjoyed a wonderful few hours catching up with family visiting from Costa Rica before they returned home to that delightful paradise in Central America.

Reflections on the Technology

There’s something fascinating – and slightly uncanny – about hearing your own voice say something you didn’t actually record.

This wasn’t just a robotic reading. It had intonation, pacing, and nuance that made it feel personal. And that’s the thing: it was personal, even if I didn’t perform it myself.

There are clear benefits here. For one, I could publish narrated versions of my blog posts without needing to manually record each one. The result might even be better than a live reading, at least in consistency. There’s also real value in accessibility, as not everyone wants to read; some prefer to listen, a mantra I'm familiar with as a podcaster.

But it’s not without caution. Using a cloned voice, especially without disclosure, raises ethical and trust questions. In my case, I’m using it transparently, as a personal experiment – but what happens when these AI-generated voices are used for persuasion, marketing, or worse, deception?

Is This the Future of Voice?

We’re already at a time where your voice – like your image or your writing – can be replicated by AI. For communicators, this presents opportunities and dilemmas in equal measure, a topic that Shel and I have discussed in episodes of our For Immediate Release podcast.

It also makes me wonder how voice cloning might help those who’ve lost their voices or want to preserve them for posterity.

I’m not sure I’d want my AI voice to read a eulogy or deliver a heartfelt message to someone close to me. But for storytelling, blogging, podcasting, or education, I can see real potential.

And maybe this is how I’ll “read” my next post – or newsletter – to you.