B
10

My AI voice clone project for a client went totally off the rails last Tuesday

I was in my home office in Austin, finishing a custom voice model for a local author's audiobook. I used ElevenLabs to clone his voice from a 30-minute sample. The training seemed fine, but when I ran the first test script, the AI started adding these weird, aggressive coughs and throat-clearing sounds between sentences. It was nothing like the clean sample. I spent three hours checking the audio files, adjusting the training settings, and even re-uploading everything. The problem turned out to be a single 2-second clip in the training data where the author took a sip of water and cleared his throat. The model latched onto that tiny noise and amplified it. I had to manually edit that clip out and retrain the whole model from scratch, which took another 6 hours. Has anyone else had a voice clone pick up on a random audio artifact and run with it? What's your fix?
3 comments

Log in to join the discussion

Log In
3 Comments
riley43
riley4325d ago
Ever think about how the AI might be learning the wrong thing from background noise, not just voice clips? I had a model pick up the hum of a fridge in a quiet studio sample and bake it into the voice tone itself, so it always sounded a bit buzzy. My fix now is to run everything through a noise removal tool before I even start training, just to strip out any room tone or random sounds. It adds a step but saves so much headache later when the clone sounds like it's recorded in a kitchen.
3
the_henry
the_henry25d ago
Riley43 that fridge hum story is way too real. My first ever voice model had a weird echo because I recorded near an air vent. The AI learned my voice plus this faint whooshing sound, so every output had this ghostly wind tunnel effect. Now I'm paranoid and clean my audio like a crazy person before any training run. It's a pain but you're right, it stops the model from picking up on random room sounds as part of the actual voice. That extra step is totally worth it to avoid a clone that sounds like it's living inside a dishwasher.
1
martinez.kim
My Austin project had a similar hiccup with a car horn, but I disagree with riley43 about over-cleaning audio. Sometimes those tiny flaws make the clone sound more human.
1