Freesound Forums

AI voice training Scripts and techniques

Started April 10th, 2023 · 4 replies · Latest reply by Appricot 1 year, 5 months ago

Sadiquecat

1,501 sounds

184 posts

1 year, 7 months ago

Hello.

I'm looking for techniques, tips and scripts to read in order to train a AI text to speech.

I can't find much info other than "go on that website, throw audio at it and voila".

Id love to have a more technical approach!

I want to record myself, and maybe some family members sort of to immortalise what we sound like.

SO does anyone know if there's like a script to read making all the different sounds, maybe covering the common ones a few times. Is saying each letter and making each "sound" a thing, or is it unnatural and noisy training ?
I'm sure reading books would be a great start, but sometimes people can be monotonous while reading or exaggerate punctuation or articulation, so in the end would it be "natural" ?

Id also presume something like this would have different specific words between a angry tone, calm tone, quick, casual etc...

Then there's probably technical sides of mic placement ? Is a "close" mic preferable, or would it sound too "podcasty" and something 30cm, 1m away be better ? I presume recording a few distances at once would be the best.

If you have resources to point to, or experience, or ideas, let me know!

Many thanks <3

CC0 Be a hero.

Appricot

0 sounds

2 posts

1 year, 6 months ago

I'm not sure if you tried MS's Clipchamp. It looks like the major part of functions you described are there. At least for voicover it works well. No scripts or special prompts or API or etc required. I've tried text-to-speech and it was really good. You can use embeded voices or to train with yours recorded. Settings are not so reach yet and tone/mood regulations are hardly possible. At least I didn't try. Perhaps, you find a way to do that also. Yet, different speed and level of voice are really good. Honestly, I didm't expect that it could be so good. It worth to try.

Sadiquecat

1,501 sounds

184 posts

1 year, 5 months ago

Sorry for the delay, I didn't see any notification of your message ^^'
Thanks I appreciate it and will have a deeper look.
The "speaker coach" sounds interesting, didn't think a thing like that would exist.

I have found the thing I was looking for the other day : https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/record-custom-voice-samples

There's a rather in depth explanation of the process and scripts in a few languages !

Cheers!

CC0 Be a hero.

Appricot

0 sounds

2 posts

1 year, 5 months ago

You've found really fruitful instructions. It looks a bit sophisticated, but, assume, it might result in really good customized voice. Have to find time to check it out. Thanks a lot for sharing.

Post reply