New Microsoft’s AI “VALL-E” can imitate any voice with just 3-second sample
Spread the loveFollowing the recent move to integrate OpenAI’s chatGPT with its products, Microsoft has recently released VALL-E, a new language model for text-to-speech synthesis (TTS) that uses audio codec codes to represent intermediate representations. (via Aitopics) The technology generates content using 3-second samples of particular voices after being trained on 60,000 hours of English […]
New Microsoft’s AI “VALL-E” can imitate any voice with just 3-second sample Read More »