Speech Synthesis Module

Google Docs can turn long documents into audio summaries in latest Workspace update

The new feature will roll out across Google Workspace over the next two weeks. It will appear under Tools > Audio > Listen to document summary, ...

3dOpinion

India's deepfake fight is a tricky battle

Easier said than done: As we see, India’s two- and three-hour deadlines figure among the most stringent globally. While the ...

IEEE

Emo-DiT: Emotional Speech Synthesis With a Diffusion Model Approach to Enhance Naturalness and Emotional Expressiveness

Abstract: Current emotional text-to-speech tasks have achieved high-quality emotional speech by incorporating emotion modules into text-to-speech models. However, there has been limited in-depth ...

Slator

Voice Cloning Meets Emotional Speech Synthesis With Alibaba’s Marco-Voice Model

Alibaba researchers have unveiled Marco-Voice, a new text-to-speech (TTS) system that brings together voice cloning and emotional speech synthesis in a single framework. With Marco-Voice, Alibaba aims ...

MIT Technology Review

AI text-to-speech programs could “unlearn” how to imitate certain people

New research shows models can be directly edited to hide selected voices, even when users specifically ask for them. A technique known as “machine unlearning” could teach AI models to forget specific ...

Ars Technica

A neural brain implant provides near instantaneous speech

Stephen Hawking, a British physicist and arguably the most famous man suffering from amyotrophic lateral sclerosis (ALS), communicated with the world using a sensor installed in his glasses. That ...

IEEE

Neural TTS-Based Dynamic Data Augmentation for Improved Speech Separation

Abstract: Text-to-speech (TTS) synthetic data augmentation has been widely used in various speech processing tasks, but its effectiveness in speech separation remains understudied. In this paper, we ...

Geeky Gadgets

ElevenLabs Launches Eleven v3 (alpha) : New Expressive Text to Speech Model

ElevenLabs has launched Eleven v3 (alpha), a new Text to Speech model designed to deliver highly expressive and realistic speech generation. This version introduces advanced features like ...

Geeky Gadgets

Gemini TTS Native Audio Out : The Future of Human-Like Audio Content

What if your audiobook could whisper secrets, your podcast could laugh with its audience, or your virtual assistant could interrupt with perfect timing—just like a real conversation? With the advent ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results