Overview

API.audio makes it easy to access the latest that is possible in AI-speech synthesis. It enables you to create amazing sounding speech from text in mere seconds, but also allows you to record, upload and manipulate human speech to use it in one of your audio experiences.

We include the following features.

Multi-Voice speech - Using your own uploaded media files you can stitch together speech using your own recorded speech and any of our hundreds of voices. Make your own voice talk to Einstein or make your audio content even more personal by using your own recorded speech.
Voice Effects - Use voice effects to alter the sound of your speakers voice. Make them sound like a famous cartoon character, alien or a chipmunk. With the immersive sound feature you can take your speaker from a loud underground bar to the ambience of a Parisian Cafe.
Voice Upload - Upload your own recorded speech files, mix them with sound design and render directly through our API for professional sounding audio.
Voice Cloning - Clone a voice for your brand through our dedicated voice capture app. With at least 30 minutes of data you can get a clone of your voice to use through the API.
Voice Library API.audio's voice library includes over 600 voices from over 8 different providers including our own in-house cloned voices. Find a voice for your use case in our library frontend library.api.audio.
Voice Discoverability - Our intelligent filtering system makes it easy for you to find voices that span across different languages, gender, accents and age groups. This will make it easy for you to offer your users the ability to find their favourite voice.
Visemes and Facial landmarks - Visemes are the visual representation of the face and mouth when speaking a word and different components of speech. Sync your speech over a virtual avatar conveniently by using visemes.
Real-time Text-to-Speech rendering. Use any of API.audio's library of voices to create speech from text in milliseconds. Best for conversational, real-time use cases.