Voices availability and selection
The most important choice when producing speech is most likely the selection of the speaker.
At the moment api.audio offers speech models from the following providers:
AWS Polly, Google Text-To-Speech, Microsoft Azure, IBM, Messner, Cerevoice, vocalID, Resemble and Deepzen A total of 600+ voice!
Further, api.audio allows you to clone a voice, as well as the voices of your users, which will then become available in your organisation for speech creation.
You can retrieve a list of all the speakers that are available in your organisation:
# Get all available voices and print them
all_voices = apiaudio.Voice().list()
print(all_voices)
Remember to always use your API key!
import apiaudio
apiaudio.api_key = "your-key"
Some voice models allow additional parameter tuning beyond standard SSML (see here) annotation.
Speaker selection can get unwieldy quickly. Hence api.audio offers additional information for each speaker which allows you to filter, sort, assess and choose. This is especially useful when you are building a frontend that allows your users to make such a choice.
Listen to all our voices
Resemble and Msnr Provider Voices
Those voices are limited to paid plans
Updated over 2 years ago