Quickstart

API.audio is the an API-first platform for adding beautiful audio to your applications. It's 99x faster than using a voice-artist and recording studio.

So anywhere you need to add audio to an application - for example an audio advertisement created and deployed to an Adserver or a Demand Side Platform, an audio message produced based on your email campaign, a voiceover plus soundtrack in a different language to attach to a video, our technology allows you to produce scalable and dynamic audio assets.

Get started -

npm install --save apiaudio
pip install apiaudio -U
#Note some installations of pip need pip3

Script

One of the core units we have is Script. A script has to be created, in the same way you'd write a script for a movie. Afterwards you'll get a scriptId and you'll be able to refer to this for a long time

script = apiaudio.Script.create(scriptText="""Hey what do you think about this""")

Scripts also consist of subunits such as sections.
You can see in this example below that there's an intro, main and outro. And we also have the ability to apply more in depth sectionProperties, i.e. adjusting the end of the sound. You can look at a great example here for more details.

text = """
<<soundSegment::intro>>
<<sectionName::intro>>
Welcome to the NewsCast of the day. April 1st 2022. 
<<soundSegment::main>>
<<sectionName::main>>
The European Commission and the United States open new chapter in their energy cooperation
State aid: Commission approves €200 million Italian scheme to support the retail trade sector in the context of the coronavirus pandemic
<<soundSegment::outro>>
<<sectionName::outro>>
That's all for today 
"""

This is important because if you’re building an audio advertisement (let’s say for a podcast network) you’ll often have different sound designs included in the audio asset. We also have other parts such as effects. We’ll come to this in another part of the documentation.

Speech

Speech is exactly what it sounds like. It allows you the ability to create beautiful speech, and also specify things like speed of the voice. If you want just to use the best text to speech voices in the market with an easy to use API, you've come to the right place!

Each script has a scriptId, so that allows you to version and create more scripts. You'll need to specify this for Speech and Mastering.

For speech, some key variables are voice. We have a rapidly increasing number of voices (at the moment of writing over 500), you can view the list of voices here

response = apiaudio.Speech.create(scriptId=script.get("scriptId"), voice="Ryan")

Alternatively if you want to hear an example of a female voice here is a good one.

response = apiaudio.Speech.create(scriptId=script.get("scriptId"), voice="jenny")

🚧

Some voices aren't available on all plans

You might get an error message for some voices, you'll need to either upgrade your plan or simply try a different voice.

Production

Production means that we create a full audio asset, so text to audio, not simply text to speech. This allows you to improve the quality of your speech, to change your format (mp3, alexa, wav), and also adjust aspects of your audio.

When you first call Production you need to create a Mastering resource. And you’ll need to specify your scriptId. Optionally you can specify a soundTemplate. You can view a list of soundTemplates here. We'll pick parisianmorning for this example.

A soundTemplate is a way to add background sound to your audio asset. Some of these have more and more advanced features, which you'll discover as you explore the API.

response = apiaudio.Mastering.create(scriptId=script.get("scriptId"), soundTemplate = "parisianmorning" )
file = apiaudio.Mastering.download(scriptId=script.get("scriptId"))

What’s Next