Architecture

What is API.audio?

API.audio is organised along the lines of a “traditional” audio production process which has 4 parts:

  • Script:
    This is text which normally would be spoken and recorded by a voice. One of the strengths of creating audio synthetically is its scalability. Api.audio therefore offers a variety of tools to organise such content efficiently, version it (eg for personalisation or to make it dynamic) and arrange it so it is ready for production.

  • Speech:
    The api.audio speech functionality allows you to generate speech from text (text-to-speech aka TTS) with a plethora of synthetic voices being available. You can also clone a voice and make it available through the API or upload voice snippets that have been recorded by a voice actor that are to be used somewhere in the production process.

  • Sound:
    Audio production usually involves adding music and various sound elements to recorded speech as well as applying audio effects. This makes it more lively and pleasant to listen to. Api.audio allows you to choose from a variety of sound templates or to create your own which can then be combined with a speech track.

  • Production:
    Finally, a file (eg an .mp3 or a .wav file) is created. In order to make it sound like a professional audio production, it usually requires various effects to be applied. This is commonly referred to as post processing or mastering. With api.audio you can programmatically produce large numbers of professionally sounding files without needing any audio production knowledge.

  • Scale:
    In order to make it easy to get content (text) into API.audio and distribute audio files into channels we offer convenient integrations. These allow you to pick up content (e.g. from your CMS, a blog, a spreadsheet or even twitter) and broadcast it to various destinations (e.g. your app, your Iot device or your slack channel).

👍

Did you know?

You do not need to use the entire production chain described above. You can pick and choose which components you need: e.g. it can make sense to only use api.audio’s speech capabilities or only the audio enhancement engine as each component is available via its own API.

📘

API.audio is RESTful

API.audio is organised around [Rest]. Our API has predictable resource-oriented URLs, returns JSON-encoded] responses, and uses standard HTTP response codes, authentication, and verbs.