Build thousands of audio files with one line of code - 'Birdcache'

Birdcache is a caching and scalability service provided by API.audio that provides the caching layer for the customer by storing data in API.audio servers for future use, we also allow you to specific parameters and scale your audio production quickly. The caching service allows you to retrieve your audio files on the fly. You may want to use this in a chatbot like interface, you may want to use this in a gaming application, basically anywhere that latency is important.

  • Available for all api.audio voices across all voice providers - all external (asynchronous and synchronous) voices
  • Produce thousands of different audio files with one line of code
  • Produce dynamic audio files and personalise them for your audience - for short audio content, for marketing messages and also for advertising
  • Caching for mastering service so you can retrieve high quality audio files
  • Option to flush the cache - in case something goes wrong
  • Low latency audio file delivery for those cases where speed matters

What are the benefits?

  • Almost real time delivery from a third party API → better latency
  • Cheaper than hitting the API for each individual request

How to use?

POST request: https://v1.api.audio/birdcache

Arguments: JSON body of type voice text audience and soundTemplate

Authentication: API key in the headers (x-api-key)

Parameters Defined

type (mandatory, string) - 'speech' or 'mastering'

voice (mandatory, string) it is the name of your chosen voice. See all voices available https://library.api.audio/

text(mandatory, string) text is what will be converted to speech

audience (key value pair object) personalisation parameters as defined in the text

soundTemplate (string) The sound template name. See all sound templates available https://library.api.audio/sounds

Example Request

curl --location --request POST 'https://v1.api.audio/birdcache' --header 'x-api-key: YOUR_API_KEY' --header 'Content-Type: application/json' --data-raw '{ "type": "mastering", "soundTemplate": "parisianmorning", "voice": "linda", "text": "Hello {{username|friend}}, {{city|tallinn}} is {{weather|rainy}} today.", "audience": {"username": ["linda"], "city": ["istanbul"], "weather": ["sunny"]}'

Response

{
	"text": "the produced speech text",
	"ready": "a boolean that shows if the file is ready to be served. If true, URL parameter should be used to obtain the file",
  "hashed": "unique request id",  
  "isFallback": "a boolean that shows if the text is fallback track or not",
	"url": "the location of the file",
  "isNewRequest": "a boolean that shows if the request is new"
  "isInProgress": "a boolean that shows the request was made previously and is now in production"
}

See the video for reference

Example Response

{
   "text": "Hello friend, Tallinn is sunny today.",
   "ready": true,
   "hashed": 4559956267709449845867081431541503943, 
   "isFallback": true, 
   "url":"https://ms-file-speech.s3.amazonaws.com/173e62af-0767-484a-a279-a024fd8b4b05.wav?AWSAccessKeyId=ASIASLKHN6Q...."
}

SDK Usage

API.audio offers Birdcache for Python and Javascript SDKs. See examples below:

# pip install apiaudio
import apiaudio
birdcache = apiaudio.Birdcache.create(
  type="speech or mastering",
	voice="voice name",
	text="Creating apiaudio speech from cache",
  audience={"username": ["salih", "sam", "timo"]},
  soundTemplate="name of soundtemplate"
)
// npm install apiaudio
import { apiaudio } from "apiaudio";
const birdcache = await Birdcache.create({ text: "Creating apiaudio speech from cache", voice: "voice name", type: "speech or mastering", audience: {"username": ["salih", "sam", "timo"], soundTemplate:"name of soundtemplate"})

End to End Example

See the example here → https://github.com/aflorithmic/birdcache_examples

Steps:

  1. Clone the repository git clone https://github.com/aflorithmic/birdcache_examples
  2. Get into the folder cd birdcache_examples
  3. Create a virtual env python3 -m venv venv
  4. Activate the virtual env source venv/bin/activate
  5. Install the python dependencies pip3 install -r requirements.txt
  6. Open birdcache.py file and put your api key and your voice name
  7. Run the code python3 birdcache.py
  8. You may edit the sentences in sentences.txt file
  9. The next time you run the code for the same sentences, they will be retrieved immediately from the cache

Text parameter may contain personalisation parameters.

For example, if you want to create a speech track 'Hello username, have a nice day.' for 200 different usernames, then username should be converted into a variable. In order to do that, you should wrap it with {{ and }} and also assign a fallback value. Fallback value stands for the default value of the variable and is mandatory. Fallback text is needed because its the convention for variable-based production engines.

So, in this case, text parameter will be Hello {{username|friend}}, {{city|location}} is {{weather|sunny}} today. Have a good {{day|day}} where the default username is friend, default city is location, and so on.

Fallback parameter is mandatory because in case of a new request which is not yet produced, or in case there is missing audience items, fallback value will be served for you. Fallback values help you to serve meaningful content in minimum time to your users, while the files are getting ready.

STEP 1:

Create a text with audience parameters:

{
	"text": "hello {{username|friend}}, {{city|location}} is {{weather|sunny}} today. Have a good {{day|day}}"
}

STEP 2:

Pass the audience:

{
	"audience": {"day": ["sunday", "monday"], "city": ["istanbul", "london"], "weather": ["rainy"], "username": ["maria", "alex"]}
}

For example, for these text and audience parameters; following combinations will be produced:

  1. hello friend, location is sunny today. Have a good day
  2. hello friend, istanbul is sunny today. Have a good day
  3. hello friend, london is sunny today. Have a good day
  4. hello alex, location is sunny today. Have a good day
  5. hello alex, istanbul is sunny today. Have a good day
  6. hello alex, london is sunny today. Have a good day

and so on...

In this case, total of 54 combinations will be produced.

STEP 3:

Get it ready:

curl --location --request POST 'https://v1.api.audio/birdcache' --header 'x-api-key: YOUR_API_KEY' --header 'Content-Type: application/json' --data-raw '{ "type": "speech", "voice": "linda", "text": "hello {{username|friend}}, {{city|location}} is {{weather|sunny}} today. Have a good {{day|day}}", "audience": {"day": ["sunday", "monday"], "city": ["istanbul", "london"], "weather": ["rainy"], "username": ["maria", "alex"]}'

Hitting the api for the first time, a fallback track will be served immediately. The rest of the combinations will be produced asynchronously. Next time you make the same request, you will see the tracks are in progress or ready. Those files will be cached and ready to be served in the future.

Note! Fallback parameters are always prioritised, and ready to be served even if the rest of the files are still being created

For full technical documentation refer to: