Hybrid Voice Feature

User Guide for Hybrid Voice Feature - mix and match real and synthetically produced speech

Hybrid Feature allows you to conveniently create hyper realistic sounding speech by combining real recorded speech files with synthetic speech produced through api.audio.


Step 1: Upload your media files to your api.audio organisation

The media files are audio files (the supported upload file formats are mp3, ogg, flac and wav, with a sampling rate of 44.1 kHz).

When uploading your files, you can assign tags to each file. This allows you to better organise and retrieve your content.

Each uploaded media file will be assigned a mediaId. mediaId is a unique value for each file, and will later be used used in the mastering process as a unique identifier of the media file.

See example below:

#Upload your media files 
response = apiaudio.Media.upload(
  file_path="./my_file.mp3",
  tags="tag1,tag2,tag3"
)
print(response)
// in order to fulfill the full platform support 
// no file related resources will be added to the Javascript SDK

List all files available to your organisation:

# lists all files
files = apiaudio.Media.list()

# lists files with tag="tag1"
files = apiaudio.Media.list(tags="tag1")

# lists file with specific id
files = apiaudio.Media.list(mediaId="some_mediaId")

# lists files with tag="tag1" and with a downloadurl
files = apiaudio.Media.list(tags="tag1", downloadUrl=True)
// lists all files
let files = await apiaudio.Media.list()

// lists files with tag="tag1"
files = await apiaudio.Media.list({tags:"tag1"})

// lists file with specific id
files = await apiaudio.Media.list({mediaId:"some_mediaId"})

// lists files with tag="tag1" and with a downloadurl
files = await apiaudio.Media.list({tags:"tag1", downloadUrl:true})

Read more here Media Resource


Step 2: Create a script!

API.audio offer tools for you to organise your content. Instead of creating speech directly from text via a direct call to the text-to-speech API, we recommend creating scripts.

Organise content efficiently, version it (eg for personalisation or to make it dynamic) and arrange it so it is ready for production. Read more here .

To use your speech media files together with synthetic speech, you first need to create a script.
Here is an example of a script with a media tag:

text = "Here is my recording: <<media::something>> and some more text."
script = apiaudio.Script.create(scriptText=text)
let text = "Here is my recording: <<media::something>> and some more text."
script = await apiaudio.Script.create({scriptText:text})

Use the <<media::something>> tag to assign your media files in the script, where media (the key) means "a media file" and something is the value of the media tag. The value can be anything you'd like, and you will use it in the mastering call. During the mastering call, you will assign this value something to the specific mediaId you uploaded in the first step. This way you can easily produce new audio, having to only switch up the media files without the need to create the same script over again.

You may also want to add a sound template over your speech and media file.
In the example below, the sound file will play throughout the whole section, including the media file.

text = "<<soundSegment::intro>> <<sectionName::intro>> Welcome to today's class. Get ready <<media::countdown>> go!"
script = apiaudio.Script.create(scriptText=text)
let text = "<<soundSegment::intro>> <<sectionName::intro>> Welcome to today's class. Get ready <<media::countdown>> go!"
script = await apiaudio.Script.create({scriptText:text})

Here is a full script example following the sound design formatting and using a media file in the main section.

text="""
<<soundSegment::intro>> 
<<sectionName::intro>>
Hello, this is your daily update from My News.
<<soundSegment::main>> 
<<sectionName::main>> 
Only the relevant. Delivered directly to you 
<<media::something>> 
<<soundSegment::outro>>
<<sectionName::outro>>
Delivered to you by My news.
"""
script = apiaudio.Script.create(scriptText=text)
let text=`
<<soundSegment::intro>> 
<<sectionName::intro>>
Hello, this is your daily update from My News.
<<soundSegment::main>> 
<<sectionName::main>> 
Only the relevant. Delivered directly to you 
<<media::something>> 
<<soundSegment::outro>>
<<sectionName::outro>>
Delivered to you by My news.
`
let script = await apiaudio.Script.create({scriptText:text})

On how to use sound segments and effects in your script, you can find more information here


Step 3: Create speech and mastering!

Next step is to bring the synthetic speech together with your media files. For that you need to create speech and mastering of your script.

First create speech by defining the scriptId and a voice of your choice.
Then master your speech file. It is in this step that you will define the media file by assigning the value something to the specific mediaId of the file you uploaded in the first step.

response = apiaudio.Speech().create(scriptId=script.get("scriptId"), voice="Aria")
response = apiaudio.Mastering().create(
    scriptId= script.get("scriptId"),
    mediaFiles=[{"something": "some_mediaId"}],
    )

print(response)

file= apiaudio.Mastering().retrieve(scriptId=script.get("scriptId"))
print(file)
let response = await apiaudio.Speech().create({scriptId:script.get("scriptId"), voice:"Aria"})
response = await apiaudio.Mastering().create(
    scriptId: script["scriptId"],
  mediaFiles: [{"something": "some_mediaId"}],
)

console.log(response)

let file = await apiaudio.Mastering().retrieve(script["scriptId"])
console.log(file)

End to End Example

import apiaudio
apiaudio.api_key = "<<apiKey>>"

#Upload your media
response = apiaudio.Media.upload(
  file_path="./my_file.mp3",
  tags="tag1,tag2,tag3"
)
print(response) 

#Create a script 
script = apiaudio.Script().create(scriptText=
"""
<<soundSegment::intro>> 
<<sectionName::intro>>
Hello, this is your daily update from My news.
<<soundSegment::main>> 
<<sectionName::main>> 
Only the relevant. Delivered directly to you 
<<media::something>> 
<<soundSegment::outro>>
<<sectionName::outro>>
Delivered to you by My news.
""")
print(script)

#Create Speech 
response = apiaudio.Speech().create(scriptId=script.get("scriptId"), voice="Linda")

#Master it. Remember to define your media files 
template="copacabana"
response = apiaudio.Mastering().create(
    scriptId=script.get("scriptId"),
    soundTemplate=template,
    mediaFiles=[{"something": "some_mediaId"}],
    )

print(response)

#Retrieve your file 
file= apiaudio.Mastering().retrieve(scriptId=script.get("scriptId"))
import apiaudio from "apiaudio";
apiaudio.configure({ apiKey: "<<apiKey>>"});

// uploading is not enabled on js sdk

const script = await Script.create({ scriptText: `<<soundSegment::intro>> 
<<sectionName::intro>>
Hello, this is your daily update from My news.
<<soundSegment::main>> 
<<sectionName::main>> 
Only the relevant. Delivered directly to you 
<<media::something>> 
<<soundSegment::outro>>
<<sectionName::outro>>
Delivered to you by My news.` });

const speech = await apiaudio.Speech.create({scriptId:script["scriptId"], voice:"Linda"})

// Master it. Remember to define your media files 
let template="copacabana"
const mastering = await apiaudio.Mastering.create({
        scriptId:script["scriptId"],
    soundTemplate:template,
    mediaFiles:[{"something": "some_mediaId"}],
    )

// Retrieve your file 
const file = await apiaudio.Mastering().retrieve(script["scriptId"])

Did this page help you?