Pronunciation Dictionaries

Make our AI systems speak more like a human :)

Often when working with TTS, the models can fail to accurately pronounce specific words, for example brands, names and locations are commonly mis-pronounced. As a first attempt to fix this we have introduced our lexi flag, which works in a similar way to SSML. For example, adding <!peadar> instead of Peadar (who is one of our founders) to your script will cause the model to produce an alternative pronunciation of this name. This is particularly useful in cases where words can have multiple pronunciations, for example the cities ‘reading’ and ‘nice’. In this instance placing <!nice> will ensure that these are pronounced correctly, given the script:

" The city of <!nice> is a really nice place in the south of france."

If this solution does not work for you, you can instead make use of our custom (self-serve) lexi feature.

This can be used to achieve one of two things, correcting single words, or expanding acronyms. For example, you can replace all occurrences of the word Aflorithmic with “af low rhythmic” or occurrences of the word ‘BMW’ with “Bayerische Motoren Werke”. Replacement words can be supplied as plain text or an IPA phonemisation.

Prononciation dictionary methods are:

  • list() Lists the publicly available dictionaries and their words

  • Parameters:

    • none
  • Example:

# returns a list of public dictionaries
dictionaries = apiaudio.Lexi.list()

list_custom_dicts() Lists the custom dictionaries and their respective words

Parameters:

  • none
    Example:
# returns a list of custom dictionaries

types = apiaudio.Lexi.list_custom_dicts()

register_custom_word Adds a new word to a custom dictionary.

- lang [required] (string) - Language family, e.g. en or es.dictionary - use global to register a word globally.
- word [required] (string) - The word that will be replaced
- replacement [required] (string) - The replacement token. Can be either a plain string or a IPA token.
- contentType [optional] (string) - The content type of the supplied replacement, can be either basic (default) or ipa for phonetic replacements.
specialization [optional] (string) - by default the supplied replacement will apply regardless of the supplied voice, language code or provider. However edge cases can be supplied, these can be either a valid; provider name, language code (i.e. en-gb) or voice name.
# correct the word sapiens
  r = apiaudio.Lexi.register_custom_word(word="sapiens", replacement="saypeeoons", lang="en")
  print(r)

❗️

Beware of our precedence

For each language, only a single word entry is permitted. However, each word can have multiple specializations. When a word is first registered a default specialization is always created, which will match what is passed in. Subsequent calls with different specializations will only update the given specialization.
The exact replacement that will be used is determined by the following order of preference:

voice name > language dialect > provider name > default

For example, a replacement specified for voice name sara will be picked over a replacement specified for provider azure.

python list_custom_words() Lists all the words contained in a custom dictionary.

  • Parameters:

** lang required - Language family, e.g. en or es - use global to list language agnostic words.

# lists all words in the dictionary along with their replacements
words = apiaudio.Lexi.list_custom_words(lang="en")

Did this page help you?