FAQ
What does Api.Audio do?
Api.Audio lets you build audio experiences quickly and seamlessly. You can use it as voice overs to videos, add engaging audio to your application, create audio ads, enhance your smart speaker skill, let users of your creative projetc dynamically create audio, version podcasts with our mastering engine or experiment with various state of the art text to speech APIs. You do not need any audio experise to create high-quality audio.
How long does it take to learn Api.Audio?
It generally takes an engineer around ten minutes to run the most basic usecase. It probably takes another hour to understand the concepts of Api.Audio, and for it to feel a bit more natural. If you have questions, we are happy to help.
Who uses Api.Audio? Engineers or non-technical people?
Most people who know some Python or JavaScript pick up how to create what they want with Api.Audio pretty easily. At this point, a good chunk of Api.Audio users are not very technical and we are always working on making it easier and more accessible to everyone while giving more advanced developer all the controls they need.
What kind of software do people build with Api.Audio?
Api.Audio is currently in use to personalise childrens audiostories adn audio toys, to add audio personalisation features to fitness, wellbeing, and meditation apps, to enriching podcast audio, to add voice overs to sales videos, to make conversational AI avatars speak, to make smart speaker skills more engaging and to produce audio advertisment to just name a few.
What SDKs do you provide?
Api.Audio currently offers a Javascript SDK (NodeJS), and a Python SDK. More coming soon!
# What can you not do with Api.Audio?
The goal of Api.Audio is to enable everyone with little or no audio expertise to programatically create high quality audio that can compete with a professional production. This also means that it is not designed for the full control you might want for your high-end production that is supposed to win a Grammy. However, if you are struggling to realize a project, we love to hear from you and help or hear about your feature requests.
What if I want a component that Api.Audio doesn't yet have?
If you have a use case that is not handled by Api.Audio's built-in components, you can build your own custom component to solve that use case. We welcome contributions and you can also submit a feature request.
Is Api.Audio secure? Where's my data stored?
We treat Security seriously at Api.Audio. If we don't answer your question in this dedicated section, let us know.
How does it compare to eg Google Text-To-Speech (TTS)?
Api.Audio offers audio-as-a-service and is created by audio enthusiasts that know all about making audio sound good. The cloud technology giants provide amazing Text-To-Speech voices that we are actually offering through api.audio as well. However, the focus of Api.Audio is to make it easy for developers to use such voices (among others) to build innovative audio solutions. This includes a wide voice catalogue (including voices for niche use-cases), guardrails to handle text content so it sounds good when converted to speech, audio processing that makes the difference between a pure speech track (think GPS or smart speaker) and an engaging, fully produced audio piece (think radio ad or podcast) along with connectors that make it easy to integrate audio creation into an existing product or project and make audio creation scalable.
We also have already thought thorough and solved some very complicated problems that developers tend to encounter when builidng audio and offer tools here that developers love us for. Hence Api.Audio often makes it possible to build in a days what takes a development team multiple months when plugging in directly into a Text-To-Speech provider.
How many languages do you support?
We currently support 3 human languages - English, Spanish & German. More are coming soon.
How easy is it to implement?
You can get up and running with our API in less than 30 minutes. We are continuously focusing on making that experience easier.
Is the audio output in real-time? How fast and how scalable is it?
Depening on what you want to do, it can take seconds to several minutes to produce a piece of audio. This - along with the strategy to pre-produce certain parts of the creation process - is normally quick enough for the vast majority of the usecases we encounter. For cases where this is not fast enough - such as conversational use-cases (ie AI chatbots) or cases where users dynamically create audio and really quick feedback - we offer a synchronous Text-To-Speech API with response times <1s as well as some other tools. In terms of scaling, even your most ambitious usecase should not be an issue.
How does "clone your own voice" work? Should the customer record their voice and send it to you?
We have developed a process that allows your users to record their voices in a way that is usable for us to voice model. The resulting model is then available through our Api.Audio so you can use it in your audio creation process.
Who is behind Api.Audio
Api.Audio is an audio creation environment developer by Aflorithmic Labs. You can learn all about the company here: https://www.aflorithmic.ai/company.
I have another question
Great! We'd love to answer it. Send us an email (to [email protected]), or chat with us in the bottom right corner.
Updated over 2 years ago