Audio Instructions

Audio instructions

This tutorial shows how to handle audio instructions.

Get started

Audio instructions may come from alert events and turn-by-turn navigation prompts, as well as user-requested prompts. To enable audio prompts for alert events and turn-by-turn navigation, all three components must be enabled at the creation of a navigation service. Refer to tutorial for alert detection and tutorial for turn-by-turn navigation for enabling the alert and navigation components. The follow sample code shows how to enable the audio guidance component.

// enable audio guidance component
tn::drive::NavigationServiceOptions options;
options.enable_audio = true;
options.enable_alert = true;
// enable other components if needed ...

// This will enable audio prompt for alert events in free mode
const auto navigationService = tn::drive::api::NavigationServiceFactory::createNavigationService(
    options, system, settings, mapContent, directionService);

If audio prompt is need for alert events and turn-by-turn navigation in navigation mode, create a navigation session with both alert and audio components enabled.

// enable both alert and audio components in navigation session
auto* navigationSession = navigationService->startNavigation(route);

There are two styles of audio prompt content, one is text with optional information about current road conditions, and the other is just a tone sound with no such information.

Before playing audio instructions in text style, you should assemble the sentence by replacing placeholders in the text with either orthographic or phonemic contents.

void playAudioInstruction(const tn::drive::models::v1::AudioInstruction& instruction)
{
    if (instruction.style == tn::drive::models::v1::AudioInstruction::PromptStyle::Text)
    {
        const auto& tokens = instruction.tokens();
        auto sentence = instruction.sentence;
        uint32_t index = 1;
        for (const auto& token : tokens)
        {
            if (token.phoneme.empty())
            {
                // use orthography if there's no phoneme
                sentence = replaceTokenWithOrthography(sentence, index, token.orthography, token.orthography_code);
            }
            else
            {
                // use phoneme for better speech result
                sentence = replaceTokenWithPhoneme(sentence, index, token.phoneme, token.phoneme_code);
            }
            index++;
        }
        ttsEngine.playText(sentence);
    }
    else if (instruction.style == tn::drive::models::v1::AudioInstruction::PromptStyle::Tone)
    {
        ttsEngine.playTone();
    }
}

// Following sample code shows how to replace tokens with orthography
// replacing it with phoneme in similar way except that escape sequences and language tags are usually needed
std::string replaceTokenWithOrthography(
    const std::string& sentence, uint32_t tokenIndex, const std::string& orthography, const std::string& orthographyCode)
{
    std::ostringstream oss;
    oss << "%" << tokenIndex << "%";
    const auto token = oss.str();

    const auto pos = sentence.find(token);
    if (pos != std::string::npos)
    {
        const auto firstHalf = sentence.substr(0, pos);
        const auto secondHalf = sentence.substr(pos + token.size(), sentence.size() - pos - token.size());
        // NOTICE: how to assemble the sentence for the TTS engine to recognize orthography content with language code
        // depends on the specification of the engine you use. Here we just ignore the language code for example purpose.
        // NOTICE: the sentence is giving in the locale set by tn::foundation::System::updateSystemLocale() or
        // tn::drive::api::NavigationService::overrideLocale(), it may be different from the language of orthography content.
        // Refer to the documentation of the TTS engine you use see if it supports playing two different languages in one sentence.
        return firstHalf + orthography + secondHalf;
    }
    else
    {
        return sentence;
    }
}

Optimize orthography prompt for abbreviation

Whether to use orthography or phoneme content for audio instructions depends on various aspects. Generally speaking, each way has its advantages and disadvantages.

	orthography	phoneme
advantages	easy to use just concatenate the strings	very accurate especially for words with multiple phonemes
disadvantages	hard to use depends on TTS engine usually need to insert escape sequences	not accurate abbreviation may not be prompted

When using orthography, you can turn on abbreviation optimization which is disabled by default.

// turn on abbreviation optimization
std::string audioPromptSettings = R"json(
{
    "OrthographyOptimize": 
    {
        "Enabled": true
    }
}
)json";

const auto settings = tn::foundation::Settings::Builder()
    .setString(tn::drive::api::SettingConstants::SETTING_AUDIO_JSON_CONTENT, audioPromptSettings)
    .build();

// enable audio component at the creation of a navigation service to make customized audio prompt settings effective
tn::drive::NavigationServiceOptions options;
options.enable_audio = true;

const auto navigationService = tn::drive::api::NavigationServiceFactory::createNavigationService(
    options, system, settings, mapContent, directionService);

Override system locale for audio instructions

tn::foundation::System::updateSystemLocale() is for updating the system locale shared by all the TA SDK components. There may be cases where you want to show visual contents in one language while audio contents are in another. For example, show a map in English while playing audio prompts in German.

In this case, navigation service provides a way to override the system locale for audio instructions only. Notice: system locale here has nothing to do with LC_xxx macros.

// set system locale to US English
system->updateSystemLocale("en_US");

// override system locale in navigation service to German
navigationService->overrideLocale("de_DE");

Audio Instructions