FlightGear: Festival Voice Interface

This page describes how to use FlightGear's voice interface to the Festival speech synthesis system, so that ATC, Pilot, etc. messages can be made audible. These messages are normally only displayed on top of the screen. A raw socket mode allows to send the messages to arbitrary servers.

Quick instructions (assuming that you have Festival installed)

$ festival --server &
$ fgfs --aircraft=j3cub --airport=KSQL --prop:/sim/sound/voices/enabled=true
Now, in FlightGear, enable ATC (in the menu under "ATC"->"Options"), press the '-key (apostrophe key) and send a message to the ATC. Hear "your" voice, that of the ATC, and some time later that of AI-planes.

Installing the Festival system

Installing more voices

I'm afraid this is a bit tedious. You can skip it if you are happy with the default voice. First find the Festival data directory. All Festival data goes to a common file tree, like in FlightGear. This can be /usr/local/share/festival/ on Unices. We'll call that directory $FESTIVAL for now.

Running FlightGear with voice support

Cofiguration & Internals

The voice subsystem only offers the common subsystem functions to the rest of FlightGear. There's no built-in function to let it send data to the socket. The only way is to write to the respective speech properties. The number of available voices, or rather "channels", isn't hard-coded. It's the number of <voice> groups in "/sim/sound/voices" that decides how many channels should be opened. This is a typical setting of interface properties, whereby the aliases at the end have nothing to do with the subsystem, but are handy shortcuts:
<sim>
    <voices>
        <host type="string">localhost</host>
        <port type="string">1314</port>
        <enabled type="bool">false</enabled>

        <voice>
            <desc>Pilot</desc>
            <text type="string"></text>
            <volume type="double">1.0</volume>
            <pitch type="double">100.0</pitch>
            <speed type="double">1.0</speed>
            <preamble type="string">(voice_us3_mbrola)</preamble>
            <festival type="bool">true</festival>
        </voice>

        <voice>
            ...
        </voice>

        <!-- handy aliases, not part of the interface: -->

        <atc alias="/sim/sound/voices/voice[0]/text"/>
        <approach alias="/sim/sound/voices/voice[0]/text"/>
        <ground alias="/sim/sound/voices/voice[0]/text"/>
        <pilot alias="/sim/sound/voices/voice[1]/text"/>
        <copilot alias="/sim/sound/voices/voice[2]/text"/>
        <ai-plane alias="/sim/sound/voices/voice[3]/text"/>
    </voices>
</sim>
The <enabled> property decides at init time whether the subsystem should be activated or not. There's currently no way to change this at runtime. Each <voice> group defines one channel. <text> is the output property. Every value that's written to it will be spoken by this channel. If <festival> is true, then the channel will set up <pitch> and <speed> (<volume> does currently not work and has to be 1), and puts Festival markup around the text. If <festival> is false, then all text is written verbatim to the socket. <preamble> is always written to the socket once as last step of the socket creation. In "festival" mode it's used to set the voice, while in raw mode it could be used to identify the channel (assuming that the server knows what to do with it).

Usage

The design principle is that message generators (e.g. the ATC subsystem) write to a message property (e.g. /sim/messages/pilot). A listener ($FG_ROOT/Nasal/screen.nas) watches this property and decides what to do with it. For pilot and ATC it writes the message to the screen.log and copies it to the /sim/sound/voices/pilot property. This is an alias to the real voice channel /sim/sound/voices/voice[1]/text. This allows the most control and makes all steps user-configurable from Nasal scripts. Message generator should not write to the voice's <text> property directly, and only to the /sim/sound/voices/* aliases if a message should not be displayed by the system.