first version of a voice README. Because of the links I decided to write

it as HTML, although I prefer raw text otherwise.
2006-02-11 17:23:17 +00:00 · 2006-02-11 17:23:17 +00:00 · 7ae98578f3
commit 7ae98578f3
parent 0bed47d554
1 changed files with 196 additions and 0 deletions
--- a/Docs/README.voice.html
+++ b/Docs/README.voice.html
@ -0,0 +1,196 @@
 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN" "http://www.w3.org/TR/html4/frameset.dtd">
 <html>
 	<head>
 		<title>FlightGear: Festival Voice Interface</title>
 		<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
 	</head>
 	<body>
 <h1>FlightGear: Festival Voice Interface</h1>
 This page describes how to use FlightGear's voice interface to the Festival speech synthesis system, so that
 ATC, Pilot, etc. messages can be made audible. These messages are normally only displayed on top of the screen.
 A raw socket mode allows to send the messages to arbitrary servers.
 <h2>Quick instructions (assuming that you have Festival installed)</h2>
 <blockquote><pre>
 $ festival --server &amp;
 $ fgfs --aircraft=j3cub --airport=KSQL --prop:/sim/sound/voices/enabled=true</pre></blockquote>
 Now, in FlightGear, enable ATC (in the menu under "ATC"-&gt;"Options"), press the '-key (apostrophe key) and
 send a message to the ATC. Hear "your" voice, that of the ATC, and some time later that of AI-planes.
 <h2>Installing the Festival system</h2>
 <ul>
 <li>
 	Make sure Festival is installed, or download it from here:
 	<a href="http://www.cstr.ed.ac.uk/projects/festival/">http://www.cstr.ed.ac.uk/projects/festival/</a>
 </li><li>
 	Check if Festival works. Only the relevant lines are shown here. Note the parentheses!</li>
 	<blockquote><pre>
 $ festival
 festival> (SayText "FlightGear")
 festival> (quit)</pre></blockquote>
 </li><li>
 	Check if MBROLA is installed, or download it from here:
 	<a href="http://tcts.fpms.ac.be/synthesis/mbrola/">http://tcts.fpms.ac.be/synthesis/mbrola/</a> -> "Downloads"
 	-> "MBROLA binary and voices" (link at the bottom; hard to find). Choose the binary for your platform.
 	Unfortunately, there's no source code available. If you don't like that, then you can skip the whole MBROLA
 	setup. But then you can't use the more realistic voices. You can also install further MBROLA voices from
 	this page. (See below)
 </li><li>
 	Run MBROLA and marvel at the help screen. That's just to check if it's in the path and executable.
 	<blockquote><pre>
 $ mbrola -h</pre></blockquote>
 </li>
 </ul>
 <h2>Installing more voices</h2>
 I'm afraid this is a bit tedious. You can skip it if you are happy with the default voice. First find the
 Festival data directory. All Festival data goes to a common file tree, like in FlightGear. This can be
 <tt>/usr/local/share/festival/</tt> on Unices. We'll call that directory <tt>$FESTIVAL</tt> for now.
 <ul>
 <li>
 	Check which voices are available. You can test them by prepending <tt>voice_</tt>:
 	<blockquote><pre>
 $ festival
 festival> (print (mapcar (lambda (pair) (car pair)) voice-locations))
 (kal_diphone rab_diphone don_diphone us1_mbrola us2_mbrola us3_mbrola en1_mbrola)
 nil
 festival> (voice_us3_mbrola)
 festival> (SayText "I've got a nice voice.")
 festival> (quit)</pre></blockquote>
 </li><li>
 	Festival voices and MBROLA wrappers can be downloaded here:
 	<a href="http://festvox.org/packed/festival/1.95/">http://festvox.org/packed/festival/1.95/</a>
 	The "don_diphone" voice isn't the best, but it's comparatively small and well suited for "ai-planes".
 	If you install it, it should end up as directory <tt>$FESTIVAL/voices/english/don_diphone/</tt>. You also need
 	to install "festlex_OALD.tar.gz" for it as <tt>$FESTIVAL/dicts/oald/</tt> and run the Makefile in this
 	directory. (You may have to add "<tt>--heap 10000000</tt>" to the festival command arguments in the Makefile.)
 </li><li>
 	Quite good voices are "us2_mbrola", "us3_mbrola", and "en1_mbrola". For these you need to install
 	MBROLA (see above) as well as these wrappers: <tt>festvox_us2.tar.gz</tt>, <tt>festvox_us3.tar.gz</tt>,
 	and <tt>festvox_en1.tar.gz</tt>. They create directories <tt>$FESTIVAL/voices/english/us2_mbrola/</tt> etc.
 	The voice <em>data</em>, however, has to be downloaded separately from another site:
 </li><li>
 	MBROLA voices can be downloaded from the MBROLA download page (see above). You want the
 	voices labeled "us2" and "us3". Unpack them in the directories that the wrappers have created:
 	<tt>$FESTIVAL/voices/english/us2_mbrola/</tt> and likewise for "us3" and "en1".
 </li>
 </ul>
 <h2>Running FlightGear with voice support</h2>
 <ul>
 <li>First start the festival server:
 	<blockquote><pre>
 $ festival --server</pre></blockquote>
 </li><li>
 	Start FlightGear with enabled voice subsystem, let's say with
 	<blockquote><pre>
 $ fgfs --aircraft=j3cub --airport=KSQL --prop:/sim/sound/voices/enabled=true</pre></blockquote>
 	Of course, you can put this option into your personal configuration file. This doesn't mean that
 	you then <em>always</em> have to use FlightGear together with Festival. You'll just get a few
 	error messages in the terminal window, but that's it. Note that you can currently <em>not</em>
 	enable the voice subsystem at runtime!
 </li><li>
 	Open the property browser to <tt>/sim/sound/voices/voice[0]/</tt> and write some text to the
 	<tt>text</tt> property. You should now hear this spoken with the default voice ("voice_kal_diphone").
 	You can try the same with <tt>voice[1]/</tt> etc. and should hear different voices if they
 	are installed, or the default voice again otherwise.
 </li><li>
 	Contact the KSFO ATC via '-key dialog (apostrophe key). You should hear "your" voice first (and see the
 	text in yellow color on top of the screen), then you should hear ATC answer with a different voice (and see
 	it in light-green color).
 </li><li>
 	You can edit the voice parameters in the <tt>preferences.xml</tt> file, and select different
 	screen colors and voice assignments in <tt>$FG_ROOT/Nasal/voice.nas</tt>. The messages aren't written
 	to the respective <tt>/sim/sound/voices/voice[*]/text</tt> properties directly, but rather to aliases
 	<tt>/sim/sound/voices/{atc,approach,ground,pilot,ai-plane}</tt>. (BTW: I've never heard anything from
 	<tt>ground</tt> and <tt>approach</tt> yet.)
 </li>
 </ul>
 <h2>Cofiguration &amp; Internals</h2>
 The <em>voice</em> subsystem only offers the common subsystem functions to the rest of FlightGear.
 There's no built-in function to let it send data to the socket. The only way is to write to the
 respective speech properties. The number of available voices, or rather "channels", isn't hard-coded.
 It's the number of &lt;voice&gt; groups in "/sim/sound/voices" that decides how many channels should be
 opened. This is a typical setting of interface properties, whereby the aliases at the end have
 nothing to do with the subsystem, but are handy shortcuts:
 <blockquote><pre>
 &lt;sim&gt;
    &lt;voices&gt;
        &lt;host type="string"&gt;localhost&lt;/host&gt;
        &lt;port type="string"&gt;1314&lt;/port&gt;
        &lt;enabled type="bool"&gt;false&lt;/enabled&gt;
        &lt;voice&gt;
            &lt;desc&gt;Pilot&lt;/desc&gt;
            &lt;text type="string"&gt;&lt;/text&gt;
            &lt;volume type="double"&gt;1.0&lt;/volume&gt;
            &lt;pitch type="double"&gt;100.0&lt;/pitch&gt;
            &lt;speed type="double"&gt;1.0&lt;/speed&gt;
            &lt;preamble type="string"&gt;(voice_us3_mbrola)&lt;/preamble&gt;
            &lt;festival type="bool"&gt;true&lt;/festival&gt;
        &lt;/voice&gt;
        &lt;voice&gt;
            ...
        &lt;/voice&gt;
        &lt;!-- handy aliases, not part of the interface: --&gt;
        &lt;atc alias="/sim/sound/voices/voice[0]/text"/&gt;
        &lt;approach alias="/sim/sound/voices/voice[0]/text"/&gt;
        &lt;ground alias="/sim/sound/voices/voice[0]/text"/&gt;
        &lt;pilot alias="/sim/sound/voices/voice[1]/text"/&gt;
        &lt;copilot alias="/sim/sound/voices/voice[2]/text"/&gt;
        &lt;ai-plane alias="/sim/sound/voices/voice[3]/text"/&gt;
    &lt;/voices&gt;
 &lt;/sim&gt;
 </pre></blockquote>
 The &lt;enabled&gt; property decides at init time whether the subsystem should
 be activated or not. There's currently no way to change this at runtime.
 Each &lt;voice&gt; group defines one channel. &lt;text&gt; is the output
 property. Every value that's written to it will be spoken by this channel.
 If &lt;festival&gt; is true, then the channel will set up &lt;pitch&gt; and
 &lt;speed&gt; (&lt;volume&gt; does currently not work and has to be <tt>1</tt>),
 and puts Festival markup around the text. If &lt;festival&gt; is false,
 then all text is written verbatim to the socket. &lt;preamble&gt; is always
 written to the socket once as last step of the socket creation. In "festival"
 mode it's used to set the voice, while in raw mode it could be used to identify
 the channel (assuming that the server knows what to do with it).
 <h2>Usage</h2>
 The design principle is that message generators (e.g. the ATC subsystem) write
 to a message property (e.g. <tt>/sim/messages/pilot</tt>). A listener ($FG_ROOT/Nasal/screen.nas)
 watches this property and decides what to do with it. For pilot and ATC it writes the message
 to the screen.log and copies it to the <tt>/sim/sound/voices/pilot</tt> property. This
 is an alias to the real voice channel <tt>/sim/sound/voices/voice[1]/text</tt>.
 This allows the most control and makes all steps user-configurable from Nasal
 scripts. Message generator should <em>not</em> write to the voice's &lt;text&gt;
 property directly, and only to the <tt>/sim/sound/voices/*</tt> aliases if a
 message should not be displayed by the system.
 </body>
 </html>