354 lines
13 KiB
HTML
354 lines
13 KiB
HTML
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN" "http://www.w3.org/TR/html4/frameset.dtd">
|
|
<html>
|
|
<head>
|
|
<title>FlightGear: Festival Voice Interface</title>
|
|
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
|
|
</head>
|
|
|
|
<body>
|
|
|
|
|
|
<h1>FlightGear: Festival Voice Interface</h1>
|
|
|
|
This page describes how to use FlightGear's voice interface to the Festival speech synthesis system, so that
|
|
ATC, Pilot, etc. messages can be made audible. These messages are normally only displayed on top of the screen.
|
|
A raw socket mode allows to send the messages to arbitrary servers.
|
|
|
|
|
|
<h2>Quick instructions (assuming that you have Festival installed)</h2>
|
|
|
|
<blockquote><pre>
|
|
$ festival --server &
|
|
$ fgfs --aircraft=j3cub --airport=KSQL --prop:/sim/sound/voices/enabled=true</pre></blockquote>
|
|
|
|
Now, in FlightGear, enable ATC (in the menu under "ATC"->"Options"), press the '-key (apostrophe key) and
|
|
send a message to the ATC. Hear "your" voice, that of the ATC, and some time later that of AI-planes.
|
|
|
|
|
|
|
|
<h2>Installing the Festival system</h2>
|
|
|
|
<ul>
|
|
<li>
|
|
Make sure Festival is installed, or download it from here:
|
|
<a href="http://www.cstr.ed.ac.uk/projects/festival/">http://www.cstr.ed.ac.uk/projects/festival/</a>
|
|
</li><li>
|
|
Check if Festival works. Only the relevant lines are shown here. Note the parentheses!
|
|
<blockquote><pre>
|
|
$ festival
|
|
festival> (SayText "FlightGear")
|
|
festival> (quit)</pre></blockquote>
|
|
</li><li>
|
|
Check if MBROLA is installed, or download it from here:
|
|
<a href="http://tcts.fpms.ac.be/synthesis/mbrola/">http://tcts.fpms.ac.be/synthesis/mbrola/</a> -> "Downloads"
|
|
-> "MBROLA binary and voices" (link at the bottom; hard to find). Choose the binary for your platform.
|
|
Unfortunately, there's no source code available. If you don't like that, then you can skip the whole MBROLA
|
|
setup. But then you can't use the more realistic voices. You can also install further MBROLA voices from
|
|
this page. (See below)
|
|
</li><li>
|
|
Run MBROLA and marvel at the help screen. That's just to check if it's in the path and executable.
|
|
<blockquote><pre>
|
|
$ mbrola -h</pre></blockquote>
|
|
</li>
|
|
</ul>
|
|
|
|
|
|
<h2>Installing more voices</h2>
|
|
|
|
I'm afraid this is a bit tedious. You can skip it if you are happy with the default voice. First find the
|
|
Festival data directory. All Festival data goes to a common file tree, like in FlightGear. This can be
|
|
<tt>/usr/local/share/festival/</tt> on Unices. We'll call that directory <tt>$FESTIVAL</tt> for now.
|
|
|
|
<ul>
|
|
<li>
|
|
Check which voices are available. You can test them by prepending <tt>voice_</tt>:
|
|
<blockquote><pre>
|
|
$ festival
|
|
festival> (print (mapcar (lambda (pair) (car pair)) voice-locations))
|
|
(kal_diphone rab_diphone don_diphone us1_mbrola us2_mbrola us3_mbrola en1_mbrola)
|
|
nil
|
|
festival> (voice_us3_mbrola)
|
|
festival> (SayText "I've got a nice voice.")
|
|
festival> (quit)</pre></blockquote>
|
|
</li><li>
|
|
Festival voices and MBROLA wrappers can be downloaded here:
|
|
<a href="http://festvox.org/packed/festival/1.95/">http://festvox.org/packed/festival/1.95/</a>
|
|
The "don_diphone" voice isn't the best, but it's comparatively small and well suited for "ai-planes".
|
|
If you install it, it should end up as directory <tt>$FESTIVAL/voices/english/don_diphone/</tt>. You also need
|
|
to install "festlex_OALD.tar.gz" for it as <tt>$FESTIVAL/dicts/oald/</tt> and run the Makefile in this
|
|
directory. (You may have to add "<tt>--heap 10000000</tt>" to the festival command arguments in the Makefile.)
|
|
</li><li>
|
|
Quite good voices are "us2_mbrola", "us3_mbrola", and "en1_mbrola". For these you need to install
|
|
MBROLA (see above) as well as these wrappers: <tt>festvox_us2.tar.gz</tt>, <tt>festvox_us3.tar.gz</tt>,
|
|
and <tt>festvox_en1.tar.gz</tt>. They create directories <tt>$FESTIVAL/voices/english/us2_mbrola/</tt> etc.
|
|
The voice <em>data</em>, however, has to be downloaded separately from another site:
|
|
</li><li>
|
|
MBROLA voices can be downloaded from the MBROLA download page (see above). You want the
|
|
voices labeled "us2" and "us3". Unpack them in the directories that the wrappers have created:
|
|
<tt>$FESTIVAL/voices/english/us2_mbrola/</tt> and likewise for "us3" and "en1".
|
|
</li>
|
|
</ul>
|
|
|
|
|
|
<h2>Running FlightGear with voice support</h2>
|
|
|
|
<ul>
|
|
<li>First start the festival server:
|
|
<blockquote><pre>
|
|
$ festival --server</pre></blockquote>
|
|
</li><li>
|
|
Start FlightGear with enabled voice subsystem, let's say with
|
|
<blockquote><pre>
|
|
$ fgfs --aircraft=j3cub --airport=KSQL --prop:/sim/sound/voices/enabled=true</pre></blockquote>
|
|
Of course, you can put this option into your personal configuration file. This doesn't mean that
|
|
you then <em>always</em> have to use FlightGear together with Festival. You'll just get a few
|
|
error messages in the terminal window, but that's it. Note that you can currently <em>not</em>
|
|
enable the voice subsystem at runtime!
|
|
</li><li>
|
|
Open the property browser to <tt>/sim/sound/voices/voice[0]/</tt> and write some text to the
|
|
<tt>text</tt> property. You should now hear this spoken with the default voice ("voice_kal_diphone").
|
|
You can try the same with <tt>voice[1]/</tt> etc. and should hear different voices if they
|
|
are installed, or the default voice again otherwise.
|
|
</li><li>
|
|
Contact the KSFO ATC via '-key dialog (apostrophe key). You should hear "your" voice first (and see the
|
|
text in yellow color on top of the screen), then you should hear ATC answer with a different voice (and see
|
|
it in light-green color).
|
|
</li><li>
|
|
You can edit the voice parameters in the <tt>preferences.xml</tt> file, and select different
|
|
screen colors and voice assignments in <tt>$FG_ROOT/Nasal/voice.nas</tt>. The messages aren't written
|
|
to the respective <tt>/sim/sound/voices/voice[*]/text</tt> properties directly, but rather to aliases
|
|
<tt>/sim/sound/voices/{atc,approach,ground,pilot,ai-plane}</tt>. (BTW: I've never heard anything from
|
|
<tt>ground</tt> and <tt>approach</tt> yet.)
|
|
</li>
|
|
</ul>
|
|
|
|
|
|
|
|
<h2>Configuration & Internals</h2>
|
|
|
|
The <em>voice</em> subsystem only offers the common subsystem functions to the rest of FlightGear.
|
|
There's no built-in function to let it send data to the socket. The only way is to write to the
|
|
respective speech properties. The number of available voices, or rather "channels", isn't hard-coded.
|
|
It's the number of <voice> groups in "/sim/sound/voices" that decides how many channels should be
|
|
opened. This is a typical setting of interface properties, whereby the aliases at the end have
|
|
nothing to do with the subsystem, but are handy shortcuts:
|
|
|
|
<blockquote><pre>
|
|
<sim>
|
|
<voices>
|
|
<host type="string">localhost</host>
|
|
<port type="string">1314</port>
|
|
<enabled type="bool">false</enabled>
|
|
|
|
<voice>
|
|
<desc>Pilot</desc>
|
|
<text type="string"></text>
|
|
<volume type="double">1.0</volume>
|
|
<pitch type="double">100.0</pitch>
|
|
<speed type="double">1.0</speed>
|
|
<preamble type="string">(voice_us3_mbrola)</preamble>
|
|
<festival type="bool">true</festival>
|
|
</voice>
|
|
|
|
<voice>
|
|
...
|
|
</voice>
|
|
|
|
<!-- handy aliases, not part of the interface: -->
|
|
|
|
<atc alias="/sim/sound/voices/voice[0]/text"/>
|
|
<approach alias="/sim/sound/voices/voice[0]/text"/>
|
|
<ground alias="/sim/sound/voices/voice[0]/text"/>
|
|
<pilot alias="/sim/sound/voices/voice[1]/text"/>
|
|
<copilot alias="/sim/sound/voices/voice[2]/text"/>
|
|
<ai-plane alias="/sim/sound/voices/voice[3]/text"/>
|
|
</voices>
|
|
</sim>
|
|
</pre></blockquote>
|
|
|
|
The <enabled> property decides at init time whether the subsystem should
|
|
be activated or not. There's currently no way to change this at runtime.
|
|
|
|
Each <voice> group defines one channel. <text> is the output
|
|
property. Every value that's written to it will be spoken by this channel.
|
|
If <festival> is true, then the channel will set up <pitch> and
|
|
<speed> (<volume> does currently not work and has to be <tt>1</tt>),
|
|
and puts Festival markup around the text. If <festival> is false,
|
|
then all text is written verbatim to the socket. <preamble> is always
|
|
written to the socket once as last step of the socket creation. In "festival"
|
|
mode it's used to set the voice, while in raw mode it could be used to identify
|
|
the channel (assuming that the server knows what to do with it).
|
|
|
|
|
|
|
|
<h2>Usage</h2>
|
|
|
|
The design principle is that message generators (e.g. the ATC subsystem) write
|
|
to a message property (e.g. <tt>/sim/messages/pilot</tt>). A listener ($FG_ROOT/Nasal/screen.nas)
|
|
watches this property and decides what to do with it. For pilot and ATC it writes the message
|
|
to the screen.log and copies it to the <tt>/sim/sound/voices/pilot</tt> property. This
|
|
is an alias to the real voice channel <tt>/sim/sound/voices/voice[1]/text</tt>.
|
|
This allows the most control and makes all steps user-configurable from Nasal
|
|
scripts. Message generator should <em>not</em> write to the voice's <text>
|
|
property directly, and only to the <tt>/sim/sound/voices/*</tt> aliases if a
|
|
message should not be displayed by the system.
|
|
|
|
|
|
|
|
<h2>Backward compatibility</h2>
|
|
|
|
The new voice subsystem is functionally compatible with the old one that
|
|
was part of the ATC subsystem. You just need to turn the <festival>
|
|
bool properties off and set the server address correctly. This sends only
|
|
the messages without any Festival syntax added:
|
|
|
|
<blockquote><pre>
|
|
<sim>
|
|
<voices>
|
|
<host type="string">192.168.2.15</host>
|
|
<port type="string">7100</port>
|
|
<enabled type="bool">true</enabled>
|
|
<voice>
|
|
<desc>ATC/Approach/Ground</desc>
|
|
<text type="string"></text>
|
|
<preamble type="string">ATC</preamble>
|
|
<festival type="bool">false</festival>
|
|
</voice>
|
|
<voice>
|
|
<desc>Pilot</desc>
|
|
<text type="string"></text>
|
|
<preamble type="string">Pilot</preamble>
|
|
<festival type="bool">false</festival>
|
|
</voice>
|
|
...
|
|
</voices>
|
|
</sim>
|
|
</pre></blockquote>
|
|
|
|
<volume>, <pitch>, and <speed> have no meaning and can
|
|
be left away. Note that also in this mode the preamble gets sent first.
|
|
It can be used to identify the channel. Of course, all messages could be
|
|
sent to just one channel, though.
|
|
|
|
|
|
|
|
<h2>Multichannel server</h2>
|
|
|
|
Raw mode does, of course, require a different server than Festival. Here's
|
|
a small Perl example for a multichannel server. Note how the <preamble>
|
|
is used as channel identification:
|
|
|
|
|
|
<blockquote><pre>
|
|
#!/usr/bin/perl -Tw
|
|
# License: GPL V2
|
|
# Modified after Example from perlipc.pod ($ man perlipc)
|
|
|
|
use strict;
|
|
BEGIN {
|
|
$ENV{PATH} = '/usr/ucb:/bin';
|
|
}
|
|
|
|
use Socket;
|
|
use Carp;
|
|
my $EOL = "\015\012";
|
|
|
|
sub spawn; # forward declaration
|
|
sub logmsg {
|
|
print "$0 $$: @_ at ", scalar localtime, "\n";
|
|
}
|
|
|
|
|
|
my $port = shift || 1314;
|
|
my $proto = getprotobyname('tcp');
|
|
|
|
|
|
($port) = $port =~ /^(\d+)$/ or die "invalid port";
|
|
|
|
|
|
socket(Server, PF_INET, SOCK_STREAM, $proto) || die "socket: $!";
|
|
setsockopt(Server, SOL_SOCKET, SO_REUSEADDR, pack("l", 1)) || die "setsockopt: $!";
|
|
bind(Server, sockaddr_in($port, INADDR_ANY)) || die "bind: $!";
|
|
listen(Server,SOMAXCONN) || die "listen: $!";
|
|
|
|
|
|
logmsg "server started on port $port";
|
|
|
|
|
|
my $waitedpid = 0;
|
|
my $paddr;
|
|
|
|
use POSIX ":sys_wait_h";
|
|
sub REAPER {
|
|
my $child;
|
|
while (($waitedpid = waitpid(-1,WNOHANG)) > 0) {
|
|
logmsg "reaped $waitedpid" . ($? ? " with exit $?" : '');
|
|
}
|
|
$SIG{CHLD} = \&REAPER; # loathe sysV
|
|
}
|
|
|
|
|
|
$SIG{CHLD} = \&REAPER;
|
|
|
|
for ($waitedpid = 0;
|
|
($paddr = accept(Client,Server)) || $waitedpid;
|
|
$waitedpid = 0, close Client) {
|
|
next if $waitedpid and not $paddr;
|
|
my($port,$iaddr) = sockaddr_in ($paddr);
|
|
my $name = gethostbyaddr($iaddr,AF_INET);
|
|
|
|
logmsg "connection from $name [", inet_ntoa($iaddr), "] at port $port";
|
|
|
|
spawn sub {
|
|
$|=1;
|
|
print "Hello there, $name, it's now ", scalar localtime, $EOL;
|
|
exec '/usr/bin/fortune' # XXX: `wrong' line terminators
|
|
or confess "can't exec fortune: $!";
|
|
};
|
|
}
|
|
|
|
|
|
sub spawn
|
|
{
|
|
my $coderef = shift;
|
|
|
|
unless (@_ == 0 && $coderef && ref($coderef) eq 'CODE') {
|
|
confess "usage: spawn CODEREF";
|
|
}
|
|
|
|
my $pid;
|
|
if (!defined($pid = fork)) {
|
|
logmsg "cannot fork: $!";
|
|
return;
|
|
} elsif ($pid) {
|
|
logmsg "creating child $pid";
|
|
return; # I'm the parent
|
|
}
|
|
# else I'm the child -- go spawn
|
|
|
|
# print header
|
|
my $id;
|
|
while (<Client>) {
|
|
s/^\s+//;
|
|
s/\s+$//;
|
|
|
|
# first line is voice channel id = "<preamble>"
|
|
if (not defined $id) {
|
|
$id = $_;
|
|
next;
|
|
}
|
|
|
|
print "\033[32m$id: \033[m$_\n";
|
|
last unless /\S/;
|
|
}
|
|
|
|
open(STDIN, "<&Client") || die "can't dup client to stdin";
|
|
open(STDOUT, ">&Client") || die "can't dup client to stdout";
|
|
## open(STDERR, ">&STDOUT") || die "can't dup stdout to stderr";
|
|
exit &$coderef();
|
|
}
|
|
</pre></blockquote>
|
|
|
|
</body>
|
|
</html>
|
|
|