Add documentation for the embedded resources system and related classes
This document (Docs/README.embedded-resources) explains how to use the embedded resources system and presents SimGear's CharArrayStream and ZlibStream families of classes, as well as the ResourceProxy class.
This commit is contained in:
parent
2fbc1dc491
commit
1131ddd38f
1 changed files with 474 additions and 0 deletions
474
Docs/README.embedded-resources
Normal file
474
Docs/README.embedded-resources
Normal file
|
@ -0,0 +1,474 @@
|
|||
-*- coding: utf-8; fill-column: 72; -*-
|
||||
|
||||
The Embedded Resources System
|
||||
=============================
|
||||
|
||||
This document gives an overview of FlightGear's embedded resources
|
||||
system and related classes. For specific information on the C++
|
||||
functions, the reference documentation is in the corresponding header
|
||||
files.
|
||||
|
||||
|
||||
Contents
|
||||
--------
|
||||
|
||||
1. The CharArrayStream and ZlibStream classes
|
||||
2. The “embedded resources” system
|
||||
3. About the XML resource declaration files
|
||||
4. The ResourceProxy class
|
||||
|
||||
|
||||
Introduction
|
||||
------------
|
||||
|
||||
The embedded resources system allows FlightGear to use data from files
|
||||
without relying on FG_ROOT to be set. This can be used, for instance, to
|
||||
grab the contents of XML files at FG build time, from any repository[1],
|
||||
and use said contents in the C++ code. The term “embedded” is used to
|
||||
avoid confusion with the ResourceProvider and ResourceManager classes
|
||||
provided by SimGear, which have nothing to do with the system described
|
||||
here.
|
||||
|
||||
The embedded resources system relies on classes present in
|
||||
simgear/io/iostreams/{zlibstream.cxx,CharArrayStream.cxx}, which were
|
||||
implemented as a way to address a concern that embedding a few XML files
|
||||
in the fgfs binary could use precious memory. The resource compiler
|
||||
(fgrcc) compresses resources before writing them in C++ form---except
|
||||
for some extensions, and it's configurable on a per-resource basis
|
||||
anyway. Then, the EmbeddedResourceManager instance, which lives in the
|
||||
fgfs process, can decompress them on-the-fly, incrementally,
|
||||
transparently. So, there is really no reason to worry about memory
|
||||
consumption, even for several dozens of XML files.
|
||||
|
||||
fgrcc is the resource compiler: it turns arbitrary files into C++ code
|
||||
the EmbeddedResourceManager can make use of, in order to “serve” the
|
||||
files' contents at runtime. It is named this way, because it fulfills
|
||||
the same role as Qt's rcc tool. It supports a thin superset of the
|
||||
XML-based format used by rcc for declaring resources[2][3].
|
||||
'fgrcc --help' gives a lot of info.
|
||||
|
||||
|
||||
1) The CharArrayStream and ZlibStream classes
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The CharArrayStream* files in simgear/io/iostreams/ implement
|
||||
CharArrayStreambuf and related IOStreams classes for working with char
|
||||
arrays, namely:
|
||||
- CharArrayStreambuf subclass of std::streambuf stream buffer
|
||||
- ROCharArrayStreambuf subclass of CharArrayStreambuf stream buffer
|
||||
- CharArrayIStream subclass of std::istream input stream
|
||||
- CharArrayOStream subclass of std::ostream output stream
|
||||
- CharArrayIOStream subclass of std::iostream input/output stream
|
||||
|
||||
(in the 'simgear' namespace, of course)
|
||||
|
||||
CharArrayStreambuf is a stream buffer class allowing to read from, and
|
||||
write to char arrays (std::strstream has been deprecated since C++98).
|
||||
Contrary to std::strstream, this class does no dynamic allocation: it is
|
||||
very simple, strictly staying for both reads and writes within the
|
||||
bounds of the buffer specified in its constructor. Contrary to
|
||||
std::stringstream, CharArrayStreambuf allows one to work on an array of
|
||||
char (that could be static data, on the stack, whatever) without having
|
||||
to make a whole copy of it.
|
||||
|
||||
ROCharArrayStreambuf is a read-only subclass of CharArrayStreambuf
|
||||
(useful for const-correctness). CharArrayIStream, CharArrayOStream and
|
||||
CharArrayIOStream are very simple convenience stream classes using
|
||||
either CharArrayStreambuf or ROCharArrayStreambuf as their associated
|
||||
stream buffer class.
|
||||
|
||||
While these classes can be of general-purpose usefulness, the particular
|
||||
reason they have been written for is to make the embedded resources
|
||||
system clean and memory-friendly. Concretely, this system supports both
|
||||
compressed and uncompressed resources, all of which can be read from
|
||||
their respective static arrays like this (think pipelines):
|
||||
|
||||
static char array
|
||||
(uncompressed ---------------> data available via an std::istream
|
||||
resource) CharArrayIStream or std::streambuf interface
|
||||
or ROCharArrayStreambuf
|
||||
|
||||
static char array
|
||||
(compressed ---------------> compressed data -------------------> ditto
|
||||
resource) CharArrayIStream ZlibDecompressorIStream
|
||||
or ZlibDecompressorIStreambuf
|
||||
|
||||
where ditto = uncompressed data available via an std::istream or
|
||||
std::streambuf interface
|
||||
|
||||
So, whether the resource data stored in static arrays by fgrcc is
|
||||
compressed or not, end-user code can read it in uncompressed form using
|
||||
an std::istream or std::streambuf interface, which means the resource
|
||||
never needs to be copied in memory a second time. This is particularly
|
||||
interesting with compressed resources, because:
|
||||
|
||||
1) The in-memory static data is much smaller in general than the
|
||||
uncompressed contents, and it's the only one we really have to
|
||||
“pay” for if one uses these stream-based interfaces.
|
||||
|
||||
2) The data is transparently decompressed on-demand as the end-user
|
||||
code reads from the ZlibDecompressorIStream or
|
||||
ZlibDecompressorIStreambuf instance.
|
||||
|
||||
In other words, these CharArrayStream classes complement the ones in
|
||||
zlibstream.cxx and make it easy to implement all kinds of pipelines to
|
||||
incrementally read or write, and possibly on-the-fly compress or
|
||||
decompress data from or to in-memory buffers (cf.
|
||||
writeCompressedDataToBuffer() in
|
||||
simgear/simgear/embedded_resources/embedded_resources_test.cxx, or
|
||||
ResourceCodeGenerator::writeEncodedResourceContents() in
|
||||
flightgear/src/EmbeddedResources/fgrcc.cxx for examples).
|
||||
|
||||
Since all of these provide standard IOStreams interfaces, they can be
|
||||
easily plugged into existing code. For instance, readXML() in
|
||||
simgear/simgear/xml/easyxml.cxx and readProperties() in
|
||||
simgear/props/props_io.cxx can incrementally read and parse data from an
|
||||
std::istream instance, and thus are able to directly read from a
|
||||
resource containing the compressed version of an XML file.
|
||||
|
||||
This incremental stuff is of course really interesting with large
|
||||
resources... which probably won't be used in FlightGear, in order not to
|
||||
waste RAM[4][5]. The EmbeddedResourceManager also has a getString()
|
||||
method to simply get an std::string when you don't care about the fact
|
||||
that this operation, by std::string design, will necessarily make a copy
|
||||
of the whole resource contents (in uncompressed form in the case of a
|
||||
compressed resource). This getString() method should be convenient and
|
||||
quite acceptable for reasonably-sized resources.
|
||||
|
||||
Finally, all of these classes---CharArray*Stream*, the classes in
|
||||
zlibstream.cxx, the EmbeddedResourceManager and related classes---can
|
||||
handle text and binary data in exactly the same way (std::string doesn't
|
||||
care, and neither do the other classes).
|
||||
|
||||
|
||||
2) The “embedded resources” system
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The embedded resources system works this way:
|
||||
|
||||
(1) The fgrcc resource compiler reads an XML file which has almost the
|
||||
same syntax[2] as Qt's .qrc files[3] and writes a .cxx file
|
||||
containing:
|
||||
- static char arrays initialized with resource contents
|
||||
(possibly compressed, this is automatic unless explicitly
|
||||
specified in the XML file);
|
||||
- a function definition containing calls to
|
||||
EmbeddedResourceManager::addResource() that register each of
|
||||
these resources with the EmbeddedResourceManager instance.
|
||||
|
||||
If you pass the --output-header-file option to fgrcc, it also
|
||||
writes a header file that goes with the generated .cxx file. For
|
||||
other options, see the output of 'fgrcc --help'.
|
||||
|
||||
It is quite possible to call fgrcc several times, each time with a
|
||||
different (XML input file, .cxx/.hxx output files) tuple: for
|
||||
instance, one call for resources present in the FlightGear repo,
|
||||
and possibly another call for resources in FGData. The point of
|
||||
this is that paths in the XML input file should be relative to
|
||||
avoid being system-dependent, and fgrcc accepts a --root option to
|
||||
indicate what you want them to be relative to, in order to let it
|
||||
find the real files. Thus, on a first invocation of fgrcc, one can
|
||||
make --root point to a path to the FlightGear repository when
|
||||
building, and on the second call use it to indicate a path to the
|
||||
FGData repository. Other variations are possible, of course.
|
||||
|
||||
Notes:
|
||||
|
||||
1) The example given here with FGData would *not* freeze the
|
||||
FGData location at FG compile time; this is only to allow
|
||||
files from FGData to be turned into generated .cxx files
|
||||
inside the FG source tree, that will make their contents
|
||||
available as embedded resources at runtime.
|
||||
|
||||
2) At the time of this writing, resources from the FlightGear
|
||||
repository are compiled at build time, and resources from the
|
||||
FGData repository are compiled offline using the
|
||||
'rebuild-fgdata-embedded-resources' script[6] (a
|
||||
convenience wrapper for fgrcc), before being committed to the
|
||||
FlightGear repository.
|
||||
|
||||
(2) SimGear contains an EmbeddedResourceManager class with, among
|
||||
others, createInstance() and instance() methods similar to the
|
||||
ones of NavDataCache. See [7] for the corresponding code.
|
||||
|
||||
FlightGear creates an EmbeddedResourceManager instance at startup
|
||||
and calls the various init functions generated by fgrcc, each of
|
||||
which registers the resources present in its containing .cxx file
|
||||
(using EmbeddedResourceManager::addResource()).
|
||||
|
||||
End-user FG code can then use EmbeddedResourceManager methods such
|
||||
as getResource(), getString(), getStreambuf() and getIStream()
|
||||
to access resource contents:
|
||||
- getResource() returns an
|
||||
std::shared_ptr<const AbstractEmbeddedResource>
|
||||
- getString() returns an std::string
|
||||
- getStreambuf() returns an std::unique_ptr<std::streambuf>
|
||||
- getIStream() returns an std::unique_ptr<std::istream>
|
||||
|
||||
AbstractEmbeddedResource is an abstract base class that you can
|
||||
think of as a resource descriptor: it points to (not contains!)
|
||||
the resource data (which is normally of static storage class), and
|
||||
contains + gives access to metadata such as the compression type
|
||||
and resource size (compressed and uncompressed).
|
||||
|
||||
AbstractEmbeddedResource currently has two derived concrete
|
||||
classes: RawEmbeddedResource for resources stored as-is
|
||||
(uncompressed) and ZlibEmbeddedResource for resources compressed by
|
||||
fgrcc. It's quite easy to add new subclasses if wanted, e.g. for
|
||||
LZMA compression or other things.
|
||||
|
||||
Resource fetching requires two things:
|
||||
|
||||
- an std::string key (fgrcc manipulates them with SGPath, but the
|
||||
EmbeddedResourceManager code in SimGear is so far completely
|
||||
agnostic of the kind of data stored in keys; this could be
|
||||
changed, though, if we wanted for example to be able to query
|
||||
at runtime all available resources in a given “virtual
|
||||
directory”);
|
||||
|
||||
- a “locale” name, similar to what FlightGear's XML translation
|
||||
files and FGLocale use. We used double quotes here, because
|
||||
fgrcc and the EmbeddedResourceManager expect “locale” names to
|
||||
be of one of these forms:
|
||||
* empty string: default locale, typically but not necessarily
|
||||
English (it is “engineering English” in FlightGear, i.e.,
|
||||
English written by programmers in the code, before
|
||||
translators possibly fix it up :)
|
||||
* en, fr, de, es, it...
|
||||
* en_GB, en_US, fr_FR, fr_CA, de_DE, de_CH, it_IT...
|
||||
|
||||
There is no encoding part, contrary to POSIX locales, hence the
|
||||
use of double quotes around the term “locale” in this context.
|
||||
|
||||
The FGLocale::getPreferredLanguage() method returns the preferred
|
||||
“locale” in the form described above, according to user choice
|
||||
(from fgfs' --language option) and/or settings (system locale).
|
||||
This allows FG to tell the EmbeddedResourceManager the preferred
|
||||
“locale” for resource fetching (same syntax as in Qt's rcc tool for
|
||||
declaration in the XML file, using the 'lang' attribute on
|
||||
'qresource' elements).
|
||||
|
||||
[ Regarding the default locale, the way things are currently set
|
||||
up, I would use no 'lang' attribute for resources suitable for
|
||||
English in the XML input file for fgrcc, except when a
|
||||
country-specific variant is desired (en_GB, en_US, en_AU...). In
|
||||
such a case, there should also be a generic variant with no
|
||||
'lang' attribute declared for the same resource virtual path.
|
||||
This matches what I did for FGLocale::getPreferredLanguage(),
|
||||
that maps unset locales and locales such as C and C.UTF-8 to the
|
||||
default locale for the EmbeddedResourceManager, which is the
|
||||
empty string. This is a matter of policy, of course, and could be
|
||||
changed if desired. ]
|
||||
|
||||
The EmbeddedResourceManager class has getLocale() and
|
||||
selectLocale() methods to manage the _selected locale_. Each
|
||||
resource-fetching method of this class (getResourceOrNullPtr(),
|
||||
getResource(), getString(), getStreambuf() and getIStream()) has
|
||||
two overloads:
|
||||
- one taking only a virtual path (the key mentioned above);
|
||||
- one taking a virtual path and a “locale” name.
|
||||
|
||||
(we'll write “locale” without enclosing double-quotes from now on,
|
||||
otherwise it gets too painful to read; but we're *not* talking
|
||||
about POSIX-style locales ending with an encoding part)
|
||||
|
||||
The first kind of overload uses the selected locale to look up the
|
||||
resource, whereas the second kind uses the explicitly specified
|
||||
locale. Then resource lookup behaves as one could expect. For
|
||||
instance, assuming a resource is looked up for in the "fr_FR"
|
||||
locale, then the EmbeddedResourceManager tries in this order:
|
||||
- "fr_FR";
|
||||
- if no resource has been registered for "fr_FR" with the provided
|
||||
virtual path, it then tries with the "fr" locale;
|
||||
- if this is also unsuccessful, it finally tries with the default
|
||||
locale: "";
|
||||
- if this third attempt fails, the resource-fetching method
|
||||
throws an sg_exception, except for getResourceOrNullPtr(),
|
||||
which returns a null
|
||||
std::shared_ptr<const AbstractEmbeddedResource> instead.
|
||||
|
||||
To see how this is used, you can look at
|
||||
simgear/simgear/embedded_resources/embedded_resources_test.cxx. The
|
||||
only difference with real use is that in this file, resource
|
||||
contents and registering calls with the EmbeddedResourceManager
|
||||
have been written manually instead of by fgrcc. Apart from
|
||||
embedded_resources_test.cxx, here are two examples of client usage
|
||||
of the EmbeddedResourceManager:
|
||||
|
||||
(a) With EmbeddedResourceManager::getString():
|
||||
|
||||
#include <simgear/embedded_resources/EmbeddedResourceManager.hxx>
|
||||
#include <simgear/debug/logstream.hxx>
|
||||
|
||||
[...]
|
||||
|
||||
const auto& resMgr = simgear::EmbeddedResourceManager::instance();
|
||||
SG_LOG(SG_GENERAL, SG_INFO,
|
||||
"Resource contents: '" <<
|
||||
resMgr->getString("/virtual/path/to/resource") << "'");
|
||||
|
||||
(b) With EmbeddedResourceManager::getIStream():
|
||||
|
||||
#include <cstddef> // std::size_t
|
||||
#include <simgear/io/iostreams/sgstream.hxx>
|
||||
#include <simgear/embedded_resources/EmbeddedResourceManager.hxx>
|
||||
|
||||
[...]
|
||||
|
||||
sg_ofstream outFile(SGPath("/tmp/whatever"));
|
||||
if (!outFile) {
|
||||
<handle open error>
|
||||
}
|
||||
|
||||
const auto& resMgr = simgear::EmbeddedResourceManager::instance();
|
||||
auto resStream = resMgr->getIStream("/virtual/path/to/resource");
|
||||
// One possible way of handling errors from resStream[8]:
|
||||
// resStream->exceptions(std::ios_base::badbit);
|
||||
|
||||
constexpr std::size_t bufSize = 4096;
|
||||
std::unique_ptr<char[]> buf(new char[bufSize]); // intermediate buffer
|
||||
|
||||
do {
|
||||
resStream->read(buf.get(), bufSize);
|
||||
outFile.write(buf.get(), resStream->gcount());
|
||||
} while (*resStream && outFile); // resStream *points* to an std::istream
|
||||
|
||||
<handle possible errors that might have caused to loop to stop
|
||||
prematurely>
|
||||
|
||||
|
||||
3) About the XML resource declaration files
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
You may want to read the output of 'fgrcc --help', which explains a few
|
||||
things, in particular how to write an XML resource declaration file that
|
||||
fgrcc can use. At the time of this writing, such files are already
|
||||
present as flightgear/src/EmbeddedResources/FlightGear-resources.xml and
|
||||
flightgear/src/EmbeddedResources/FGData-resources.xml in the FlightGear
|
||||
repository. In case you need resources from elsewhere, it's easy to add
|
||||
other XML resource declaration files:
|
||||
|
||||
1) If you want the .cxx/.hxx resource files to be automatically
|
||||
generated as part of the FlightGear build:
|
||||
|
||||
Copy and adapt the add_custom_command() call in
|
||||
flightgear/src/Main/CMakeLists.txt[9] that invokes fgrcc on
|
||||
flightgear/src/EmbeddedResources/FlightGear-resources.xml.
|
||||
|
||||
2) In flightgear/src/Main/CMakeLists.txt, add paths for your new
|
||||
fgrcc-generated .cxx and .hxx files to the SOURCES and HEADERS
|
||||
CMake variables for the 'fgfs' target.
|
||||
|
||||
3) Assuming you passed for instance
|
||||
--init-func-name=initFoobarEmbeddedResources in step 1, add a call
|
||||
to initFoobarEmbeddedResources() after this code in fgMainInit()
|
||||
(flightgear/src/Main/main.cxx):
|
||||
|
||||
simgear::EmbeddedResourceManager::createInstance();
|
||||
initFlightGearEmbeddedResources();
|
||||
|
||||
|
||||
4) The ResourceProxy class
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
SimGear contains a ResourceProxy class that allows one to access real
|
||||
files or embedded resources in a unified way. When using it, one can
|
||||
switch from one data source to the other with minimal code changes,
|
||||
possibly even at runtime (in which case there is obviously no code
|
||||
change at all).
|
||||
|
||||
Sample usage (from FlightGear):
|
||||
|
||||
simgear::ResourceProxy proxy(globals->get_fg_root(), "/FGData");
|
||||
proxy.setUseEmbeddedResources(false); // can also be set via the constructor
|
||||
|
||||
std::string s = proxy.getString("/some/path");
|
||||
std::unique_ptr<std::istream> streamp = proxy.getIStream("/some/path");
|
||||
|
||||
This example would retrieve contents from the real file
|
||||
$FG_ROOT/some/path. If true had been passed in the
|
||||
proxy.setUseEmbeddedResources() call, it would instead have used the
|
||||
default-locale version of the embedded resource whose virtual path is
|
||||
/FGData/some/path.
|
||||
|
||||
For more information about this class, see [10] and [11].
|
||||
|
||||
|
||||
Footnotes
|
||||
=========
|
||||
|
||||
[1] E.g., FlightGear or FGData, as long as the path to the latter is
|
||||
provided to the FG build system, which is currently possible but not
|
||||
required (passing '-D FG_DATA_DIR:PATH=...' to CMake when
|
||||
configuring the FlightGear build).
|
||||
|
||||
[2] The differences with the QRC format[3] are explained in the output
|
||||
of 'fgrcc --help'. Here is the relevant excerpt:
|
||||
|
||||
,----
|
||||
| 1. The <!DOCTYPE RCC> declaration at the beginning should be omitted (or
|
||||
| replaced with <!DOCTYPE FGRCC>, however such a DTD currently doesn't
|
||||
| exist). I suggest to add an XML declaration instead, for instance:
|
||||
|
|
||||
| <?xml version="1.0" encoding="UTF-8"?>
|
||||
|
|
||||
| 2. <RCC> and </RCC> must be replaced with <FGRCC> and </FGRCC>,
|
||||
| respectively.
|
||||
|
|
||||
| 3. The FGRCC format supports a 'compression' attribute for each 'file'
|
||||
| element. At the time of this writing, the allowed values for this
|
||||
| attribute are 'none', 'zlib' and 'auto'. When set to a value that is
|
||||
| not 'auto', this attribute of course bypasses the algorithm for
|
||||
| determining whether and how to compress a given resource (algorithm
|
||||
| which relies on the file extension).
|
||||
|
|
||||
| 4. Resource paths (paths to the real files, not virtual paths) are
|
||||
| interpreted relatively to the directory specified with the --root
|
||||
| option. If this option is not passed to 'fgrcc', then the default root
|
||||
| directory is the one containing INFILE, which matches the behavior of
|
||||
| Qt's 'rcc' tool.
|
||||
`----
|
||||
|
||||
[3] http://doc.qt.io/qt-5/resources.html
|
||||
|
||||
[4] The main reason why I wrote the classes in
|
||||
simgear/simgear/io/iostreams/{CharArrayStream,zlibstream}.cxx is
|
||||
thus not to maximize memory-efficiency with very large resources;
|
||||
rather, it is to make the implementation of the following parts
|
||||
simple, clean and modular:
|
||||
- the resource compiler (fgrcc);
|
||||
- the EmbeddedResourceManager.
|
||||
|
||||
[5] The EmbeddedResourceManager architecture would make it quite easy to
|
||||
also support runtime loading of resources from files (a thing the Qt
|
||||
resource system supports), but it is not very clear how interesting
|
||||
this would be, compared to having the files loaded from $FG_ROOT.
|
||||
Well, maybe for large files [apt.dat.gz & Co] that we would want to
|
||||
load but not see in the FGData repository at all. But then there
|
||||
would be the requirement, of course, that “something” puts the files
|
||||
in a clearly-defined, platform-dependent location known to the
|
||||
EmbeddedResourceManager.
|
||||
|
||||
[6] https://sourceforge.net/p/flightgear/fgmeta/ci/next/tree/python3-flightgear/rebuild-fgdata-embedded-resources
|
||||
|
||||
[7] https://sourceforge.net/p/flightgear/simgear/ci/next/tree/simgear/embedded_resources/
|
||||
|
||||
[8] We know that in some buggy C++ implementations, the
|
||||
std::ios_base::failure exception can't be caught, at least not under
|
||||
its name, due to some ABI compatibility mess:
|
||||
|
||||
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66145
|
||||
|
||||
However, it stills causes the program to abort, and since this
|
||||
error handling technique makes for much more readable and less
|
||||
error-prone code, I think it's still a good way to handle IOStreams
|
||||
errors even now, unless you really need to *catch* the
|
||||
std::ios_base::failure exception.
|
||||
|
||||
[9] flightgear/CMakeModules/GenerateFlightgearResources.cmake in my
|
||||
'i18n-and-init-work-v2-rebased' branch (not merged into 'next' at
|
||||
the time of this writing).
|
||||
|
||||
[10] https://sourceforge.net/p/flightgear/simgear/ci/next/tree/simgear/embedded_resources/ResourceProxy.hxx
|
||||
|
||||
[11] https://sourceforge.net/p/flightgear/simgear/ci/next/tree/simgear/embedded_resources/embedded_resources_test.cxx
|
Loading…
Reference in a new issue