diff --git a/Docs/README.embedded-resources b/Docs/README.embedded-resources new file mode 100644 index 000000000..42ece9653 --- /dev/null +++ b/Docs/README.embedded-resources @@ -0,0 +1,474 @@ +-*- coding: utf-8; fill-column: 72; -*- + +The Embedded Resources System +============================= + +This document gives an overview of FlightGear's embedded resources +system and related classes. For specific information on the C++ +functions, the reference documentation is in the corresponding header +files. + + +Contents +-------- + +1. The CharArrayStream and ZlibStream classes +2. The “embedded resources” system +3. About the XML resource declaration files +4. The ResourceProxy class + + +Introduction +------------ + +The embedded resources system allows FlightGear to use data from files +without relying on FG_ROOT to be set. This can be used, for instance, to +grab the contents of XML files at FG build time, from any repository[1], +and use said contents in the C++ code. The term “embedded” is used to +avoid confusion with the ResourceProvider and ResourceManager classes +provided by SimGear, which have nothing to do with the system described +here. + +The embedded resources system relies on classes present in +simgear/io/iostreams/{zlibstream.cxx,CharArrayStream.cxx}, which were +implemented as a way to address a concern that embedding a few XML files +in the fgfs binary could use precious memory. The resource compiler +(fgrcc) compresses resources before writing them in C++ form---except +for some extensions, and it's configurable on a per-resource basis +anyway. Then, the EmbeddedResourceManager instance, which lives in the +fgfs process, can decompress them on-the-fly, incrementally, +transparently. So, there is really no reason to worry about memory +consumption, even for several dozens of XML files. + +fgrcc is the resource compiler: it turns arbitrary files into C++ code +the EmbeddedResourceManager can make use of, in order to “serve” the +files' contents at runtime. It is named this way, because it fulfills +the same role as Qt's rcc tool. It supports a thin superset of the +XML-based format used by rcc for declaring resources[2][3]. +'fgrcc --help' gives a lot of info. + + +1) The CharArrayStream and ZlibStream classes + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The CharArrayStream* files in simgear/io/iostreams/ implement +CharArrayStreambuf and related IOStreams classes for working with char +arrays, namely: + - CharArrayStreambuf subclass of std::streambuf stream buffer + - ROCharArrayStreambuf subclass of CharArrayStreambuf stream buffer + - CharArrayIStream subclass of std::istream input stream + - CharArrayOStream subclass of std::ostream output stream + - CharArrayIOStream subclass of std::iostream input/output stream + +(in the 'simgear' namespace, of course) + +CharArrayStreambuf is a stream buffer class allowing to read from, and +write to char arrays (std::strstream has been deprecated since C++98). +Contrary to std::strstream, this class does no dynamic allocation: it is +very simple, strictly staying for both reads and writes within the +bounds of the buffer specified in its constructor. Contrary to +std::stringstream, CharArrayStreambuf allows one to work on an array of +char (that could be static data, on the stack, whatever) without having +to make a whole copy of it. + +ROCharArrayStreambuf is a read-only subclass of CharArrayStreambuf +(useful for const-correctness). CharArrayIStream, CharArrayOStream and +CharArrayIOStream are very simple convenience stream classes using +either CharArrayStreambuf or ROCharArrayStreambuf as their associated +stream buffer class. + +While these classes can be of general-purpose usefulness, the particular +reason they have been written for is to make the embedded resources +system clean and memory-friendly. Concretely, this system supports both +compressed and uncompressed resources, all of which can be read from +their respective static arrays like this (think pipelines): + +static char array +(uncompressed ---------------> data available via an std::istream + resource) CharArrayIStream or std::streambuf interface + or ROCharArrayStreambuf + +static char array +(compressed ---------------> compressed data -------------------> ditto + resource) CharArrayIStream ZlibDecompressorIStream + or ZlibDecompressorIStreambuf + +where ditto = uncompressed data available via an std::istream or + std::streambuf interface + +So, whether the resource data stored in static arrays by fgrcc is +compressed or not, end-user code can read it in uncompressed form using +an std::istream or std::streambuf interface, which means the resource +never needs to be copied in memory a second time. This is particularly +interesting with compressed resources, because: + + 1) The in-memory static data is much smaller in general than the + uncompressed contents, and it's the only one we really have to + “pay” for if one uses these stream-based interfaces. + + 2) The data is transparently decompressed on-demand as the end-user + code reads from the ZlibDecompressorIStream or + ZlibDecompressorIStreambuf instance. + +In other words, these CharArrayStream classes complement the ones in +zlibstream.cxx and make it easy to implement all kinds of pipelines to +incrementally read or write, and possibly on-the-fly compress or +decompress data from or to in-memory buffers (cf. +writeCompressedDataToBuffer() in +simgear/simgear/embedded_resources/embedded_resources_test.cxx, or +ResourceCodeGenerator::writeEncodedResourceContents() in +flightgear/src/EmbeddedResources/fgrcc.cxx for examples). + +Since all of these provide standard IOStreams interfaces, they can be +easily plugged into existing code. For instance, readXML() in +simgear/simgear/xml/easyxml.cxx and readProperties() in +simgear/props/props_io.cxx can incrementally read and parse data from an +std::istream instance, and thus are able to directly read from a +resource containing the compressed version of an XML file. + +This incremental stuff is of course really interesting with large +resources... which probably won't be used in FlightGear, in order not to +waste RAM[4][5]. The EmbeddedResourceManager also has a getString() +method to simply get an std::string when you don't care about the fact +that this operation, by std::string design, will necessarily make a copy +of the whole resource contents (in uncompressed form in the case of a +compressed resource). This getString() method should be convenient and +quite acceptable for reasonably-sized resources. + +Finally, all of these classes---CharArray*Stream*, the classes in +zlibstream.cxx, the EmbeddedResourceManager and related classes---can +handle text and binary data in exactly the same way (std::string doesn't +care, and neither do the other classes). + + +2) The “embedded resources” system + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The embedded resources system works this way: + + (1) The fgrcc resource compiler reads an XML file which has almost the + same syntax[2] as Qt's .qrc files[3] and writes a .cxx file + containing: + - static char arrays initialized with resource contents + (possibly compressed, this is automatic unless explicitly + specified in the XML file); + - a function definition containing calls to + EmbeddedResourceManager::addResource() that register each of + these resources with the EmbeddedResourceManager instance. + + If you pass the --output-header-file option to fgrcc, it also + writes a header file that goes with the generated .cxx file. For + other options, see the output of 'fgrcc --help'. + + It is quite possible to call fgrcc several times, each time with a + different (XML input file, .cxx/.hxx output files) tuple: for + instance, one call for resources present in the FlightGear repo, + and possibly another call for resources in FGData. The point of + this is that paths in the XML input file should be relative to + avoid being system-dependent, and fgrcc accepts a --root option to + indicate what you want them to be relative to, in order to let it + find the real files. Thus, on a first invocation of fgrcc, one can + make --root point to a path to the FlightGear repository when + building, and on the second call use it to indicate a path to the + FGData repository. Other variations are possible, of course. + + Notes: + + 1) The example given here with FGData would *not* freeze the + FGData location at FG compile time; this is only to allow + files from FGData to be turned into generated .cxx files + inside the FG source tree, that will make their contents + available as embedded resources at runtime. + + 2) At the time of this writing, resources from the FlightGear + repository are compiled at build time, and resources from the + FGData repository are compiled offline using the + 'rebuild-fgdata-embedded-resources' script[6] (a + convenience wrapper for fgrcc), before being committed to the + FlightGear repository. + + (2) SimGear contains an EmbeddedResourceManager class with, among + others, createInstance() and instance() methods similar to the + ones of NavDataCache. See [7] for the corresponding code. + + FlightGear creates an EmbeddedResourceManager instance at startup + and calls the various init functions generated by fgrcc, each of + which registers the resources present in its containing .cxx file + (using EmbeddedResourceManager::addResource()). + + End-user FG code can then use EmbeddedResourceManager methods such + as getResource(), getString(), getStreambuf() and getIStream() + to access resource contents: + - getResource() returns an + std::shared_ptr + - getString() returns an std::string + - getStreambuf() returns an std::unique_ptr + - getIStream() returns an std::unique_ptr + + AbstractEmbeddedResource is an abstract base class that you can + think of as a resource descriptor: it points to (not contains!) + the resource data (which is normally of static storage class), and + contains + gives access to metadata such as the compression type + and resource size (compressed and uncompressed). + + AbstractEmbeddedResource currently has two derived concrete + classes: RawEmbeddedResource for resources stored as-is + (uncompressed) and ZlibEmbeddedResource for resources compressed by + fgrcc. It's quite easy to add new subclasses if wanted, e.g. for + LZMA compression or other things. + + Resource fetching requires two things: + + - an std::string key (fgrcc manipulates them with SGPath, but the + EmbeddedResourceManager code in SimGear is so far completely + agnostic of the kind of data stored in keys; this could be + changed, though, if we wanted for example to be able to query + at runtime all available resources in a given “virtual + directory”); + + - a “locale” name, similar to what FlightGear's XML translation + files and FGLocale use. We used double quotes here, because + fgrcc and the EmbeddedResourceManager expect “locale” names to + be of one of these forms: + * empty string: default locale, typically but not necessarily + English (it is “engineering English” in FlightGear, i.e., + English written by programmers in the code, before + translators possibly fix it up :) + * en, fr, de, es, it... + * en_GB, en_US, fr_FR, fr_CA, de_DE, de_CH, it_IT... + + There is no encoding part, contrary to POSIX locales, hence the + use of double quotes around the term “locale” in this context. + + The FGLocale::getPreferredLanguage() method returns the preferred + “locale” in the form described above, according to user choice + (from fgfs' --language option) and/or settings (system locale). + This allows FG to tell the EmbeddedResourceManager the preferred + “locale” for resource fetching (same syntax as in Qt's rcc tool for + declaration in the XML file, using the 'lang' attribute on + 'qresource' elements). + + [ Regarding the default locale, the way things are currently set + up, I would use no 'lang' attribute for resources suitable for + English in the XML input file for fgrcc, except when a + country-specific variant is desired (en_GB, en_US, en_AU...). In + such a case, there should also be a generic variant with no + 'lang' attribute declared for the same resource virtual path. + This matches what I did for FGLocale::getPreferredLanguage(), + that maps unset locales and locales such as C and C.UTF-8 to the + default locale for the EmbeddedResourceManager, which is the + empty string. This is a matter of policy, of course, and could be + changed if desired. ] + + The EmbeddedResourceManager class has getLocale() and + selectLocale() methods to manage the _selected locale_. Each + resource-fetching method of this class (getResourceOrNullPtr(), + getResource(), getString(), getStreambuf() and getIStream()) has + two overloads: + - one taking only a virtual path (the key mentioned above); + - one taking a virtual path and a “locale” name. + + (we'll write “locale” without enclosing double-quotes from now on, + otherwise it gets too painful to read; but we're *not* talking + about POSIX-style locales ending with an encoding part) + + The first kind of overload uses the selected locale to look up the + resource, whereas the second kind uses the explicitly specified + locale. Then resource lookup behaves as one could expect. For + instance, assuming a resource is looked up for in the "fr_FR" + locale, then the EmbeddedResourceManager tries in this order: + - "fr_FR"; + - if no resource has been registered for "fr_FR" with the provided + virtual path, it then tries with the "fr" locale; + - if this is also unsuccessful, it finally tries with the default + locale: ""; + - if this third attempt fails, the resource-fetching method + throws an sg_exception, except for getResourceOrNullPtr(), + which returns a null + std::shared_ptr instead. + + To see how this is used, you can look at + simgear/simgear/embedded_resources/embedded_resources_test.cxx. The + only difference with real use is that in this file, resource + contents and registering calls with the EmbeddedResourceManager + have been written manually instead of by fgrcc. Apart from + embedded_resources_test.cxx, here are two examples of client usage + of the EmbeddedResourceManager: + + (a) With EmbeddedResourceManager::getString(): + + #include + #include + + [...] + + const auto& resMgr = simgear::EmbeddedResourceManager::instance(); + SG_LOG(SG_GENERAL, SG_INFO, + "Resource contents: '" << + resMgr->getString("/virtual/path/to/resource") << "'"); + + (b) With EmbeddedResourceManager::getIStream(): + + #include // std::size_t + #include + #include + + [...] + + sg_ofstream outFile(SGPath("/tmp/whatever")); + if (!outFile) { + + } + + const auto& resMgr = simgear::EmbeddedResourceManager::instance(); + auto resStream = resMgr->getIStream("/virtual/path/to/resource"); + // One possible way of handling errors from resStream[8]: + // resStream->exceptions(std::ios_base::badbit); + + constexpr std::size_t bufSize = 4096; + std::unique_ptr buf(new char[bufSize]); // intermediate buffer + + do { + resStream->read(buf.get(), bufSize); + outFile.write(buf.get(), resStream->gcount()); + } while (*resStream && outFile); // resStream *points* to an std::istream + + + + +3) About the XML resource declaration files + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +You may want to read the output of 'fgrcc --help', which explains a few +things, in particular how to write an XML resource declaration file that +fgrcc can use. At the time of this writing, such files are already +present as flightgear/src/EmbeddedResources/FlightGear-resources.xml and +flightgear/src/EmbeddedResources/FGData-resources.xml in the FlightGear +repository. In case you need resources from elsewhere, it's easy to add +other XML resource declaration files: + + 1) If you want the .cxx/.hxx resource files to be automatically + generated as part of the FlightGear build: + + Copy and adapt the add_custom_command() call in + flightgear/src/Main/CMakeLists.txt[9] that invokes fgrcc on + flightgear/src/EmbeddedResources/FlightGear-resources.xml. + + 2) In flightgear/src/Main/CMakeLists.txt, add paths for your new + fgrcc-generated .cxx and .hxx files to the SOURCES and HEADERS + CMake variables for the 'fgfs' target. + + 3) Assuming you passed for instance + --init-func-name=initFoobarEmbeddedResources in step 1, add a call + to initFoobarEmbeddedResources() after this code in fgMainInit() + (flightgear/src/Main/main.cxx): + + simgear::EmbeddedResourceManager::createInstance(); + initFlightGearEmbeddedResources(); + + +4) The ResourceProxy class + ~~~~~~~~~~~~~~~~~~~~~~~ + +SimGear contains a ResourceProxy class that allows one to access real +files or embedded resources in a unified way. When using it, one can +switch from one data source to the other with minimal code changes, +possibly even at runtime (in which case there is obviously no code +change at all). + +Sample usage (from FlightGear): + + simgear::ResourceProxy proxy(globals->get_fg_root(), "/FGData"); + proxy.setUseEmbeddedResources(false); // can also be set via the constructor + + std::string s = proxy.getString("/some/path"); + std::unique_ptr streamp = proxy.getIStream("/some/path"); + +This example would retrieve contents from the real file +$FG_ROOT/some/path. If true had been passed in the +proxy.setUseEmbeddedResources() call, it would instead have used the +default-locale version of the embedded resource whose virtual path is +/FGData/some/path. + +For more information about this class, see [10] and [11]. + + +Footnotes +========= + +[1] E.g., FlightGear or FGData, as long as the path to the latter is + provided to the FG build system, which is currently possible but not + required (passing '-D FG_DATA_DIR:PATH=...' to CMake when + configuring the FlightGear build). + +[2] The differences with the QRC format[3] are explained in the output + of 'fgrcc --help'. Here is the relevant excerpt: + +,---- +| 1. The declaration at the beginning should be omitted (or +| replaced with , however such a DTD currently doesn't +| exist). I suggest to add an XML declaration instead, for instance: +| +| +| +| 2. and must be replaced with and , +| respectively. +| +| 3. The FGRCC format supports a 'compression' attribute for each 'file' +| element. At the time of this writing, the allowed values for this +| attribute are 'none', 'zlib' and 'auto'. When set to a value that is +| not 'auto', this attribute of course bypasses the algorithm for +| determining whether and how to compress a given resource (algorithm +| which relies on the file extension). +| +| 4. Resource paths (paths to the real files, not virtual paths) are +| interpreted relatively to the directory specified with the --root +| option. If this option is not passed to 'fgrcc', then the default root +| directory is the one containing INFILE, which matches the behavior of +| Qt's 'rcc' tool. +`---- + +[3] http://doc.qt.io/qt-5/resources.html + +[4] The main reason why I wrote the classes in + simgear/simgear/io/iostreams/{CharArrayStream,zlibstream}.cxx is + thus not to maximize memory-efficiency with very large resources; + rather, it is to make the implementation of the following parts + simple, clean and modular: + - the resource compiler (fgrcc); + - the EmbeddedResourceManager. + +[5] The EmbeddedResourceManager architecture would make it quite easy to + also support runtime loading of resources from files (a thing the Qt + resource system supports), but it is not very clear how interesting + this would be, compared to having the files loaded from $FG_ROOT. + Well, maybe for large files [apt.dat.gz & Co] that we would want to + load but not see in the FGData repository at all. But then there + would be the requirement, of course, that “something” puts the files + in a clearly-defined, platform-dependent location known to the + EmbeddedResourceManager. + +[6] https://sourceforge.net/p/flightgear/fgmeta/ci/next/tree/python3-flightgear/rebuild-fgdata-embedded-resources + +[7] https://sourceforge.net/p/flightgear/simgear/ci/next/tree/simgear/embedded_resources/ + +[8] We know that in some buggy C++ implementations, the + std::ios_base::failure exception can't be caught, at least not under + its name, due to some ABI compatibility mess: + + https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66145 + + However, it stills causes the program to abort, and since this + error handling technique makes for much more readable and less + error-prone code, I think it's still a good way to handle IOStreams + errors even now, unless you really need to *catch* the + std::ios_base::failure exception. + +[9] flightgear/CMakeModules/GenerateFlightgearResources.cmake in my + 'i18n-and-init-work-v2-rebased' branch (not merged into 'next' at + the time of this writing). + +[10] https://sourceforge.net/p/flightgear/simgear/ci/next/tree/simgear/embedded_resources/ResourceProxy.hxx + +[11] https://sourceforge.net/p/flightgear/simgear/ci/next/tree/simgear/embedded_resources/embedded_resources_test.cxx