1
0
Fork 0
Commit graph

65 commits

Author SHA1 Message Date
Scott Giese
0058ef27c5 TerraSync: Rate Limiter
Sourceforge seems to have an agressive API rate limiter set at approximately 50 requests/minute.
2021-01-31 21:58:41 -06:00
Torsten Dreyer
7621569834 terrasync.py: add option to use basic authentication 2021-01-31 14:25:06 +01:00
Florent Rougon
e342a0f41e terrasync.py: add support for the HTTPS protocol
- Instantiate an HTTPSConnection object when the URL scheme is 'https'.

- Clarify and simplify the initialization of HTTPSocketRequest and
  HTTPDownloadRequest:
    + clarify initialization of their 'callback' attribute (it's the
      method of the same name; make it clear that the base class
      constructor, namely HTTPGetCallback.__init__(), doesn't modify the
      'callback' attribute when an object of class HTTPSocketRequest or
      HTTPDownloadRequest is initialized);
    + HTTPDownloadRequest doesn't need access to the TerraSync object
      -> remove the corresponding instance attribute and constructor
         argument.

- Don't use super() when initializing HTTPDownloadRequest objects
  (see [1]).

[1] https://fuhm.net/super-harmful/
2020-12-15 15:40:40 +01:00
Florent Rougon
c5e45f2b49 terrasync.py: two improvements
- Refuse to recursively delete a directory that does not contain a
  .dirindex file. This will protect users against data loss in case they
  inadvertently use the --remove-orphan option with the wrong target
  directory.

- Correctly handle the case where we have a file on disk that is now
  listed as a directory on the server: remove the file if we are in
  'sync' mode, so that the directory can be created and sync'ed from the
  server.
2020-10-04 14:02:13 +02:00
Scott Giese
3f2ee2de04 Use python3 default implementation 2020-10-03 10:28:46 -05:00
Florent Rougon
8009f46a51 terrasync.py: improve code readability 2020-10-03 14:55:25 +02:00
Florent Rougon
692ab6835f terrasync.py: more thorough checking of .dirindex contents
- only accept ASCII-encoded .dirindex files (this is guaranteed to work
  fine "everywhere");

- reject .dirindex files with a 'path' entry that contains a backslash
  or starts with a slash;

- reject .dirindex files with a 'path' entry that contains a '..'
  component;

- reject .dirindex files with an 'f', 'd' or 't' entry whose name field
  contains a slash or a backslash;

- reject .dirindex files with an 'f', 'd' or 't' entry whose name field
  is '..';

- add comment lines (starting with '#') in the sample good .dirindex
  file used by unit tests.
2020-10-03 14:18:29 +02:00
Florent Rougon
2a991c9874 terrasync.py: test_virtual_path.py can't be run directly
Remove the 'if __name__ == "__main__": unittest.main()'. Indeed, the
module can't be run this way due to its imports. Tests from this module
can be run with:

  cd scripts/python/TerraSync
  python3 -m unittest tests.test_virtual_path
2020-10-02 16:38:08 +02:00
Florent Rougon
13f943b4a1 terrasync.py: rename DirIndex attributes and remove accessors
In Python, common usage is not to define accessors, but to directly use
class or instance attributes (especially when the associated data is
constant after instance creation). If it later happens that a given
attribute needs getter or setter logic, this can always be done via the
@property decorator, and doesn't affect calling code at all. See for
instance:

  https://docs.python.org/3/library/functions.html#property
  https://mail.python.org/pipermail/tutor/2012-December/thread.html#92990

Apply this to the DirIndex class and rename the following attributes for
better readability: f -> files, d -> directories, t -> tarballs.
2020-10-02 16:38:08 +02:00
Florent Rougon
477d9f7a9a terrasync.py: move the DirIndex class to its own module and add unit tests
The tests can be run from directory 'scripts/python/TerraSync' using:

  python3 -m unittest tests.test_dirindex

(or just 'python3 -m unittest' to run all tests pertaining to
terrasync.py).
2020-10-02 16:38:08 +02:00
Scott Giese
431844138b python compatibility: make __ne__ explicit.
python3 has a default implementation for __ne__ when __eq__ is defined.  The opposite is not true -- having only __ne__ does not have a default __eq__ implementation.
Also note that there are cases when eq/ne will both be True or both False, therefore, developers are encouraged to explicitly define these methods in pairs.
2020-10-01 23:06:37 -05:00
Scott Giese
22e9d0e2f1 python: use a with-statement to manage file close 2020-10-01 22:51:31 -05:00
Florent Rougon
7714abd56e terrasync.py: fix a DeprecationWarning
Using or importing the ABCs from 'collections' instead of from
'collections.abc' is deprecated since Python 3.3, and in 3.9 it will
stop working.
2020-10-01 19:44:19 +02:00
Scott Giese
df67cc2bd9 Python: best practices
Check eq/ne
Anticipate file issues
Ensure file closure
2020-08-29 12:07:15 -05:00
Thomas LESNE
31f434d3ad Migrate FGFSDemo.py, FlightGear.py, demo.py and nasal_api_doc.py to Python 3
Also apply minor changes to FGFSDemo.py and demo.py: add python3
shebang, make the scripts executable, improve an error message.
This closes ticket #224:
<https://sourceforge.net/p/flightgear/flightgear/merge-requests/224/>.
2020-08-05 14:01:57 +02:00
Thomas LESNE
8e3274a7c2 FGFSDemo.py: update properties to match the ones available in 2020.x
New properties available at least with the c172p.
2020-08-05 12:45:59 +02:00
Stuart Buchanan
9553577397 Add support for tarballs in terrasync 2020-02-16 20:27:44 +00:00
Scott Giese
b0e157dd88 Bug fix to set a valid default value for the right edge.
Courtesy of report by Brendan Black.
2018-10-04 18:59:15 -05:00
Scott Giese
e093ac19bc Bug Fix: #2013.
TerraSync not download all tiles specified in some cases.
Contribution by Peter Duda.
2018-09-27 23:46:16 -05:00
Florent Rougon
c7f6c55423 terrasync.py: user-friendly error message when no 'path' found in .dirindex
During the recent SourceForge migration, the TerraSync server hosted
there used to send .dirindex files with the following contents:

  Project web is currently offline pending the final migration of its
  data to our new datacenter.

Make sure terrasync.py aborts with a user-understandable error in such a
case.
2018-02-18 20:22:58 +01:00
Florent Rougon
5cca99bbae virtual_path.py: add more functions and properties to VirtualPath
Add more functions and properties to VirtualPath that directly
correspond to functions and properties of pathlib.PurePath, except that
types are adapted of course, and that for API consistency, VirtualPath
methods use mixedCaseStyle whereas those of pathlib.PurePath use
underscore_style.
2018-02-18 19:04:58 +01:00
Florent Rougon
dd4fc36a9d terrasync.py: make assert statements more useful
When the second argument of an assert statement is a string, using
repr() is more helpful than relying on the default representation.
2018-02-18 19:04:58 +01:00
Florent Rougon
5aae639a0d terrasync.py: print full URL in exception messages, pass full URL to callback
Due to some misleading 'url' variable name, network error messages used
to contain things such as '/scenery/Airports/N/E/4/.dirindex' (i.e., the
path on the server) instead of the full URL. For the same reason, the
callback function of HTTPGetCallback objects was passed this
path-on-server instead of the URL. This should all be fixed now.
2018-02-08 00:10:18 +01:00
Florent Rougon
1ae1ecc6c2 terrasync.py: accept paths using backslash separators for --only-subdir
We don't lose anything by accepting this, because using backslashes in
file or dir names is out of question, as Windows users can't see them.
2018-02-07 23:31:38 +01:00
Florent Rougon
cb6b267430 terrasync.py: fix a small formatting issue with --report
Remove unneeded blank lines between listed files or dirs when using
--report.
2018-02-07 11:38:41 +01:00
Florent Rougon
c30298ffce terrasync.py: add option --only-subdir
Option --only-subdir allows one to restrict terrasync.py processing[1]
to a chosen subdirectory of the TerraSync repository. Example:

  terrasync.py --target=/your/TerraSync/repo --only-subdir="Airports/L/F/P"

[1] This works in both 'check' and 'sync' modes.
2018-02-07 11:38:41 +01:00
Florent Rougon
c0e1f29a75 terrasync.py: add and use a VirtualPath class; also add MutableVirtualPath
Add classes VirtualPath and MutableVirtualPath (the latter derived from
the former) to manipulate slash-separated paths where the root '/'
represents the TerraScenery root. This makes it clear what a function
expects when you see that one of its arguments is a VirtualPath
instance: you don't have to ask yourself whether it can start or end
with a slash, how to interpret it, etc. Operating on these paths is also
easy[1], be it to assemble URLs in order to retrieve files or to join
their relative part with a local directory path in order to obtain a
real (deeper) local path.

VirtualPath and MutableVirtualPath are essentially the same; the former
is hashable and therefore has to be immutable, whereas the latter can be
modified in-place with the /= operator (used to append path components),
and therefore can't be hashable. As a consequence, MutableVirtualPath
instances can't be used as dictionary keys, elements of a set or
frozenset, etc.

VirtualPath and MutableVirtualPath use the pathlib.PurePath API where
applicable (part of this API has been implemented in
[Mutable]VirtualPath; more can be added, of course). These classes have
no assumptions related to TerraSync and thus should be fit for use in
other projects.

To convert a [Mutable]VirtualPath instance to a string, just use str()
on it. The result is guaranteed to start with a '/' and not to end with
a '/', except for the virtual root '/'. Upon construction, the given
string is interpreted relatively to the virtual root, i.e.:

  VirtualPath("") == VirtualPath("/")
  VirtualPath("abc/def/ghi") == VirtualPath("/abc/def/ghi")
  etc.

VirtualPath and MutableVirtualPath instances sort like the respective
strings str() converts them too. The __hash__() method of VirtualPath is
based on the type and this string representation, too. Such objects can
only compare equal (using ==) if they have the same type. If you want to
compare the underlying virtual paths inside a VirtualPath and a
MutableVirtualPath, use the samePath() method of either class.

For more info, see scripts/python/TerraSync/terrasync/virtual_path.py
and unit tests in scripts/python/TerraSync/tests/test_virtual_path.py.

[1] Most useful is the / operator, which works as for SGPath:

      VirtualPath("/abc/def/ghi") == VirtualPath("/abc") / "def" / "ghi"
      VirtualPath("/abc/def/ghi") == VirtualPath("/abc") / "def/ghi"
2018-02-07 11:38:41 +01:00
Florent Rougon
e613c81b4c terrasync.py: move custom exception classes to a separate module
Add new module 'terrasync.exceptions' for our custom exception classes.
2018-02-07 11:38:41 +01:00
Florent Rougon
c72de824d2 terrasync.py: more modular code layout
- New directory scripts/python/TerraSync/terrasync.

- Move scripts/python/terrasync.py to
  scripts/python/TerraSync/terrasync/main.py (main module in the new
  structure).

- Add empty __init__.py file to scripts/python/TerraSync/terrasync/ to
  make this directory a Python package.

- Wrap the main code from previous terrasync.py in a main() function of
  the terrasync.main module. Also move command-line arguments parsing to
  a separate parseCommandLine() function.

- Add an executable script scripts/python/TerraSync/terrasync.py for end
  users, that just calls terrasync.main.main().

For end users, the only difference is that they now have to use
scripts/python/TerraSync/terrasync.py instead of
scripts/python/terrasync.py (which doesn't exist anymore, since all this
lives under the scripts/python/TerraSync directory from now on).

This structure will allow to cleanly split the code into modules and to
add unit tests.
2018-02-07 11:38:41 +01:00
Florent Rougon
eb23f8906d terrasync.py: use os.path.abspath() in TerraSync.setTarget()
Using os.path.abspath() here in TerraSync.setTarget() adds a safety
layer in case the process later calls os.chdir() or similar[1], which
would change the meaning of the "." directory. Also remove the strip()
call which I don't consider useful here, see my message at:

  https://sourceforge.net/p/flightgear/mailman/message/36208140/

[1] Not the case currently, but who knows what will happen in the
    future...
2018-02-03 11:22:56 +01:00
Florent Rougon
77513a2498 terrasync.py: update script header
In particular, terrasync.py appears *not* to require dnspython. :-)
2018-02-03 11:22:56 +01:00
Florent Rougon
8693e442d7 terrasync.py: add options --mode and --report
You may now call terrasync.py with --mode=sync or --mode=check. 'sync'
mode is the default and corresponds to terrasync.py's usual behavior.

In 'check' mode, terrasync.py never writes to disk and aborts at the
first mismatch between local and remote data. The exit status in 'check'
mode is:
  - 0 if the program terminated successfully and no mismatch was found
    between the local and remote repositories;
  - 1 in case an error was encountered;
  - 2 if there was a mismatch between local and remote data.

In 'sync' mode, the exit status is:
  - 0 if the program terminated successfully;
  - 1 in case an error was encountered.

A mismatch in 'check' mode is *not* an error, it is just one of the two
expected results. An error is a worse condition (uncaught exception,
network retrieval aborted after retrying failed, stuff like that).

Additionally, calling terrasync.py with --report causes it to print
lists of:
  - files and dirs that were missing or had mismatching hashes (this is
    okay in 'sync' mode: these things have been "fixed" in the target
    directory before the report was printed);
  - files and dirs that have been found to be orphaned (i.e., found
    under the target directory but not mentioned in the corresponding
    .dirindex file). These are the ones removed in 'sync' mode when
    --remove-orphan is passed.
2018-02-03 11:22:56 +01:00
Florent Rougon
6d323bbbdc terrasync.py: prepare the terrain for --mode and --report
- Add computeHash() utility function that can work with any file-like
  object (e.g., a connected socket).

- Rename hash_of_file() to hashForFile(), and of course implement it
  using our new computeHash().

- Add class HTTPSocketRequest derived from HTTPGetCallback. It allows
  one to process data from the network without storing it to a file (it
  uses the file-like interface provided by http.client.HTTPResponse).

  The callback returns the http.client.HTTPResponse object, which can be
  conveniently used in a 'with' statement.

- Simplify the API of TerraSync.updateDirectory(): its 'dirIndexHash'
  argument must now be a hash (a string); the None object is not allowed
  anymore (with the soon-to-come addition of --mode=check, having to
  deal with this special case in updateDirectory() would make the logic
  too difficult to follow, or we would have to really completely
  separate check-only mode from update mode, which would entail code
  duplication).

  Since TerraSync.updateDirectory() must now always have a hash to work
  with, compute the hash of the root '.dirindex' file from the server in
  TerraSync.start(), using our new HTTPSocketRequest class---which was
  written for this purpose, since that will have to work in check-only
  mode (but not only), where we don't want to write any file to disk.

- TerraSync.updateFile(): correctly handle the case where a directory
  inside the TerraSync repository is (now) a file according to the
  server: the directory must be recursively removed before the file can
  be downloaded in the place formerly occupied by the directory.

- Add stub class Report. Its methods do nothing for now, but are already
  called in a couple of appropriate places. The class will be completed
  in a future commit, of course.
2018-02-03 11:22:56 +01:00
Florent Rougon
af021cc1ef terrasync.py: add function removeDirectoryTree() and exception UserError
The goal of removeDirectoryTree() is to provide a safety net around
recursive directory removal with shutil.rmtree(), in order to prevent
user or bug-caused catastrophic events such as /, /home /home/joeuser or
C:\ being recursively erased.
2018-02-03 11:22:56 +01:00
Florent Rougon
2c6f93aa46 terrasync.py: improve retry logic and error handling
- Add method assembleUrl() to HTTPGetter.

- Raise a NetworkError exception with the particular URL and number of
  retries when it has been exhausted.

- Number of retries is now trivial to expose as a parameter, and set to
  5 in HTTPGetter.

- Sleep for one second between self.httpConnection.close() and
  self.httpConnection.connect() when retrying a failed HTTP request.

- Apply DRY principle.
2018-02-03 11:22:56 +01:00
Florent Rougon
c27ae92c73 terrasync.py: improve error handling
- New generic exception class TerraSyncPyException.

- Add subclass NetworkError of TerraSyncPyException.

- Raise a NetworkError exception when the HTTP return code is not 200.

- hash_of_file() does not silently ignore errors anymore; exceptions
  should be dealt with wherever appropriate by the callers.

  Whenever hash_of_file() returns, its return value is now the SHA-1
  hash of the specified file. This is less error-prone IMHO than
  returning None. Otherwise, calling code could erroneously conclude
  that there is a matching hash when the file to check is actually
  missing. For a concrete example, see the 'dirIndexHash' parameter of
  TerraSync.updateDirectory(), which so far is used precisely with the
  value None to express that "we are just starting the recursion and
  have no hash from the server to compare to".
2018-02-03 11:22:56 +01:00
Florent Rougon
19209a71f0 terrasync.py: improve the HTTPGetter and HTTPGetCallback APIs
When called, the callback passed to HTTPGetter.get() is now explicitly
passed the URL and the http.client.HTTPResponse instance.

Remove the HTTPGetCallback.result attribute (not needed anymore, leaves
more freedom when implementating HTTPGetCallback subclasses...).
2018-02-03 11:22:56 +01:00
Florent Rougon
de4291b851 terrasync.py: make HTTPGetter.get() return the HTTPGetCallback.callback ret value
This way, the caller can get access to interesting things from the
callback, such as the HTTPResponse object (wrapper for the socket).
2018-02-03 11:22:56 +01:00
Florent Rougon
b7fc63d896 terrasync.py: remove methods TerraSync.isReady() and TerraSync.update()
These methods don't work, because HTTPGetter objects have no isReady()
and update() methods either.
2018-02-03 11:22:56 +01:00
Florent Rougon
417d81d4c3 terrasync.py: minor changes
These changes should not alter the behavior of terrasync.py.
2018-02-03 11:22:56 +01:00
Torsten Dreyer
a27ea4dfe6 remove orphan directories with --remove-orphan 2017-09-01 10:33:35 +02:00
Saikrishna Arcot
bb0869599b
Switch to using argparse in terrasync.py, which is actually maintained, and fix a couple of errors. 2016-12-28 08:32:55 -06:00
Saikrishna Arcot
eb4738cb02
Switch to using optparse module, and allow settings bounds from command line. 2016-12-27 22:41:59 -06:00
Saikrishna Arcot
e67ea4e0cb
Add in the ability to download scenery only for a specific section. 2016-12-27 21:17:46 -06:00
Saikrishna Arcot
2c5429c589
Improve error handling.
If the response to the HTTP request isn't 200 (success), then don't save
the response, and don't call the callback.

Additionally, only retry in the case of HTTPException. This allows using
Ctrl-C to work correctly (and easily).
2016-12-27 20:46:59 -06:00
Torsten Dreyer
b9cba13e32 Fix the root cause for terrasync.py timeouts
Thanks to Andre Coetzee for spotting.

Also, don't hardcode port 80, intead use the port given in the url
2016-06-06 12:17:12 +02:00
Torsten Dreyer
a93dd29c85 First attempt to handle the nasty socket timeout
Retry once if a http get fails
2016-05-25 16:19:36 +02:00
Torsten Dreyer
489be2ce16 Add user-agent request header 2016-05-18 15:14:59 +02:00
Torsten Dreyer
6921c98933 Much better terrasync.py
- tortellini instead of spaghetti code (use oop)
- reuse connection
2016-05-18 12:51:29 +02:00
Torsten Dreyer
e2afbb4678 terrasync.py: cleanup and add some more power
- add option --quick
  check sha1sum of .dirindex files and skip directory if hash matches
- add option --remove-orphan
  remove orphan files (files exist locally but not on server)
- be less verbose
- write .dirindex files locally
2016-05-11 23:04:24 +02:00