- Instantiate an HTTPSConnection object when the URL scheme is 'https'.
- Clarify and simplify the initialization of HTTPSocketRequest and
HTTPDownloadRequest:
+ clarify initialization of their 'callback' attribute (it's the
method of the same name; make it clear that the base class
constructor, namely HTTPGetCallback.__init__(), doesn't modify the
'callback' attribute when an object of class HTTPSocketRequest or
HTTPDownloadRequest is initialized);
+ HTTPDownloadRequest doesn't need access to the TerraSync object
-> remove the corresponding instance attribute and constructor
argument.
- Don't use super() when initializing HTTPDownloadRequest objects
(see [1]).
[1] https://fuhm.net/super-harmful/
- Refuse to recursively delete a directory that does not contain a
.dirindex file. This will protect users against data loss in case they
inadvertently use the --remove-orphan option with the wrong target
directory.
- Correctly handle the case where we have a file on disk that is now
listed as a directory on the server: remove the file if we are in
'sync' mode, so that the directory can be created and sync'ed from the
server.
- only accept ASCII-encoded .dirindex files (this is guaranteed to work
fine "everywhere");
- reject .dirindex files with a 'path' entry that contains a backslash
or starts with a slash;
- reject .dirindex files with a 'path' entry that contains a '..'
component;
- reject .dirindex files with an 'f', 'd' or 't' entry whose name field
contains a slash or a backslash;
- reject .dirindex files with an 'f', 'd' or 't' entry whose name field
is '..';
- add comment lines (starting with '#') in the sample good .dirindex
file used by unit tests.
Remove the 'if __name__ == "__main__": unittest.main()'. Indeed, the
module can't be run this way due to its imports. Tests from this module
can be run with:
cd scripts/python/TerraSync
python3 -m unittest tests.test_virtual_path
In Python, common usage is not to define accessors, but to directly use
class or instance attributes (especially when the associated data is
constant after instance creation). If it later happens that a given
attribute needs getter or setter logic, this can always be done via the
@property decorator, and doesn't affect calling code at all. See for
instance:
https://docs.python.org/3/library/functions.html#propertyhttps://mail.python.org/pipermail/tutor/2012-December/thread.html#92990
Apply this to the DirIndex class and rename the following attributes for
better readability: f -> files, d -> directories, t -> tarballs.
The tests can be run from directory 'scripts/python/TerraSync' using:
python3 -m unittest tests.test_dirindex
(or just 'python3 -m unittest' to run all tests pertaining to
terrasync.py).
python3 has a default implementation for __ne__ when __eq__ is defined. The opposite is not true -- having only __ne__ does not have a default __eq__ implementation.
Also note that there are cases when eq/ne will both be True or both False, therefore, developers are encouraged to explicitly define these methods in pairs.
During the recent SourceForge migration, the TerraSync server hosted
there used to send .dirindex files with the following contents:
Project web is currently offline pending the final migration of its
data to our new datacenter.
Make sure terrasync.py aborts with a user-understandable error in such a
case.
Add more functions and properties to VirtualPath that directly
correspond to functions and properties of pathlib.PurePath, except that
types are adapted of course, and that for API consistency, VirtualPath
methods use mixedCaseStyle whereas those of pathlib.PurePath use
underscore_style.
Due to some misleading 'url' variable name, network error messages used
to contain things such as '/scenery/Airports/N/E/4/.dirindex' (i.e., the
path on the server) instead of the full URL. For the same reason, the
callback function of HTTPGetCallback objects was passed this
path-on-server instead of the URL. This should all be fixed now.
Option --only-subdir allows one to restrict terrasync.py processing[1]
to a chosen subdirectory of the TerraSync repository. Example:
terrasync.py --target=/your/TerraSync/repo --only-subdir="Airports/L/F/P"
[1] This works in both 'check' and 'sync' modes.
Add classes VirtualPath and MutableVirtualPath (the latter derived from
the former) to manipulate slash-separated paths where the root '/'
represents the TerraScenery root. This makes it clear what a function
expects when you see that one of its arguments is a VirtualPath
instance: you don't have to ask yourself whether it can start or end
with a slash, how to interpret it, etc. Operating on these paths is also
easy[1], be it to assemble URLs in order to retrieve files or to join
their relative part with a local directory path in order to obtain a
real (deeper) local path.
VirtualPath and MutableVirtualPath are essentially the same; the former
is hashable and therefore has to be immutable, whereas the latter can be
modified in-place with the /= operator (used to append path components),
and therefore can't be hashable. As a consequence, MutableVirtualPath
instances can't be used as dictionary keys, elements of a set or
frozenset, etc.
VirtualPath and MutableVirtualPath use the pathlib.PurePath API where
applicable (part of this API has been implemented in
[Mutable]VirtualPath; more can be added, of course). These classes have
no assumptions related to TerraSync and thus should be fit for use in
other projects.
To convert a [Mutable]VirtualPath instance to a string, just use str()
on it. The result is guaranteed to start with a '/' and not to end with
a '/', except for the virtual root '/'. Upon construction, the given
string is interpreted relatively to the virtual root, i.e.:
VirtualPath("") == VirtualPath("/")
VirtualPath("abc/def/ghi") == VirtualPath("/abc/def/ghi")
etc.
VirtualPath and MutableVirtualPath instances sort like the respective
strings str() converts them too. The __hash__() method of VirtualPath is
based on the type and this string representation, too. Such objects can
only compare equal (using ==) if they have the same type. If you want to
compare the underlying virtual paths inside a VirtualPath and a
MutableVirtualPath, use the samePath() method of either class.
For more info, see scripts/python/TerraSync/terrasync/virtual_path.py
and unit tests in scripts/python/TerraSync/tests/test_virtual_path.py.
[1] Most useful is the / operator, which works as for SGPath:
VirtualPath("/abc/def/ghi") == VirtualPath("/abc") / "def" / "ghi"
VirtualPath("/abc/def/ghi") == VirtualPath("/abc") / "def/ghi"
- New directory scripts/python/TerraSync/terrasync.
- Move scripts/python/terrasync.py to
scripts/python/TerraSync/terrasync/main.py (main module in the new
structure).
- Add empty __init__.py file to scripts/python/TerraSync/terrasync/ to
make this directory a Python package.
- Wrap the main code from previous terrasync.py in a main() function of
the terrasync.main module. Also move command-line arguments parsing to
a separate parseCommandLine() function.
- Add an executable script scripts/python/TerraSync/terrasync.py for end
users, that just calls terrasync.main.main().
For end users, the only difference is that they now have to use
scripts/python/TerraSync/terrasync.py instead of
scripts/python/terrasync.py (which doesn't exist anymore, since all this
lives under the scripts/python/TerraSync directory from now on).
This structure will allow to cleanly split the code into modules and to
add unit tests.