What
What this is about
Git-Annex is an interesting extension for Git that I’m testing to synchronize files between my Android Devices and computers. Git-Annex can be run in Termux on an Android phone without having to root the phone.
When
Synchronizing files with mobile devices
Copying photos, screenshots, notes and downloads off of phone and onto the computer is a reoccurring task in order to free space ad synchronize date to and from on my phone.
Most people synchronize their phones with Google Drive, Apple iCloud or Dropbox. That’s a lot more convenient, but it also means giving up a lot of privacy.
Background
Git-Annex
Git-Annex is an extension for Git, a versioning system for source code well-known among software developers.
With git source code lives in “repositories”. Each developer can “clone” or “checkout” his own working copy. Files are added with “add” and then “commit”ed to the working copy, before they are “push”ed to the the remote repository on a server.
This allows multiple developers to work on the same code simultaneously without destroying each others work. Git keeps a history of all changes and can automatically solve most of the issues of bringing together the working copies.
As git is designed for source code and is built to analyse line-by-line differences in text files, it’s inefficient on large binary files. Each change requires storing a new copy of the entire binary file, the size of working copies grows as every file is always downloaded when cloning. Checking for differences in binary files is slow and makes little sense.
Large binary files can be very useful for instance as input files for software testing purposes and so extensions to Git have been built to enable storing large files.
Git-Annex is one such extension for Git that I am testing for synchronizing files between my Android devices and computer.
The benefit is a cloud-less decentralized solution to the problem of synchronizing files. The drawback lies in it’s complexity over other solutions.
How
Setting up Git-Annex on Android
Setting up Git-Annex with Android devices is a bit involved. The following describes how I’ve configured it on my Android devices.
Note: there is also a very helpful walk-through guide I’ve linked below [1].
For each Android device
The first step is to install Termux. Termux is a Linux environment for Android. It’s installed like a regular Android-App. The Google PlayStore versions are currently outdated and use incompatible libssl builds that break curl and wget, so it’s best to install from F-Droid.
On the Android device, in Termux we first install and configure an SSH-Server:
Now we can log-in from a computer:
The last command will print either armabi or aarch64 depending on whether the device uses a 32-bit or 64-bit processor. There are different prebuilt versions of git-annex depending on the type of processor.
Get the Git-Annex prebuilts
If it’s a 32-Bit Android device:
If it’s a 64-Bit Android device, instead:
Now we can configure git-annex:
Close and re-open the Termux-App, so that the new git-annex configuration is properly loaded.
On most devices there are two directories
and
that both point to same directory. So watch out not to mix the two as this can confuse git-annex.
Android uses a file-system called “sdcardfs” that is based on FAT32 and as such doesn’t allow symbolic links.
This is inconvenient for use with git-annex as it would usually place symlinks in the repositories main directory that link to locations of the actual binaries in the git-tree.
Without symlinks git-annex has to resort to using copies instead. This means there are always two copies of the data - in git repository directory tree and the main directory. This of course uses more storage, but works like the symlinks for most use-cases.
Git-Annex will occasionally state this in printouts saying it is using a “crippled filesystem” and is using the “direct mode”.
The Android file-system also doesn’t support setting write permissions which causes errors with git-annex. We can disable by setting:
Now we can add and commit files:
The Android device is now ready to be synced with git-annex from other devices.
For convenience you may want to add a script “sync.sh” to /data/data/com.termux/files/home/.shortcuts.
and setting that executable
With the “Termix Widget”-App installed, and the widget added to the homescreen, synchronization can be triggered by tapping on the sync.sh script in the list of scripts directly from the home screen.
Setting up Git-Annex on Linux
On the Linux-PC (or Windows WSL)
We can then add one or more Android devices as remotes:
Files can be synchronized by running
In case of errors
annex ignore
I ran into some issues with git-annex marking a remote with “annex-ignore = true”. This happens when the remote is not correctly configured. When a remote is once labeled with “annex-ignore”, it won’t be synchronised.
After fixing whatever the issue was, we can remove the flag by editing the config, for this
and remove the line
vanished files
Another issue I ran into while messing around with git-annex was that at one point all files appeared to have vanished. This was likely caused by re-run “git-annex init” which created a new repo and repo UUID.
I was able to see the files still existed under
In this state one has to be careful not to run “git” commands, that may automatically issue a “git gc”, because that will in fact remove the files.
The commands
and
can be run, but didn’t help.
However
showed the files that had vanished and
listed their file names.
I found that I had a faulty commit that could be seen with
though I have no idea where that came from, reverting by the commit hash solved the issue
After re-commiting
and syncing on all devices
the files reappeared on all devices.
accidental 'git add' instead of 'git annex add'
Git-Annex extends Git and as such we can use mixed repositories with source and binaries.
Especially when used to standalone git one might accidentally add a file with “git add” instead of “git annex add”.
If the added file is also committed, then the usual painful approaches to get the file out of the git repository are required. One would have to “git revert” the commit and then clean the git tree to make sure the unused file doesn’t increase the git repository size.
Progress
Conclusions
So far git-annex works well for me. There is a learning curve and it does require familiarity with Git and the Linux command-line. At the stage I’m currently using it, it probably doesn’t beat simple “rsync” approaches.
However git-annex has some more advanced features. One can, for instance, specify how many copies to keep on which remotes to enforce redundancy.
It might be very useful to be able to keep the last 30 days of photos on the phone and drop older photos, if they have already been synchronized to two (or more) other remotes.
Backup hard drives can be added and can be automatically checked for data consistency.
In my view the approach is also superior to cloud-based backups as it preserves data privacy, is a lot faster in synchronisation speed and is decentralized - it doesn’t require an always-on and power consuming server. It can even be combined with off-site cloud storage by adding the cloud storage as an additional git remote.
With git-annex I can sync my Android devices to my laptop or desktop computer, depending on which is available at a given time. Then sync the two with each other at a later time without having to manually sort out conflicts.
Due to the heavy use of checksums with git and git-annex data corruption can also be detected.
And as a software developer git-annex is a useful extension to git, that one should know about, for adding large binary files, for instance as test input files, to source-code repositories.
1] https://git-annex.branchable.com/walkthrough/