I'm excited to share my recent project, where I took on the challenge of porting a popular but untested 600+ line Bash script to Python. The outcome is [`rsync-time-machine.py`](https://github.com/basnijholt/rsync-time-machine.py), a Python implementation of the [`rsync-time-backup`](https://github.com/laurent22/rsync-time-backup) script. It provides Time Machine-style backups using rsync and creates incremental backups of files and directories to the destination of your choice.
The tool is designed to work on Linux, macOS, and Windows (via WSL or Cygwin). Its advantage over Time Machine is its flexibility - it can backup from/to any filesystem and works on any platform. You can also backup to a Truecrypt drive without any issues.
Unlike the original Bash script, `rsync-time-machine.py` is fully tested. It has no external dependencies (only requires Python ≥3.7), and it is fully compatible with [`rsync-time-backup`](https://github.com/laurent22/rsync-time-backup). It offers pretty terminal output and is fully typed.
Key features include:
* Each backup is in its own folder named after the current timestamp.
* Backup to/from remote destinations over SSH.
* Files that haven't changed from one backup to the next are hard-linked to the previous backup, saving space.
* Safety check - the backup will only happen if the destination has explicitly been marked as a backup destination.
* Resume feature - if a backup has failed or was interrupted, the tool will resume from there on the next backup.
* Exclude file - support for pattern-based exclusion via the `--exclude-from` rsync parameter.
* Automatically purge old backups based on a configurable expiration strategy.
* "latest" symlink that points to the latest successful backup.
I appreciate any feedback and contributions! Feel free to file an issue on the GitHub repository for any bugs, suggestions, or improvements. Looking forward to hearing your thoughts.
Happy backing up!
Please, do let me know if you have any questions or need any further information.
I bet it was a fun project, but in reality backups using rsync vs restic, borg, etc., are very inefficient in speed, performance, storage, and likely other ways. After discovering these newer tools, I vowed to never put myself through the pain of rsync backups ever again.
Actually, for raw speed, rsync is much faster than any of the tools you mentioned (see e.g., https://github.com/borgbackup/borg/issues/4190). I really like a lightweight solution, where I do not even need any tool to restore backups. The tools you mentioned are great though.
Did you see the last reply on the thread you linked. The guy messed up ENV variable in borg and was doing too many account backups as new archives killing the cache when the same account was backed up next day. Borg will always be faster than rsync while doing incremental backups, but of course has a learning curve coming from the simplicity of rsync.
I used the bash rsync-time-backup script for a while to back up datasets because they use less space than a full copy by linking to unchanged files in previous backups. Now I am using DVC that takes a git-like approach to do the same, and additionally making it easier to distribute datasets with a simple pull command.
Despite constantly Googling for backup solutions and replacements for TimeMachine!
Side note: I recently put together a TrueNAS Scale based NAS box for TimeMachine.
It's running 5x 4TB drives, using ZFS RAIDZ1, and it's the best network multi-user TimeMachine destination I've ever used! (short of a large direct connected TB SSD)
It's much more responsive to browse and restore over my LAN than I'd have expected!
If you have a file system like Zfs that gives you snapshots, you would not need a tool like oP’s to make multiple copies. You can periodically do rsync(think cron job), and literally include the —delete flag such that rsync tries to replicate source and destination. The trick is to keep making cheap snapshots on ZFS such that those deletions are captured in the snapshots. When you need a file that was deleted a year ago, or as of a year ago, simply browse to the snapshot of that time, and get your data back.
Zfs stores the delta only anyway, so they are quite cheap.
This is how I backup the primary storage of a huge HPC cluster - 2.5 Petabytes of research data.
Are you open to a single dependency [0]? Entirely native tooling is an admirable thing that I greatly appreciate, but parsing subprocess output is fraught with issues (I know, I've done this as well).
I'm excited to share my recent project, where I took on the challenge of porting a popular but untested 600+ line Bash script to Python. The outcome is [`rsync-time-machine.py`](https://github.com/basnijholt/rsync-time-machine.py), a Python implementation of the [`rsync-time-backup`](https://github.com/laurent22/rsync-time-backup) script. It provides Time Machine-style backups using rsync and creates incremental backups of files and directories to the destination of your choice.
The tool is designed to work on Linux, macOS, and Windows (via WSL or Cygwin). Its advantage over Time Machine is its flexibility - it can backup from/to any filesystem and works on any platform. You can also backup to a Truecrypt drive without any issues.
Unlike the original Bash script, `rsync-time-machine.py` is fully tested. It has no external dependencies (only requires Python ≥3.7), and it is fully compatible with [`rsync-time-backup`](https://github.com/laurent22/rsync-time-backup). It offers pretty terminal output and is fully typed.
Key features include:
* Each backup is in its own folder named after the current timestamp. * Backup to/from remote destinations over SSH. * Files that haven't changed from one backup to the next are hard-linked to the previous backup, saving space. * Safety check - the backup will only happen if the destination has explicitly been marked as a backup destination. * Resume feature - if a backup has failed or was interrupted, the tool will resume from there on the next backup. * Exclude file - support for pattern-based exclusion via the `--exclude-from` rsync parameter. * Automatically purge old backups based on a configurable expiration strategy. * "latest" symlink that points to the latest successful backup.
To learn more about how to use and install `rsync-time-machine.py`, check out the [GitHub repo](https://github.com/basnijholt/rsync-time-machine.py).
I appreciate any feedback and contributions! Feel free to file an issue on the GitHub repository for any bugs, suggestions, or improvements. Looking forward to hearing your thoughts.
Happy backing up!
Please, do let me know if you have any questions or need any further information.