Synametrics Technologies

Revolving around the core of technology

Comparing rsync With Other Technologies

Understanding File-Level backups

When discussing file-level backup types, there are two main school of thoughts: Differential and Incremental.

Differential backups start by first making a full backup, and then each backup after that transfers only what has changed until you decide to do a full backup again. For example, if I make a full backup on Monday and I run a differential backup on Thursday, it will contain any changes I have made between Monday and Thursday. If I then decide to do another Differential backup on Sunday, this will contain all changes made from Monday to Sunday. This resets when I do a full backup.

Incremental backups improve on this idea. With incremental backups, I start with a full backup. Every backup after that only transfers what has changed since the last backup, be that a full backup or an incremental backup. This means that if I run a full backup on Monday, my Thursday incremental backup is only Monday to Thursday, and my Incremental backup on Sunday is only Thursday to Sunday.

Simply put, differential backups are more encompassing, take longer to backup, and are quick to restore whereas incremental backups are more precise, are quicker to backup, but take longer to restore.

	Step 1	Step 2	Step 3
rdiff	Calculate Delta	Transfer Delta	N/A
rsync	Calculate Delta	Transfer Delta	Merge Delta

Of course there are many other differences between the two but the general concept holds. As you can see from above they are virtually identical with the exception of the final step. However, this final step is very important when you understand what ramifications it has. If you look at the destination of both of these backups, you will see very different results.

An rdiff backup does not merge the delta during backup, it does it during restore. This means that on the destination, you will only contain delta files. With rsync backups the delta is merged at time of backup, meaning that the destination will contain complete files that you can open and use.

For the purpose of this article I will compare metrics between three backup scenarios: rdiff backup, rsync backup, and an rsync backup over HTTP. I feel as if rsync over HTTP is worth looking at, since it opens up many possibilities for improving backups as a whole. The metrics that we will be using are as follows: Manageability, Useability, Reliability, Security and Performance. With all of these, we will assume that a generic software will be using these algorithms.

Manageability

	rdiff	rsync	rsync over HTTP
Configuration	Source Configurable	Source & destination configurable	Source & destination configurable
User Management	Limited to no User management	Limited to no User management	High User management
Destination Differences	Incomplete Destination Files	Complete Destination Files	Complete Destination Files
Versioning	Versioning by Default	Versioning with Custom Scripts	Versioning via Software

With rdiff, you will be able to configure a source to select which files to backup, and direct them to a storage destination. When the files get to the destination they are simply stored. With rsync you are able to configure both the source and destination. With rsync over HTTP you can configure both of these via a web portal.

A benefit of rdiff over rsync is that rdiff will always maintain versions of a file, whereas rsync always maintains the most recent version. With rsync over HTTP there are methods that can be implemented to add versioning as a feature.

Useability

	rdiff	rsync	rsync over HTTP
Network Requirement	Can run over a slow network	Can run over a slow network	Can run over a slow network
Flexibility	Not Flexible	Flexible	Very Flexibile
Portability	Source Files not Portable	Portable Source Files	Portable Source and Destination Files
File Interactiveness	Can Only interact from Source	Can Interact from Source and Destination from their locations	Can Interact from Source and Destination from any location
Synchronization	No Synchronization	Synchronization Possible	Multi-level Synchronization

When it comes to useability there are many factors. For an average user, both rdiff and rsync are fairly similar when it comes to triggering backups, from say a command line. With a software that uses these you get some more features and ease of use for the average user, but what the users can do is different. rdiff is not as flexible as rsync specifically because of the lack of destination receiver that rsync requires, which is also why rdiff does not have the ability to merge at destination.

The biggest downfall for rdiff in this category is the loss of synchronization. With rdiff, one side has meaningful and useable files, whereas the other has only the delta files. With rsync, we can keep A and B identical across a network; with rsync over HTTP you add the possibility of keeping multiple machines in sync with each other.

Reliability

	rdiff	rsync	rsync over HTTP
File Corruption	Corruption is a Major Issue	Corruption is a Minor Issue	Corruption is a Minor Issue
Half Finished Backups	Can Recognize Partial Backups	Can Recognize Partial Backups	Can Recognize Partial Backups

With respect to reliability, there is a major concern with rdiff vs. rsync. The entire point of a backup is so that you have the ability to restore data when needed. If your backup software maintains versions of files then this is even more true. The restore process for both algorithms is different, again due to the merging process.

When you restore via rdiff, the source machine pulls the necessary files from the destination machine and rebuilds the file starting from the base, and increasing in version numbers until it reaches the most recent copy of a file. The rsync algorithm does not do versioning by default, since it merges the deltas at the destination when they get there. When using software that uses rsync over HTTP there are features available that provide the ability to maintain versions of files as well, with the difference being that restoration happens in reverse than rdiff does.

For example, say with all three scenarios, we backup a previously backed up file 5 times with changes and now we need to restore the most recent copy. However, in this situation version 2 was corrupt; you can see this effect below, remember that rsync by itself does not maintain these versions.

Green = We can successfully restore this

Red = We can not restore this

Blue = Can only restore this if it is the most current version.

rdiff	rsync	rsync over HTTP
Original File	New Current File	Original File
Version 1	New Current File	Version 1
Version 2	New Current File	Version 2
Version 3	New Current File	Version 3
Version 4	New Current File	Version 4
Most Current File	New Current File	Most Current File

This may sound a bit confusing, but think of the restore process for each as follows:

rdiff

rdiff doesn't merge the delta at the destination after a backup, so the only full copy of a file is the very first backup.
In order to get to any version of a backup, rdiff must take the original file and merge it with the next version (Ex: Original -> Version 1 -> Version 2 -> etc.)
If a version is corrupted, it can no longer merge since the prerequisite version is broken, and subsequent versions no longer merge properly.

rsync

rdiff doesn't merge the delta at the destination after a backup, so the only full copy of a file is the very first backup.
In order to get to any version of a backup, rdiff must take the original file and merge it with the next version (Ex: Original -> Version 1 -> Version 2 -> etc.)
If a version is corrupted, it can no longer merge since the prerequisite version is broken, and subsequent versions no longer merge properly.

rsync over HTTP

rdiff doesn't merge the delta at the destination after a backup, so the only full copy of a file is the very first backup.
In order to get to any version of a backup, rdiff must take the original file and merge it with the next version (Ex: Original -> Version 1 -> Version 2 -> etc.)
If a version is corrupted, it can no longer merge since the prerequisite version is broken, and subsequent versions no longer merge properly.

In most cases, you will not need to restore an original copy of a file, but a version in between. In this situation, the longer a backup goes on for the more reliable rsync over HTTP becomes since you can still retrieve files after a corruption occurs.

Security

rdiff	rsync	rsync over HTTP
SSH	SSH	No SSH
No SSL	SSL	SSL
No Authentication / 2FA	Authentication / No 2FA	Authentication + 2FA

Security is a major concern with backups, especially with data breaches becoming more and more commonplace. The typical rdiff and rsync backups don't provide much security during the transfer, but allow you to secure both machines in however way you wish. Rsync over http on the other hand has a variety of additional security measures, specifically on the fly. There are many methods of transferring data over HTTP with rsync, and software that utilize this provide a multitude of increased security measures, such as requiring SSL, needing 2FA on accounts, and using built in aes encryption.

Performance

Performance is a slightly different metric compared to the others. Backup time alone, rdiff should be faster than both rsync and rsync over HTTP. Restoration time should be faster with rsync and rsync over HTTP than rdiff since rdiff still has to merge, so comparatively they are very similar.

Conclusion

Overall, the choice in which algorithm, or software, to use when backing up files is different for everyone. Some situations may not need the advantages of rsync over HTTP, such as a completely closed network backup. However, once you need to introduce a network into the picture, the overall benefits and scalability provided with rsync over HTTP is unmatched.

Mailing Address

490 State Rt. 33,
Bldg 2, Unit 2
Millstone, NJ 08535

Phone/Email

Phone: +1609-750-0007
Email: sales@synametrics.com
https://www.synametrics.com

Synametrics Technologies

Comparing rsync With Other Technologies

Understanding File-Level backups

Manageability

Useability

Reliability

rdiff

rsync

rsync over HTTP

Security

Performance

Conclusion

Mailing Address

Phone/Email

Navigation

Social Media

WinSQL

Syncrify

SynaMan

Other Product/Services

Comparing rsync With Other Technologies

Understanding File-Level backups

Manageability

Useability

Reliability

rdiff

rsync

rsync over HTTP

Security

Performance

Conclusion

Navigation

Social Media