Single vs Multi-tier BackupThe importance of backup is usually not a tough sell. Individuals who never backup their data will also swear they should do it. Your data is valuable. It will cost you time and effort to recreate it, and that costs money. In the worst case scenario, you may never be able to recover from a disaster if you do not have a backup.
With so many backup technologies and products on the market today, it can become very difficult to understand and select the right technology for your environment. Although most backup software perform similar tasks, the methods used by these products can be quite different. One such method lies in the notion of Single vs Multiple Tier Backup.
Single-tier BackupIn the simplest terms, a single tier backup is the one where only one CPU is involved. In other words, there is only one process backups file to the target destination. For example, backing up files to a USB drive or a tape drive. In this case the process that reads data from a source is also responsible for writing the files to the target device.
Multi-tier BackupUnlike single tier backups, two CPUs get involved running two different processes. The first process is responsible for reading data and sending it to the target machine. The second process receives incoming data from the source and is responsible for writing it to the destination. Typically, the process running on the source machine is called a client and the one running on the target is called a server. A good example of a multi-tier backup is a private or public cloud backup.
Comparing the two mechanismsSingle tier backups are suitable when the source and target are physically close to each other. In other words, a single-tier backup is typically used to backup data locally. In such cases, the entire file is copied to the destination even if just a fraction of data has been modified. Since the data transfer speed is typically very fast, there is no need to determine how much of a file has been modified between source and target. As far as I/O goes, this is done with one read operation on the source and one write operation to the target.
On the other hand, a multi-tier mechanism is a better choice when data needs to be sent offsite. Most multi-tier backup systems assume they are working on a slow network, such as across the Internet, and therefore, they try to minimize network traffic as much as possible. In such cases a backup client on the source machine communicates with a corresponding server process running on the target side. These two processes (client and server) together determine exactly what needs to be sent across the network. This architecture is similar to a multi-tier relational database system where a client sends a SQL query and the back-end database returns exactly what the client asked for, reducing tremendous amount of network traffic.
A multi-tier backup system typically will only backup the delta changes within a file. For example, when backing up a 10GB file, it will try to determine how much of this file has changed and only send the changes to the other side. The process of determining the change within a file is called block matching and it results in a smaller file called delta, which represents the change between source and target file. Depending upon the file type, the actual delta size is usually a small fraction of the original size. Once this block matching is complete and a delta is created, it is sent to the server across the network. The server rebuilds the file by merging the delta with the original file to come up with a new version of the same file on the server machine.
Maximizing resource managementThree resources are involved in typical backup:
When backing up off-site, you will have no choice but to get the network involved. The slower the network, the longer it will take to backup data. Therefore, it is important to transfer just what is needed and multi-tier systems do exactly that.
File VersioningOne other benefit of a multi-tier architecture is its ability to store deltas. Although different versions of the same file can definitely be stored in a single-tier backup system, multiple versions will take significantly more disk space if entire files are store.
Imagine a 10GB file containing 10 versions, which can potentially add up to 100GB on the storage. If a backup system is already creating deltas between source and target, it can save these delta to maintain different versions of the same file. In this case, creating 10 versions of a 10GB file can be a lot smaller than 100GB since one file will be saved in its entirety and the other versions will only be deltas.
FTP and WebDAV - a common misconceptionMany backup vendors offer a remote backup solution through FTP or WebDAV. Although using the above definition of multi-tier backup architecture is valid when backing up to an FTP or a WebDAV server, the second tier (the target side) does not play any role in reducing the network traffic. In other words, an FTP or a WebDAV server cannot compute delta nor it can merge an incoming delta with the existing file on its end. Therefore, if you try copying a 10GB file, it will transfer the entire file over and over again.
WebDAV servers have another limitation: size. The WebDAV protocol sits on top of HTTP, which uses the Content-Length header to hold the size of a file. This is a 32 bit integer and cannot hold a value above 2.14GB () . As a result, backing up large files through WebDAV is not possible.
ConclusionSingle-tier backup systems are great when creating a local backup. However, when creating an off-site backup, a multi-tier backup system works much better. One such multi-tier backup system is Syncrify from Synametrics Technologies, Inc.