View blogs | Login

Single vs Multi-tier Backup

With so many backup technologies and products on the market today, it can become very difficult to understand and select the right technology for your environment. Although most backup software perform similar tasks, the methods used by these products can be quite different. One such method lies in the notion of Single vs Multiple Tier Backup.

Single-tier Backup

In the simplest terms, a single tier backup has just one CPU is involved. In other words, there is only one machine process backups bringing the files to the their target destination. For example, backing up files to a USB drive or a tape drive. In this case the process that reads data from a source is also responsible for writing the files to the target device.

Multi-tier Backup

Unlike single tier backups, two CPUs are involved, running two different processes. The first process is responsible for reading data and sending it to the target machine. The second process receives incoming data from the source and is responsible for writing it to the destination. Typically, the process running on the source machine is called a client and the one running on the target is called a server. A good example of a multi-tier backup is a private or public cloud backup.

Comparing the two mechanisms

Single tier backups are suitable when the source and target are located physically close to each other. In other words, a single-tier backup is typically used to backup data locally. In such cases, the entire file is copied to the destination even if just a fraction of data has been modified. Since the data transfer speed is typically very fast, there is no need to determine how much of a file has been modified between source and target. As far as I/O goes, this is done with one read operation on the source and one write operation to the target.

On the other hand, a multi-tier mechanism is a better choice when data needs to be sent offsite. Most multi-tier backup systems assume they are working on a slow network, for example, across the Internet therefore, they try to minimize network traffic as much as possible. In such cases a backup client on the source machine communicates with a corresponding server process running on the target side. These two processes (client and server) together determine exactly what needs to be sent across the network. This architecture is similar to a multi-tier relational database system where a client sends a SQL query and the back-end database returns exactly what the client asked for, reducing tremendous amount of network traffic.

A multi-tier backup system typically will only backup the delta changes within a file. For example, when backing up a 10GB file, it will try to determine how much of this file has changed and only send the changes to the other side. The process of determining the change within a file is called block matching and it results in a smaller file called delta, which represents the change between source and target file. Depending upon the file type, the actual delta size is usually a small fraction of the original size. Once this block matching is complete and a delta is created, it is sent to the server across the network. The server rebuilds the file by merging the delta with the original file to come up with a new version of the same file on the server machine.

Maximizing resource management

Three resources are involved in typical backup:
  1. CPU
  2. Disk I/O
  3. Network
CPUs nowadays are very fast. Disk I/Os, although not as fast as CPUs, are still faster than sending data over a network. Since no network is usually involved in a local backup, utilizing the two faster resources can get the job done and the slower resource never come into picture. This is the reason why single-tier backups work great when backing up locally.

When backing up off-site, you will have no choice but to get the network involved. The slower the network, the longer it will take to backup data. Therefore, it is important to transfer just what is needed and multi-tier systems do exactly that.

File Versioning

One other benefit of a multi-tier architecture is its ability to store deltas. Although different versions of the same file can definitely be stored in a single-tier backup system, multiple versions will take significantly more disk space if entire files are store.

Imagine a 10GB file containing 10 versions, which can potentially add up to 100GB on the storage. If a backup system is already creating deltas between source and target, it can save these delta to maintain different versions of the same file. In this case, creating 10 versions of a 10GB file can be a lot smaller than 100GB since one file will be saved in its entirety and the other versions will only be deltas.

FTP and WebDAV - a common misconception

Many backup vendors offer a remote backup solution through FTP or WebDAV. Although using the above definition of multi-tier backup architecture is valid when backing up to an FTP or a WebDAV server, the second tier (the target side) does not play any role in reducing the network traffic. In other words, an FTP or a WebDAV server cannot compute delta nor it can merge an incoming delta with the existing file on its end. Therefore, if you try copying a 10GB file, it will transfer the entire file over and over again.

WebDAV servers have another limitation: size. The WebDAV protocol sits on top of HTTP, which uses the Content-Length header to hold the size of a file. This is a 32 bit integer and cannot hold a value above 2.14GB () . As a result, backing up large files through WebDAV is not possible.

Conclusion

Single-tier backup systems are great when creating a local backup. However, when creating an off-site backup, a multi-tier backup system works much better. One such multi-tier backup system is Syncrify from Synametrics Technologies, Inc.


Created on: Apr 13, 2015
Last updated on: Nov 30, 2024

LEAVE A COMMENT

Your email address will not be published.

Navigation

Social Media

Powered by 10MinutesWeb.com