Back to basics: The differences between Backup and Replication
While data backup and replication have their similarities, they are not the same, and rather than competing with one another can be used as complimentary tools to maximize the efficiency of an IT environment.
Data backup is the process of taking a copy of data at a fixed point in time and storing it for a set time frame (retention) in an alternate location to its original source.
Backups are typically used to make sure regulations and compliance around data protection are being met, and to protect against data loss.
Data replication, also requires a copy of data to be taken and transferred to an alternate storage platform. Replication however, creates a synchronous or near synchronous copy usually designed to limit and reduce any potential down time should primary systems fail.
Backup is an essential tool for organizations of all sizes and goes some way to meeting legislation around data protection and industry compliance around data such as those in The School Financial Value Standards (SFVS), for schools in the UK, The Protection of Personal Information Act (POPI) in South Africa or The Patriots Act in America.
The ability to restore a file that has been lost, corrupted or deleted in an efficient way is a driver for organizations to invest in backup solutions whether these be onsite, offsite or hybrid.
If replication gives me an identical copy almost instantly why is it not an effective backup?
An instant, or close to, fail over will be most effective in the case of a full system failure or loss. A disaster situation such as a fire or flood on a primary site has the potential to cause significant financial loses if a business cannot continue to operate. Being able to fail over and keep systems such as web or mail servers online can allow businesses to keep trading.
If a file is corrupted or deleted on a primary system then this will be copied to the replicated system, so a historical version may be needed to access a usable copy of the data; a backup is one way of making sure there is an intact copy of the file.
Ransomware remains a threat to all organizations and comes in many forms and strains. Should a system become infected with ransomware, this will be replicated to the secondary copy and would also render that system unusable. However, as long as there is a secure off-site backup then the data can be restored back to the primary storage systems.
When is it best to use replication?
Replication has the ability to drastically reduce the Recovery Time and Recovery Point Objective’s (RTO, RPO) of an organization due to its near instant copy and the ability to fail over to secondary systems.
- The Recovery Time Objective (RTO) is the time limit set by the business to have recovered data and have systems running at a normal level. Sub-sets of data may have different RTO’s dependent on their importance or availability.
- The Recovery Point Objective refers to the last available copy of data that can be recovered from and the maximum amount of time between these backup points. If the business can afford to lose a day’s work, this is likely to be set at 24 hours.
Replication as a tool for business continuity and disaster recovery is something that enterprize organizations have relied on for years. Traditionally this would involve a secondary data centre or storage platform being, identical to the primary, being provisioned and maintained at a significant additional cost to the organization.
With the ability to utilize public cloud storage platforms, such as Microsoft’s Azure platform or Amazon’s Glacier platform, and the ability to run virtual machines in these environments replication is becoming more accessible to smaller organizations.
Replication is most effective as a tool for near-instant recovery but not for historical copies or to keep in line with legislation.
Which one should I use?
Whether replication should be used ultimately depends on the requirement of your organization and the policies that are in place. Backup however, should always be used in one form or another.
If there is a requirement for high availability or an RTO of less than 12 hours, then replication is a good fit. However, unless utilizing cloud based infrastructures this can still be a very costly investment.