Private cloud computing offers a number of significant advantages – including lower costs, faster server deployments, and higher levels of resiliency. What is often over looked is how the Private Cloud can dramatically changes the game for IT disaster recovery in terms of significantly lower costs, faster recovery times, and enhanced testability.
Before we talk about the private cloud, let’s explore the challenges of IT disaster recovery for traditional server systems.
Most legacy IT systems are comprised of a heterogeneous set of hardware platforms – added to the system over time – with different processors, memory, drives, BIOS, and I/O systems. In a production environment, these heterogeneous systems work as designed, and the applications are loaded onto the servers and maintained and patched over time.
Offsite backups of these heterogeneous systems can be performed and safely stored at an offsite location. There are really 2 options for backing up and restoring the systems:
1) Back up the data only – where the files are backed up from the local server hard drives to the offsite location either through tapes, online or between data centers over a dedicated fiber connection. The goal is to assure that all of the data is captured and recoverable. To recover the server in the case of a disaster, the operating system needs to be reloaded and patched to the same level as the production server, the applications need to be reloaded, re-patched, and configured, and then the backed up data can be restored to the server. Reloading the operating system and applications can be a time consuming process, and assuring that the system and applications are patched to the same levels as the production server can be subject to human memory and error – both of which can lengthen the recovery time. (This is why I hate upgrading my laptop hardware. I have to invest days to get a new laptop to match the configuration of my old laptop).
2) Bare Metal Restore – a much faster way to recover the entire system. BMR creates an entire snapshot of the operating system, applications, system registry and data files, and restores the entire system on similar hardware exactly as it was configured in the production system. The gotcha is the “similar hardware” requirement. This often requires the same CPU version, BIOS, and I/O configuration to assure the recovery will be operational. In a heterogeneous server environment, duplicate servers need to be on-hand to execute a bare metal restoration for disaster recovery. As a result, IT disaster recovery for heterogeneous servers systems either sacrifice recovery time or requires the hardware investment be fully duplicated for a bare metal restoration to be successful.
Enter disaster recovery for private cloud computing. First, with all of the discussion about “cloud computing”, let me define what I mean by private cloud computing. Private Cloud computing is a virtualized server environment that is:
Designed for rapid server deployment – as with both public and private clouds, one of the key advantages of cloud computing is that servers can be turned up & spun down at the drop of a hat.
Dedicated – the hardware, data storage and network are dedicated to a single client or company and not shared between different users.
Secure – Because the network is dedicated to a single client, it is connected only to that client’s dedicated servers and storage.
Compliant – with the dedicated secure environment, PCI, HIPAA, and SOX compliance is easily achieved.
As opposed to public cloud computing paradigms, which are generally deployed as web servers or development systems, private cloud computing systems are preferred by mid and large size enterprises because they meet the security and compliance requirements of these larger organizations and their customers.
When production applications are loaded and running on a private cloud, they enjoy a couple of key attributes which dramatically redefine the approach to disaster recovery:
1) The servers are virtualized, thereby abstracting the operating system and applications from the hardware.
2) Typically (but not required) the cloud runs on a common set of hardware hosts – and the private cloud footprint can be expanded by simply adding an additional host.
3) Many larger private cloud implementations are running with a dedicated SAN and dedicated cloud controller. The virtualization in the private cloud provides the benefits of bare metal restoration without being tied to particular hardware. The virtual server can be backed up as a “snapshot” including the operating system, applications, system registry and data – and restored on another hardware host very quickly.
This opens up 4 options for disaster recovery, depending on the recovery time objective goal.
1) Offsite Backup – The simplest and fastest way to assure that the data is safe and offsite is to back up the servers to a second date center that is geographically distanced from the production site. If a disaster occurs, new hardware will need to be located to run the system on, which can extend the recovery time depending on the hardware availability at the time of disaster.
2) Dedicated Warm Site Disaster Recovery – This involves placing hardware servers at the offsite data center. If a disaster occurs, the backed up virtual servers can be quickly restored to the host platforms. One advantage to note here is that the hardware does not need to match the production hardware. The disaster recovery site can use a scaled down set of hardware to host a select number of virtual servers or run at a slower throughput than the production environment.
3) Shared Warm Site Disaster Recovery – In this case, the private cloud provider delivers the disaster recovery hardware at a separate data center and “shares” the hardware among a number of clients on a “first declared, first served” basis. Because most disaster recovery hardware sits idle and clients typically don’t experience a production disaster at the same time, the warm site servers can be offered at a fraction of the cost of a dedicated solution by sharing the platforms across customers.
4) Hot Site SAN-SAN Replication – Although more expensive than warm site disaster recovery, SAN-SAN replication between clouds at the production and disaster recovery sites provides the fastest recovery and lowest data latency between systems. Depending on the recovery objectives, the secondary SAN can be more cost effective in terms of the amount and type of storage, and the number and size of physical hardware servers can also be scaled back to accommodate a lower performance solution in case of a disaster.
Conclusion: An often overlooked benefit of private cloud computing is how it changes the IT disaster recovery game. Once applications are in production in a private cloud, disaster recovery across data centers can be done at a fraction of the cost compared to traditional heterogeneous systems, and deliver much faster recovery times.
Mike T Klein