Are You Really Protected from Disaster?

High Availability / Disaster Recovery
Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times

Just how reliant is your organization on technology to carry out its business objectives? If you’re among the majority that collectively spent billions of dollars to ensure that their companies successfully made it through the century date change, the answer is very reliant. Yet, when it comes to ensuring that mission-critical business operations don’t come to a standstill due to other potential disasters, most companies have not been quite so astute at making certain that these same systems remain operational. In fact, enterprise resource planning (ERP), Internet-based applications, and globalization, among other issues, are driving the need for 24/7 availability. As a result, more organizations are finding that they require zero data loss and recovery time measured in minutes rather than hours if critical operations are disrupted.

The reality in most organizations is that this is simply not being achieved. In fact, the Vulnerability Index—an independently conducted survey commissioned by Comdisco and cosponsored by BellSouth that identifies how prepared companies are in the event of disruptions or lack of access to computer systems, data, and applications—indicates that the majority of organizations do not have practices in place to deliver on that level of expectation. In many instances, organizations require high availability, yet their plans (if they have them) are not effective, and applications such as e-commerce promise to make a serious problem only more acute.

Missing Effectiveness in Program Planning

Given the increasing importance of technology in carrying out strategic business goals, logic would dictate that organizations have programs in place to protect these critical systems and applications. However, as Figure 1 indicates, the majority of organizations do not have effective programs in place to ensure the viability of systems that support their operations. At best, just 39 percent of organizations have an effective continuity program for enterprise application servers, according to the 1999 Vulnerability Index. While this is a slight increase from 1997, when 35 percent of organizations reported effective programs in place, it does not represent increasing reliance on these systems. Similarly, just about one- third of organizations have effective continuity programs in place for their WANs and LANs.

Of growing concern is the fact that even fewer organizations have continuity programs for their Internet-related activities. Currently, only 14 percent of those with


Internet applications have effective Internet business continuity plans in place. As more organizations move to Internet-reliant functions, this lack of preparation will become even more alarming.

Bad Things Do Happen

Just how important is a formal business continuity plan to a company’s ability to reduce its vulnerability successfully? It’s plenty important, according to the study’s findings. For example, among companies without formal programs, average Vulnerability Index scores were 175 percent higher for enterprise and application servers and 34 percent higher for Internet-based applications than scores for those with formal continuity programs in place.

Those taking the hope-for-the-best approach should be aware that the odds aren’t so good when it comes to avoiding disaster. About one in four organizations (26 percent) participating in this survey has experienced a disruption of its computer systems or an inability to access them. Additionally, as Figure 2 shows, among companies that experienced a disruption, nearly one-fourth had a disruption of more than 24 hours. The median length of time of these companies’ computer disruptions was eight hours.

High Availability Required but Not in Place

Given the criticality of systems and applications, these types of disruptions are intolerable to most organizations. In fact, nearly one-half of the organizations surveyed said that they require 99 percent or greater availability of key applications. However, as Figure 3 shows, the most common steps taken—fault tolerant hardware and redundant hardware on site—do not address availability should an organization not have access to their location. Additionally, 13 percent have taken no steps whatsoever to ensure high availability.

Generally, the level of availability required is based on the criticality of systems in day-to-day operations and on the financial impact of not having those systems operational. In continuity terms, this translates into recovery time objectives and recovery point objectives. A recovery time objective (RTO) is the amount of downtime tolerable as a result of a disruption, and a recovery point objective (RPO) is the amount of data loss tolerable from the point of the last backup to the disaster. For example, most organizations surveyed indicated that they typically target an RTO of between four and 24 hours; one in five companies reported objectives of less than four hours. RPOs are even more stringent, with one in four companies reporting that it can tolerate no loss of data under any circumstances.

Two factors make it increasingly difficult for organizations to achieve their RTOs and RPOs using traditional recovery techniques. First is the sheer volume of data that must be recovered and the impact that has on RTOs. For example, simply loading backup tapes for a large database can easily take eight or more hours, not counting the time it takes to ship tapes to an alternate site or the dozens of other steps that must occur in the recovery process. Second is the fact that all data entry occurs online with applications such as e- commerce and electronic data interchange (EDI), significantly complicating RPOs. In these instances, the only record of transactions is the electronic data itself. Therefore, there is no way to re-create transactions manually. If the system goes down, those transactions that have occurred since the last backup are lost. Given these factors, the reality is that most organizations seeking single-digit or even double-digit RTOs and RPOs, depending on their volume of data and backup procedures, have to employ one or more high-availability solutions, be they remote journaling, mirroring, clustering, et cetera.

One reason why organizations may not employ these high-availability solutions to the extent required is that they simply have no idea of whether or not they’re within their objectives. In fact, a significant number of companies have not validated their RTOs and RPOs. As Figure 4 shows, only slightly more than one-half of those with enterprise or application servers have validated RTOs and RPOs for these systems, indicating a serious discrepancy between required availability and what may indeed be in place in most organizations.


When Online Applications Go Offline

This lack of preparation and testing of RTOs and RPOs in traditional environments becomes even more alarming considering the increasing importance of the Internet. For example, more than one-third of organizations now report relying on the Internet for business-critical applications, yet only one-third of these organizations have any plan in place, and only 14 percent have a truly effective continuity program when measured against key components that should be in a comprehensive continuity plan for Internet operations. These measures include Internet-specific factors as well as those that should be a part of any effective continuity program. (See below for more on these components.)

As Figure 5 indicates, for the minority of organizations that have put programs in place to reduce the vulnerability of their Internet-reliant business functions, the most commonly practiced procedure is management of short-term outages. However, critical elements, such as data synchronization, multiple ISPs, and testing and evaluation programs, are much less likely to be used. With two-thirds of organizations believing that they will eventually use the Internet as a vehicle for conducting e-commerce and with the publicity surrounding outages of several well-known companies, it is of growing concern that more organizations aren’t taking precautions to protect the availability of these applications.

Vulnerability to Availability

There is no doubt that organizations are highly reliant on their technology infrastructures to support the goals of their businesses. Based on findings from the Vulnerability Index, it’s also apparent that these infrastructures are often at considerable risk. The issue now becomes how to minimize risk and ensure that systems are available. These are some of the basic steps that organizations should take to develop effective continuity programs for their technology infrastructures:

• Naming a crisis management team with clearly delineated responsibilities
• Estimating the cost of a disruption
• Creating an ongoing system for preventing or reducing the likelihood of a disruption
• Identifying critical information
• Storing computer data off-site
• Recovering critical information
• Designating an alternate site to relocate to in the event of a disruption
• Establishing procedures for communicating a plan within the organization
• Planning for evacuation and transportation of key people and materials
• Having a command center for continuing operations
• Identifying necessary support staff
• Having a testing and evaluation program for a recovery plan

For WAN applications, companies should also include the following components in their continuity programs:

• Alternate WAN transport
• A process for identifying single points of failure
• Management of short-term outages
• Testing and evaluation
• A detailed, written plan
• Local server replication
• Redundant routers

And for Internet-based applications, companies should include the following:


• Alternate network access
• An ongoing system for preventing single points of failure
• Management of short-term outages
• A testing and evaluation program
• A detailed, written plan
• Local server replication
• Redundant routers
• Data synchronization
• Multiple ISPs or multihoming

The ability of organizations to conduct business is now so tightly tied to computer and communication technology that, in many instances, if the technology is not available, the business simply cannot function. As a result, it’s crucial that those responsible for ensuring the availability of the IT infrastructure understand the impact not having that infrastructure or specific components available would have on critical business operations. Once this is known, put a program in place that reflects the importance of these technologies. If you notice there are steps that I have outlined that your organization hasn’t taken, the time to do so is now.

If you’d like to get a quick snapshot of where your programs stand today, you can take your own Vulnerability Index assessment, accessible via Comdisco’s Web site at www.comdisco.com. After completing a brief online questionnaire, you will receive a customized analysis that compares your company’s progress with that of respondents to the Vulnerability Index.

ENTERPRISE APPLICATIONS AND SERVERS WIDE AREA NETWORKS LOCAL AREA NETWORKS
INTERNET-RELATED FUNCTIONS

Are_You_Really_Protected_from_Disaster-04-00.png 122x218

Are_You_Really_Protected_from_Disaster-04-01.png 370x218

Continuity Program Effectiveness by Type of Technology Effective Partially Ineffective

39% 22% 38% 36% 20% 44%

28% 15% 57%

14% 22% 64%

Figure 1: Most organizations do not have effective high-availability and continuity operations in place.


Are_You_Really_Protected_from_Disaster-05-00.png 372x231

35%

30%

25%

20%

15%


33%


24%



20%

18%

4 hours or less 5 to 8 hours 9 to 24 hours More than 24 hours 11%

13%

0% 10% 20% 30% 40% 50% 60% 70%

Figure 2: Among companies that have experienced a disruption to their computing environment, nearly one in four reported outages lasting more than 24 hours.

Are_You_Really_Protected_from_Disaster-05-01.png 363x245

High-availibility Steps Taken Among Organizations Requiring High Availibility

66%

Fault-tolerant hardware Redundant hardware (on-site) Redundant hardware at business continuity site Clustering (on-site) Standby applications at business continuity site Remote clustering No steps taken

56%

32%

27%

21%

Figure 3: Fault-tolerant servers are the most popular high-availability options that customers choose.


Are_You_Really_Protected_from_Disaster-06-00.png 358x234

Organization Has Validated RTO and RPO Objectives

Among Companies with Each Type of System

100%

80%

60%

40%

20%

0%

Critical Procedures of an Internet Business Continuity Plan

Time Objective Point Objective

Wide Area Network

Enterprise/ Application Servers Local Area Network

Figure 4: Only slightly more than half of the companies surveyed have tested their RTOs and RPOs against their ability to recover for their main servers.

Management of short-term outages System to prevent single points of failure Local server replication Alternative network access Testing and evaluation for disaster recovery Redundant routers Written recovery plan Data synchronization Multiple ISPs None of the above

58%

0% 20% 40% 60% 80% 100%

43%

40%

35%

35%

33%

33%

29%

22%

33%

Figure 5: Doing business on the Web amplifies the need for recovery problems.


BLOG COMMENTS POWERED BY DISQUS

LATEST COMMENTS

Support MC Press Online

$0.00 Raised:
$