Application monitoring tools can help companies tune up their engines.
APP performance| Marshall & Swift may be situated in sunny Los Angeles, but the company barely averted disaster last August when Hurricane Charley ripped through southwest Florida.
Some of the largest insurance companies in the US rely on Marshall & Swift's 200-plus servers to process claims and calculate the costs of rebuilding commercial and residential properties. Within one month of the hurricane making landfall, the number of claims jumped from 20,000 to a whopping 180,000. This sudden surge in server utilization could have spelled disaster. Fortunately, Marshall & Swift had turned to ProactiveNet just a couple of months earlier.
ProactiveNet's flagship product, ProactiveNet 6.0, is a performance measurement and analysis tool that identifies when an application or system is going outside of its normal parameters and pinpoints the most likely source of the problem. In the case of Marshall & Swift, ProactiveNet alerted the company's IT department to an improper balance of application, Web and database servers. Some servers were being underutilized while others were being overburdened, thereby causing degradations in overall system performance.
Using ProactiveNet, Marshall & Swift began the painstaking process of monitoring the usage patterns of each server and identifying peak utilization periods. Although the process took months to perfect, Geoff Garlow, Marshall & Swift's director of operations, says it prepared the company for the deluge of claims precipitated by Hurricane Charley.
"Without ProactiveNet, we would not have survived [the onslaught of claims]," says Garlow. "We were able to actively move certain servers around to ensure that all claims were processed in a timely manner."
Marshall & Swift isn't the only company that's taking a chance on today's string of application performance management (APM) solutions. An increasing number of companies are banking on APM tools to improve application availability and performance, enforce service-level agreements (SLAs), enhance end-user experience and cut infrastructure costs through improved capacity planning. In fact, Jean-Pierre Garbani, vice president for computing systems at Forrester Research, estimates that more than 60 percent of Fortune 2000 companies are using some variation of an APM product today.
At costs ranging from $US100,000 to $US500,000 for a two-year licence, however, Garbani warns that companies shouldn't expect to see instant results from APM tools. Implementation periods can span from days to months, depending on the complexity of a situation. And while there are out-of-the-box monitoring tools for applications from big-name vendors such as SAP and Oracle, Garbani says that companies with custom-built applications will likely have to rely on highly configurable APM tools from niche players. But that's not all. "Understanding how to set parameters for application monitoring software is the biggest challenge," says Garbani, noting that the task of setting standards and thresholds for what constitutes "normal" application behaviour can entail months of adjustments and minor modifications.
Despite these difficulties, businesses can no longer afford to let outages and degradations go unnoticed. According to research company Gartner, application problems are the single largest source of downtime, causing 40 percent of annual downtime hours and 32 percent of average downtime costs.
Companies also pay for performance problems in other ways. Lopsided server utilization wasn't the only price Marshall & Swift was paying for poorly managed applications, for instance. Prior to deploying ProactiveNet, the company was doling out $US220,000 a month to Qwest Communications to monitor, manage, and host its servers and applications. Despite these high costs, Garlow says, Marshall & Swift's clients were constantly complaining of inexplicable outages, annoying lag times and the inability of some applications to support multiple end users. As a result, the company was paying service-level agreement (SLA ) penalties that sometimes exceeded $US20,000 a month.
By leveraging ProactiveNet's capability to provide real-time analyzed performance data and revising its policies and procedures, Garlow says Marshall & Swift now delivers 99.7 percent SLA availability and has eliminated practically all financial penalties from application outages and performance degradations. With an in-house monitoring tool in place and five new employees to manage the system, the company is on the cusp of extending its SLAs to include weekends, thereby broadening its revenue stream. And the company estimates savings of $US1.2 million a year in managed services costs and consultant fees.
Still, Garlow warns that an APM solution shouldn't be viewed as a cure-all. "You can have the best performance monitoring system in the world, but if you don't have the correct staff in place and the correct procedures, then you're just wasting your money," he warns.
Ending Mystery Crashes
At Illinois-based online brokerage OptionsXpress, application performance problems can have a serious impact on livelihoods. Nearly 7000 options traders visit OptionsXpress's Web site at any given time, completing nearly 20,000 transactions a day. With all this online traffic, the brokerage's IT administrators were always up against the clock when re-creating troublesome applications offline in the development environment.
"Even when we did try to re-create the problem a month later, when it finally reached the development queue, the developer was often unable to re-create the situation just based on the time lapse," says David Kalt, president of OptionsXpress. That's because OptionsXpress's application data is constantly being updated as customers perform trades. By the time the company's IT administrators would get around to exploring the problem, it would be next to impossible to re-create the same production environment.
What's more: OptionsXpress's reliance on third-party software would often obfuscate the real source of a problem, causing IT administrators to waste time pointing fingers at vendors. That is until the company deployed Identify's AppSight Black Box software in late 2002.
Rather than replicate an application problem, Identify's Black Box software technology records real-time, forensic logs of software and system events. While the application runs in production, the software captures every system event and condition at every level, from user inputs and system configuration to code. Identify's application support solution, AppSight, then organizes these logs into time-synchronized views to pinpoint the root cause of each problem. The system avoids costly downtime by letting applications remain running, even as problems are being recorded and analyzed.
Vlad Karpel, OptionsXpress's vice president of IT, recalls struggling to unlock the mystery behind a troublesome trading application that was forcing traders to resubmit orders. "At some point," he says, "the application would just die and then restart itself on its own."
Typically, Karpel's IT team would have needed to re-create the entire application, examine every line of code, add tracing statements and recompile the application to identify the source of the problem. However, upon activating AppSight, Karpel quickly discovered that an error in the number of SQL connections was rendering the application unstable.
Dow Chemical didn't want to waste any time discovering the source of its application performance issues. With 90 percent of its clients located outside its Michigan-based headquarters, Dow needed a solution that would help the science and technology company track its end users' online experience. Dow caters to 500 locations worldwide, and its electronic channels generate annual sales of $US5 billion. Failing to accurately monitor how applications perform for customers - from Pennsylvania to Finland - was a risk Dow couldn't afford to take.
International differences in Web browser versions and the multiple ways in which ISPs measure and manage their network layers forced Dow's geographically scattered end users to endure a wide variety of online experiences. For example, Dow's US-based IT department would schedule system backups during the early hours of the morning, not realizing that this was prime time for customers in Japan to place online orders. As a result, these customers would experience order processing delays. Hoping to capture a much-needed, end-to-end view of its worldwide performance metrics, Dow enlisted Mercury Interactive's Service Level Management solution.
The first step for Dow was to examine performance trends and gather baseline information so that the IT department could set realistic service-level objectives for availability and response times for the different geographies they serve. The Mercury Service Level Management solution was then configured to send alerts when performance dipped near those levels. By creating an early warning system, Dow now diagnoses problems and can take action before business interruptions arise. The company can also aggregate service-level data in the form of detailed reports that match specific activities with select time periods so that administrators can pinpoint activities - such as system backups - that may cause delays.
Gaining an end-to-end perspective of its online operations provides "a great deal of comfort" to Dow's customers, according to Mack Murrell, Dow's senior global director of enterprise IT operations and services. But more than simply enhancing customer satisfaction, Mercury has helped Dow increase the availability of key applications by 35 percent by reducing the amount of time it takes to isolate, identify and diagnose these application issues.
By delivering solid results, Mercury is just one of countless vendors to establish a foothold in today's crowded APM market. Management stalwarts such as BMC Software and Computer Associates, up-and-comers such as ProactiveNet and Wily Technology, and 800-pound gorillas such as Hewlett-Packard and IBM all offer APM solutions and services that promise to improve response times and application availability. In fact, research company IDC estimates performance management software revenue will experience a 7.5 percent annual growth rate over the next five years, reaching $US3.6 billion by 2008. All of which leaves companies with little excuse - and plenty of options - for eliminating poorly performing applications.
Join the CIO Australia group on LinkedIn. The group is open to CIOs, IT Directors, COOs, CTOs and senior IT managers.