Wisconsin blizzard vs. data center: How Marquette won
- 01 March, 2011 03:40
Marquette University's IT department deployed unified communications tools to improve collaboration among faculty and staff - IT staff collaboration wasn't the priority. But as it turned out, Microsoft's Lync suite of voice, videoconferencing and instant messaging tools proved to be IT's life raft during a snowstorm-related data center calamity.
During a January blizzard so snowy that the Milwaukee-based university closed, the HVAC units that run Marquette's data center short circuited, after wind-driven snow piled up and then melted inside the air conditioning condensers on the roof.
"We ended up losing all three of the AC units," says Dan Smith, Marquette's Senior Director of IT Services.
It quickly became dangerously warm in the data center: Smith and his staff realized IT staffers had to start shutting down servers before they shut down themselves from the heat.
But on a day of heavy snow, when all IT team members were at their homes or travelling, it wasn't easy to get everyone on the same page to discuss a plan of action. The team had not used Lync's conference call feature for an all-hands-on-deck crisis situation before.
[ For complete coverage on Microsoft's SharePoint collaboration software -- including enterprise and cloud adoption trends and reviews of SharePoint 2010 -- see CIO.com's SharePoint Bible. ]
Marquette systems manager Adam Garsha, who was the first to be alerted about the overheating data center, set up a conference call with 12 IT team members using Lync. Marquette had tested Lync as part of Microsoft's TAP program and chose to deploy it instead of Cisco's UC suite due to Microsoft's lower licensing costs. During the rollout, Marquette moved the university's entire faculty and staff off Siemens PBX phone systems and onto VoIP-enabled Polycom phones that use Lync as the call manager.
The Lync conference call allowed IT team members to talk and IM inside the same app (videoconferencing is also a Lync feature but the Marquette did not use video in this instance). There was much discussion about what to do if the AC units were still down when the university reopened the next day.
"We were all at home or travelling, but we all joined a Lync conference call and brainstormed," says Smith. "We realized we'd have to shut down most servers and start up in the redundant data center if the HVAC units could not be fixed. We do not have automatic failover for systems, so that migration could take hours."
At about this time, the Web server in the data center shut down from the heat.
Smith and team members braved the snow and headed to the data center to open windows and turn on fans. The team concluded that if the Web server was down, the same would happen with other important servers. Servers at risk included those running Oracle databases, which control the financial and student registration systems, and the D2L e-learning systems that professors use to post syllabuses and class schedules online.
The team then shut down servers running their biggest systems, but they were able to keep the e-mail and Lync server up. However, "it was warm, and we weren't sure how long we could keep the Lync server running in that heat," says Smith.
Long enough, it turns out. The Lync conference call became the team's virtual headquarters where the dispersed staff worked together to curb a crisis. Team members who went mobile could dial in to the Lync conference on their cellphones that were running OCS R2, Lync's backward-compatible predecessor.
"When one of my admins had to go into the data center, he would switch over to his cellphone but stay in the Lync conference call so we could give him directions on what server to shut down," says Victor Martinez, Marquette's Windows Lead and a Technical Lead for Lync.
After a full day of coordinating, the Marquette IT team was able to get the first HVAC unit fixed by 7 pm, bringing the temperature down. A few hours later the second HVAC was working again, and because the data center can get by with two HVACs, the team started up all servers that night. The third HVAC unit was fixed the next morning.
Marquette IT team members agree that the voice, video and messaging tools in a UC suite - in this case Lync - provide the best means for any business group to communicate and collaborate, Garsha says.
"I was Googling around for a freebie voice and IM Web service because I felt that without Lync we would be in the dark," says Garsha.
At one point, Smith was collecting peoples' personal e-mail addresses as a last resort if the Lync and e-mail server overheated.
"If we lost Lync, we would be sending e-mails around to the group," says Smith, "which is obviously not as timely or efficient as voice and IM in one app."
Shane O'Neill covers Microsoft, Windows, Operating Systems, Productivity Apps and Online Services for CIO.com. Follow Shane on Twitter @smoneill. Follow everything from CIO.com on Twitter @CIOonline and on Facebook. Email Shane at email@example.com
Read more about data center in CIO's Data Center Drilldown.
Join the CIO Australia group on LinkedIn. The group is open to CIOs, IT Directors, COOs, CTOs and senior IT managers.
Why change management doesn’t work
Larry Page wants to see your medical records
Dual-Persona Smartphones Not a BYOD Panacea
After two-year hiatus, EFF accepts bitcoin donations again
CIOs struggle to deliver timely mobile business apps: survey
Implementing A Security Analytics Architecture
According to the 2012 Verizon Data Breach Investigations report, 99% of breaches led to data compromise within “days” or less, whereas 85% of breaches took “weeks” or more to discover. This presents a significant challenge to security teams as it grants attackers extended periods of time within a victim’s environment. More “free time” leads to more stolen data and more digital damage. Principally, this is because today’s security measures aren’t designed to counter today’s more advanced threats. Read on.
New Demands for Real-time Threat Management
Many organisations are evaluating a new security model based upon IT risk management best practices. This is a good idea, but not enough for today’s dynamic and malevolent threat landscape. To keep up with IT changes and external threats, large organisations need to embrace two new security practices: real-time risk management for day-to-day security adjustments and real-time threat management to detect and remediate sophisticated, stealthy, and damaging security breaches (i.e., advanced persistent threats, or APTs). Learn more.
Mobile Load - Performance Testing for Mobile Applications
Key mobile trends and analysis on how performance testers must change their testing methodologies to ensure they are accounting for the changes caused by mobile usage. Download today.