After distributing a buggy antivirus update that apparently disabled hundreds of thousands of computers on Wednesday, McAfee is still at a loss to explain exactly what happened.
McAfee says that just a small fraction of its corporate customers -- less than 0.5 percent -- were affected by the glitch, which caused some Windows XP Service Pack 3 systems to crash and reboot repeatedly. McAfee blamed a bad virus definition update shipped out Wednesday morning, Pacific time, which ended up quarantining a critical Windows process called svchost.exe.
By the end of the day, the antivirus vendor still couldn't say exactly what caused the problem. "We're investigating how it was possible some customers were impacted and some not," said Joris Evers, a McAfee spokesman, speaking via instant message. One common factor amongst the victims of the glitch, however, is that they'd enabled a feature called "Scan Processes on Enable" in McAfee VirusScan software.
Added in version 8.7 of the product, this feature lets McAfee's malware scanner check processes in the computer's memory when it starts up. According to Evers, it is currently not enabled by default. However, some versions of VirusScan did ship with it enabled. McAfee's instructions for repairing affected computers can be found here.
A large number of users reported major problems after installing McAfee's bad update Wednesday.
Systems at Intel were knocked offline before the bad update could be stopped, according to Intel spokesman Chuck Mulloy. He couldn't say how many PCs were affected, but said that the problem was "significant."
"There were quite a few clients, laptops and PCs [affected]," he said. "We were able to get it stopped fairly early on, but clearly not soon enough."
About 40 percent of machines in Washington's Snohomish County were affected by the problem, according to John Storbeck, the county's engineering services supervisor. "This is a nightmare," he said in an e-mail message.
In Iowa, a local disaster response exercise was disrupted when 911 computer systems crashed, according to Deb Hale a Security Administrator with Internet Service provider Long Lines in Sioux City, Iowa. County IT staff soon started getting calls from other departments --- including police, fire and emergency response -- and began an emergency shutdown of all computers on the assumption that a virus was spreading.
After finishing the exercise, using a radio system for dispatch, participants learned that there was no virus, just a bad McAfee update, Hale said in a blog post. "Thanks to McAfee we were forced to test our response to a disaster while in the midst of a real 'disaster,'" she wrote.
The problem took out PCs at about 40 percent of the customers of U.K. IT outsourcing company Centrality, according to Managing Director Mike Davis. "It's absolutely massive in terms of what we're seeing here," he said in a telephone interview as prepared to leave work at 1.30 a.m.
The problem started late in the afternoon, Davis said. "We started getting calls about 4 p.m. U.K. time on our help desk from customers that were having their XP-based machines just reboot seemingly randomly," he said. After realizing that it was happening to several different customers simultaneously, Centrality quickly figured out that the problem had to do with McAfee's update, and started shutting down McAfee ePolicy Orchestrator management servers to keep the problem from spreading. By then, however, several thousand computers had disappeared from the networks it manages.
Because the update knocked PCs offline that meant that there was no easy way to fix the broken computers over the network, so harried system administrators had to either walk users through the repair process or fix the infected machines themselves, one by one.
For many the problem was strangely similar to a widespread virus outbreak.
"This is the worst glitch that I've ever had to deal with," said Ken Whittaker a desktop support technician with a Michigan university that had about 10,000 desktops affected.
Whittaker said that only his VirusScan 8.7 users were hit -- others, using the older 8.5 version, were not.
It's not unheard of for antivirus vendors to mistakenly flag legitimate software with their updates. Criminals have become so good at switching up their code that companies like McAfee are now churning out millions of signatures in a cat-and-mouse game to identify malware that is in circulation. That leads to errors.
Still, that McAfee allowed a major Windows component to be misidentified demonstrates "a complete failure in their quality control process," said Amrit Williams, CTO with systems management vendor BigFix. "You're not talking about some obscure file from a random third party; you're talking about a critical Windows file," he said. "The fact that it wasn't found is extremely troubling."
Williams knows what he's talking about. He's a former director of engineering with McAfee.
Late Wednesday, McAfee's executive vice president of support, Barry McPherson, posted a short note saying that he had "talked to literally hundreds of my colleagues around the world and emailed thousands to try and find the best way to correct these issues."
He didn't apologize to customers but added, "Let me say this has not been my favorite day. Not for me, or for McAfee. Not by a long shot."
Join the CIO Australia group on LinkedIn. The group is open to CIOs, IT Directors, COOs, CTOs and senior IT managers.