This week CIO is exploring the top tech trends expected to disrupt data centres in the future, as well as IT departments, business units and the c-suite.
In a previous article published on Monday – 5 tech trends that will impact on data centres in future – we discuss the first five trends as shared by research analyst, David Cappuccio, at the Gartner Infrastructure, Operations & Data Centre Summit in Sydney last week.
With IT workers often slaving away over the last big project and spending up to three quarters of their budget keeping the lights on, new trends swoop in with the potential to disrupt the entire company. But without forward planning and consideration, IT can’t see the forest for the trees.
Below are the next five of Cappuccio’s top 10 IT trends, ranging from technology, societal and organisational changes, that are likely to have an impact on IT operations.
In the previous article we ended on integrated systems. In this piece, we start with the opposite - disaggregated systems.
“Put them together – take them apart,” says Cappuccio.
The basis of this trend is that by targeting processors, you can rely on various components and capabilities, as a system with a shared set of interconnects means users could easily swap in and out components as they need.
“Rather than upgrading severs and all the pieces attached to it – because today I have a processer, I/O, memory, power supplies, etc. I’m upgrading everything – what if I just upgrade to a new processor only? Everything else is still usable,” says Cappuccio.
Many large web scale companies like Facebook are following this trend, and as networking and storage infrastructure is frequently purchased and configured separately from servers, these tech giants are buying many of their components from smaller vendors in Taiwan or China as opposed to more US-based technology stalwarts.
“Buying a shared interface using something like silicon photonics at Intel, based on a rack environment, I could put these components anywhere within that rack - so long as that interconnect was the same and they have the same interfaces,” adds Cappuccio.
“I could grow that thing as I needed to grow it, so it could have massive scale, massive amount of compute, very small footprint, and a fairly low price point relative to general purpose because I’m buying straight from the manufacturer.”
Though disaggregated systems are a good solution for large web scale environments, Cappuccio says it’s not quite ready for prime time in general-purpose environment due to a lack of access to original design manufacturers.
“Do I have supply chain or maintenance cycle between them? Do I have a parts supply? Do I have vendors who actually understand what these things and can maintain it for me? In many cases today, the answer is no.”
He likens this trend to the open software movement, which first took off in the '90s where enterprises and software developers had free access to large amounts of code, but it only took off once vendors stood up and offered to ensure the free software was valid and tested, along with maintenance services.
“Those vendors took over responsibility to maintain it…it’s hard to find an enterprise today without some type of open software running for the web servers or whatever, but it’s all managed by these vendors,” says Cappuccio.
“Open hardware is the same thing right now; we have early phases of movement where we’re beginning to see vendors pop up; we’re early in the life cycle right now but we think it’s going to have an impact.”
IT leaders are seeing more and more intelligence built into infrastructures, especially moving forward into sophisticated data analytics. So how does it impact them?
There has been a level of analytics in IT for quite some time, with many companies putting in data centre infrastructure management (DCIM) tools to manage energy consumption. In response, vendors have started to tie in asset management and workflow management tools into their offerings to allow a more granular vision of data surrounding these assets.
The next step involves adding a whole a level of diagnostics in order to get ideas about not just what happened and why, but what will happen next.
“You can ask the data - if I install these new devices what’s going to happen for the truth of IT? Or I want to accomplish something, how can we make it happen?” says Cappuccio.
“Either let me run the intelligence to understand what could happen down the road and how it can happen – or, better yet, do this analysis for me and let me know that you’re seeing what I can’t see.”
We’re starting to see this kind of analytics impacting on generators, storage, and eventually networking. This is a level of analysis that most companies aren’t using yet, but Gartner anticipate it taking off mostly in hybrid environments.
“With hybrid today, I’ve got something running off-premise, and something else running on-premise and the problem is, at the end of the day, you still own that end user experience.
“You need somebody who understands how all those pieces tie together and who can monitor them with very granular detail, and which software vendors can do that today? Not many,” says Cappuccio.
This trend is useful in determining the right way to optimise workloads, including the right time to shut down or move them, along with the appropriate restructures for applications according to demand.
Proactive infrastructures should gain credibility as IT shops don’t have the right skillsets to understand how all the pieces tie together themselves, nor the time to learn.
“As we get closer to prescriptive analytics, all that system support and automation is going to be run via software. This will have impact on the human effort and this is good news or bad news depending on where you’re coming from,” says Cappuccio.
“Some would argue the more I do with analytics, that the more dangerous it is because the fewer skills I’m going to have in my organisation.
“I would say if you did it right, then there’s still a need for those high level workers at the top of each one of those vertical stacks. They are going to be involved, as the most credible people you have, but there’s not going to be many of them.”
IT service continuity
There has been a series of unsuccessful disaster recovery plans, despite excellent infrastructure, that has led many organisations to question why we approach DR from a data centre perspective to gain 100 per cent availability when it’s really about service.
“A few years ago we saw this happen a lot in the US – a number of companies had been hit by natural disasters, with big snow storms and hurricanes on the east coast,” says Cappuccio.
“A lot of them had been through events through other disasters over the years and they had a few massive data centres and good DR plans and they’re all sitting back, thinking ‘it’s a storm, big deal’.”
Sure enough, when the storm hit, all the data centres all stayed up. The real problems began once the state shut down the roads due to hazards like falling trees, preventing fuel trucks from reaching the data centres.
“If the trucks can’t get to the data centre, they can’t run them, and this little light bulb went off and they said, ‘we’ve designed disaster recovery around the wrong thing – it’s not about the hardware it’s about the service,” says Cappuccio.
Those companies then began to consult with providers, asking if they could design an environment where specific services will always be available. Not just a tier 4 data centre, but running the service in multiple places at once.
“With a good enough network, and a trustworthy level of latency, if I know a storm is a potential risk to my company I can offload the service somewhere else temporarily until the risk is evaded, then we can bring it back again. Or I can run multiple services in different places at the same time and link them together,” he says.
Many e-commerce companies are now putting web services and their customer data bases in several sites at the same time, and practicing load balancing across different disaster zones. If any of the sites fail, the customer data can then be rerouted to a secondary site.
“They don’t see the difference and that’s the key,” says Cappuccio. “The customers’ perception is that you’re always available - it might be a slightly degraded performance but it’s available.
“If that site goes down, you start losing customers and we all know winning back customers is a whole lot more difficult than gaining new ones. Last impressions are forever.”
Indeed, reputation drives customer retention, which drives revenue, and so businesses have realised disaster recovery needs to be designed around business continuity for critical applications.
“I’ve seen people build tier 4 environments across multiple tier 2 data centres which actually cost them less at the end of the day than if they’d built out one tier 4 data centre by themselves,” adds Cappuccio.
Part of the problem for service continuity is determining what’s critical and what’s not. IT teams need to sit down with business units to single out mission critical applications, and the business impact if these goes down. Meanwhile, non-mission critical services and applications can be shifted to traditional disaster recovery services.
“It’s like asking a business unit, from a DR perspective, how fast do you need this back up? Well, now! Right well that will cost you a million dollars a minute – okay we’ll need it next month… you’ve got to ask the right questions,” says Cappuccio.
On top of these changes, location and networking options are also becoming the key to everything. Colocation providers like Equinix have recently realised that their data centres were becoming irrelevant, and the real offering was the network between them.
“They presented the idea that if you want to run services in your data centre, but your racks are in their data centre, they can offer you an almost guaranteed latency between sites all around the world.”
If you want to run multiple services or the same services in multiple places around the world, these colocation and hosting providers can do that for you. On top of this, some of their customers could be cloud providers.
“Say you wanted to run PaaS on Azure, well they’d say Azure happens to be running at 19 of this provider’s data centres around the world, and so they’re able to grant you private access to public cloud providers in the same site,” adds Cappuccio.
“Suddenly the way I look at colocation and hosting begins to change pretty dramatically. It’s all about service continuity not about what’s the cost of floor space. It’s not about hardware anymore.”
This trend explores how IT can deal with the business pushing them to deliver things really fast, while at the same time we're trying to follow a strict process.
“We don’t like to do stupid things for the right reasons, we don't want to mess everything else up in order to be fast, but demand for an agile IT environment is increasing across the board and we don’t have the skills to cater to it, so it’s a huge problem,” says Cappuccio.
Now there’s a new term called ‘bimodal’, which came out of the DevOps world. The idea of bimodal IT is that nowadays to develop an application, we need to get it out there as fast as possible, and it if fails, we’ll fix it as we go.
Historically, IT is designed around the idea that protecting the business is paramount. If critical applications go down, the business starts losing money, so processes and controls are in place that makes operations slower but safer.
“Nowadays, to enable the business we’ve got to do deadlocks, we’ve got to do fast development, we’ve got to put in change as quickly as possible, which is totally contradictory to the first objective – and that’s bimodal,” says Cappuccio.
So mode one is the old way and mode two is the new way. Mode one is all about reliability, planning, control and change management. Mode two is the opposite, it’s all about agility.
“Go for it; fail. Failure’s good because you learn from failure. Don’t do the same thing twice but learn from that failure and move on. It’s all about revenue, customer experience, and reputation,” says Cappuccio.
Most companies look at this and ask, how do we get everything to mode two? The reality is, you’ve got to do both, depending on different application types or business goals. It is a cultural shift that has to be absorbed into your existing culture, rather than replacing it.
Another problem with bimodal IT, according to Cappuccio, is that the people who only work one side – the IT traditionalists - actually want to be working in mode two because it looks like more fun.
“They’re both important, so we need to find a way to incentivise people to do mode one or jump back and forth between both.”
Scarcity of IT skills
This is getting worse and worse, according to Cappuccio, and not because there aren't people out there to hire, but because the complexity of IT is becoming so great that finding people who already understand everything is a major challenge for shops.
“It’s not that we don’t have smart people, we do. The problem is in many organisations we’ve got people organised in vertical stacks where they’re very good at what they do and they’re incentivised to be good at what they do.
“A virtualisation expert is an expert we can’t live without, that expertise is their claim to fame, and they want to stay in that place because of that. But I would offer that in many organisations you’ll find that the people on top of those vertical stacks are getting bored because they’re doing the same thing day in day out.”
The real challenge is getting those experts out of that box; not changing their job but getting them thinking horizontally, instead of vertically. The more those high level workers are spread around, the more skills they have in other technologies and fields, and thus being of more value to IT as a whole.
“When you have an application that’s running like crap, you ask the developers what’s wrong with it, and what do they say first? It’s the network! Then the network team say, it’s the storage!
“Everybody blames everybody else - all the vertical stacks always have a reason why it’s not their problem. But a person who can look at all the pieces and say it’s a cascade effect, when that happens this happens and it’s all tied together, they’re what we need for business.”
This also helps with employee engagement, as workers who leave their box start getting more interested in their job because they’re learning something new.
“They’re not being penalised for doing something wrong because they’re not a storage expert, they’re learning and that’s okay.
“Then other people lower in the stacks see this and think hey, it’s okay for me to get out of the box, I can start learning new things too, so you start getting new people going horizontally.”
This system is not for everyone, as Cappuccio says, “you don’t want to be a mile wide and an inch deep”, but you need some people who have expertise across the board, at a time when everything is going to be chaos.