Splunk adds visibility into virtual environments
- 10 September, 2012 19:46
When transitioning workloads to virtual environments, one of the big drawbacks for data center administrators can be a loss of visibility.
When a problem occurs, it can be difficult to get a handle on details like which users are affected and by how much as well as the causal links between the user layer, the application layer and the underlying infrastructure. This is often because the hypervisor abstracts the data about the underlying hardware.
"Monitoring the dynamic nature of virtualization with tools designed for single-technology silos creates a significant challenge for administrators," says Dave Bartoletti, senior analyst at Forrester Research. "There is a growing need for solutions that provide cross-tier visibility to effectively troubleshoot, monitor and analyze data across silos and deliver real-time business insights and operational intelligence."
Splunk--—provider of an engine that collects, indexes and analyzes massive volumes of machine-generated data--—thinks big data is the answer. Splunk customer CloudShare, —a San Mateo, Calif.-based provider of pre-production cloud for dev and test, demos and POCs,— sees a constant stream of data from its network/gateways/firewalls, backend, virtual machines, applications, web servers, databases and storage.
CloudShare's infrastructure as a service (IaaS) platform is designed to grant each customer--—including a large number of Fortune 500 firms like HP, SAP, Microsoft and IBM--—its own private multi-VM networked environment, including compute resources, networking, IP and preinstalled OS. During peak hours, its system performs about 500 VM resume/suspend operations an hour. Its VMware performance data alone comes in at about 2 million events per hour.
Getting a handle on that data, let alone correlating and analyzing it, is a tricky proposition. In its early days, Elad Gotfrid, CloudShare's director of IT, says the company got by with traditional monitoring tools. But it soon outgrew them.
Scaling Out With Splunk
"In the beginning, we used a traditional monitoring tool, which was good for a small scale," Gotfrid says. "Once you start to grow up, you see the scale doesn't allow you to use a traditional monitoring system anymore. You need higher visibility."
Gotfrid explains that CloudShare went with a new offering from Splunk--—then in beta--—called Splunk App for VMware, specifically designed for the VMware virtual layer. Originally, CloudShare brought in Splunk to monitor the performance of its virtual machines. But once the company saw the possibilities, it spread to every area of the business. He notes that CloudShare uses Splunk to collect performance stats, logs and events from the virtualization layer and then correlate that information with network, storage, OS and application events. This allows IT to contextualize infrastructure data and track business metrics such as usage and resource costs per trial and business user.
Dashboards link operational data from both physical and virtual sources, providing vital information to network operations, customer support, marketing, sales and R&D. CloudShare even leverages it to fight fraud by using network device and firewall information to create attack signatures that trigger automatic blocks or trigger alerts to network operations.
"At CloudShare, we think of Splunk as our eyes and ears," Gotfrid says. "Splunk software enables us to understand and oversee every aspect of our operations. The key asset we achieve from Splunk software is the ability to correlate business data with performance metrics. Compiling data about our customers and understanding which resources are being utilized allows us to understand and plan our capacity based on clear trends we identify."
Thor Olavsrud covers IT Security, Big Data, Open Source, Microsoft Tools and Servers for CIO.com. Follow Thor on Twitter @ThorOlavsrud. Follow everything from CIO.com on Twitter @CIOonline and on Facebook. Email Thor at email@example.com
Read more about applications in CIO's Applications Drilldown.
Join the CIO Australia group on LinkedIn. The group is open to CIOs, IT Directors, COOs, CTOs and senior IT managers.
Why IT projects really fail
The enlightened CIO’s guide to running projects
Why IT projects really fail
Queensland government to provide 200 services online by 2015
Call Centers Suffer From Big Data Overload
The Evolution and Value of Purpose-Built Backup Appliances
Customers today are still grappling with subpar backup performance as systems outstrip the allotted backup window time. Strategies for data protection and recovery continue to be dictated by aggressive SLAs, rapid recovery, and ease of integration in existing environments. As a result, firms have started to embrace more disk-based data protection technologies, including purpose-built backup appliances (PBBAs) to protect and recover data and applications. This white paper explores the measurable benefits of PBBA systems for customers, with a focus on the increased use and adoption patterns of both integrated and targeted systems.
Deploying Flash in the Enterprise: Cost Comparison
Flash is quickly emerging as the preferred way to overcome performance limitations of hard disk drives, especially when your capacity requirements are relatively small and you require high performance. In this price comparison, we compare each solution's performance and capacity to address different storage challenges. Click here to download!
Case Study: The True Value of Conference Calling
In a study by the University of Bradford study, we look at the benefits of a strong telepresence and how organisations can become faster, more focused and environmentally responsible. Click to download!