Hadoop skills are in high demand
- 11 November, 2011 10:04
- Comments
NEW YORK -- The growing enterprise interest in Hadoop and related technologies is driving demand for professionals with big data skills.
Analysts and IT managers at the Hadoop World conference here this week repeatedly pointed to skills availability as one of the key challenges companies face in adopting Hadoop and said that those with the right skills could command healthy premiums.
One indication of just how limited that skills supply is: IT executives from JP Morgan Chase and EBay who delivered keynote addresses at the conference used the opportunity to recruit from the audience.
Hugh Williams, vice president of experience, search and platforms at EBay, told audience members that the auction site is recruiting Hadoop professionals and he invited those interested in exploring opportunities to speak with him.
Larry Feinsmith, managing director at JP Morgan Chase, who followed Williams, only half-jokingly told the audience that Chase was also hiring and would be willing to pay 10% more than EBay.
"Hadoop is the new data warehouse. It is the new source of data" within the enterprise, said James Kobielus, an analyst with Forrester Research. "There is a premium on people who know enough about the guts of Hadoop" to help companies take advantage of it, he said.
Hadoop allows companies to store and manage far larger volumes of structured and unstructured data than can be managed affordably by today's relational database management systems.
A growing number of companies have begun tapping the technology to store and analyze petabytes of data such as weblogs, click stream data and social media content to gain better insights about their customers and their business.
The increasing enterprise adoption is driving demand for people with advanced analytics skills, Kobielus said. That includes people with backgrounds in areas such as multivariate statistical analysis, data mining, predictive modeling, natural language processing, content analysis, text analysis and social network analysis, he said.
"Big data in the broader sense -- and Hadoop in particular -- is driving demand for people who have experience doing advanced analytics using newer approaches such as MapReduce and R for predictive and statistical modeling," he said. These are the data analysts or data scientists who will work with structured and unstructured data in Hadoop environments to deliver new insights and intelligence to the business, he said.
Interest in Hadoop is also creating demand for Hadoop platform management professionals, Kobielus said. Their job will be to implement Hadoop clusters, secure, manage and optimize them and to ensure that the cluster remains available for enterprise use. "These are the people who build out and optimize the platform" on which Hadoop applications run, he said.
"The database administrators who administer Teradata and [Oracle's] Exadata are the same people who are now beginning to redefine their roles as Hadoop cluster administrators," he said. "They realize this is a brand new world." Also, expect to see demand for storage management professions and for those who can help integrate Hadoop environments with existing relational database technologies.
Demand for Hadoop professionals falls into three broad categories: data analysts or data scientists; data engineers ;and IT data management professionals, said Martin Hall, CEO of Karmasphere, which sells software products for Hadoop environments.
The data management professionals will be the ones who choose, install, manage, provision and scale Hadoop clusters, Hall said. These are the IT professionals who decide whether Hadoop is located in the cloud or on premise, which vendors to choose, which distribution of Hadoop to use, the size of the cluster and whether it will be used for running production applications or for quality-testing purposes.
The skills required for this role are similar to those required for doing the same tasks in traditional relational database and data warehouse environments, he said.
Hadoop data engineers, meanwhile, are those responsible for creating the data processing jobs and building the distributed MapReduce algorithms for use by data analysts. Those with skills in areas such as Java and C++ could find more opportunities as enterprises begin deploying Hadoop, he said.
The third category of professional in demand are data scientists with experience in areas such as SAS, SPSS and programming languages such as R, Hall said. These are the professionals who will generate, analyze, share and integrate intelligence gathered and stored in Hadoop environments.
For the moment, the shortage of Hadoop manpower means companies need help from service providers to deploy the technology. One indication of this is the fact that the revenues generated by professional consulting and systems integration firms involving Hadoop is significantly larger than the revenues from sale of Hadoop products, Kobielus said.
Companies such as Cloudera, MapR, Hortonworks and IBM today offer training courses in Hadoop that companies can take advantage of to build their own Hadoop centers of excellence, he said.
Jaikumar Vijayan covers data security and privacy issues, financial services security and e-voting for Computerworld. Follow Jaikumar on Twitter at @jaivijayan or subscribe to Jaikumar's RSS feed . His e-mail address is jvijayan@computerworld.com .
Read more about bi and analytics in Computerworld's BI and Analytics Topic Center.
Join the CIO Australia group on LinkedIn. The group is open to CIOs, IT Directors, COOs, CTOs and senior IT managers.
- Bookmark this page
- Share this article
- Got more on this story? Email CIO
- Follow CIO on twitter
- Hadoop ready for corporate IT, execs say - Computerworld
- Computerworld continuing coverage of Hadoop - Computerworld
- Q&A: Hadoop creator expects surge in interest to continue - Computerworld
- @jaivijayan
- Computerworld Jaikumar Vijayan News
- jvijayan@computerworld.com
- BI and Analytics Topic Center - Computerworld
- Setting a strategy for secure mobile printing
- Agile: Transforming small-team thinking into big business results
- ALM Buyers Guide: A Practical Guide to Choosing the Right Agile Tools for your Team
- Detailed Explanation of the Core Competencies
- Oracle IT Modernization Series Modernization: The Path to SOA
-
Apple aims iPads at High Schools
-
Face Time - Interview with John Brennan and Robert DiStefano
-
Google Jumps Into Social Bookmarks Game
-
NBN build gaining momentum daily: Quigley
-
Face Time - Interview with John Brennan and Robert DiStefano
-
Mastering Backup and Restoration
A backup strategy should not be static. Rather, it should establish a platform for a business to deliver continuous improvement through faster backup and restore features, easier management, lower operating expenditure, reduced complexity and delayed capital investment. These will in turn support greater business competitiveness. Read on. -
Protecting Against the Leading Causes of Data Breach
This whitepaper was written for the organisation that wants to focus on prevention of data loss and doesn’t have millions to spend, but needs affordable solutions that can be implemented today to protect millions of sensitive records and dollars worth of intellectual property. This whitepaper addresses: - What organisations can do to prevent the four leading causes of data breaches - Why dedicated (pure-play) DLP solutions may not protect you from all four leading causes of data breaches - How to get prevent sensitive data leaving your organisation -
Oracle IT Modernization Series Modernization: The Path to SOA
More and more organizations are looking to service-oriented architecture (SOA) as the basis of their future computer architecture. Recognizing that legacy application design and implementation approaches have led to applications that are costly to operate and maintain, hard to change, and rely on a dwindling set of skills, organizations are hoping that SOA provides a key component of the answer to these problems. Read on.
-
Concurrent and Real Time Systems
-
The Internet Gigabook for Dummies
-
SMS 2003 Administrator's Reference
-
Office 2010 Visual Quick Tips
-
Excel 2002 for Dummies Quick Reference
-
Learning Autodesk Maya 2008
-
The Semantic Web
-
Macbook for Dummies®, 2nd Edition
-
ASP.NET 2.0 Website Programming Problem - Design - Solution








Comments
Post new comment