Yahoo drops its own Hadoop distribution
- 02 February, 2011 07:24
- Comments
Yahoo is discontinuing its distribution of the Hadoop platform and will instead focus on Apache Hadoop, the Hadoop Team at Yahoo said this week.
Hadoop, which was built initially by Apache Chairman Doug Cutting while he was at Yahoo, has become prominent in data centers and cloud computing. Yahoo will halt its own distribution and remove all references to a Yahoo distribution from its Web site and close its github facility for Hadoop. "Our intent is to return to helping Apache produce binary releases of Apache Hadoop that are so bulletproof that Yahoo and other production Hadoop users can run them unpatched on their clusters," said Eric Baldeschwieler, vice president of Hadoop development at Yahoo, in the company's announcement.
[ Get the no-nonsense explanations and advice you need to take real advantage of cloud computing in InfoWorld editors' 21-page Cloud Computing Deep Dive PDF special report. | Stay up on the cloud with InfoWorld's Cloud Computing Report newsletter. ]
The Apache Hadoop community has been "very turbulent" lately, according to Baldeschwieler. "Over the last few months we have been developing Hadoop enhancements in our internal git repository while doing a complete review of our options. Our commitment to open sourcing our work was never in doubt, but the future of the Yahoo distribution of Hadoop was far from clear. We've concluded that focusing on Apache Hadoop is the way forward," said Baldeschwieler
Yahoo will have to sort out how to contribute several man-years' worth of work to Apache to "unwind the Yahoo git repositories," Baldeschwieler said. Yahoo has proposed a 20.100 release of Hadoop, featuring stability and high performance. Also, Yahoo has set up a feature branch called hadoop-future. A draft list of proposed features includes federation, with the ability to use more storage per Hadoop cluster; a new metrics framework; and optimizing the Hadoop MapReduce parallel applications framework for use with small jobs
Yahoo said that until the Hadoop 0.20 release, Yahoo committers worked as release masters to produce binary Apache Hadoop releases for the entire community to use on clusters. "As the community grew, we experimented with using the Yahoo distribution of Hadoop as the vehicle to share our work. Unfortunately, Apache is no longer the obvious place to go for Hadoop releases. The Yahoo team wants to return to a world where anyone can download and directly use releases of Hadoop from Apache. We want to contribute to the stabilization and testing of those releases," Baldeschwieler said.
This article, "Yahoo drops its own Hadoop distribution," was originally published at InfoWorld.com. Follow the latest developments in business technology news and get a digest of the key stories each day in the InfoWorld Daily newsletter. For the latest developments in business technology news, follow InfoWorld.com on Twitter.
Read more about data management in InfoWorld's Data Management Channel.
Join the CIO Australia group on LinkedIn. The group is open to CIOs, IT Directors, COOs, CTOs and senior IT managers.
- Bookmark this page
- Share this article
- Got more on this story? Email CIO
- Follow CIO on twitter
- Selecting the right cloud: A step-by-step guide : Cloud Computing - InfoWorld
- InfoWorld’s Cloud Computing Report - InfoWorld
- Yahoo drops its own Hadoop distribution : Data Management - InfoWorld
- Business technology, IT news, product reviews and enterprise IT strategies - InfoWorld
- IT news and top technology headlines - InfoWorld
- InfoWorld Daily Newsletter - InfoWorld
- InfoWorld.com on Twitter
- Data Management - InfoWorld
- Businesses are ready for a new approach to IT - Simplify deployment and reduce complexity using systems integrated with expertise
- IDC Whitepaper: Generating Proven Business Value with EMC Next-Generation Backup and Recovery
- Staying Secure and Preventing Data Leaks in a Cloud-obsessed World
- Webcast: Innovation Driving UC Everywhere: From Mobile to the Cloud and Beyond
- SOA Best Practices and Design Patterns
-
NBN build gaining momentum daily: Quigley
-
Face Time - Interview with John Brennan and Robert DiStefano
-
Monday Grok: Will Siri crack the walls of GOOG?
-
Face Time - Interview with John Brennan and Robert DiStefano
-
Face Time - Interview with John Brennan and Robert DiStefano
-
Information Security Policies, Standards and Procedure
As a result of the adjustments in the way business is conducted, ownership of information does not carry the same clear accountability it once did. Physical and behavioural boundaries used to exist around information management but these can be missing in the modern workplace. Clearly thought-out information security policies, standards and procedures addressing internationally supported standards, will go a long way to addressing the risk exposure these changes have created. In this third paper, “Policies, Standards and Procedures,” we discuss guidelines for effective information security management. -
Pathways Business Brochure 2012
Tailored learning and development program for organisations looking to build business acumen within their Key ICT executive. The course curriculum is designed in conjunction with the specific requirements the enrolling organisation. -
Think print, Think security - Plugging the printer security gap
The widespread use of networked printers and multifunction peripherals (MFPs) which scan, print, fax, copy and email has increased productivity in the production of all types of business output. However, the growing sophistication of these devices has also increased security risks associated with printing. Network connectivity, along with hard disk and memory storage, means that MFPs are now susceptible to many of the same security risks as PCs and servers alongside the traditional risk of sensitive printed output getting into the wrong hands. However, all too often the security of the print environment is overlooked and little is done to mitigate these threats. Read more.
-
Microsoft Dynamics Gp for Dummies®
-
Teach Yourself Microsoft Windows 98
-
Storage Area Networks for Dummies
-
Introduction to Interactive Programming on the Internet Using HTML & JavaScript
-
Simple Computer Tune-up
-
Wiley Pathways
-
The Cognitive Dynamics of Computer Science
-
PHP 6 and MySQL 6 Bible
-
Google Adsense for Dummies








Comments
Post new comment