VANCOUVER, BC -- Last year's foundation of the Open Data Platform Initiative (ODPi), a collaborative project of The Linux Foundation that aims to reduce complexity surrounding the Hadoop ecosystem, made waves in certain parts of the Apache Software Foundation (ASF) concerned by the creation of an external organization that could exert influence over Apache projects.
At the Apache: Big Data North America conference in Vancouver, BC this week, the ODPi moved to ease those concerns through dialog and sponsorship of the ASF.
The idea behind ODPi's creation was to provide a big data kernel in the form of a tested reference core of Apache Hadoop, Apache Ambari and related Apache source artifacts. ODPi released a runtime specification and test suite earlier this year.
Big-name big data members
The organization has dozens of members. Most of them are big data solution providers like Hortonworks, Pivotal, EMC, IBM and SAS. While there are some end users also in the mix — and the organization is encouraging new members to join, including more end users — one of the concerns within ASF is that ODPi now employs the majority of committers to Hadoop ecosystem projects.
"Right now, ODPi is this sort of super organization of Hadoop vendors," Jim Jagielski, senior director in the Tech Fellows program at Capital One and one of the developers and founders of ASF, said in a panel about the issue yesterday. "Worst case, there could be a concerted effort by a single entity to basically create a Hadoop ecosystem that ODPi wants and not necessarily what the community wants."
"It's something that we will be looking at," he added. "The thing that really differentiates the ASF model from a lot of the other models out there is that it really is focused on the individual developers."
For instance, right now, there are two somewhat overlapping projects around authorization and data security in the Hadoop ecosystem: Apache Sentry, a top-level project supported by Hadoop distribution vendor Cloudera, and Apache Ranger (incubating), a project supported by Hadoop distribution vendor Hortonworks. Cloudera is not a member of ODPi but Hortonworks is. If ODPi certification becomes a differentiator for end users, the potential exists for ODPi to give favor to Apache Ranger implementations, regardless of merit.
That said, Jagielski also said he felt that scenario was unlikely.
"I think ODPi and the folks that are spending big money behind everything actually realize that they would be shooting the gift horse in the mouth," he said.
The nuclear option
It should also be noted that ASF holds the nuclear option of wiping the slate of committers to a project and establishing a whole new team.
Another concern expressed by some members of ASF is that the presence of ODPi as a middleman between Apache developers and end users could have a stifling effect on innovation.
"Part of the concern about an organization like ODPi is that it creates a layer, maybe an undue layer, between the people developing Hadoop and the end users," Jagielski said. "Open source, in general, thrives by a very, very tight feedback loop between the developers of the code and the end users of that code. One of the concerns that we have is making sure that that feedback loop is not curtailed in any way."
But the ODPi, says John Mertic, director of Program Management for ODPi and Open Mainframe Project at The Linux Foundation, is focused on downstream users. It seeks to gather insight from ISVs and other end users and surface those insights to the project communities at ASF.
"The ASF isn't really organized to do a lot of the, for lack of a better word, handholding," Jagielski acknowledged. "That's the sort of area where ODPi can really, really shine, to provide that sort of help for us."
"At the ODPi, we don't make software," Mertic added. "With what we do, our number one priority is making sure this technology can permutate into organizations of all sizes, all cultures, everywhere. I consider that we have a bit of a civic duty in doing this in an open source manner in getting this technology to as many people as possible."
In an effort to show that ODPi exists to support ASF's Hadoop ecosystem projects rather than muscle in on the foundation's turf, ODPi announced yesteday that it had become a gold sponsor of the foundation, joining existing ASF sponsors Pivotal, IBM, WANDisco and Hortonworks (all of which are also ODPi members).
"We have a number of our members that are sponsors at different levels as well, but we wanted to make a real public show of our support of the ASF," Mertic explained. "we are in support of the efforts of the ASF. For us, this is the start of a continuing dialog. We're not trying to fork Hadoop. We want to get rid of that idea."
"When we came out, we probably didn't do the right thing with the ASF," Mertic added. "This is probably a dialog we should have had then. We're both focused on the same thing. We both want Hadoop to succeed. We don't want to be in competition with each other. There's tons of barriers to Hadoop adoption. What can we do to resolve that?"