The Apache Software Foundation (ASF) will hold its second annual Apache: Big Data North America conference in Vancouver, BC, starting Monday next week. Alongside keynotes from companies like Netflix and IBM, and panels on a huge range of topics — from security and storage to managing distributed systems and machine learning — the foundation will also host a forum that looks to cut to the heart of its community model and how private companies should be involved in its work.
On Wednesday afternoon, Jim Jagielski, senior director in the Tech Fellows program at Capital One and one of the developers and founders of the Apache Software Foundation (ASF), and John Mertic director of Program Management for ODPi and Open Mainframe Project at The Linux Foundation, will host a panel dubbed ODPi and ASF Collaboration: Ask Us Anything!.
"Jim cares deeply about the foundation and our licensing and community models," says Rich Bowen, executive vice president of the Apache Software Foundation. "They're going to be talking about how ODPi, as an external organization, will collaborate with the projects within the organization. There are some concerns within the foundation about ODPi's model, though I think the folks there have their hearts in the right place."
The nonprofit Open Data Platform Initiative (ODPi) is a collaborative project of The Linux Foundation that formed last year in an effort to reduce the amount of complexity surrounding the Hadoop and big data environment. The idea was to provide a big data kernel in the form of a tested reference core of Apache Hadoop, Apache Ambari and related Apache source artifacts. ODPi released a runtime specification and test suite earlier this year.
ODPi has dozens of members. Most of them are big data solution providers like Hortonworks, Pivotal, EMC, IBM and SAS, though end users are also in the mix.
"One of the things we do at Apache is we provide a place where projects can do their thing," Bowen says. "As the foundation, we don't provide them a lot of direction. One of the big concerns that we have and the reason [primary Apache Web server developer] Brian Behlendorf insisted the Apache Web server be under a permissive license in the first place, is that we at the foundation are very concerned about project independence."
Bowen explains that the foundation considers it essential that projects not be governed by any particular company. When they are, he says, and the company loses interest, projects often whither on the vine and die.
"One of [ODPi's] stated goals is to provide road maps to these projects," he says. "From a consumer perspective, this is very appealing. As the Apache Foundation, we are concerned that some of the organizations that are part of ODPi will exercise undue influence over various projects. We have companies that are involved in Apache projects that don't respect our trademarks, that speak of these projects as though they control them and indeed operate as though they control them. Project independence is, I believe, critical for open source projects to survive."
"I think John and his crew at ODPi care about this," he adds. "I think we can achieve a useful dialog."
That sort of dialog was largely the reason for the birth of ApacheCon in 2000. ApacheCon will be taking place alongside Apache: Big Data North America in Vancouver next week.
Sharing ideas (and code)
"ApacheCon tends to be traditionally focused on community-building events, inter-project bonds," Bowen says. "People aren't operating in a vacuum. Projects need to know what other projects are doing so there are points of connection. It's about sharing ideas and sharing code."
Last year, given the enormous growth of big data projects at ASF, the foundation split the big data portion off into its own simultaneous event.
"We now have around 300 projects that are represented in some way at this event," Bowen explains. "That presents challenges as far as presenting a particular focus. That is why last year we started doing a big data-focused event. The big data software world happens at Apache; most of the major big data projects happen at Apache."
Bowen notes that big data projects have become the most active projects at ASF in terms of mailing list activity and code commits.
"It seems like every month or two we're graduating a new project from the Incubator and over half of them are in the big data space," he says. "This is just a new chapter in the Apache evolution. We're seeing big data projects just really taking over the foundation in terms of active projects."
Step up to the barcamp
Apache: Big Data is about bringing all those projects together under one roof so project code committers and developers can collaborate with users and discuss the issues, technologies, techniques and best practices shaping the data ecosystem.
Bowen says he's especially excited about BarCampApache, scheduled for Wednesday.
"It will use a traditional BarCamp model," he says. "You show up with your ideas and the schedule is made up on the fly. It's less formal and more of a round table discussion. That's an event I would love to see more people at. Every year, some new project gets spawned out of a BarCamp event."