Splice Machine, the startup behind a dual-engine relational database management system (RDBMS) powered by Apache Hadoop and Apache Spark, last week announced that it would release that technology to open source.
Splice Machine uses resource isolation — separate processes and resource management for its Hadoop and Spark components — to ensure that large, complex online analytical processing (OLAP) queries don't overwhelm time-sensitive online transaction processing (OLTP) queries. The hybrid architecture allows you to run analytical workloads and transactional workloads concurrently — a boon for use cases ranging from digital marketing to ETL acceleration, operational data lakes, data warehouse offloads, Internet of Things (IoT) applications, web, mobile and social applications and operational applications.
Founded in 2012, the San Francisco-based company has been attracting customers and investors alike. It's raised $31 million in four rounds from five investors — in its most recent round, in January of this year, it raised $9 million in series C funding. Last year, IDG Connect named it one of 20 red-hot, pre-IPO companies in B2B technology.
[ Related: New Splice Machine RDBMS unites OLTP and OLAP ]
Splice Machine Co-Founder and CEO Monte Zweben says starting as a proprietary software company was the right decision — it's much easier for a software project to get somewhere with a small group of developers that are all under one direction, he says. But this is the right time to transition the company's RDBMS to open source and grow its user base.
"From our perspective, it's clear that the proprietary software model adds friction to the adoption process," he says. "We want tens of thousands of users, not hundreds of users. That's a major new opportunity, given the open source movement."
Until now, Zweben explains, Splice Machine has gone to market with a proprietary software model that was appropriate for a young startup. The sales team would ask hard sales questions, like: "What's this project about? Is there budget for it? Who's your boss?"
But that has a chilling effect on developers seeking to learn something new.
"If you're a developer interested in simply examining the state of the art and experimenting, that's now a conversation you're comfortable with," Zweben says.
By releasing the technology to open source, Zweben says Splice Machine will allow developers to freely experiment with its technology and, ultimately, will lead to more enterprise deployments.
[ Related: Hadoop powers big data digital marketing platform ]
"Customers view open source software as standard," he says. "This is an insurance policy for them. There's no single point of failure. If we're acquired or change focus, there's a large community that can continue development. It reduces vendor lock-in."
As an added benefit, he notes, a large open source community around a technology means organizations can find talent who know the technology inside and out.
"The evolution of Splice Machine from being the first transactional RDBMS on Hadoop, to incorporating Apache Spark as an analytical engine, has been amazing to watch as a member of their Advisory Board," Mike Franklin, former Chair of the School of Computer Science at UC Berkeley and incoming Chair of Computer Science at the University of Chicago, said in a statement last week. "Our AmpLab at Berkeley has initiated many open source projects, including Apache Spark, Apache Mesos and Alluxio (formerly Tachyon). I applaud Splice Machine in taking the leap and joining the open source community."
Of course, releasing a technology to open source is one thing, building a successful community around it is another.
"There is no doubt that the developer community does not come automatically," Zweben says. "We're under no illusions about that. I'm pivoting the majority of our marketing spend on not generating leads in terms of enterprises but actually building out an entire community infrastructure."
That will include investing in evangelists, a community site, Slack and IRC channels, best practices and more.
"We're so excited to build this community around us," he says. "Hopefully it will deliver the only cost-effective dual-workload database that can handle the analytical and transactional needs of modern applications."
Splice Machine is releasing its technology on its own GitHub, and is "90 percent of the way" to applying to the Apache Software Foundation's Apache Incubator for the project.
"We put out a call last week for mentors and champions," Zweben says. "We've had a healthy response to that from the Apache community."
Splice Machine plans to offer its RDBMS in a free, full-featured Community Edition and a licensed Enterprise Edition. The Enterprise Edition license will include support and features focused on operations. These won't be features required to use a database, Zweben hastens to add. They'll be features useful to running an RDBMS 24/7, governance and tuning it: backup and restore, authentication, security.
"The support model, I think, is less successful in the marketplace," Zweben says. "It seems as if the best way to build an open source company, having spoken to many different CEOs, bankers and venture capitalists, is actual to create a free community edition and an enterprise edition."
Join the CIO Australia group on LinkedIn. The group is open to CIOs, IT Directors, COOs, CTOs and senior IT managers.