Critical.
Authoritative.
Strategic.
Subscribe to CIO Magazine »

Hadoop alternative to be open sourced

Information provider LexisNexis is releasing its HPCC Systems technology, but it may have an uphill battle

LexisNexis is planning to release its internally developed supercomputing platform as open source, providing developers with an alternative to the Hadoop framework for large-scale data processing, the company said Wednesday.

LexisNexis has been developing the technology, dubbed HPCC Systems, for the past 10 years, according to the company, which provides a variety of information services to legal firms, libraries, corporations and government entities.

"We've been doing this quietly for years for our customers with great success. We are now excited to present it to the community to spur greater adoption," said James Peck, CEO of LexisNexis' Risk Solutions division, in a statement. "We look forward to leveraging the innovation of the open source community to further the development of the platform for the benefit of our customers and the community."

HPCC Systems runs on clusters of commodity hardware and is made up of a number of components, centered around the company's Enterprise Control Language, a "declarative, data-centric programming language optimized for large-scale data management and query processing," LexisNexis said.

A component called Thor handles data ETL (extraction, transformation and loading) chores, while a third system named Roxie delivers "highly scalable, high-performance online query processing and data warehouse capabilities," LexisNexis said.

The system is able to analyze petabyte-sized volumes of data "significantly faster and more accurately than current technology systems," scaling up to thousands of nodes, the company said.

LexisNexis will offer both a community edition and commercial enterprise edition of HPCC Systems, which will be overseen by company CTO Armando Escalante.

At first, HPCC Systems will be offered as a virtual machine for testing by the community, with full binaries and the source code to be issued a number of weeks later.

The community edition will be released under the GNU Affero GPL v3 license. New code contributed by LexisNexis and community members will go to the open-source edition first, according to a detailed FAQ document on the company's site.

However, LexisNexis stressed that HPCC Systems won't involve the release of any of its "data sources, data products, the unique data linking technology, or any of the linking applications that are built into its products."

The community edition will also have a number of limitations compared to the enterprise edition, such as a restriction of one Thor process per node, according to a comparison chart. It will also get only "basic testing against different Linux distributions," while the enterprise edition will undergo a much more rigorous certification.

Pricing for the enterprise edition, which is offered with a number of support tiers, was not available.

Enterprise Edition subscription customers have access to a number of add-on modules as well, including a tool that converts the Pig Latin language used in Hadoop to ECL.

ECL specifications will be released under a Creative Commons license, the company said.

Despite HPCC Systems' pedigree and apparent long-term success as an operational system for LexisNexis, it will have an uphill battle in the broader marketplace, according to Forrester Research analyst James Kobielus.

"I don't doubt [HPCC Systems] does what they say it does, but they're late to the game. I don't know that they are going to get a lot of traction, because a lot of vendors and users have placed their bets with Hadoop," he said.

LexisNexis will also be hindered by the fact that it is not a database or data warehousing company, and despite its claims about HPCC Systems' capabilities, the company "is not going to win a performance battle against the whole Hadoop community," Kobielus said. "Venture capital money is going into startups left and right to involve that community."

Chris Kanaracus covers enterprise software and general technology breaking news for The IDG News Service. Chris's email address is Chris_Kanaracus@idg.com

Join the CIO Australia group on LinkedIn. The group is open to CIOs, IT Directors, COOs, CTOs and senior IT managers.

More about: Creative, Forrester Research, IDG, James Kobielus, Linux, NU
References show all
Comments are now closed.
Related Coverage
Related Whitepapers
Latest Stories
Community Comments
Tags: business issues, applications, hadoop, software, LexisNexis Risk Solutions, business intelligence, data warehousing
Latest Blog Posts
Whitepapers
  • Case Study: Steel Blue
    Read how Perth-based safety footwear manufacturer, Steel Blue, was able to cut costs with shipping and improve efficiency while meeting the growing demand for their products as they expanded their national and export markets and increased their local market share, all thanks to a new ERP system.
    Learn more »
  • Smarter Data Centre Outsourcing: Considerations for CFOs
    Deloitte explores the business and finance implications associated with managing data centres. This paper outlines the options available to structure an organisations data centre and complementary IT services and provides the key considerations that need to be reviewed when determining which option works best for them.
    Learn more »
  • The THREE Pillars of High Availability Storage
    Without high-availability storage, you don’t actually have anything – so for a storage system to deliver high availability, system architecture needs to handle component failure as well as service upgrades. This webcast presentation discusses the importance of high-availability to organisations, and how to make sure you can access your data whenever you need it. By using Pure Storage system architecture, along with infiniband as a stateless controller, viewers will learn how Pure Storage meet their philosophy of a “non-disruptive everything”.
    Learn more »
All whitepapers
rhs_login_lockGet exclusive access to Invitation only events CIO, reports & analysis.
Latest Jobs
Salary Calculator

Supplied by

View the full Peoplebank ICT Salary & Employment Index

Recent comments

Computerworld
ARN
Techworld
CMO