Subscribe to CIO Magazine »

Open Source 'Lingual' Helps SQL Devs Unlock Hadoop

One commonly cited stumbling block to the broader adoption of Apache Hadoop in the enterprise is the difficulty and expense of finding and hiring developers who understand and can think in Hadoop MapReduce. But Big Data application framework specialist Concurrent wants to change that.

Concurrent is already the driving force behind Cascading, a stand-alone open source Java application framework designed by Concurrent founder and CTO Chris Wensel as an alternative API to MapReduce.

Cascading gives Java developers the capability to build Big Data applications on Hadoop using their existing skillset. Now Concurrent hopes to help SQL users get into the act with Lingual, an open source ANSI-standard SQL engine that runs on top of Cascading.

Lingual Lets Analysts, Developers Tap SQL Skills for Hadoop

Lingual, which will be publicly available under the Apache 2.0 license within the next few weeks, gives analysts and developers familiar with SQL, JDBC and traditional BI tools the caoability to create and run Big Data applications on Hadoop using their existing skillsets.

"Concurrent was established with the belief that there had to be a simpler path to mass Hadoop adoption," Wensel says. "And since day one, we have worked to create solutions that make it easier for developers to build powerful and robust Big Data applications quickly and easily. With the Lingual project, we are one huge step closer to realizing our mission."

To date, many Hadoop users have turned to Apache Hive (a data warehouse infrastructure built for Hadoop) and Apache Pig (a high-level platform for creating MapReduce programs) to achieve SQL-like capabilities.

"Pig and Hive have their own qualities and actually are quite good, but sometimes you just want SQL," Wensel says. "Lingual is great for people who don't know how to use Hadoop but know SQL. The best way to get value out of something in many cases is just to use SQL."

"We just want to make it easier for people to get data off Hadoop or to port their apps to Hadoop using skills they already know," he adds.

Use Cases for Open Source Lingual SQL Parser

Example use cases for Lingual include the following:

  • Giving data analysts, scientists and developers the capability to "cut and paste" existing ANSI SQL code from traditional data warehouses and instantly access data locked on a Hadoop cluster

  • Giving developers the capability to use a standard Java JDBC interface to create new Hadoop applications or use any of the Cascading APIs and languages, like Scalding and Cascalog

  • Giving companies the capability to query and export data from Hadoop directly into traditional BI tools

"We are very excited about the prospect of using standard SQL to provide seamless access to the billions of events that we track daily," says Zack Shapiro, director of engineering at Kontagent, a Concurrent customer.

"Rather than filtering through events and exporting them to MySQL, our customer support staff and data scientists will finally be able to work with tools they already know to query the raw data directly within our Hadoop cluster through the use of Lingual and Cascading," Shapiro says.

Cascading has already been adopted in some of the biggest and most well-known Big Data companies, like eBay, Etsy and Twitter. Twitter uses Cascading to streamline its data processing, data filtering and workflow optimization for large volumes of unstructured and semi-structured data. It is also the driving force behind three popular open source language extensions: PyCascading (Python + Cascading), Scalding (Scala + Cascading) and Cascalog (Clojure + Cascading).

"eBay has picked that up and is running it as well," Wensel notes. "All of eBay's search is now running on Scalding."

Thor Olavsrud covers IT Security, Big Data, Open Source, Microsoft Tools and Servers for Follow Thor on Twitter @ThorOlavsrud. Follow everything from on Twitter @CIOonline, Facebook, Google + and LinkedIn. Email Thor at

Join the CIO Australia group on LinkedIn. The group is open to CIOs, IT Directors, COOs, CTOs and senior IT managers.

More about: Apache, eBay, Facebook, Google, IT Security, Microsoft, MySQL, Scala
Comments are now closed.
Related Coverage
Related Whitepapers
Latest Stories
Community Comments
Tags: application development, open source, applications, databases, Concurrent, software
Latest Blog Posts
  • IBM X-Force Threat Intelligence
    In the second half of 2013, the advancement of security breaches across all industries continued to rise. Within this report, we’ll explain how more than half a billion records of personally identifiable information (PII) such as names, emails, credit card numbers and passwords were leaked in 2013 - and how these security incidents show no signs of stopping.
    Learn more »
  • Swiss Nuclear Power Plant Improves Business Continuity
    Learn how Kernkraftwerk Leibstadt (KKL), a Swiss nuclear power plant, achieved 95% virtualization with 50% fewer servers in just two months by implementing a Vblock System. The solution ensures that KKL can reliably deliver the continuous electricity supply safely and cost effectively.
    Learn more »
  • The Three Essential Steps to Successful Cloud Migration
    Businesses and enterprises have quickly realised the power and efficiency of cloud computing, but migrating to the cloud can be a challenging process. This guide leads you through the three key steps you should take to assess your workload, select the most appropriate cloud model and ensure your cloud provider’s migration methodology stacks up.
    Learn more »
All whitepapers
rhs_login_lockGet exclusive access to Invitation only events CIO, reports & analysis.
Latest Jobs
Salary Calculator

Supplied by

View the full Peoplebank ICT Salary & Employment Index

Recent comments