Informatica rolls out data parser for Hadoop
- 03 November, 2011 07:09
- Comments
Informatica has strengthened its hand in the burgeoning market for Hadoop, the open-source programming framework for large-scale data processing, unveiling a new data parser on Wednesday that can transform piles of unstructured information into a more structured form for use in running Hadoop jobs.
The release builds on Informatica's June release of a Hadoop connector, which was aimed at data movement in and out of a Hadoop cluster, rather than data transformation. It also comes amid a wave of announcements from vendors such as Sybase and MarkLogic in the run-up to next week's Hadoop World conference.
Hadoop has emerged as one of the highest-profile technologies associated with "Big Data," an industry buzzword referring to the large amounts of unstructured information generated by websites, sensors and other non-relational sources, as well as the desire by companies to sift through such data for insights about their customers and businesses.
Informatica has been in the data-parsing business for some time. HParser includes a set of libraries for various data types, from standards like XML to industry-specific formats such as HIPAA, which is used in healthcare, and ASN.1 for telcos.
It comes in three editions, including two commercial versions, HParser Industry Standards and HParser for Documents, as well as a community version. The latter is available at no cost but premium services and add-ons are for sale.
Also Wednesday, Informatica announced that the community version of HParser will be available for use and downloadable from the website of Hortonworks, a spinoff of Yahoo which announced a preview version of its own Hadoop distribution this week.
The news drew a pair of thumbs-up from industry analysts.
The parser represents "great news for the Hadoop community," as it gives them "field-proven" technology, said James Kobielus, senior analyst with Forrester Research.
The Hortonworks announcement illustrates the "sorts of vendor partnerships that Hortonworks is building in the Hadoop community that will drive continued development of the fully open-source Apache Hadoop stack," Kobielus added.
One big stumbling block for Hadoop has been that many IT shops don't have the skills to easily adopt it. HParser's graphical development environment could help mitigate this problem, wrote David Menninger, vice president and research director at Ventana Research, in a blog post Wednesday.
"Using a graphical environment to develop these routines should make it easier and faster to create the code necessary to parse the data," he wrote.
Chris Kanaracus covers enterprise software and general technology breaking news for The IDG News Service. Chris's e-mail address is Chris_Kanaracus@idg.com
Join the CIO Australia group on LinkedIn. The group is open to CIOs, IT Directors, COOs, CTOs and senior IT managers.
- Bookmark this page
- Share this article
- Got more on this story? Email CIO
- Follow CIO on twitter
- Business Intelligence Best Practices for Dashboard Design
- Key Considerations in Modernising Your Backup and Deduplication Solutions
- Lower Your IT Costs When You Standardize on Oracle Database 11g
- HP 3PAR Utility Storage - Benefits Summary - Next-Generation Storage for Virtual and Cloud Data Centers
- HP P6000 Enterprise Virtual Array performance
-
Apple aims iPads at High Schools
-
Face Time - Interview with John Brennan and Robert DiStefano
-
Google Jumps Into Social Bookmarks Game
-
NBN build gaining momentum daily: Quigley
-
Face Time - Interview with John Brennan and Robert DiStefano
-
Unified Monitoring™ A Business Perspective
The enterprise computing landscape has changed dramatically. Virtualisation, outsourcing, SaaS, and cloud computing are creating fundamental changes, and ushering in an era in which enterprises distribute increasingly critical IT assets and applications across multiple service providers.This paper explores today’s computing trends and their monitoring implications in detail. In addition, it reveals how a new monitoring paradigm architecture, that uniquely addresses the monitoring realities of today’s and tomorrow’s enterprises—whether they rely on internal platforms, external service providers, or a combination of both. -
Process-Driven Master Data Management for Dummies
We wrote this book to introduce you to the subject of processdriven MDM. It’s a big topic, one that far outstrips the ability of a brief book to cover. However, our hope is that by reading this book you will gain a fundamental understanding of processdriven MDM, how it works, and what it takes to make it a success in your organisation. -
Avaya Deploys the Avaya Desktop Video Device with the Avaya Flare® Experience
A revolutionary new video collaboration device, the Avaya Desktop Video Device has been making waves in the communications industry ever since Avaya introduced the product in the fall of 2010. Avaya’s own employees have been among the earliest users and have seen first-hand how the product can improve collaboration and make people more efficient and effective. Read more.
-
Professional SQL Server 2005 Clr Programming with Stored Procedures, Functions, Triggers, Aggregates, and Types
-
Creating Web Sites Bible, Third Edition
-
AutoCAD 2007 for Dummies
-
Starting an Online Business for Dummies, 6th Edition
-
Designing Web-based Training
-
Visio 2007 Bible
-
Solidworks 2010
-
Unicode
-
Red Hat Linux Bible








Comments
Post new comment