Is the concept of the enterprise data warehouse dead? For a show called Data Warehouse World, there have been a lot of attacks here on the idea of data warehouses.
The thrust of many panel sessions and discussions here at the show, due to close tomorrow, was summed up in the title of one session, "The Great Data Warehouse Debate -- Enterprise Data Warehouse vs. Data Mart."But the topic was not relegated to just this panel, as many keynote and session speakers spent a great deal of time addressing the subject.
Data warehouses, designed to provide decision support for top-level corporate managers, integrate data related to the subjects a company needs to analyze to do business effectively. Data marts are a subset of the data warehouse, containing less data and less history, and customized for a specific department.
Data marts are built so that end users in different departments can quickly get answers to their queries, and to give departmental managers control over the their own decision support systems.
"That's where the end user gets to touch and feel things," said Bill Inmon, founder of Pine Cone Systems Inc. and a speaker here.
Inmon, however, is an unabashed advocate of building enterprise data warehouses before data marts. Building data marts results in the problem of integrating different departmental-level data -- the very issue that data warehouses try to solve in the first place, Inmon said.
"There is massive redundancy of common data because each data mart wants its own source of common data," said Inmon. "There is no sharing."From an engineering point of view, data marts pose the problem of requiring a separate interface to be built from each data source -- for example, transaction processing applications, telemarketing applications, shop-floor inventory applications -- to each data mart.
If a data warehouse is built first, then there is a need to build only one interface from each source to the warehouse, and then one interface from the warehouse to each mart. This eliminates the need for multiple sources of data to linked to multiple marts, Inmon said.
Building a data warehouse first will also ensure that metadata definitions are uniform throughout the enterprise, allowing data to be shared, speakers and attendees agreed.
Metadata can be divided into three broad categories: that needed for users, for database data models, and for technical implementation, according to Jim Doak, a solutions architect for database and client/server applications at IBM's Toronto Software Lab. Doak was here to help out in IBM's booth on the show floor.
There is meta data for users, who need to know how business terms are defined; for database data models, which need definitions for entities they store and their relationships; and for the technology implementation, which needs specifications for source and target applications, and how data is to be propagated over which protocols.
"You can build data warehouses any way you want," said Inmon. But if data marts are built first, "you are going to have big problems," he added. "There is no uniformity and no way ever to achieve uniformity."While many speakers and attendees here agreed that in concept it's best to build a data warehouse first and then move on to data marts, in practice this is not happening.
Eighty percent of data warehouse applications being built now are actually data marts, according to Doug Hackney, president of consultancy Enterprise Group Inc., citing GartnerGroup Inc. figures. Hackney spoke at the "Great Data Warehouse Debate" today.
According to a cross-section of speakers and attendees here, data warehouse projects are so lengthy and costly that companies are increasingly starting with the data mart idea. Data warehouses can take two to three years and two to five million dollars to build, according to Hackney. Data marts can take three to six months and cost less than a million dollars -- though costs can run into several million dollars, Hackney conceded.
The other element involved in building data warehouses is that they are very risky from a company-politics point of view, Hackney said. If a project of that size fails, the project manager is thrown out the door. In addition, since corporate executives' performance is measured in terms of quarters, it is difficult to get and maintain corporate sponsors for a project that takes years, and difficult to keep a team together that long, Hackney said.
Attendees and vendors here agreed that enterprise data warehouses, though based on valid underlying concepts, are being overtaken by practical reality.
"There are a lot of problems involved in starting off with a data mart," said Andy Hood, a systems engineer for Silicon Graphics Inc., and an attendee here.
"If the different data marts use different models, you've got a problem there."But SGI is still selling systems to be used for data marts, he confirmed.
And while IBM in the early 1990s pushed data warehouse applications on the basis of an enterprisewide technology blueprint, lately the company's focus has been on data marts, according to Doak.
"Data warehouses are just too much for people to get their arms around ... they can take years," said Doak.
What's the answer? The consensus emerging here is to take a hybrid approach, adopting the data warehouse methodology and building data marts with enterprise needs in mind.
To do this, Hackney suggested, department managers should be gathered to agree on the common ground on which their different marts should be built.
Broad agreement should be reached on five basic issues: subjects on which an enterprise is interested in maintaining data; basic business dimensions (such as factory production or sale revenue by sales person); metrics and measures; business rules; and semantics defining basic terms.
Only general principles should be reached, and the exercise shouldn't take more than one day, Hackney said.
By doing this a company stands a better chance of being able to integrate the data mart information when it feels ready to build a full-scale data warehouse, said Hackney.
Many industry insiders here agreed with the start-small approach, but issued some caveats.
"If you're experienced with data warehouses, fine, go for it," said Tony Rodoni, executive director of the data warehouse business unit at Informix Software Inc. and a speaker at the show. "But if you're not, you might not want to start with such a big project. Just make sure you're data mart is built on scaleable technology, so you have room to grow."Other industry insiders also agreed with the hybrid approach.
"There is the danger of stovepipe solutions," where source data gets sent to a data mart and can't be shared, said Michael Brill, vice president of marketing for Whitelight Systems Inc. Though just an attendee here, Brill will within months be trying to sell a product designed to integrate data from data marts.
"If different departments have different definitions of a customer, then you're going to have trouble sharing customer data," Brill said. Hackney's suggestion that company's reach agreement on high-level basic terms would go a long way to solving that problem, Brill said.
But there are other problems to be surmounted. As Pine Cone's Inmon warned, data marts might require multiple interfaces to multiple data sources, which would probably need to get reworked when a data warehouse is built.
"Sure, you might need to re-engineer things, but that's life," Brill said.
One way to avoid large-scale re-engineering of the interfaces among the various data marts and data sources is to construct a limited number of data marts before tackling an enterprise data warehouse, said Informix's Rodoni. In addition, the architecture of the enterprise data warehouse should start to be laid out as data marts are being built, to help ensure uniformity of data, he said.
While such an approach might be a compromise, it answers real-world needs.
"Architecturally, galactic data warehouses are right on. You wouldn't dream of doing it any other way," said Hackney. "But we've gone out of ... the ivory tower, out on to reality."Information on Data Warehouse World and speakers at the conference can be found at http://wwwdciexpo.com/.
Join the CIO Australia group on LinkedIn. The group is open to CIOs, IT Directors, COOs, CTOs and senior IT managers.