Home | Research Studies | About Us | Login

Welcome Visitor!

Executive Summary

Data Warehouse Appliances: Evolution or Revolution?

The concept of a low-cost and high-performance data warehouse appliance is a hot topic. This is not only due to the increasing number of new data warehouse appliance vendors entering the market, but also to the competing solutions now being offered by established system hardware and software vendors. Further, the need by businesses, both large and small, for data warehousing is steadily increasing.

The objective of this research study was to analyze the current state of the art in data warehouse appliance technology and products, and to evaluate the strengths, weaknesses and benefits of an appliance approach to data warehousing. The study also looked at how organizations are using appliances in their data warehousing projects, and included a survey to gauge industry perceptions about the value and acceptance of data warehouse appliances. The research was sponsored leading vendors of data warehouse appliance solutions, namely DATAllegro, Greenplum, IBM, Netezza, and Teradata.

A Historical Perspective

Although data warehouse appliances are usually perceived as being a recent innovation, this is not the case. The concept of low-cost and integrated hardware and software for doing data intensive operations first appeared in the 1980s in the form of database machines. These machines were designed primarily to offload processing from expensive mainframe computers. Interest in database machines waned with the advent of cheaper client/server computing solutions. After a hiatus of many years, interest again perked up as companies began to face increasing costs for handling the rapidly growing data volumes in their data warehouses. The result was the emergence of the data warehouse appliance.

What is a Data Warehouse Appliance?

An appliance has one purpose, comes in one package, requires one install, and has one support point. Our research study defines the purpose of a data warehouse appliance to be:

“The enablement of high-performance data warehousing with a total cost of ownership (TCO) that provides a rapid return on investment (ROI) to the business.”

The two important phrases to note in this definition are “high performance data warehousing” and “rapid ROI to the business.” There is only a business case for using a data warehouse appliance if its total installation and operating costs are less than the business benefits achieved by its use. Current business pressures also make it important that this ROI be achieved as rapidly as possible.

All the customers interviewed for the study were achieving significant ROI and were very satisfied with the appliances they were using. All of them also commented, however, that the total cost of a data warehousing project involves more than just the cost of the data warehouse appliance. The largest cost of many projects is the effort involved in acquiring and integrating source data for loading into a data warehouse. A data warehouse appliance does not reduce this cost.

Types of Data Warehouse Appliance

The study identified four types of data warehouse appliance.

  • Native data warehouse appliance where the hardware and software is tightly integrated into a single data warehouse solution. The software and hardware are not individually licensed and cannot be separated. Examples of vendors providing native data warehouse appliances include DATAllegro, Netezza, and Teradata.

  • Software data warehouse appliance where commercial or open source relational DBMS software is designed and/or optimized for data warehouse processing. The software supports hardware solutions purchased from one or more third-party vendors. Examples of vendors or vendors providing software data warehouse appliances include Greenplum and Sybase (Sybase IQ).

  • Packaged data warehouse appliance where commercial software and hardware is tuned for data warehousing, is packaged and supplied by a single vendor, and is installed and maintained as a single system. Examples of vendors providing packaged data warehouse appliances include HP (NeoView), IBM (Balanced Warehouse), and Sun/Greenplum (Data Warehouse Appliance).

  • Data management appliance that offloads data intensive operations from a host computer. The offloaded workload may involve operational, specialized analytics, or archival processing. Examples of vendors providing data management appliances include ParAccel and Dataupia.

Evaluation of the various solutions on the marketplace showed that it is becoming increasingly difficult to determine the best product for any given data warehousing project. The biggest difference between the products was their ability to support enterprise data warehousing projects involving mixed workloads of complex and simple queries, and concurrent data warehouse updates. Important distinguishing features here are performance and scalability, hardware and software reliability, flexibility for expansion, support for preferred third-party hardware and software suppliers, workload management features, and data management and utility capabilities. All the customers interviewed, however, recommended that a full proof of concept study be performed in addition to feature/function comparisons.

Survey Results

A survey of 254 IT professionals carried out during the study showed that nearly 50 percent of the respondents were actively involved with data warehouse appliances by evaluating, planning, deploying, or running these products. Of those using data warehouse appliances, 50 percent said the solution performed well and satisfied their needs; 29 percent said the project was more difficult than expected or the appliance was below their expectations. Only about 10 percent of respondents indicated they thought that a data warehouse appliance was not a viable approach. The top three features for data warehouse appliances were high performance, data scalability, and high reliability and availability.

Read the entire study.