Data Vault and Data Virtualization: Double Agility

A Technical Whitepaper

The Problems of Data Vault

Data Vault is a modern approach for designing enterprise data warehouses. The two key benefits of Data Vault are data model extensibility and reproducibility of reporting results. Unfortunately, from a query and reporting point of view a Data Vault model is complex.
Developing reports straight on a Data Vault?based data warehouse leads to very complex SQL statements that almost always lead to bad reporting performance. The reason is that in such a data warehouse the data is distributed over a large number of tables.

The Typical Approach to Solve the Data Vault Problems To solve the performance problems with Data Vault, many organizations have developed countless derived data stores that are designed specifically
to offer a much better performance. In addition, a multitude of ETL programs is developed to refresh these derived data stores periodically. Although such a solution solves the performance problems, it introduces several new ones. Valuable time must be spent on designing, optimizing, loading, and managing all these derived data stores. The existence of all these extra data stores diminishes the intended flexibility that organizations try to get by implementing Data Vault. Philosopher and statesman Francis Bacon would have said: "The remedy is worse than the disease."

