Free eBook: Data Warehouse Off-loading (62 pages)
Read this definitive guide to implementing a data warehouse offloading pattern with data virtualization.
Operating a data warehouse is expensive. Therefore, companies are choosing to partially off-load historical data onto cheaper Hadoop stores such as Cloudera and Hortonworks as natural solutions to drive costs down. However, introducing Hadoop to an existing data warehouse creates new problems with no simple way to bridge the gap—there are two data stores in place with very different modes of access, protocols, data formats, and performance and security capabilities. These data silos present several challenges such as creating a unified report that combines the data from the two systems. IT teams may respond to this problem by physically integrating the data, but such approaches are time consuming, resource intensive, and expensive.
Table of Contents:
- INDEX OF FIGURES
- HYBRID DATA WAREHOUSE
- DATA WAREHOUSE OFF-LOADING PATTERN
- DATA PARTITIONS IN DATA VIRTUALIZATION
- EXAMPLE USE CASE
- DATA WAREHOUSE OFF-LOADING IN ACTION
- APPENDIX A: Loading the Sample Data in Teradata
- APPENDIX B: Loading the Sample Data in Cloudera
- APPENDIX C: Loading the Sample Data in Hortonworks
- APPENDIX D: Raw Performance Data