Ventana Research: Fulfilling the Promise of Data Lakes
The expanding volume and variety of data originating from sources that are both internal and external to the enterprise are challenging businesses in harnessing their Big Data for actionable insights. In their attempts to overcome Big Data challenges, organizations are exploring data lakes as consolidated repositories of massive volumes of raw, detailed data of various types and formats. But creating a physical data lake presents its own hurdles, one of which is the need to store the data twice which can lead to governance challenges with regard to data access and quality. Also, data lakes can become data silos since they are often built to target particular departments, such as Marketing, and subsequently must be combined with other enterprise data (e.g., CRM, ERP, or other data lakes) for analysis.
To overcome the limitations of physical data lakes, progressive organizations are turning to data virtualization to extend their physical data lakes by creating a "virtual" or "logical" data lake through a layer of abstraction. Benchmark research on information optimization conducted by Ventana Research indicates that data virtualization techniques are increasingly popular in big data scenarios such as data lakes. Twenty-six percent of organizations participating in this research stated that data virtualization is a key activity for big data analytics.
Data virtualization can facilitate and expedite accessing and exploring critical data in a cost-effective manner, and assist organizations in deriving more value from their data lakes and their information. Data virtualization technologies can improve an organization’s ability to govern and extract more value from its data lakes by extending them as logical data lakes.