The Journey to a Data Lake
Apache Hadoop didn’t disrupt the datacenter, the data did. Today, every enterprise has a Data Warehouse that serves to model and capture the essence of the business from their enterprise systems.
The explosion of new types of data in recent years - from inputs such as the web and connected devices, or just sheer volumes of records - has put tremendous pressure on the EDW and other data systems.
In response to this disruption, an increasing number of organizations have turned to Apache Hadoop to help manage the enormous increase in data while maintaining coherence of the Data Warehouse.
This paper discusses Apache Hadoop, its capabilities as a data platform and how the core of Hadoop and its surrounding ecosystem solution vendors provides the enterprise requirements to integrate alongside the Data Warehouse and other enterprise data systems as part of a modern data architecture, and as a step on the journey toward delivering an enterprise ‘Data Lake’.