“Data! Data! Data! I can’t make bricks without clay.” — Sherlock Holmes
In the era of Artificial Intelligence, one cannot think of anything else than DATA! If we talk about Big Data, it is a term used to describe a collection of data that is huge in volume and yet growing exponentially with time. But the real challenge is to store such complex and enormous amounts of data. Here comes Data Warehouse, Data Mart, and Data Lake into the picture. They might sound alike but they actually differ.
When you think about the term ‘warehouse’, the first thing that comes up in your mind is a place used for storing goods. But does the warehouse is a place where everything gets dumped? Not really. In fact, it’s more like a storehouse where only necessary things get stored (without any order or category or subcategory) so that they can be collected when necessary.
Data Warehouse, often abbreviated as DW or DWH and also know as Enterprise Data Warehouse (EDW), can be thought of as central repositories of integrated data from one or more disparate sources. DW stores current and historical data in one single place. Further, these data are used for analysis and reporting purposes. Also, One can use such data to do time series analysis like year over year growth or last month sales.
While creating Data Warehouse, a large portion of the procedure is spent on deciding which data to include and which data to exclude. The reason behind spending years on deciding is that data in DW are not subjected to frequent updates.
Sure you must have gone to a supermarket for grocery shopping. Things are arranged category wise and further in a subcategorical manner if needed. Have you ever witnessed an incident where yogurt is being kept along with cookies and pulses are kept in refrigerators? No right? Well, that is what a mart is where products are organized as per there usage and categories.
Structure specific to the DW environment is termed as Data Mart. Data Mart can be seen as a subset of Data Warehouse, which is oriented to a specific business line or team. Data marts contain repositories of summarized information collected for analysis on a selected section or unit inside a company, for example, the purchasing department, sales department. Thus we can say that data in Data Mart pertains to a single department