Why You Need a Data Warehouse
Introduction
You likely have heard about data warehousing, but are unsure exactly what it is and if your company needs one. I will attempt to help you to fully understand what a data warehouse can do and the reasons to use one so that you will be convinced of the benefits and will proceed to build one.
In my experience, not nearly as many companies have a data warehouse as I would have expected. And many of those that say they have a data warehouse don’t really have a true data warehouse, but rather a dumping ground for tables that are copied from source systems with little modification.
For a company to be successful in the future, they must make good decisions. And to make good decisions requires all relevant data to be taking into consideration. And the best source for that data is a well-designed data warehouse.
The concept of data warehousing is pretty simple: Data is extracted on a periodic basis from source systems, which are applications such as ERP systems that contain important company info. Data from these systems is moved to a dedicated server that contains a data warehouse. When it is moved it is cleaned, formatted, validated, reorganized, summarized, and supplemented with data from many other sources. This resulting data warehouse will become the main source of information for report generation and analysis via reporting tools that can be used for such things as ad-hoc queries, canned reports, and dashboards.
Building data warehouses has become easier over the years due to improvements in the tools, improvements in the processes (i.e. see Ralph Kimball Books) and a better understanding of the architectures (see Building an Effective Data Warehouse Architecture). And of course there are consultants who can help!
Why you need Business Intelligence
A goal of every business is to make better business decisions than their competitors. That is where business intelligence (BI) comes in. BI turns the massive amount of data from operational systems into a format that is easy to understand, current, and correct so decisions can be made on the data. You can then analyze current and long-term trends, be instantly alerted to opportunities and problems, and receive continuous feedback on the effectiveness of your decisions. See Why you need Business Intelligence.
The most important ingredient to a BI solution is that it must include a data warehouse.
Data Warehouse Basics
The concept of a data warehouse is not difficult to understand. Basically the idea is to create a permanent storage space for the data needed to support reporting, analysis, and other BI functions. While it may seem wasteful to store data in multiple places (source systems and the data warehouse), the many advantages of doing that more than justify the effort and expense.
Data warehouses reside on servers dedicated to this function running a database management system (DBMS) such as SQL Server and using Extract, Transform, and Load (ETL) software such as SQL Server Integration Services (SSIS) to pull data from the source systems and into the data warehouse.
Typical data warehouses hold multiple subject areas, and from the data warehouse are built data marts, which each hold a single subject area such as sales or finance (see Data Warehouse vs Data Mart).
The data needed to provide reports, dashboards, analytic applications and ad-hoc queries all exists within the production applications inside your company, so why not use the BI tools directly against this data? Well, there are many reasons why you would want to use a data warehouse instead of the “direct access” approach:
- You need to integrate many different sources of data in near real-time. This will allow for better business decisions because users will have access to more data. Plus this will save users lots of time because they won’t waste precious time retrieving data from multiple sources
- You have tons of historical data that you need to gather in one easily accessible place in which it will have common formats, common keys, common data model, and common access methods.
- You need to keep historical records, even if the source transaction systems does not
- You can restructure the data and rename tables and fields (i.e. numbers without decimal points or tables named F017) so if makes more sense to the users.
- You need to use master data management to consolidate many tables, such as customers, into one table
- Users are running reports directly against operational systems, causing performance problems. Instead, create a data warehouse so users can run reports off of that. Plus, the data warehouse is optimized for read access, resulting in faster report generation
- The data warehouse can be housed on a server built specifically for a data warehouse, resulting is much quicker access than hardware designed for handling transactions
- There is a risk that BI users might misuse or corrupt the transaction data
- Having an easy to use data warehouse allows users to create their own reports without having to get IT involved. Leading to “Self Service BI”
- A data warehouse is a convenient place to create and store metadata
- Improve data quality by cleaning up data as it is imported into the data warehouse (providing more accurate data) as well as providing consistent codes and descriptions
- Reports using the data warehouse wont be affected by new releases of application software. Also, trying to maintain reports that summarize data originating within multiple releases of software is very difficult
- Having one version of the truth, so each department will produce results that are in line with all the other departments, providing consistency
- Having a data warehouse makes it easy to create business intelligence solutions on top of it, such as SSAS cubes
- Companies that have implemented data warehouses and complementary BI systems have generated more revenue and saved more money than companies that haven’t invested in BI systems and data warehouses
Even though additional hardware and software are needed, the presence of a data warehouse costs less and delivers more value than a direct connection. With the continued drop in costs for processing power and storage, that makes the case for a data warehouse even stronger.
Benefits of a Data Warehouse and BI solution
Once a data warehouse is in place and populated with data, it will become a part of a BI solution that will deliver benefits to business users in many ways:
- End user creation of reports: The creation of reports directly by end users is much easier to accomplish in a BI environment. They can also create much more useful reports because of the power and capability of BI tools compared to a source application. And moving the creation of reports to a BI system increases consistency and accuracy and usually reduces cost
- Ad-hoc reporting and analysis: Since the data warehouse eliminates the need for BI tools to compete with the transactional source systems, users can analyze data faster and generate reports more easily, and slice-and-dice in ways they could never do before. The Microsoft BI toolset vastly improves the ability to analyze data
- Dynamic presentation through dashboards: Managers want access to an interactive display of up-to-date critical management data. That is accomplished via dashboards, which are sophisticated displays that show information in creative and highly graphical forms, much like the instrument panel on an automobile
- Drill-down capability: Users can drill down into the details underlying the summaries on dashboards and reports. The allows users to slice and dice to find underlying problems
- Support for regulations: Sarbanes-Oxley and other related regulations have requirements that transactional systems are sometimes not able to support. With a data warehouse, the necessary data can be retained as long as the law requires
- Metadata creation: Descriptions of the data can be stored with the data warehouse to make it a lot easier for users to understand the data in the warehouse. This will make report creation much simpler for the end-user
- Support for operational processes: A data warehouse can help support business needs, such as the ability to consolidate financial results within a complex company that uses different software for different divisions
- Data mining: Once you have built out a data warehouse, there are data mining tools that you can use to help find hidden patterns using automatic methodologies. While reporting tools can tell you where you have been, data mining tools can tell you where you are going
- Security: A data warehouse makes it much easier to provide secure access to those that have a legitimate need to specific data and to exclude others
- Analytical tool support: There are many vendors who have analytical tools (i.e QlikView, Tableau) that allow business units to slice and dice the data and create reports and dashboards. These tools will all work best when extracting data from a data warehouse
This long list of benefits is what makes BI based on a data warehouse an essential management tools for companies.
This diagram shows a complete architecture of an end-to-end BI solution:
More info:
Why & When Data Warehousing? Is it Relevant?
Top Five Benefits of a Data Warehouse
Microsoft EDW Architecture, Guidance and Deployment Best Practices
Data Warehouse Architecture – Kimball and Inmon methodologies
Data Warehouse Architecture Presentation Slides
Why Do We Need a Data Warehouse?
Designing an ETL process with SSIS: two approaches to extracting and transforming data
Should you use a data warehouse with a tabular cube?
Why do we need a Data Warehouse ?
What is the purpose of a data warehouse?
Data Warehousing Fundamentals Recorded Webinar
Business Intelligence is a must
Great blog post! You brought up a very interesting observation about the scarcity of data warehouses throughout Corporate America. I also observed this. Do you think the trend is changing?
Thanks for the post Malik. I do think the trend is changing. I have been surprised by how many companies do not have a “true” data warehouse. But it seems most realize the benefits of a data warehouse and BI and are taking steps to build a true DW along with BI solutions. I get a lot of requests from clients to come in and help them understand how to build a DW and the proper architecture. Most want to know best practices and how everyone else is building a DW. It’s a smart move to bring in a consultant who has built many DW’s and get guidance and certainty that you are building it correctly than to just do it all on your own, especially if you have not build one before. Kinda like trying to build a house on your own without asking a contractor for help!
Though I have never been a fan of a 3NF DWH, there are good uses for 3NF–including hierarchy enforcement in a staging area. What really chaps my @$$ is the misuse of the term “data mart”. It is the old “business dept” vs “business process subject area” argument.
The sad fact, however, is that it is human nature to “cluster and operate in groups”, and that tendency is demonstrated to the fullest in most unsuccessful DWH projects. I believe human nature itself is the biggest reason for DWH failure, and that is why we do not see more of them. However, when properly incenitivized and shown a truthful proof-of-concept, data warehouse projects can succeed in the same manner in which a good bill becomes a productive law in the US–education, executive support, compromise, and no bulls__t leadership.
Steven Neumersky
Pingback:Why use a SSAS cube? - SQL Server - SQL Server - Toad World
Pingback:Best Applications | Bemused
Pingback:Why use a SSAS cube? | James Serra's Blog
Pingback:Preparation for the 70-467 SQL BI exam | x86x64
very nice and informative post. easy to understand about data warehouse after reading this.
Read here also : http://dwbimaster.com/data-warehouse/data-warehouse-definition-properties-and-benefits/
Pingback:Why use a data lake? | James Serra's Blog
Its very informative.
Pingback:Understanding data warehousing and its importance | Aviana Global
Pingback:DP-900: Azure Data Fundamentals Certification Study Guide
Pingback:DP-900 Exam Study Guide (Microsoft Azure Data Fundamentals)