Operational Data Store (ODS) Defined
I see a lot of confusion on what exactly is an Operational Data Store (ODS). While it can mean different things to different people, I’ll explain what I see as the most common definition. First let me mention that an ODS is not a data warehouse or data mart. A data warehouse is where you store data from multiple data sources to be used for historical and trend analysis reporting. It acts as a central repository for many subject areas and contains the “single version of truth”. A data mart serves the same purpose but comprises only one subject area. Think of a data warehouse as containing multiple data marts. See my other blogs that discuss this is more detail: Data Warehouse vs Data Mart,Building an Effective Data Warehouse Architecture, and The Modern Data Warehouse.
The purpose of an ODS is to integrate corporate data from different heterogeneous data sources in order to facilitate operational reporting in real-time or near real-time . Usually data in the ODS will be structured similar to the source systems, although during integration the data can be cleaned, denormalized, and business rules applied to ensure data integrity. This integration will happen at the lowest granular level and occur quite frequently throughout the day. Normally an ODS will not be optimized for historical and trend analysis as this is left to the data warehouse. And an ODS is frequently used as a data source for the data warehouse.
To summarize the differences between an ODS and a data warehouse:
- An ODS is targeted for the lowest granular queries whereas a data warehouse is usually used for complex queries against summary-level or on aggregated data
- An ODS is meant for operational reporting and supports current or near real-time reporting requirements whereas a data warehouse is meant for historical and trend analysis reporting usually on a large volume of data
- An ODS contains only a short window of data, while a data warehouse contains the entire history of data
- An ODS provides information for operational and tactical decisions on current or near real-time data while a data warehouse delivers feedback for strategic decisions leading to overall system improvements
- In an ODS the frequency of data load could be every few minutes or hourly whereas in a data warehouse the frequency of data loads could be daily, weekly, monthly or quarterly
Major reasons for implementing an ODS include:
- The limited reporting in the source systems
- The desire to use a better and more powerful reporting tool than what the source systems offer
- Only a few people have the security to access the source systems and you want to allow others to generate reports
- A company owns many retail stores each of which track orders in its own database and you want to consolidate the databases to get real-time inventory levels throughout the day
- You need to gather data from various source systems to get a true picture of a customer so you have the latest info if the customer calls customer service. Custom data such as customer info, support history, call logs, and order info. Or medical data to get a true picture of a patient so the doctor has the latest info throughout the day: outpatient department records, hospitalization records, diagnostic records, and pharmaceutical purchase records
More info:
Comparing Data Warehouse Design Methodologies for Microsoft SQL Server
Defining the Purpose of the Operational Data Store
Operational data store – Implementation and best practices
On-Demand Webinar: ODS versus Data Lake: Which is Right for your Business?
I liked your second sentence.
15 years ago I worked on a Big Five system where the reports ran, slowly, on the OLTP db. I put together a nice project charter proposing that we build a data warehouse. Denied!!! “DW project failure rates are too high, you must come up with another solution”. I opened MSWord and did a “find” for “warehouse” and a “replace” with “mart” and resubmitted. Denied!!! “Although dm project failure rates are less than dw failure rates, it is still too risky. You must come up with another solution.” Back to MS Word…this time I did find on “data mart” and a replace with “ODS”. I resubmitted. Approved!!!
Eight months later I was giving a project closure presentation where I outlined the final architecture of the “ODS” as well as how the reports now ran in minutes instead of DAYS.
One of the project sponsors stopped me and asked, “You just built a data warehouse, didn’t you.”
Moral of the story: a rose by any other name…
Hi Dave,
Do you have a model copy of your proposal. I have similar to your situation and team assigned me for doing the paper work
I agree with your definition of an ODS and it’s value. On a number of projects I’ve worked on, there has been a virtual prohibition of even mentioning the term ODS. I’ve used the global search and replace method to change it to Reporting Data Store and they loved it.
Any idea where the ODS got the reputation it has in so many companies?
Thanks for the comment Mel! I don’t know where ODS got the bad name. Honestly I have not run into a client yet who gave me dirty looks when I talked about an ODS 🙂
I think ODS has a bad rap because it’s usually implemented at companies that are poorly managed. If enough time is spent in the planning stage you are more likely to successfully build a data warehouse. Too many companies try to build a data warehouse, but end up with an ODS because they didn’t plan or put much effort in gathering requirements. In many cases the ODS is an appropriate need, but often it’s the result of poor project management.
Pingback:Data Loading performance considerations with Clustered Columnstore indexes | Garrett Edmondson
We do have Operational Data Store implemented on SQL Server with near real time reporting, and we build the ODS because of the similar reasons that that you mentioned. We have too many sources where entire business process runs and its imperative for us provide ODS to give the operational reporting.
Coming to the question: What’s your opinion on building ODS using big data technology. Due to the kind of integration and transformation we have, i feel, big data may not be performant and cost effective. Any thoughts?
Great explanation on the differences between the ODS and data warehouse. Totally new to warehousing and this has shed some light for me. Question on the ODS layer, in a real world scenario, which is better practice, full load or delta?
Delta is better if it can be done. More info at https://www.jamesserra.com/archive/2011/08/methods-for-populating-a-data-warehouse/
Thank you! There are a lot of gems here, I’ll be sure to over them slowly.