Power BI and a Star Schema
I sometimes get asked from customers if they should use a star schema or should they use one large single table/flatfile for their dataset in Power BI. The answer is almost always star schema, and Patrick LeBlanc does a great job explaining what a star schema is and why star schemas are so important for Power BI in his video Why Power BI loves a Star Schema. I wanted to list out his four reasons:
- Usability: If you have just one large table with all the data, when creating a report it is difficult to find the fields you are looking for in the “Fields” list. With a star schema, the Fields list is much easier to navigate. Star schemas make your data model much cleaner and easier to use
- Simpler DAX: The code for DAX will be much simpler against a star schema compared to a wide flat table. It will be shorter, easier to write, easier to read, and easier to maintain
- Performance: Star schema’s can handle huge amounts of data without impacting performance. The test Patrick ran was against 7 million rows and a query took 29ms against the one large table compared to 7ms against the equivalent star schema. You can imagine the performance difference against 10’s of millions of rows
- Faster refresh: In Power BI Desktop, Patrick used Power Query against 7.8 million records from a SQL Server Database and choose Close and Apply. For the large table, this took 34 minutes (2069 rows/second). The star schema took only 4 minutes (28k rows/second)
Patrick also has a video showing how to turn a flat file into a star schema: Power BI Tutorial | From Flat File To Data Model.
And a report using a flattened table could return inaccurate numbers, as pointed out by sqlbi: The importance of star schemas in Power BI.
For more information on star schema, check out the “bible” by Ralph Kimball: The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling.
More info:
Understand star schema and the importance for Power BI
Basics of Modeling in Power BI: What is a Dimension Table and Why Say No to a Single Big Table
Power BI Basics of Modeling: Star Schema and How to Build it
Build Your First Star Schema Model in Action: Power BI Modeling Basics
Power BI Tips for Star Schema and Dimensional Data Modeling
I would also add accuracy of results as demonstrated by sqlbi:
https://www.sqlbi.com/articles/the-importance-of-star-schemas-in-power-bi/
I recently did a blog post on the same subject, inspired by the same video 🙂 https://sqlkover.com/star-schema-all-the-things/ Although I appreciate the groundwork done by the Kimball books, I’m more a fan of the “Star Schema – The Complete Reference” book.
When using a Star Schema, users are getting confused when they are seeing dimension values that have not made a transaction on the fact table. If you had a single flat file you will only see in the filter pane vales where there has been some activity / transaction
In your experience does this stand-up to both import and direct query?
Yes, definitely. The direct queries will be must faster against a star schema compared to a large flat table.