SSAS: One giant cube or many small ones?
When using SQL Server Analysis Services (SSAS) to build a cube, you may start out with one fact table, but then decide to add another fact table. You are then faced with a decision: Do you add that fact table to your existing cube and have multiple measure groups, or do you create another cube, each with a single measure group? An example would be if you create a cube with financial data, and then a request is made to have a cube with operational data. Should you put the financial data and operational data together in one cube, or have two cubes, a financial cube and an operational cube? Here is a list of the benefits and disadvantages of both approaches:
One giant cube
Benefits
- A role based security model can be defined once for the database and has to be re-visited only when a new cube or dimension (that requires Dimension Data Access) is created
- Requires just one SSAS project in BIDS. The shared dimensions can be defined once and linked to associated measures groups. Code maintenance is all in one place
- If using dashboarding tools like PerformancePoint, SharePoint etc., common shared dimensions ensure that filters for dashboards can be built once and re-used across all cubes because they share the same structure, keys and value columns
- If you ever need to work with data from two fact tables in the same query or calculation, or if you think you might ever need to in the future, you should go with the single cube approach. The two options for cross-cube querying, linked measure groups and the LookUpCube MDX function, should be avoided
Disadvantages
- Having one database for all cubes means modification to one cube or dimension would require a re-deployment of all the cubes in the database irrespective of the fact nothing has changed in other cubes
- Depending on the nature of change, sometimes all cubes in the SSAS database may have to be reprocessed after a deployment. Structural changes to dimensions especially the shared ones will require all the cubes in the database to be reprocessed after deployment
Many small cubes
Benefits
- Having one cube per database gives the flexibility to customize dimension attributes and hierarchies that is relevant to a specific audience. For example, attributes that are not relevant to a target audience can be removed
- This method has the advantage that only affected cube databases have to be deployed
- Because there is only one cube in the database there is no dependency on shared dimensions
- Having multiple, smaller cubes may result in faster query performance than one large cube in some cases, especially if your fact tables have very different dimensionality
- Maintenance can be easier and less disruptive with multiple cubes: if you need to make changes to a cube while users are querying it, you might end up invalidating users’ connections and dropping caches. With one cube the chances of this disruption affecting more users increases
- It’s easier to scale out with multiple cubes: if you find your server is maxing out, you can simply buy another server and distribute your cubes equally between the two. With a single cube approach you end up having to look at (admittedly not that much) more complex scale-out scenarios like network load balancing
- Facilitates multiple developers in the SSAS BIDS environment (this is a big deal in a team environment). With one cube, only one developer at a time can be working on the cube. With many small cubes, separate developers can be working on each cube
- If you have a cube with many measures, measure groups, dimensions and hierarchies but have to use Standard Edition, you cannot use Perspectives to hide complexity from your users. In this case, creating multiple cubes might be a more user-friendly approach
- If you have complex calculations, especially MDX Script assignments, it’s too easy to write a calculation that has an effect on part of the cube you didn’t want to alter. With multiple cubes, the chances of this happening are reduced
Disadvantages
- Role based security model has to be defined for every database
- Requires a SSAS project in BIDS for each cube. Shared dimensions have to be created once on each project. If sufficient care is not taken during development, it could break the conformity of dimensions. Maintaining consistent attributes and names, hierarchy and levels, key and name columns, sort order could become a development and maintenance challenge
- In this method, you have to replicate similar filters for different cubes
More info:
SSAS: One Database, many Cubes Vs. One Database per Cube
New Data Mart: Create a New SSAS Cube or a Perspective in an Existing Cube?
excellent post James. keep it up.
by the way, regarding the point you mentioned in disadvantage of multiple cubes i.e. “Requires a SSAS project in BIDS for each cube”. – BIDS allows us to have multiple cubes in 1 single SSAS project. and if we choose this method, then its beneficial for developers. But if we have cubes in different SSAS projects, then its too much to develop and maintain just like you have explained.
thanks, khilit
Pingback:Microsoft SSAS Solution: Cube Design Considerations | Khilit's Business Intelligence Blog
Nice article James. I must admit the choise bewteen the two approaches is quite often a difficult one as no single option is clearly better.
Another way of looking at this issue is from a business usage perspective. From my experience the “one giant cube” approach is better for enterprise level reporting i.e. if you want to see what’s going on across the organisation and across business processes then this is better. On the other hand if the reporting is across discrete business units, departments or business processes then the many small cubes approach may be better.
In my opinion it is better to consider the business scenario before the technical implications i.e. when trying to decide whether to have one or many cubes.
Keep up the good work!
Glad you like the article Tino, and thanks for your excellent point about considering the business scenario. I did not think of that!
Great article.
As we are after SQL Server 2012 release, It might be worth clarifying it is for previous versions or SSAS 2012 Multidimensional. Some of the advantages don’t apply to SSAS 2012 Tabular (processing dependencies) but I suspect we cannot call tabular a cube (although I do when I speak with business people as they don’t care and generally it is easier for them understand 😉
I also assume when you refer to database you always mean SSAS Database not data warehouse database? Is that the case?
One question I have are perspectives and linked cubes. Any thoughts on those two features?
Take care and thanks for sharing your thoughts with everyone 😉
Emil
Hi Emil,
Glad you like the article. You are correct in that this is dealing with SSAS multidimensional cubes. And yes I am referring to an SSAS database.
Perspectives work great at limiting what a user sees, especially if you have a big cube. I don’t link linked cubes since linking fact tables is not much help since linked fact tables can only be used on linked dimensions, plus changing the structure breaks the link.
Hi James,
I have similar experience. I like perspective also because first can be physical names and second can be friendly name so rename of friendly name won’t break existing MDX.
I’ve seen linked cubes being used but I didn’t like it for the reasons you specified, I think it is much simpler to maintain without dependencies.
Take care
Emil
Pingback:Bus Matrix Presentation Follow Up Q&A « byoBI.com
Dear James,
We have around 30+ small cubes in our BI project and would like to know if there is any way to deploy all these 30 cubes at once through a script or any other process.
Currently we are deploying each bim file individually which is time consuming task.
Many Thanks,
CR