Tabular query modes: DirectQuery vs In-Memory
When using the Tabular model in SSAS, the deployment options screen offers four choices for “Query Mode”: DirectQuery, DirectQuery with In-Memory, In-Memory, In-Memory with DirectQuery. Query Mode is used to specify the source from which query results are returned when you deploy the BISM project to the SSAS tabular model server. Here is a description of each, with the benefits and drawbacks:
In-Memory: This is the default. The data in the tabular model is processed and compressed using the xVelocity in-memory analytics engine (formerly called VertiPaq). This in-memory columnar storage engine has been optimized for high performance analysis and exploration of data. It provides fast query times for aggregation queries. However, there are some drawbacks:
- The data is not updated when the source data changes, so the model needs to be processed to refresh the data
- When you turn off the computer hosting the model, the cache is saved to disk and must be reopened when you load the model. The save and load operations can be time-consuming
- The server need lots memory if you have a large amount of fact data
DirectQuery: This mode uses relational data that is stored in a SQL Server database (it is similar to the ROLAP mode in a multidimensional model). It lets users retrieve data directly from a SQL Server data source in real-time. Any DAX queries on the data are translated by Analysis Services into equivalent SQL statements against the specified relational data source. To create a model in DirectQuery mode, you must first change the design-time environment so that it supports the use of DirectQuery mode (see Enable DirectQuery Design Mode). There are some additional benefits:
- It is possible to have a model over data sets that are too large to fit in memory on the Analysis Services server. If your source data in the SQL Server data source cannot be compressed into a 1 terabyte or smaller Analysis Services database, consider DirectQuery. Also, if your Analysis Services database does not fit in half of the memory on the machine hosting the Analysis Services instance, consider DirectQuery
- The data is guaranteed to be up-to-date, and there is no extra management overhead of having to maintain a separate copy of the data. Changes to the underlying source data can be immediately reflected in queries against the data model without any processing
- It has the advantage of using provider-side query acceleration such as SQL Server 2012 column indexes
- It could use row-level security provided by the backend database as well as auditing at the SQL Server level when individuals execute queries at the data source
- Analysis Services can perform optimization to ensure the query plan against the backend database will be as efficient as possible
- Reduced memory use, CPU use, and improved start times by Analysis Services
- Metadata discovery operations are faster, because data for a DirectQuery enabled model need not be loaded into memory to complete the discovery operation. This speeds operations like expanding databases in the Object Explorer in SQL Server Management Studio
Be aware there are some design considerations if you are planning to use DirectQuery mode:
- During design phase, you may need to use Preview or Filter function to load subset data into your project
- MDX queries are not supported for a model in DirectQuery mode. You cannot use PerformancePoint, Excel or other clients that only issue MDX queries to consume it (Power View and SSRS are the only two tools you can use so far – see Using Reporting Services with DirectQuery models)
- Currently this model only supports one data connection. You cannot query two or more SQL Servers from a DirectQuery enabled model
- Calculated columns and some DAX functions are not supported. You may need to use SQL View or other technique for that purpose
- DirectQuery enabled models cannot be created in PowerPivot. Only models deployed to a standalone Analysis Services instance can use DirectQuery
In-Memory with DirectQuery: This is a hybrid mode. By default, queries should be answered by using the In-Memory mode, however, the connection string from the client can instead choose to use the DirectQuery mode.
DirectQuery with In-Memory: This is a hybrid mode. By default, queries should be answered by using the DirectQuery mode, however, the connection string from the client can instead choose to use the In-Memory mode.
A hybrid mode provides you with many options:
- When both the cache and the relational data source are available, you can set the preferred connection method, but ultimately the client controls which source is used, using the DirectQueryMode connection string property. So you can serve clients that issue MDX queries and clients that issue DAX queries from the same model
- You can also configure partitions on the cache in such a way that the primary partition used for DirectQuery mode is never processed and must always reference the relational source. There are many ways to use partitions to optimize the model design and reporting experience. For more information, see Partitions and DirectQuery Mode (SSAS Tabular)
- After the model has been deployed, you can change the preferred connection method. For example, you might use a hybrid mode for testing, and switch the model over to DirectQuery only mode only after thoroughly testing any reports or queries that use the model. For more information, see Set or Change the Preferred Connection Method for DirectQuery
Note that queries on a model deployed in DirectQuery mode can return different results than when the same model is deployed in-memory, because data is fetched directly from a relational data store and aggregations required by formulas are performed using the relevant relational engine, rather than using the xVelocity in-memory analytics engine for storage and calculation. For example, there are differences in the way that certain relational data stores handle numeric values, dates, nulls, and so forth.
Also note that after a project is deployed you can change the preferred query data source for a DirectQuery model by changing a property in SSMS. This is done by right clicking on the deployed database in SSMS and selecting properties and you will see a property called DirectQueryMode that is used to make the change. See Set or Change the Preferred Connection Method for DirectQuery.
In the end, your preference when data modeling should be to use the in-memory cache for your tabular model. DirectQuery is a better choice when you are using Power View and your scenario requires one or more of the benefits offered by DirectQuery.
More info:
Deployment Options for BISM Project
SSAS Tabular Model – Query Modes
Formula Compatibility in DirectQuery Mode
DirectQuery Mode (SSAS Tabular)
Difference between DirectQuery and In-memory mode in tabular mode analysis services
Upcoming DirectQuery vs Vertipaq Presentation (slides)
Tabular Projects – DirectQueryMode and QueryMode
Using DirectQuery in the Tabular BI Semantic Model
Hybrid Mode in Tabular BI Semantic Model – Part 1, Part 2
Introducing the BI Semantic Model in Microsoft® SQL Server® 2012
Pingback:Comparing DirectQuery and ROLAP for real-time access | James Serra's Blog
Great article, James, very informative! Do you have any insight into when Microsoft might add MDX capability to the DirectQuery mode?
Pingback:Real-time query access with PDW | James Serra's Blog
One important point, We cannot implement row-level security when we want to deploy tabular model in Direct Query mode. We have to implement dynamic security to achieve security then…