Non-obvious APS/PDW benefits
The Analytics Platform System (APS), which is a renaming of the Parallel Data Warehouse (PDW), has a lot of obvious benefits, which I discuss here. For those of you who find your database is getting too big, or becoming too slow, or you need to integrate non-relational data, check out APS, Microsoft’s MPP solution.
But there are a lot of non-obvious benefits to using APS that I have listed below:
- APS is a hardware platform that removes roadblocks for future needs. Be proactive instead of reactive!
- Numerous capabilities (i.e. PolyBase, mixed workload support, scalability, HDInsight) to get you thinking about better ways to use your data
- Much quicker development time due to speed of execution
- Allows removal of ETL complexity (temp tables, aggregation tables, data marts, other band aids and workarounds)
- Permits elimination of SSAS cubes or switch to ROLAP mode (so real-time data and no cube processing time)
- Reduction or elimination of nightly maintenance windows. Instead use intra-day batch cycles
- Tuning, redesigning ETL, upgrading hardware (more memory, Fusion IO cards), etc., for SMP to get 20-50% improvement versus 20-50x improvement with MPP
- SMP is optimized for OLTP while APS is optimized for data warehouses
- Allows a clear path to do predictive analytics via tools like Azure ML by having the disk space and processing power
- Don’t think of it as just a solution for a large data warehouse but rather for any size warehouse that needs faster query performance
- Faster query performance allows for adding more parameters to reports that also can be run real-time (instead of at night with fixed parameters) so business users can ask more sophisticated questions and execute ad-hoc queries. This will result in a lot more self-service reporting
- Have the space and performance to consolidate all your various data warehouses and data marts to one place
- Decrease infrastructure costs through consolidation: reduced server count, reduction of duplicate data, decreasing licenses, viewer ETL pieces, less infrastructure management costs (hardware, SQL Server, OS)
Hi James,
Thanks for the wonderful article. Can you please expand on the following point mentioned in the blog “Allows a clear path to do predictive analytics via tools like Azure ML by having the disk space and processing power”.
I am yet to explore AzureML but does it support connectivity to PDW/APS. When a ML algorithm executes, does it result it huge data transfer to the cloud? Please clarify
Thanks!
The way you would use APS and Azure ML is to use PolyBase to copy data from APS into Azure blob storage. Then, you can point Azure ML to that blob storage and use that data. There is not yet a way that Azure ML can use APS as a data source.
Thanks James
Pingback:How an MPP appliance solution can improve your future | James Serra's Blog
James,
You mentioned “switching of cubes to ROLAP mode” (and I assume to DQ only mode in tabular). Have you done this with any clients at all, and do you know of any white papers illustrating the empirical benefit of doing this?
Thanks.
Hi Steve,
I have another blog posting that may help: https://www.jamesserra.com/archive/2014/03/real-time-query-access-with-pdw/
Thanks.
Is the SQL generated in ROLAP mode any better or worse than the SQL generated in DQ Only mode? Are there any tips, tricks, or gotchas on what MDX/DAX coding practices to avoid in an effort to “prevent bad SQL being passed to PDW?
Pingback:How an MPP appliance solution can improve your future - SQL Server - SQL Server - Toad World