Parallel Data Warehouse (PDW) Version 2
Version 2 of Parallel Data Warehouse (PDW) is apparently due this month (and can be ordered as of March 1st). The official name is Microsoft SQL Server 2012 Parallel Data Warehouse. Info on HP’s version: HP AppSystem for Microsoft SQL Server 2012 Parallel Data Warehouse. Here are some of the major new features:
- Will have the xVelocity in-memory analytics engine and the xVelocity memory-optimized columnstore index feature in Microsoft SQL Server 2012 (along with the new columnstore index features of being updatable and able to be a clustered index) making it 10-50x faster and 15x compression
- Will use Windows 2012 Storage Spaces
- Uses Hyper-V, with everything virtualized
- Runs SQL Server 2012 and Windows Server 2012 Standard
- Failover is now handled by Hyper-V, replacing HPC
- Uses DAS (Direct-Attached Storage) via SAS JBOD, versus a previous strategy of simulating shared-nothing on a SAN (Storage Area Network)
- Will include a new data-processing engine called PolyBase, which is designed to enable queries across relational data and non-relational Hadoop data in the Hadoop Distributed File System (HDFS). You can create an external table in SQL Server (kinda like a linked server) and you can query it with T-SQL. So you can retrieve data from HDFS with a PDW query (seamlessly joining structured and semi-structured data), you can import data from HDFS to PDW, and you can export data from PDW to HDFS. Microsoft Technical Fellow David Dewitt is one of the principals behind PolyBase. PolyBase will be used in PDW for now, and later it will be added to SQL Server. See Seamless insights on structured and unstructured data with SQL Server 2012 Parallel Data Warehouse
- Will have an updated distributed query processor and a new admin console
- Improved speed. At PASS 2012 they demoed a 1PB data warehouse query finishing in under two seconds
- Upgraded hardware. For HP, the new appliance is called the HP AppSystem for Microsoft SQL Server 2012 Parallel Data Warehouse, or EDW V2 (specs). In addition to increased CPU processing power (16 cores per EDW V2 compute node vs. 12 cores for EDW V1 compute node), the EDW V2 contains 256GB of memory per server vs.96GB for EDW V1. The EDW V2 appliance also contains 35 disks per EDW V2 Compute node vs. 11 or 24 disk options on the EDW V1 appliance. This will allow customers to grow the EDW V2 to support up to 6 PB (Petabytes) of data. Finally, EDW V2 is available in a quarter rack system (with EDW V1, two full racks, control plus compute, was the smallest you can go). Hardware will be ProLiant Gen8 DEL360 with up to 8 compute nodes per rack and up to 7 racks. The combination of Microsoft software with HP Converged Infrastructure means HP AppSystem for Parallel Data Warehouse offers leading performance for complex workloads, with up to 100x faster query performance and a 30% faster scan rate than previous generations
- Dell hardware will be PowerEdge R620 with up to 9 compute notes per rack and up to 6 racks
- Allows customers to use their own hardware to perform backup operations (V1 required customers to purchase a backup node and its respective storage)
- Direct query with Power View, PowerPivot, PerformancePoint
- Use SSMS instead of Nexus
- 2.5x lower price per terabyte and 50% lower total hardware list price
- Up to 70% more storage capacity
- Up to double the rate of data loading speed
- Support 15TB – 6PB (v1 was 50TB – 600TB)
- Workload management enhanced: 4 predefined resource classes as server roles; allocation of fixed amounts of memory and PDW concurrency slots; an administrator can associate principles with resource classes; works similar to resource governor
More info:
Appliance: Parallel Data Warehouse (PDW)
Video Parallel Data Warehouse Version 2
The EDW evolution continues and it is Bigger, Faster and Better !
Microsoft’s SQL Server Parallel Data Warehouse Provides High Performance and Great Value (review of vendors)
Insight Through Integration: SQL Server 2012 Parallel Data Warehouse – PolyBase Demo
Rock your data with SQL Server 2012 Parallel Data Warehouse (PDW)
Rock your data with SQL Server 2012 Parallel Data Warehouse (PDW) – What’s new?
Video Microsoft Parallel Data Warehouse V2 New Features
Video Large-Scale Data Warehousing and Big Data with Microsoft SQL Server Parallel Data Warehouse V2
Video Polybase: Hadoop Integration in SQL Server PDW V2
Rock your data with SQL Server 2012 Parallel Data Warehouse (PDW) – POC Experiences
Is SQL Server Parallel Data Warehouse 2012 an EDW Game Changer?
Pingback:SQL Server 2014: Columnstore Index improvements | James Serra's Blog
Pingback:Parallel Data Warehouse (PDW) benefits made simple - SQL Server - SQL Server - Toad World
Pingback:Parallel Data Warehouse (PDW) benefits made simple | James Serra's Blog
Pingback:Parallel Data Warehouse (PDW) AU1 released - SQL Server - SQL Server - Toad World
Pingback:Parallel Data Warehouse (PDW) AU1 released - SQL Server - SQL Server - Toad World
Hi James,
You mention above that their is was suppose to be support for SSMS as an IDE for PDW v2, but up to date I have not found any supporting documentation that the support does exist. Do you know whether SSMS is supported as we are waiting for an APS appliance with the AU2 update, or if SSDT is the only IDE that is still currently supported. Thank you for your assistance in this regard, it is very much appreciated.
Hi Herman,
In PDW v2, which was available over a year-and-a-half ago, support for SSDT was added. Support for SSMS was not added (I was mistaken in my post). There are no plans to add support for SSMS as SSDT has all the features needed to support PDW.
My organization is planning to use Microsoft SQL Server 2012 Parallel Data Warehouse (PDW) for BI purpose. In my project they require SAS connectivity with PDW. I am not able to find any link or clue which will help me to get integration/connectivity with the same.
Could you please put some insight on that.