Build announcements: Azure Synapse Analytics in public preview and more
A few data platform announcements yesterday at Microsoft Build that I wanted to blog about.
The biggest one is Azure Synapse Analytics is now available in public preview! You can immediately log into your Azure portal and use it. While in the Azure portal, search for “Synapse” and you will see “Azure Synapse Analytics (workspaces preview)”. Choose that and then click “Create Synapse workspace” (you first may need to register the resource provider “Microsoft.Synapse” in your subscription – see Azure resource providers and types).
These new features are now available to you:
Synapse Studio
Collaborative workspaces
Distributed T-SQL Query service
SQL Script editor
Unified security model
Notebooks
Apache Spark
On-demand T-SQL
Code-free data flows
Orchestration Pipelines
Data movement
Integrated Power BI
Check out the full documentation. Note on the home page of the Synapse workspace, under “Useful links” there is a “Getting started” link that has an option to “Query sample data” that creates a new SQL pool for you and loads sample data into it. It also provides sample scripts so you can start querying the data (there is not much sample data/queries yet) .
I also have a large PowerPoint deck on an overview of Azure Synapse Analytics that you may find useful.
Also mentioned where various other new features (see the release notes):
- GA: Workload Isolation, Updatable Hash Key, Materialized View Improvement
- Public preview: COPY statement for data loading, PREDICT Scoring, Bulk Load Wizard, CSV Schema Inference, DeltaLake Tables v0.6 support (in Spark and Data Flow), CDM support, External Table Wizard
- Private preview: SQL MERGE support, Column Encryption, Multi-Column Hash Distribution
Note the COPY statement is now the preferred way to load data into Synapse (see Data loading strategies for Synapse SQL pool).
Make sure to watch the excellent session by Charles Feddersen at Build called “Developing end-to-end analytics solutions with the latest Azure Synapse features” where he demo’s some of these new features.
Some resources available to you:
- Try the latest Azure Synapse features with an Azure free trial account
- Register for the virtual event “Azure Synapse Analytics: How it Works”
- Step-by-step tutorial on securely and easily sharing data
- Download the Synapse Getting Started Toolkit
Another announcement was Azure Synapse Link (see What is Azure Synapse Link for Azure Cosmos DB (Preview)?). This allows you to take your Azure Synapse Analytics and point it directly at your operational database and do T-SQL queries against it without having to copy the data to Synapse. This means you can do real-time analytics without impacting your online or operational systems. This is especially important when you are talking about lots of data at big scale. Sometimes this is referred to as hybrid transactional-analytical processing (HTAP). See Azure Analytics: Clarity in an instant.
This is achieved by extending the standalone T-SQL query capabilities that previously just worked with ADLS to now work over data stored in Azure operational data services, starting with Cosmos DB. Cosmos DB makes data available to Synapse Link in a columnar structure using the Azure Cosmos DB analytical store. The result is that the T-SQL query engine can query data in the lake and in operational stores, allowing for hybrid capabilities. Eventually Synapse Link will be available for Azure SQL Database, Azure Database for PostgreSQL, and Azure Database for MySQL. See Connect to Synapse Link for Azure Cosmos DB
Concerning Cosmos DB, now in GA is autoscale (originally named autopilot) as well as serverless modes of operation, allowing better alignment of billing to active usage. Autoscale works by managing provisioned request units from between 10% and 100% of a customer-declared maximum, based on demand. Serverless implements per-operation compute pricing. See Create Azure Cosmos containers and databases with autoscale throughput and Autoscale + serverless: new offers to fit any workload.
Also announced was that the Azure SQL Edge product is now available in public preview. Announced at last year’s Build conference under the name “Azure SQL Database Edge,” the product is a version of the Azure SQL Database can run on small edge devices, including those based on ARM processors. SQL Edge also integrates a specially-implemented version of Azure Stream Analytics.
With SQL Edge’s public preview, this now means Microsoft’s T-SQL language works at the edge, on-premises and in the cloud, on relational and NoSQL operational data, on the data warehouse, on the data lake and in HTAP implementations.
Everything announced at Build can be found in the Microsoft Build 2020 Book of News.
More info:
Microsoft Build brings announcements for cloud data, analytics services, and intersection of the two
5 Reasons why Azure Synapse Analytics should be on your roadmap
Thank you James! This a great summary of the announcements and I’m going to check your Synapse deck!
This is amazing! Something that I’ve been wondering is, if we create a SQL pool for our Synapse Workspace; does it differ in any way from an existing SQL pool? There’s no special data lake optimisations that exist here?
This is great, quick question when delta table will integrate with SQL on demand and SQL pool?
Hi Yohesh, not yet, but on the roadmap.
Thanks for distilling down as-ever.
I have seen a fair amount of news/documentation about on-demand pricing for data stored in the lake/files. Does the on-demand T-SQL pricing also cover data stored in a Warehouse? i.e. instead of provisioning a DW based on size, pay per query like Google Big Query?
Hi James,
is the SQL DB in the sql pool an ‘MPP’ SQL or is it SMP ‘SQL’?
Hi Jude, it is an MPP SQL pool.
Thanks James. Does that mean Synapse studio&analytics is best fit for only SQL DW type scenarios?
The entire workspace & integration is great, but not meant for a regular Azure SQL DB (small scale DW or OLTP)?
Correct, SQL DW is an OLAP solution and should not be used for OLTP.