New Microsoft data governance product: Azure Purview
Azure Purview is at data governance solution that is the sequel to the product Azure Data Catalog, and is now available in public preview.
Purview catalogs data from on-premises, multi-cloud, or software-as-a-service (SaaS) locations. Purview let’s you understand exactly what data you have, manage its compliance with privacy regulations, and derive insights. Create a unified, up to date understanding of your data landscape with automated data discovery, sensitive data classification, and end-to-end data lineage with Purview Data Map. Purview aims to maximize the compliant use of a your own data by understanding it, how it moves (i.e. lineage), and who it is shared with. It also integrates with Microsoft Information Protection and Power BI sensitivity labels and certification and promotion labels.
Azure Purview includes three main components:
- Data discovery, classification, and mapping: It will automatically find all of an organization’s data on-premises or in the cloud, even those that are managed by other providers, and evaluate the characteristics and sensitivity of the data as it scans it
- Data catalog: It enables all users to search for trusted data using a simple web-based search engine. There is also visual graphs that let you quickly see if data of interest is from a trusted source
- Data governance: It provides a bird’s-eye view of your company’s data landscape, enabling “data officers” to efficiently govern the use of data. This enables key insights such as the distribution of data across multiple environments, how data is moving, and where sensitive data is stored
There is a sophisticated search engine to view all the scanned items:
It tracks data lineage (click to expand):
Below are the nine current different sources you can scan (more to come soon) via the “Sources” section. I have got all the scans to work on all of the sources except Power BI as that requires a bit of extra work to scan a workspace different from the one in your subscription (by default, the system will use the Power BI tenant that exists in the same Azure subscription). To register a Power BI workspace outside your subscription, see Use PowerShell to register and scan Power BI in Azure Purview (preview). For those sources that are not supported, there is an option to submit data to the catalog via Azure Purview REST APIs. You can also use the APIs to build your own user experience on the catalog.
You can also use a “map view” to see all the sources and group them under collections (click to expand):
Azure Purview also comes with system defined classification rules but you can also add your own custom classification rules:
Besides the sources listed above, you can also import the metadata from “external connections”, which currently include Azure Data Factory (ADF) and Azure Data Share. You can set this up via Management Center -> External connections. Note to view the external connections menu, you need to be assigned any one of the Azure build-in roles: Contributor, Owner, Reader, User Access Administrator (inherited role from subscription or resource group is not sufficient). Be aware there is not an option to “scan” these sources. Instead, run the pipeline in ADF as usual and the lineage will be auto pushed to Purview (note currently only supported is copy, data flow, and execute SSIS package activity). For Data Share, execute a snapshot and once it is complete, the assets and lineage will be pushed to Purview.
To ramp-up quickly, I suggest you visit the Azure Purview product page. Get started with Azure Purview documentation and view the Mechanics video to see Azure Purview in action and give your feedback via UserVoice.
More info:
A first look at Azure Purview – Data Governance for your data estate
Azure Synapse Analytics – Introduction to Azure Purview
Microsoft introduces Azure Purview data catalog; announces GA of Synapse Analytics
Use Power BI with Azure Purview to achieve better data governance and discovery
Map your data estate with Azure Purview
Unified Data Governance using Azure Purview – preventing Data Lake from becoming a Data Swamp
Power BI Governance, Good Practices: Setting up Azure Purview for Power BI
Pingback:New Microsoft data governance product: Azure Purview – SQLServerCentral
It’s very nice data governance with more graphical view and it is a unified data governance platform
Pingback:Data Mesh defined | James Serra's Blog
Pingback:Azure Purview is generally available | James Serra's Blog
Pingback:Azure Purview is generally available – SQLServerCentral