Microsoft Fabric roadmap
Microsoft Fabric is an awesome product that has now been in public preview for five months. If you are not familiar with it, check out my recent video where I provide a Microsoft Fabric introduction. Also, an excellent training course has just been released to learn all about Fabric: Microsoft Fabric Complete Guide – Future of Data with Fabric.
Just released was the Microsoft Fabric roadmap that you can check out at https://aka.ms/FabricRoadmap. It’s great to see Microsoft be transparent on what features they are working on and when they will be available.
Here are my top 18 features on the roadmap that I am most excited about (in the order found in the roadmap):
Admin and governance:
Purview hub for administrators and data owners – Public preview
Estimated release timeline: Q4 2023
Fabric admins (Q3 2023) and data owners (Q4 2023) can gain valuable insights about sensitive data, certified and promoted items. They contain insights about sensitive data, certified and promoted items, and a gateway to advanced capabilities in Microsoft Purview portals.
Purview data loss prevention policies for schematized data in OneLake
Estimated release timeline: Q1 2024
Compliance admins can use Microsoft Purview Data Loss Prevention (DLP) policies to detect the upload of sensitive data (such as social security number) to OneLake. If such an upload is detected, the policies will trigger automatic policy tip that is visible to data owners and it can also trigger an alert for compliance admins. DLP policies can automate the compliance processes to meet enterprise-scale compliance and regulatory requirements in an effective way.
Microsoft Fabric user REST APIs
Estimated release timeline: Q4 2023
Deliver a user-friendly, standardized API for Fabric’s core functionality and experience APIs, ensuring ease of use for developers. The well-documented Fabric REST API includes authentication, authorization, version control, policy enforcement, and error handling. Additionally, developers can use existing protocol-specific APIs like XMLA and TDS. Some examples include Workspace and capacity management, CRUD operations on items, and permission management.
OneLake:
OneLake security model for tables and files (public preview)
Estimated release timeline: Q2 2024
Managing data security across multiple analytical engines and copies of data is challenging. OneLake and Fabric simplify this by enabling the use of a single data copy across multiple analytical engines without any data movement or duplication. Taking the “one copy” concept further, OneLake is also enhancing security with a finer-grain model, allowing direct security for tables and folders. These security definitions live with the data and travel across shortcuts to wherever the data is used. Security defined at OneLake is universally enforced no matter, which analytical engine is used to access the data.
Folders in workspaces
Estimated release timeline: Preview in Q4 2023
Introducing folders in the workspace allows you to better organize and find items. The preview of this feature will provide the organizational capabilities of folders. Subsequent updates will address folder-related permission management scenarios.
Synapse – Data Warehouse:
Data warehouse SQL security enhancements
Estimated release timeline: Q4 2023 (available now: Announcing: Column-Level & Row-Level Security for Fabric Warehouse & SQL Endpoint | Microsoft Fabric Blog | Microsoft Fabric)
You can define granular row-level security for data in the data warehouse, ensuring restricted access and appropriate viewing based on entitlements.
Synapse – Data Engineering:
Lakehouse data security (Public Preview)
Estimated release timeline: Q2 2024
You’ll have the ability to apply file, folder, and table (or object level) security in the lakehouse. you can also control who can access data in the lakehouse, and the level of permissions they have. For example, You can grant read permissions on files, folders, and tables. Once permissions are applied, they’re automatically synchronized across all engines. Which means, that permissions will be consistent across Spark, SQL, Power BI, and external engines.
Schema support for Lakehouse
Estimated release timeline: Q2 2024
The lakehouse will support 3-part naming convention. It enables you to add schemas to your lakehouses, which is consistent with the current warehouse experience.
Policy management
Estimated release timeline: Q2 2024
Workspace admins will be able to author and enforce policies based on Spark properties, ensuring that your workloads comply with certain rules. For example, they can limit the number of resources, the time that a workload can consume, or prevent users from changing certain Spark settings. This will enhance the governance and security of your Spark workloads.
Copilot integration in notebooks (Public Preview)
Estimated release timeline: Q4 2023
You’ll be able to use copilot in notebooks, to chat about your data, get code suggestions, and debug your code. Copilot will be data aware, which means, it will have context about the lakehouse tables and schemas. Copilot is a smart and helpful assistant for data engineering tasks.
Dynamic lineage of data engineering items
Estimated release timeline: Q4 2023
You will be able to trace the lineage within Fabric across the code items such as notebooks & Spark jobs, and data items such as a lakehouse. This lineage will be dynamic, which means meaning if the code adds or removes references to lakehouses, it will be reflected in the lineage view.
Synapse – Data Science:
Semantic link
Estimated release timeline: Q4 2023 (available now: Semantic link in Microsoft Fabric: Bridging BI and Data Science | Microsoft Fabric Blog | Microsoft Fabric)
Semantic link bridges the gap between data science and BI by providing a Python library (SemPy) that enables data scientists to interact with Power BI datasets and measures. You can use SemPy to read, explore, query, and validate data in Power BI from Python notebooks, and use the library’s features to detect and resolve data challenges. Users can also write back to the Power BI dataset through the lakehouse with Direct Lake mode.
Synapse – Real-Time Analytics:
SQL native support in KQL querysets
Estimated release timeline: Q2 2024
This feature enables customers to use a native SQL editor to run SQL over KQL databases in a queryset, alongside using KQL. With this capability, customers are able to use the SQL editor’s native capabilities, such as syntax highlighting, suggestions, and more.
Co-pilot (Preview)
Estimated release timeline: Q4 2023
KQL Co-pilot allows you to write queries in natural language and have them translated into Kusto Query Language (KQL). You can use Co-pilot to ask your how-to queries, explore your data in a KQL database, and create Kusto entities such as tables, functions, and materialized views.
Create actions and alerts with Data Activator
Estimated release timeline: Q4 2023
This feature provides a low-code/no-code experience to drive actions and alerts from your KQL database data. Data Activator gives you a single place to define actionable patterns in your data. These patterns can range from simple thresholds (such as a value being exceeded) to more complex patterns over time (such a value trending down). When Data Activator detects an actionable pattern, it triggers an action. That action can be an email or a Teams alert to the relevant person in your organization. It can also trigger an automatic process, via a Power Automate flow or an action in one of your organization’s line-of-business apps.
Data Factory:
Fast Copy support in Dataflow Gen2
Estimated release timeline: Q1 2024
We’re adding support for large-scale data ingestion directly within the Dataflow Gen2 experience, utilizing the pipelines Copy Activity capability. This supports sources such Azure SQL Databases, CSV, and Parquet files in Azure Data Lake Storage and Blob Storage.
This enhancement significantly scales up the data processing capacity of Dataflow Gen2 providing high-scale ELT (Extract-Load-Transform) capabilities.
Copilot in Data Factory
Estimated release timeline: Q4 2023
You’ll be able to use copilot with dataflows and data pipelines in Data Factory. Copilot in Data Factory empowers both citizen and professional developers to build simple to complex dataflows and pipelines using natural language. You’ll be able to work together with Copilot to iteratively develop dataflows and data pipelines for your data integration needs.
Data Activator: (available now: Announcing the Data Activator public preview)
Metrics, triggers, and actions
Estimated release timeline: Q2 2024
In addition to monitoring business objects, Data Activator will let you define key metrics that assign values from your data stream. These metrics can trigger actions based on aggregated values across multiple dimensions at specific time intervals.
Data activator makes me reminisce about Jheri Curls and Hollywood Shuffle. Get me my data activator, man! 🤣🤣
Pingback:Microsoft Fabric Roadmap – Curated SQL