Ways to land data into Fabric OneLake

Microsoft Fabric is rapidly gaining popularity as a unified data platform, leveraging OneLake as its central data storage hub for all Fabric-integrated products. A variety of tools and methods are available for copying data into OneLake, catering to diverse data ingestion needs. Below is an overview of what I believe are the key options:

— Pardon the interruption for a shameless plug: My book “Deciphering Data Architectures: Choosing Between a Modern Data Warehouse, Data Fabric, Data Lakehouse, and Data Mesh” would make a wonderful Christmas gift! Order on Amazon (English), Portuguese (Amazon), German (Amazon) —

Fabric Data Pipeline via Copy activity

Simplify data movement with managed workflows designed for efficient and reliable transfers. Ideal for orchestrating complex data pipelines with minimal effort.

Fabric Data Pipeline via Copy job

The Copy job provides a streamlined solution for data ingestion, enabling users to easily move data from any source to any destination. It supports both batch and incremental delivery, offering flexibility to meet a variety of data transfer needs.

Fabric Dataflow Gen2

Create repeatable, scalable ETL (Extract, Transform, Load) processes. Dataflow Gen2 allows for visually mapping transformations and is perfect for business users and data engineers alike.

Local file/folder upload via Fabric Portal Explorer

Leverage drag-and-drop functionality in the Fabric portal for quick, manual uploads of local files and folders to OneLake.

Fabric Eventstreams

Ingest event-driven data in real time. This is an excellent option for use cases like IoT telemetry, application logs, or transactional events.

Fabric OneLake File Explorer

Manage your OneLake files as if they were stored locally on your machine. This tool enhances accessibility and streamlines workflows.

Fabric Spark notebooks via APIs

Utilize Spark notebooks to process and load data programmatically. Combined with OneLake’s REST API, this method is tailored for advanced, customizable data ingestion needs.

Fabric Mirroring

Synchronize OneLake with external storage systems seamlessly. This option ensures your OneLake data stays updated without manual intervention.

Azure Storage Explorer

Use this desktop app to manage data across your Azure storage resources, including OneLake. It’s particularly useful for managing large datasets with a familiar interface.

AzCopy

Leverage this powerful command-line utility for efficient, large-scale data transfers. It’s the perfect tool for moving massive datasets to OneLake.

OneLake integration for semantic models

Automatically write data imported into model tables to Delta tables in OneLake. This integration simplifies analytics workflows while enhancing data consistency.

Azure Data Factory (ADF)

For enterprise-scale ETL needs, ADF offers robust capabilities that integrate seamlessly with OneLake. While similar to Fabric Data Pipelines, ADF shines in complex, high-volume scenarios.

T-SQL COPY INTO

Load data directly into OneLake using SQL scripts. This method is ideal for developers and database administrators looking for a straightforward, SQL-native approach.

By leveraging these tools and methods, organizations can effectively and efficiently ingest data into Fabric OneLake, ensuring optimal use of its unified data platform capabilities. Each approach has its unique strengths, allowing teams to choose the best fit for their specific use case.

More info:

Fabric Benchmarking Part 1: Copying CSV Files to OneLake

Comments

Ways to land data into Fabric OneLake — 6 Comments

Koen Verbeeck on December 13, 2024 at 6:02 am said:

Am I correct in assuming that you will always need to create some sort of compute first (eg lakehouse, warehouse, KQL database) before you can actually write something to OneLake? It seems you cannot store files in OneLake like you can with Azure Blob Storage.

Reply ↓
Pingback:Ways to Land Data into Microsoft Fabric OneLake – Curated SQL
James Serra on December 13, 2024 at 12:38 pm said:

Hi Koen…OneLake is automatically created with Fabric (it is ADLS Gen2 under the covers). Yes, you will need to create a lakehouse, warehouse, etc first, which are just folders in OneLake, before copying data into OneLake.

Reply ↓
Pingback:Fabric Benchmarking Part 1: Copying CSV Files to OneLake. - SQLGene Training
Vivek on January 1, 2025 at 3:29 pm said:

In some source systems the only way to get data out of the source system is to schedule report data to come across as attachments in an email.

Do any of these options work to pull attachments out of emails and load data into a Lakehouse or Warehouse table.

Reply ↓
- Ajay on February 3, 2025 at 11:03 am said:
  
  You will need to create an automation pipeline either with Logic Apps to extract these attachments and land or drop them into Azure Blob Storage connector
  
  Reply ↓

James Serra's Blog

Big Data and Data Warehousing