James Serra | James Serra's Blog

About James Serra

James works at Microsoft as a big data and data warehousing solution architect where he has been for most of the last ten years. Before that he was an independent consultant working as a Data Warehouse/Business Intelligence architect and developer. He is a prior SQL Server MVP with over 35 years of IT experience. Check out his book Deciphering Data Architectures: Choosing Between a Modern Data Warehouse, Data Fabric, Data Lakehouse, and Data Mesh.

View all posts by James Serra

Ways to access data in ADLS Gen2

Posted on September 16, 2019 by James SerraMarch 9, 2020

With data lakes becoming popular, and Azure Data Lake Store (ADLS) Gen2 being used for many of them, a common question I am asked about is “How can I access data in ADLS Gen2 instead of a copy of the … Continue reading →

New product: Azure Data Share

Posted on August 19, 2019 by James SerraMarch 29, 2021

A brand new product by Microsoft called Azure Data Share was recently announced. It is in public preview. To explain the product in short, any data which resides in Azure storage can be securely shared between a data provider and … Continue reading →

Podcasts on Modern Data Warehouse

Posted on August 8, 2019 by James SerraAugust 7, 2019

I did a couple of recent podcasts that I wanted to mention: The first one was with Kirby Repko from the YouTube channel SQLTalk: Modern Data Warehouse Design with James Serra (15 minutes). Check out his playlist of other useful … Continue reading →

Big Data Workshop

Posted on July 10, 2019 by James SerraJuly 10, 2019

A challenge I have with customers who want to get hands-on experience with the Azure products that are found in a modern data warehouse architecture is finding a workshop that covers many of those products. To the rescue is a … Continue reading →

The Microsoft Power Platform

Posted on June 25, 2019 by James SerraApril 13, 2023

The Microsoft Power Platform consists of three products: Power BI, PowerApps, and Microsoft Flow (now called Power Automate). I find customers are confused on the use cases of these products and how they compare to other products (Azure Functions, Logic … Continue reading →

Common Data Model

Posted on June 6, 2019 by James SerraFebruary 1, 2022

The Common Data Model (CDM) is a shared data model that is a place to keep all common data to be shared between applications and data sources. Another way to think of it is is a way to organize data … Continue reading →

Microsoft Build event announcements

Posted on May 16, 2019 by James SerraMay 15, 2019

Another Microsoft event and another bunch of exciting announcements. At the Microsoft Build event last week, the major announcements in the data platform and AI space were: Machine Learning Services enhancements In Private Preview is a new visual interface for … Continue reading →

Where should I clean my data?

Posted on April 25, 2019 by James SerraJanuary 30, 2021

As a follow-up to my blogs What product to use to transform my data? and Should I load structured data into my data lake?, I wanted to talk about where you should you clean your data when building a modern data warehouse … Continue reading →

Two more of my presentations

Posted on April 4, 2019 by James SerraApril 4, 2019

I recently made available two more presentations that you might find helpful. Feel free to download them and present them to others (adding a line that you got them from me is all I ask). There is a full list of all … Continue reading →

Azure Data Explorer

Posted on March 14, 2019 by James SerraApril 5, 2019

Azure Data Explorer (ADX) was announced as generally available on Feb 7th. In short, ADX is a fully managed data analytics service for near real-time analysis on large volumes of data streaming (i.e. log and telemetry data) from such sources … Continue reading →

My latest presentations

Posted on March 11, 2019 by James SerraMarch 10, 2019

I frequently present at user groups, and always try to create a brand new presentation to keep things interesting. We all know technology changes so quickly so there is no shortage of topics! There is a list of all my presentations with … Continue reading →

Azure Data Lake Store Gen2 is GA

Posted on February 19, 2019 by James SerraMarch 10, 2019

Azure Data Lake Store (ADLS) Gen2 was made generally available on February 7th. In short, ADLS Gen2 is the best of the previous version of ADLS (now called ADLS Gen1) and Azure Blob Storage. ADLS Gen2 is built on Blob storage … Continue reading →

Storage options for SQL Server database files in Azure

Posted on January 29, 2019 by James SerraApril 14, 2020

If you are using SQL Server in an Azure VM (IaaS) you have a number of options of where to store the database files (.mdf, .ldf, and .ndf). Most customers use managed disks, available in a number of offerings: Standard HDD, Standard … Continue reading →

What product to use to transform my data?

Posted on January 8, 2019 by James SerraJanuary 17, 2019

If you are building a big data solution in the cloud, you will likely be landing most of the source data into a data lake. And much of this data will need to be transformed (i.e. cleaned and joined together … Continue reading →

Azure Data Factory Data Flow

Posted on December 17, 2018 by James SerraNovember 17, 2019

Azure Data Factory v2 (ADF) has a new feature in public preview called Data Flow. I have usually described ADF as an orchestration tool instead of an Extract-Transform-Load (ETL) tool since it has the “E” and “L” in ETL but … Continue reading →

Should I load structured data into my data lake?

Posted on November 27, 2018 by James SerraMarch 2, 2020

With data lakes becoming very popular, a common question I have been hearing often from customers is, “Should I load structured/relational data into my data lake?”. I talked about this a while back in my blog post What is a data … Continue reading →

Premium blob storage

Posted on November 6, 2018 by James SerraMay 18, 2020

As a follow-up to my blog Azure Archive Blob Storage, Microsoft has released another storage tier called Azure Premium Blob Storage (announcement). It is in private preview in US East 2, US Central and US West regions. This is a performance … Continue reading →

SQL Server 2019 Big Data Clusters

Posted on October 22, 2018 by James SerraApril 13, 2020

At the Microsoft Ignite conference, Microsoft announced that SQL Server 2019 is now in preview and that SQL Server 2019 will include Apache Spark and Hadoop Distributed File System (HDFS) for scalable compute and storage. This new architecture that combines together the … Continue reading →

James Serra's Blog

Big Data and Data Warehousing

Author Archives: James Serra