Common Data Model
The Common Data Model (CDM) is a shared data model that is a place to keep all common data to be shared between applications and data sources. Another way to think of it is is a way to organize data … Continue reading →
The Common Data Model (CDM) is a shared data model that is a place to keep all common data to be shared between applications and data sources. Another way to think of it is is a way to organize data … Continue reading →
Another Microsoft event and another bunch of exciting announcements. At the Microsoft Build event last week, the major announcements in the data platform and AI space were: Machine Learning Services enhancements In Private Preview is a new visual interface for … Continue reading →
As a follow-up to my blogs What product to use to transform my data? and Should I load structured data into my data lake?, I wanted to talk about where you should you clean your data when building a modern data warehouse … Continue reading →
I recently made available two more presentations that you might find helpful. Feel free to download them and present them to others (adding a line that you got them from me is all I ask). There is a full list of all … Continue reading →
Azure Data Explorer (ADX) was announced as generally available on Feb 7th. In short, ADX is a fully managed data analytics service for near real-time analysis on large volumes of data streaming (i.e. log and telemetry data) from such sources … Continue reading →
I frequently present at user groups, and always try to create a brand new presentation to keep things interesting. We all know technology changes so quickly so there is no shortage of topics! There is a list of all my presentations with … Continue reading →
Azure Data Lake Store (ADLS) Gen2 was made generally available on February 7th. In short, ADLS Gen2 is the best of the previous version of ADLS (now called ADLS Gen1) and Azure Blob Storage. ADLS Gen2 is built on Blob storage … Continue reading →
If you are using SQL Server in an Azure VM (IaaS) you have a number of options of where to store the database files (.mdf, .ldf, and .ndf). Most customers use managed disks, available in a number of offerings: Standard HDD, Standard … Continue reading →
If you are building a big data solution in the cloud, you will likely be landing most of the source data into a data lake. And much of this data will need to be transformed (i.e. cleaned and joined together … Continue reading →
Azure Data Factory v2 (ADF) has a new feature in public preview called Data Flow. I have usually described ADF as an orchestration tool instead of an Extract-Transform-Load (ETL) tool since it has the “E” and “L” in ETL but … Continue reading →
With data lakes becoming very popular, a common question I have been hearing often from customers is, “Should I load structured/relational data into my data lake?”. I talked about this a while back in my blog post What is a data … Continue reading →
As a follow-up to my blog Azure Archive Blob Storage, Microsoft has released another storage tier called Azure Premium Blob Storage (announcement). It is in private preview in US East 2, US Central and US West regions. This is a performance … Continue reading →
At the Microsoft Ignite conference, Microsoft announced that SQL Server 2019 is now in preview and that SQL Server 2019 will include Apache Spark and Hadoop Distributed File System (HDFS) for scalable compute and storage. This new architecture that combines together the … Continue reading →
At Microsoft Ignite, one of the announcements was for Azure SQL Database Hyperscale, which was made available in public preview October 1st, 2018 in 12 different Azure regions. SQL Database Hyperscale is a new SQL-based and highly scalable service tier for single databases … Continue reading →
As I first mentioned in my blog Microsoft database migration tools, the Azure Database Migration Service (DMS) is a PaaS solution that makes it easy to migrate from on-prem/RDS to Azure and one database type to another. I’ll give a brief overview … Continue reading →
Read Scale-Out is a little-known feature that allows you to load balance Azure SQL Database read-only workloads using the capacity of read-only replicas, for free. As mentioned in my blog Azure SQL Database high availability, each database in the Premium tier (DTU-based … Continue reading →
My last blog post was on Azure SQL Database high availability and I would like to continue along that discussion with a blog post about disaster recovery in Azure SQL Database. First, a clarification on the difference between high availability and … Continue reading →
In this blog I want to talk about how Azure SQL Database achieves high availability. One of the major benefits from moving from on-prem SQL Server to Azure SQL Database is how much easier it is to have high availability … Continue reading →
Dataflows, previously called Common Data Service for Analytics as well as Datapools, will be in preview soon and I wanted to explain in this blog what it is and how it can help you get value out of your data … Continue reading →
There are two really great features just added to Power BI that I wanted to blog about: Composite models and Dual storage mode. This is part of the July release for Power BI Desktop and it is in preview (see … Continue reading →