Deciphering Data Architectures: Choosing Between a Modern Data Warehouse, Data Fabric, Data Lakehouse, and Data Mesh
Please send me any feedback on the book to jamesserra3@gmail.com. Would love to hear what you think!
I would appreciate any Amazon reviews.
Here are some of the places you can purchase it
Amazon, Powell’s, Bookshop.org, Target, Barnes & Noble, Waterstones (UK), thriftbooks, eBooks.com, Walmart, Google Play Books, Apple Books, Booktopia (Australia), Harvard Book Store, HudsonBooksellers, Books-a-Million, alibris, BetterWorldBooks, booksrun, Shroff publishers, and just about every online bookseller in North America.
My book is also part of a Humble bundle.
You can also read it on O’Reilly (via subscription – get a 30-day free trial).
Custom signed copies of my book are available at Etsy.
My book is currently in the process of being translated to Portuguese (available now and at Amazon), Polish, German (available now and at Amazon), Chinese and Russian!
Abstract
Data fabric, data lakehouse, and data mesh have recently appeared as viable alternatives to the modern data warehouse. These new architectures have solid benefits, but they’re also surrounded by a lot of hyperbole and confusion. This practical book provides a guided tour of these architectures to help data professionals understand the pros and cons of each.
James Serra, big data and data warehousing solution architect at Microsoft, examines common data architecture concepts, including how data warehouses have had to evolve to work with data lake features. You’ll learn what data lakehouses can help you achieve, and how to distinguish data mesh hype from reality. Best of all, you’ll be able to determine the most appropriate data architecture for your needs. With this book, you’ll:
- Gain a working understanding of several data architectures
- Learn the strengths and weaknesses of each approach
- Distinguish data architecture theory from the reality
- Pick the best architecture for your use case
- Understand the differences between data warehouses and data lakes
- Learn common data architecture concepts to help you build better solutions
- Explore the historical evolution and characteristics of data architectures.
- Learn essentials of running an architecture design session, team organization, and project success factors
- Free from product discussions, this book is a timeless resource for years to come
External reviews
Book Review – Deciphering Data Architectures, Koen Verbeeck
DATA ARCHITECTURES ARE CHOICES, NOT IDEALS, Ole Olesen-Bagneux
Mitul Vadgama
Dipak Shaw
Craig Rothe
Greg Low
Amir Sazgarnia
Varun Singal
Editorial reviews
“There is no one whose knowledge of data architectures and data processes I trust more than James Serra. This book not only provides a comprehensive and clear description of key architectural principles, approaches, and pitfalls, it also addresses the all-important people, cultural, and organizational issues that too often imperil data projects before they get going. This book is destined to become an industry primer studied by college students and business professionals alike who encounter data for the first time (and maybe the second and third time as well!)”
–Wayne Eckerson, President of Eckerson Group
“James’s superpower has always been taking complex subjects and explaining them in a simple way. In this book, he hits all the key points to help you choose the right data architecture and avoid common (and costly!) mistakes.”
–Rod Colledge, Senior Technical Specialist (Data & AI), Microsoft
“James has condensed over 30 years of data architecture knowledge and wisdom into this comprehensive and very readable book. For those who must do the hard work of delivering analytics rather than singing its praises, this is a must-read.”
–Dr. Barry Devlin, Founder and Principal, 9sight Consulting
“Data management is critical to the success of every business. Deciphering Data Architectures breaks down the buzzwords into simple and understandable concepts and practical solutions to help you get to the right architecture for your dataset. James has an innate curiosity to understand things and then to share that in a way that everyone can understand.”
–Matt Usher, Director, Pure Storage
“James’ blog has been my go-to resource for demystifying architectural concepts, understanding technical terminology, and navigating the life of a solution architect or data engineer. His ability to transform complex technical concepts into clear, easy-to-grasp explanations is truly remarkable. This book is an invaluable collection of his work, serving as a comprehensive reference guide for designing and comprehending architectures.”
–Annie Xu, Senior Data Customer Engineer, Google
“Deciphering Data Architectures is not only thorough and detailed, but it also provides a critical perspective on what works, and perhaps more importantly, what may not work well. Whether discussing older data approaches or newer ones such as Data Mesh, the book offers words of wisdom and lessons learned that will help any data practitioner accelerate their data journey.”
–Eric Broda, entrepreneur, data consultant, O’Reilly author of Implementing Data Mesh
“In Deciphering Data Architectures, James Serra does a wonderful job explaining the evolution of leading data architectures and the trade-offs between them. This book should be required reading for current and aspiring data architects.”
–Bill Anton, Data Geek, Opifex Solutions
“Deciphering guides data architects through the modern analytics landscape, covering data warehouses, fabrics, lakehouses, and meshes. Author Serra’s insightful commentary and real-world examples make the concepts accessible for all levels. This masterclass book equips you to navigate the ever-evolving world of data management.”
–William McKnight, McKnight Consulting Group
“Marketing buzz and industry thought-leader chatter have sown much confusion about data architecture patterns. With his depth of experience and skill as a communicator, James Serra cuts through the noise and provides clarity on both long-established data architecture patterns and cutting-edge industry methods. that will aid data practitioners and data leaders alike. Put it on your desk– you’ll reference it often.”
–Sawyer Nyquist, Owner, Writer, and Consultant, The Data Shop
“Deciphering Data Architectures is an indispensable vendor-neutral guide for today’s data professionals. It insightfully compares historical and modern architectures, emphasizing key trade-offs and decision-making nuances in choosing an appropriate architecture for the evolving data-driven landscape.”
–Stacia Varga, Author and data analytics consultant, Data Inspirations
“The world of data architectures is complex and full of noise. This book provides a fresh, practical perspective born of decades of experience. Whether you’re a beginner or an expert, everyone with an interest in data must read this book!”
–Piethein Strengholt, author of Data Management at Scale
“This reference should be on every data architect’s bookshelf. With clear and insightful descriptions of the current and planned technologies, readers will gain a good sense of how to steer their companies to meet the challenges of the emerging data landscape. This is an invaluable reference for new starters and veteran data architects alike.”
–Mike Fung, Master Principal Cloud Solution Architect, Oracle
“An educational gem! Deciphering Data Architectures strikes a perfect balance between simplicity and depth, ensuring that technology professionals at all levels can grasp key data concepts and understand the essential tradeoff decisions that really matter when planning a data journey.”
–Ben Reyes, Co-Founder and Managing Partner, ZetaMinusOne LLC
“I recommend Deciphering Data Architectures as a resource that provides the knowledge to understand and navigate the available options when developing a data architecture.”
–Mike Shelton, Cloud Solution Architect, Microsoft
“As a consultant and community leader, I often direct people to James Serra’s blog for up-to-date and in-depth coverage of modern data architectures. This book is a great collection, condensing Serra’s wealth of vendor-neutral knowledge. My favorite is Part III, where James discusses the pros and cons of each architecture design. I believe this book will immensely benefit any organization that plans to modernize its data estate.”
–Teo Lachev, Consultant, Prologika
“This book represents a great milestone in the evolution of how we handle data in the technology industry, and how we have handled it over several decades, or what is easily the equivalent of a career for most. The content offers great insights for the next generation of data professionals in terms of what they need to think about when designing future solutions. ‘deciphering’ is certainly an excellent choice of wording for this, as deciphering is exactly when it is needed when turning requirements into data products.”
–Paul Andrew, CTO, Cloud Formations Consulting
“A fantastic guide for data architects, this book is packed with experience and insights. Its comprehensive coverage of evolving trends and diverse approaches makes it an essential reference for anyone looking to broaden their understanding of the field.”
–Simon Whiteley, CTO, Advancing Analytics Limited
“Deep, practitioner wisdom within, the latest scenarios in the market today have vendor specific skew, latest terminology, and sales options. James takes his many years of expertise to give agnostic, cross cloud, vendor, vertical approaches from small to large.”
–Jordan Martz, Sr Sales Engineer, Fivetran
“Data Lake, Data Lakehouse, Data Fabric, Data Mesh … It isn’t easy sorting the nuggets from the noise. James Serra’s knowledge and experience is a great resource for everyone with data architecture responsibilities.”
–Dave Wells, Industry Analyst, eLearningcurve
“Too often books are “how-to” with no background or logic – this book solves that. With a comprehensive view of why data is arranged in a certain way, you’ll learn more about the right way to implement the “how”.”
–Buck Woody, Principal Data Scientist, Microsoft
“No other book I know explains so comprehensively about data lake, warehouse, mesh, fabric and lakehouse! It is a must have book for all data architects and engineers.”
–Vincent Rainardi, data architect and author
About the author
James works at Microsoft as a big data and data warehousing solution architect where he has been for most of the last nine years. He is a thought leader in the use and application of Big Data and advanced analytics, including data architectures such as the modern data warehouse, data lakehouse, data fabric, and data mesh. Previously he was an independent consultant working as a Data Warehouse/Business Intelligence architect and developer. He is a prior SQL Server MVP with over 35 years of IT experience. He started his career as a software developer, then was a DBA for 12 years, and for the last twelve years he has been working extensively with business intelligence and data warehousing using numerous Microsoft technologies and tools. He has been at times a permanent employee, consultant, contractor, and owner of his own business. All these experiences along with continuous learning have helped him to develop many successful data warehouse and BI projects. He is a popular blogger and speaker, having presented at dozens of major events including SQLBits, PASS Summit, Data Summit and the Enterprise Data World conference.
Table of contents
- Big Data
- What Is Big Data and How Can It Help You?
- Data Maturity
- Self-Service Business Intelligence
- Summary
- Types of Data Architectures
- Evolution of Data Architectures
- Relational Data Warehouse
- Data Lake
- Modern Data Warehouse
- Data Fabric
- Data Lakehouse
- Data Mesh
- Summary
- The Architecture Design Session
- What Is an ADS?
- Why Hold an ADS?
- Before the ADS
- Conducting the ADS
- After the ADS
- Tips for Conducting an ADS
- Summary
- The Relational Data Warehouse
- What Is a Relational Data Warehouse?
- What a Data Warehouse Is Not
- The Top-Down Approach
- Why Use a Relational Data Warehouse?
- Drawbacks to Using a Relational Data Warehouse
- Populating a Data Warehouse
- The Death of the Relational Data Warehouse Has Been Greatly Exaggerated
- Summary
- Data Lake
- What Is a Data Lake?
- Why Use a Data Lake?
- Bottoms-Up Approach
- Best Practices for Data Lake Design
- Multiple Data Lakes
- Summary
- Data Storage Solutions and Process
- Data Storage Solutions
- Data Processes
- Summary
- Approaches to Design
- Online Transaction Processing (OLTP) Versus Online Analytical Processing (OLAP)
- Operational and Analytical Data
- Symmetric Multiprocessing (SMP) and Massively Parallel Processing (MPP)
- Lambda Architecture
- Kappa Architecture
- Polyglot Persistence and Polyglot Data Stores
- Summary
- Approaches to Data Modeling
- Relational Modeling
- Dimensional Modeling
- Common Data Model (CDM)
- Data Vault
- The Kimball and Inmon Data Warehouse Methodologies
- Summary
- Approaches to Data Ingestion
- ETL Versus ELT
- Reverse ETL
- Batch Processing Versus Real-Time Processing
- Data governance
- Summary
- The Modern Data Warehouse
- The MDW Architecture
- Pros and Cons of the MDW Architecture
- Combining the RDW and Data Lake
- Stepping Stones to the MDW
- Case Study: Wilson & Gunkerk’s Strategic Shift to an MDW
- Summary
- Data Fabric
- The Data Fabric Architecture
- Why Transition from an MDW to a Data Fabric Architecture?
- Potential Drawbacks
- Summary
- Data Lakehouse
- Delta Lake Features
- Performance Improvements
- The Data Lakehouse Architecture
- What If You skip the Relational Data Warehouse?
- Relational Serving Layer
- Summary
- Data Mesh Foundation
- A Decentralized Data Architecture
- Data Mesh Hype
- Dehghani’s Four Principles of Data Mesh
- The “Pure” Data Mesh
- Data Domains
- Data Mesh Logical Architecture
- Different Topologies
- Data Mesh Versus Data Fabric
- Use Cases
- Summary
- Should you adopt data mesh? Myths, concerns, and the future
- Myths
- Concerns
- Organizational Assessment: Should You Adopt a Data Mesh?
- Recommendations for Implementing a Successful Data Mesh
- The Future of Data Mesh
- Zooming Out: Understanding Data Architectures and Their Applications
- Summary
- People and process
- Team Organization: Roles and Responsibilities
- Why Projects Fail: Pitfalls and Prevention
- Tips for Success
- Summary
- Technologies
- Choosing a Platform
- Cloud Service Models
- Software frameworks
- Summary