Three more chapters of my data architecture book are available!
As I have mentioned in a prior blog post, I have been writing a data architecture book, which I started last November. The title of the book is “Deciphering Data Architectures: Choosing Between a Modern Data Warehouse, Data Fabric, Data Lakehouse, and Data Mesh” and it is being published by O’Reilly.
There are now five chapters and the preface available in their Early Release program:
- Big Data
- Types of Data Architectures
- The Architecture Design Session
- The Relational Data Warehouse
- Data Lake
It’s 80 printed pages. Check it out here! You can expect to see two additional chapters appear each month. This is a great way to start reading the book without having to wait until the entire book is done. Note you have to have an O’Reilly subscription to access it, or start a free 10-day trial. The site has the release date for the full book as May 2024, but I’m expecting it to be available by the end of this year. Please send me any feedback on the book to jamesserra3@gmail.com. Would love to hear what you think!
Data lake chapter looks as copy paste from dozen already available at Oreilly books. At first paragraph you defining Data Lake as schema-on-read, but for Presentation (Gold) data layer define “data that has been transformed into a specific schema for use in reporting tools”.
So, same as in dozens other books there are no explanation why “schema-on-read” if almost all Data Lake users will work with Presentation data layer, where schema will be predefined.
I greatly appreciate the feedback! I certainly did not copy-and-paste from other Oreilly books, rather it came from my own blogs and experiences. Your point on the presentation layer confusion is spot on and I will update that chapter to make that more clear. Thanks for pointing that out!