Vi bøger
Levering: 1 - 2 hverdage

The Azure Data Lakehouse Toolkit - Ron L'Esteve - Bog

Bag om The Azure Data Lakehouse Toolkit

Design and implement a modern data lakehouse on the Azure Data Platform using Delta Lake, Apache Spark, Azure Databricks, Azure Synapse Analytics, and Snowflake. This book teaches you the intricate details of the Data Lakehouse Paradigm and how to efficiently design a cloud-based data lakehouse using highly performant and cutting-edge Apache Spark capabilities using Azure Databricks, Azure Synapse Analytics, and Snowflake. You will learn to write efficient PySpark code for batch and streaming ELT jobs on Azure. And you will follow along with practical, scenario-based examples showing how to apply the capabilities of Delta Lake and Apache Spark to optimize performance, and secure, share, and manage a high volume, high velocity, and high variety of data in your lakehouse with ease. The patterns of success that you acquire from reading this book will help you hone your skills to build high-performing and scalable ACID-compliant lakehouses using flexible and cost-efficient decoupled storage and compute capabilities. Extensive coverage of Delta Lake ensures that you are aware of and can benefit from all that this new, open source storage layer can offer. In addition to the deep examples on Databricks in the book, there is coverage of alternative platforms such as Synapse Analytics and Snowflake so that you can make the right platform choice for your needs. After reading this book, you will be able to implement Delta Lake capabilities, including Schema Evolution, Change Feed, Live Tables, Sharing, and Clones to enable better business intelligence and advanced analytics on your data within the Azure Data Platform. What You Will LearnImplement the Data Lakehouse Paradigm on Microsoft¿s Azure cloud platform Benefit from the new Delta Lake open-source storage layer for data lakehouses Take advantage of schema evolution, change feeds, live tables, and more Writefunctional PySpark code for data lakehouse ELT jobs Optimize Apache Spark performance through partitioning, indexing, and other tuning options Choose between alternatives such as Databricks, Synapse Analytics, and Snowflake Who This Book Is For Data, analytics, and AI professionals at all levels, including data architect and data engineer practitioners. Also for data professionals seeking patterns of success by which to remain relevant as they learn to build scalable data lakehouses for their organizations and customers who are migrating into the modern Azure Data Platform.

Vis mere
  • Sprog:
  • Engelsk
  • ISBN:
  • 9781484282328
  • Indbinding:
  • Paperback
  • Sideantal:
  • 465
  • Udgivet:
  • 14. juli 2022
  • Udgave:
  • 1
  • Størrelse:
  • 255x176x28 mm.
  • Vægt:
  • 908 g.
  • 8-11 hverdage.
  • 18. december 2024
På lager
Forlænget returret til d. 31. januar 2025

Normalpris

Medlemspris

Prøv i 30 dage for 45 kr.
Herefter fra 79 kr./md. Ingen binding.

Beskrivelse af The Azure Data Lakehouse Toolkit

Design and implement a modern data lakehouse on the Azure Data Platform using Delta Lake, Apache Spark, Azure Databricks, Azure Synapse Analytics, and Snowflake. This book teaches you the intricate details of the Data Lakehouse Paradigm and how to efficiently design a cloud-based data lakehouse using highly performant and cutting-edge Apache Spark capabilities using Azure Databricks, Azure Synapse Analytics, and Snowflake. You will learn to write efficient PySpark code for batch and streaming ELT jobs on Azure. And you will follow along with practical, scenario-based examples showing how to apply the capabilities of Delta Lake and Apache Spark to optimize performance, and secure, share, and manage a high volume, high velocity, and high variety of data in your lakehouse with ease.
The patterns of success that you acquire from reading this book will help you hone your skills to build high-performing and scalable ACID-compliant lakehouses using flexible and cost-efficient decoupled storage and compute capabilities. Extensive coverage of Delta Lake ensures that you are aware of and can benefit from all that this new, open source storage layer can offer. In addition to the deep examples on Databricks in the book, there is coverage of alternative platforms such as Synapse Analytics and Snowflake so that you can make the right platform choice for your needs.
After reading this book, you will be able to implement Delta Lake capabilities, including Schema Evolution, Change Feed, Live Tables, Sharing, and Clones to enable better business intelligence and advanced analytics on your data within the Azure Data Platform.
What You Will LearnImplement the Data Lakehouse Paradigm on Microsoft¿s Azure cloud platform
Benefit from the new Delta Lake open-source storage layer for data lakehouses
Take advantage of schema evolution, change feeds, live tables, and more
Writefunctional PySpark code for data lakehouse ELT jobs
Optimize Apache Spark performance through partitioning, indexing, and other tuning options
Choose between alternatives such as Databricks, Synapse Analytics, and Snowflake
Who This Book Is For
Data, analytics, and AI professionals at all levels, including data architect and data engineer practitioners. Also for data professionals seeking patterns of success by which to remain relevant as they learn to build scalable data lakehouses for their organizations and customers who are migrating into the modern Azure Data Platform.

Brugerbedømmelser af The Azure Data Lakehouse Toolkit



Find lignende bøger
Bogen The Azure Data Lakehouse Toolkit findes i følgende kategorier:

Gør som tusindvis af andre bogelskere

Tilmeld dig nyhedsbrevet og få gode tilbud og inspiration til din næste læsning.