Gør som tusindvis af andre bogelskere
Tilmeld dig nyhedsbrevet og få gode tilbud og inspiration til din næste læsning.
Ved tilmelding accepterer du vores persondatapolitik.Du kan altid afmelde dig igen.
The data lakehouse is the next generation of the data warehouse and data lake, designed to meet today's complex and ever-changing analytics, machine learning, and data science requirements.Learn about the features and architecture of the data lakehouse, along with its powerful analytical infrastructure. Appreciate how the universal common connector blends structured, textual, analog, and IoT data. Maintain the lakehouse for future generations through Data Lakehouse Housekeeping and Data Future-proofing. Know how to incorporate the lakehouse into an existing data governance strategy. Incorporate data catalogs, data lineage tools, and open source software into your architecture to ensure your data scientists, analysts, and end users live happily ever after.
Master the most agile and resilient design for building analytics applications: the Unified Star Schema (USS) approach. The USS has many benefits over traditional dimensional modeling. Witness the power of the USS as a single star schema that serves as a foundation for all present and future business requirements of your organization.Data warehouse legend Bill Inmon and business intelligence innovator, Francesco Puppini, explain step-by-step why the Unified Star Schema is the recommended approach for business intelligence designs today, and show through many examples how to build and use this new solution. This book contains two parts. Part I, Architecture, explains the benefits of data marts and data warehouses, covering how organizations progressed to their current state of analytics, and to the challenges that result from current business intelligence architectures. Chapter 1 covers the drivers behind and the characteristics of the data warehouse and data mart. Chapter 2 introduces dimensional modeling concepts, including fact tables, dimensions, star joins, and snowflakes. Chapter 3 recalls the evolution of the data mart. Chapter 4 explains Extract, Transform, and Load (ETL), and the value ETL brings to reporting. Chapter 5 explores the Integrated Data Mart Approach, and Chapter 6 explains how to monitor this environment. Chapter 7 describes the different types of metadata within the data warehouse environment. Chapter 8 progresses through the evolution to our current modern data warehouse environment.Part II, the Unified Star Schema, covers the Unified Star Schema (USS) approach and how it solves the challenges introduced in Part I. There are eight chapters within Part II:Chapter 9, Introduction to the Unified Star Schema: Learn about its architecture and use cases, as well as how the USS approach differs from the traditional approach. Chapter 10, Loss of Data: Learn about the loss of data and the USS Bridge. Understand that the USS approach does not create any join, and for this reason, it has no loss of data. Chapter 11, The Fan Trap: Get introduced to the Oriented Data Model convention, and learn the dangers of a fan trap through an example. Differentiate join and association, and realize that an "in-memory association" is the preferred solution to the fan trap. Chapter 12, The Chasm Trap: Become familiar with the Cartesian product, and then follow along with an example based on LinkedIn, which illustrates that a chasm trap produces unwanted duplicates. See that the USS Bridge is based on a union, which does not create any duplicates. Chapter 13, Multi-Fact Queries: Distinguish between multiple facts "with direct connection" versus multiple facts "with no direct connection". Explore how BI tools are capable of building aggregated virtual rows.Chapter 14, Loops: Learn more about loops and five traditional techniques to solve them. Follow along with an implementation, which will illustrate the solution based on the USS approach.Chapter 15, Non-Conformed Granularities: Learn about non-conformed granularities, and learn that the Unified Star Schema introduces a solution called "re-normalization". Chapter 16, Northwind Case Study. Witness how easy it is to detect the pitfalls of Northwind using the ODM convention. Follow along with an implementation of the USS approach on the Northwind database with various BI tools.
The data lakehouse is the next generation of the data warehouse and data lake, designed to meet today's complex and ever-changing modern information systems. This book shows you how to construct your data lakehouse as the foundation for your artificial intelligence (AI), machine learning (ML), and data mesh initiatives. Know the pitfalls and techniques for maximizing business value of your data lakehouse.In addition, be able to explain the core characteristics and critical success factors of a data lakehouse. By reviewing entry errors, key incompatibility, and ensuring good documentation, we can improve the data quality and believability of your lakehouse. Evaluate criteria for data quality, including accuracy, completeness, reliability, relevance, and timeliness. Understand the different types of storage for the lakehouse, including the under-utilized yet extremely valuable bulk storage. There are three data types in the data lakehouse (structured, textual, and analog/ IoT), and for each, learn how to build a robust foundation for artificial intelligence (AI), machine learning (ML), and data mesh. Leverage data models for structured data, ontologies and taxonomies for textual data, and distillation algorithms for analog/IoT data. Learn how to abstract these data types to accommodate future requirements and simplify data lineage. Apply Extract, Transform, and Load (ETL) to create a structure that returns the answers to business problems. The end result is a data lakehouse that meets our needs. Speaking of human needs, learn Maslow's Hierarchy of Data Lakehouse Needs. Next explore data integration geared for Al, ML, and data mesh. Then deep dive with us into all of the varieties of analytics within the lakehouse, including structured, textual, and analog analytics. Witness how descriptive data, data catalog, and metadata can increase the value of the lakehouse. We conclude with a detailed evolution of data architecture, from magnetic tape to the data lakehouse as a bedrock foundation for AI, ML, and data mesh.
The data lakehouse is the next generation of the data warehouse and data lake, designed to meet today's complex and ever-changing analytics, machine learning, and data science requirements.Learn about the features and architecture of the data lakehouse, along with its powerful analytical infrastructure. Appreciate how the universal common connector blends structured, textual, analog, and IoT data. Maintain the lakehouse for future generations through Data Lakehouse Housekeeping and Data Future-proofing. Incorporate data catalogs, data lineage tools, and open source software into your architecture to ensure your data scientists, analysts, and end users live happily ever after. Deep dive into one specific implementation of a data lakehouse: the Databricks Lakehouse Platform.
"Learn why the textual warehouse is valuable today, how to build one, and how it compares to the data warehouse"--
Increase the awareness of your customer's behavior to survive and excel within your industry.One hundred years ago, the voice of the customer was easily and routinely heard by the shopkeeper. In small towns, the shopkeeper knew everyone. Today's world has gotten much bigger and much more complex. No longer does the store owner personally know everyone who comes into the store. Yet there are three important abilities technologies offer that make it possible to listen to the voice of the customer today: The ability to acquire, store, and manage huge amounts of data The ability to read and understand text in a computerized environment The ability to visualize dataThis book answers important questions such as: Where is the voice of the customer heard? How does the corporation find and capture the voice of the customer? How is the voice of the customer actually interpreted and understood? How do you cope with the volume of messages the customer is sending you? How do you separate noise from the important messages? How do you analyze the composite voice of the customer over thousands of customers? How do you reduce the voice of the customer to a visual format that is understood by management? How do you know when the message the customer is sending changes?After reading this book the reader will be able to manage, build, and operate a corporate infrastructure that listens to the voice of the customer.
In our distant past, we attempted to create wealth by turning everyday substances into gold. This was early alchemy, and ultimately it did not work. But the world has changed. Today we have a type of modern alchemy that really can create gold. We can transform voluminous text into a wealth of knowledge. Text is a common fabric of society, yet it is still challenging for our technology to make sense of text. This is where taxonomies can help. In this book, legendary Bill Inmon will introduce you to the concept of taxonomies and how they are used to simplify and understand text. We emphasise the practical aspects of taxonomies, and the subsequent usage of taxonomies as a basis for textual analytics. This book is for managers who have to deal with text, students of computer science, programmers who need to understand taxonomies, systems analysts who hope to draw business value out of a body of text, and especially those who are struggling to decode data lakes. Hopefully for those individuals (and many more), this book will serve as both an introduction to taxonomies and a guide to how taxonomies can be used to bring text into the realm of corporate decision-making. This book will introduce you to the world of taxonomies, as well as explore: Simple and complex taxonomies; Ontologies; Obtaining taxonomies; Changing taxonomies; Taxonomies and data models; Types of textual data; Textual analytics. In addition, several case studies are presented from industries as diverse as banking, call centres, and travel.
For years, business users have leveraged spreadsheets for storing and communicating data. Although spreadsheets may be easy to create and update, making important corporate decisions based on spreadsheets is risky due to the lack of data credibility. Whether you are a manager, developer, end user, or student, this book will help you turn spreadsheet data into credible, useful, reliable data that can be trusted in order to make important decisions.A chapter is dedicated to each of the following topics: Brief history of spreadsheets Spreadsheet paradox Spreadsheet varieties The PDF spreadsheet Spreadsheet formatting Spreadsheet disambiguation The intermediate database The ssdef database The corporate database The metadata database (mnemonic database) Political considerations Data modeling and the spreadsheet Case study
Organizations invest incredible amounts of time and money obtaining and then storing big data in data stores called data lakes. But how many of these organizations can actually get the data back out in a useable form? Very few can turn the data lake into an information gold mine. Most wind up with garbage dumps.Data Lake Architecture will explain how to build a useful data lake, where data scientists and data analysts can solve business challenges and identify new business opportunities. Learn how to structure data lakes as well as analog, application, and text-based data ponds to provide maximum business value. Understand the role of the raw data pond and when to use an archival data pond. Leverage the four key ingredients for data lake success: metadata, integration mapping, context, and metaprocess.Bill Inmon opened our eyes to the architecture and benefits of a data warehouse, and now he takes us to the next level of data lake architecture.
Tilmeld dig nyhedsbrevet og få gode tilbud og inspiration til din næste læsning.
Ved tilmelding accepterer du vores persondatapolitik.