Download Early Release ChapterS
Apache HudiTM: The Definitive Guide
Whether you've been using Hudi for years, or you’re new to Hudi’s capabilities, this guide will help you build robust, open, and high-performing data lakehouses.

Apache Hudi™ enables you to create and manage a data lake with database-like capabilities, including efficient upserts, deletions, and incremental data processing. It is a revolutionary open source framework that transforms the way data engineers and data scientists interact with large-scale datasets.
With this practical guide, data engineers, data architects, and software architects will discover how to overcome challenges in building transactional guarantees on rapidly changing data using Apache Hudi. Download and read this e-book to learn how to seamlessly build an interoperable lakehouse from disparate data sources and deliver faster insights using your query engine of choice, with practical examples covering analytics from batch to interactive to streaming.
Complete Table of Contents
- What is Apache Hudi?
- Getting Started with Hudi
- Write to Hudi
- Read from Hudi
- Achieve Efficiency with Indexing (not yet available)
- Maintain and Optimize Hudi Tables
- Handle Concurrent Operations
- Building a Lakehouse using Hudi Streamer
- Running Hudi in Production (not yet available)
- Building an End-to-End Lakehouse Solution (not yet available)