Iceberg, Multi-Engine Data Stack and Catalog Hell

Iceberg, Multi-Engine Data Stack and Catalog Hell

865 Lượt nghe
Iceberg, Multi-Engine Data Stack and Catalog Hell
Are you struggling with Apache Iceberg catalog complexity? Join us for this deep-dive conversation with Julien, a freelance data engineer with 10+ years of experience, as him and @mehdio explore the current state of table formats, catalog management, and the future of data lakehouse architecture. 📓 Resources: Boring Catalog GitHub: https://github.com/boringdata/boring-catalog Boring Data Templates: https://boringdata.io Julian's Newsletter: https://juhache.substack.com/ DuckLake : https://ducklake.select/ 🎥 Related Videos: DuckLake Deep Dive: https://www.youtube.com/watch?v=hrTjvvwhHEQ&t=5s Upcoming DuckLake Webinar: motherduck.com/events 00:00 - Introduction & Julien's Background 01:50 - The Data Engineering Career Evolution 02:56 - Boring Data: Solving the Cold Start Problem 05:40 - SaaS vs Self-Hosted Data Stack Trade-offs 10:55 - Iceberg in the Modern Data Stack 12:50 - Performance Challenges: Internal vs Open Formats 15:15 - Multi-Engine Architecture Strategy 17:20 - What's Missing in Iceberg Adoption 18:07 - Catalog Landscape Overview (Polaris, Glue, Unity) 19:26 - S3 Tables & Cloudflare R2 Innovation 23:48 - DuckLake: The New Table Format 26:06 - Advice for Getting Started with Table Formats 29:16 - Abstraction vs Technical Knowledge Trade-offs 30:38 - Live Demo: Boring Catalog 32:26 - Installing & Initializing Boring Catalog 35:06 - Creating Your First Iceberg Table 37:16 - Understanding Iceberg Metadata Structure 39:51 - Time Travel & Snapshot Management 42:06 - DuckDB Integration Demo 44:06 - Future Wishlist: Iceberg + DuckLake 45:26 - Wrap-up & Resources ➡️ Follow Us LinkedIn: https://linkedin.com/company/motherduck X/Twitter : https://twitter.com/motherduck Blog: https://motherduck.com/blog/ #ApacheIceberg #DataEngineering #DataLakehouse #DuckLake #Snowflake #DataArchitecture #duckdb