Making Kafka Cloud-Native: Object Storage Instead of Disks (with Filip Yonov & Josep Prat)

Making Kafka Cloud-Native: Object Storage Instead of Disks (with Filip Yonov & Josep Prat)

1.561 Lượt nghe
Making Kafka Cloud-Native: Object Storage Instead of Disks (with Filip Yonov & Josep Prat)
What if Apache Kafka could use cheap S3 storage instead of expensive disks? This architectural shift is worth hundreds of millions—and it's coming to open source. The week we look a project that's push for a Kafka that uses object storage services like S3 as its main disk, sacrificing a little latency for cheap, infinitely-scalable disks. There are several companies trying to walk down that road, and it’s clearly big business - one of them recently got bought out for a rumoured $250m. But one of them is actively trying to get those changes back into the community, as are pushing to make Apache Kafka speak object storage natively. Joining me to explain why and how are Josep Prat and Filip Yonov of Aiven. We break down what it takes to make Kafka’s storage layer optional on a per-topic basis, how they’re making sure it’s not a breaking change, and how they plan to get such a foundational feature merged. Thanks to Aiven for sponsoring this episode. – Diskless Kafka Overview: https://fnf.dev/45fRuFh Announcement Post: https://aiven.io/blog/guide-diskless-apache-kafka-kip-1150 Aiven’s (Temporary) Fork, Project Inkless: https://github.com/aiven/inkless/blob/main/docs/inkless/README.md Kafka Improvement Process (KIP) Articles: KIP-1150: https://cwiki.apache.org/confluence/display/KAFKA/KIP-1150%3A+Diskless+Topics KIP-1163: Diskless Core: https://cwiki.apache.org/confluence/display/KAFKA/KIP-1163%3A+Diskless+Core KIP-1164: Topic Based Batch Coordinator: https://cwiki.apache.org/confluence/display/KAFKA/KIP-1164%3A+Topic+Based+Batch+Coordinator KIP-1165: Object Compaction for Diskless: https://cwiki.apache.org/confluence/display/KAFKA/KIP-1165%3A+Object+Compaction+for+Diskless Support Developer Voices on Patreon: https://patreon.com/DeveloperVoices Support Developer Voices on YouTube: https://www.youtube.com/@DeveloperVoices/join Filip on LinkedIn: https://www.linkedin.com/in/filipyonov Josep on LinkedIn: https://www.linkedin.com/in/jlprat/ Kris on Bluesky: https://bsky.app/profile/krisajenkins.bsky.social Kris on Mastodon: http://mastodon.social/@krisajenkins Kris on LinkedIn: https://www.linkedin.com/in/krisjenkins/ -- 0:00 Intro 3:14 The Problem: Why Kafka's Current Storage is Expensive 9:46 Servless, Diskless and Naming Things 11:53 Why Make A Competitive Advantage Open Source? 17:15 If Kafka Were Started Today, Would It Be Designed Cloud-First? 20:35 The Solution: Retrofitting Object Storage Into Kafka 26:57 Durability in the Face of Errors 34:37 What About Latency? 42:23 Performance Characteristics with Bursty Traffic 46:47 Transaction Support 52:34 How Transparent Is It To The Client? 58:00 How does this change Reads? 1:03:04 How Do You Handle The Metadata Layer? 1:09:04 What's The State Of The Open Source Merge? 1:25:30 How Can Users Get Involved? 1:28:18 Outro