Apache Iceberg Architecture Overview - 101 Course #4

Apache Iceberg Architecture Overview - 101 Course #4

24.048 Lượt nghe
Apache Iceberg Architecture Overview - 101 Course #4
This course is an overview of Apache Iceberg's data lakehouse architecture. The Apache Iceberg framework is a table format developed for data lakehouses, and it allows for efficient storage and access of large amounts of data. In this course, you'll learn about the components that make up Apache Iceberg's architecture, including catalogs, metadata files, manifest lists, and manifests. A catalog is a collection of metadata about the data sets stored in a data lakehouse. This metadata includes information about the structure of the data set, such as its table name, schema information (column names and types), and partitioning information. The catalog also contains other types of metadata about the data set, such as its size (in bytes) and when it was created or last modified. Metadata files are used to store metadata associated with each table in an Apache Iceberg data lakehouse. They contain information such as table properties and statistics that can be used to optimize query performance. Manifest lists provide a way to track which files are part of each table in a Data Lakehouse. They are typically stored as JSON files in HDFS and contain a list of all the files that make up a particular table. Finally, manifests provide details on which files have been added or deleted from an Apache Iceberg Data Lakehouse over time. Manifests are stored as JSON files in HDFS and keep track of all changes made to the Data Lakehouse over time. This allows users to easily review these changes at any given point in time. If you're looking for more great content on Data Lakehouses and Data Lake Engines like Apache Iceberg, be sure to check out Dremio's Subsurface page for tutorials, blogs, webinars and more! Connect with us! Twitter: https://bit.ly/30pcpE1 LinkedIn: https://bit.ly/2PoqsDq Facebook: https://bit.ly/2BV881V Community Forum: https://bit.ly/2ELXT0W Github: https://bit.ly/3go4dcM Blog: https://bit.ly/2DgyR9B Questions?: https://bit.ly/30oi8tX Website: https://bit.ly/2XmtEnN