Schema-aware PCollections and Beam SQL (Beam Summit Europe 2019)

Schema-aware PCollections and Beam SQL (Beam Summit Europe 2019)

2.114 Lượt nghe
Schema-aware PCollections and Beam SQL (Beam Summit Europe 2019)
Apache Beam doesn’t have any knowledge of the actual structure of the records in a PCollection, and little understanding of PTransforms. In practice, most of the PCollections are schematized: Avro records, BigQuery rows, and even POJOs and case classes. Many operations are performed on structural records: filtering by field, grouping by a specific field, and so on. In this talk, we are going to learn about schema-aware PCollections and Beam SQL. See how we can leverage them, and how it works with Scio, Scala DSL for Apache Beam. Speaker: Gleb Kanterov - Staff Engineer @ Spotify The Beam Summit Europe 2019 was a 2 day event held in Berlin at the KulturBrauerei, all focused around Apache Beam. For more information about the Beam Summit, follow us on twitter @BeamSummit or go to the website: https://beamsummit.org/