Ingest Kafka events to Pub/Sub and BigQuery

Ingest Kafka events to Pub/Sub and BigQuery

2.837 Lượt nghe
Ingest Kafka events to Pub/Sub and BigQuery
This shows an end to end solution on how to adopt the events first approach by ingesting data from Kafka to Pub/Sub first before putting them into BigQuery and why this is important. At the same time, the solution allows the "relayer" to be deployed close to where the Kafka Cluster is located which makes many things such as network / firewall configurations a lot easier. And because the relayer it's a rather lightweight service, it is simple enough to deploy and manage even on the on-prem ecosystem. Further reading - The whole solution: https://github.com/rocketechgroup/kafka-pubsub-bigquery-example - Very interesting read on how to manage the auto commit behaviour with Kafka without using manual commit: https://github.com/confluentinc/confluent-kafka-python/issues/300 - Kafka Consumer docs: https://kafka-python.readthedocs.io/en/master/apidoc/KafkaConsumer.html - Pub/Sub subscription to BigQuery dataflow template: https://cloud.google.com/dataflow/docs/guides/templates/provided-streaming#pubsub-subscription-to-bigquery