Native Support of Prometheus Monitoring in Apache Spark 3.0
All production environment requires monitoring and alerting. Apache Spark also has a configurable metrics system in order to allow users to report Spark metrics to a variety of sinks. Prometheus is one of the popular open-source monitoring and alerting toolkits which is used with Apache Spark together. Previously, users can use
1. a combination of Prometheus JMX exporter and Apache Spark JMXSink, 2. 3rd party libraries
3. implement a custom Sink for more complex metrics like GPU resource usage
Apache Spark 3.0.0 will add another easy way to support Prometheus for general use cases. In this talk, we will talk about the following and show a demo.
1. How to enable new Prometheus features.
2. What kind of metrics are available.
3. General tips for monitoring and alerting on structured streaming jobs. (Spark side / Prometheus side)
Currently, Apache Spark exposes metrics at Master/Worker/Driver/Executor to integrate with the existing Prometheus server easily with less effort. This is already available with Apache Spark 3.0.0-preview and preview2. You can try it right now.
About:
Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business.
Read more here: https://databricks.com/product/unifie...
Connect with us:
Website: https://databricks.com
Facebook: https://www.facebook.com/databricksinc
Twitter: https://twitter.com/databricks
LinkedIn: https://www.linkedin.com/company/databricks
Instagram: https://www.instagram.com/databricksinc/ Databricks is proud to announce that Gartner has named us a Leader in both the 2021 Magic Quadrant for Cloud Database Management Systems and the 2021 Magic Quadrant for Data Science and Machine Learning Platforms. Download the reports here. https://databricks.com/databricks-named-leader-by-gartner