Master Reading Spark Query Plans

55.577 Lượt nghe

00:00

Update Required To play the media you will need to either update your browser to a recent version or update your Flash plugin.

Tải MP3

MÔ TẢ MP3TIẾP THEO

Master Reading Spark Query Plans

Spark Performance Tuning

Dive deep into Apache Spark Query Plans to better understand how Apache Spark operates under the hood. We'll cover how Spark creates logical and physical plans, as well as the role of the Catalyst Optimizer in utilizing optimization techniques such as filter (predicate) pushdown and projection pushdown.

The video covers intermediate concepts of Apache Spark in-depth, detailed explanations on how to read the Spark UI, understand Apache Spark’s query plans through code snippets of various narrow and wide transformations like reading files, select, filter, join, group by, repartition, coalesce, hash partitioning, hashaggregate, round robin partitioning, range partitioning and sort-merge join. Understanding them is going to give you a grasp on reading Spark’s step-by-step thought process and help identify performance issues and possible optimizations.

📄 Complete Code on GitHub: https://github.com/afaqueahmad7117/spark-experiments/blob/main/spark/2_reading_query_plans.ipynb
🎥 Full Spark Performance Tuning Playlist: https://www.youtube.com/playlist?list=PLWAuYt0wgRcLCtWzUxNg4BjnYlCZNEVth

🔗 LinkedIn: https://www.linkedin.com/in/afaque-ahmad-5a5847129

Chapters:
00:00 Introduction
01:30 How Spark generates logical and physical plans? 
04:46 Narrow transformations (filter, select, add or update columns) query plan explanation
09:02 Repartition query plan explanation
12:57 Coalesce query plan explanation
17:32 Joins query plan explanation
23:23 Group by count query plan explanation
27:04 Group by sum query plan explanation
28:05 Group by count distinct query plan explanation
33:59 Interesting observations on Spark’s query plans 
36:56 When will predicate pushdown not work? 
39:07 Thank you

 #ApacheSpark #SparkPerformanceTuning #DataEngineering #SparkDAG #SparkOptimization
#dataengineering #interviewquestions #azuredataengineer					

Master Reading Spark Query Plans

Nhạc Theo Chủ Đề

Liên kết website

Master Reading Spark Query Plans

Những bài liên quan

Chưa có bài liên quan nào!