PySpark Tutorial | Apache Spark Full Course | Spark Tutorial for beginners | PySpark Training Full Course
Only training that covers Basic to Advanced Spark with Spark UI and with live examples. Here is what it covers in length in next 6 hrs 45 min:
Chapters:
00:00:00 - What we are going to Cover?
00:00:25 - Introduction
00:01:10 - What is Spark?
00:02:29 - How Spark Works - Driver & Executors
00:06:04 - Spark Transformations & Actions
00:10:31 - Spark DataFrames & Execution Plans
00:13:33 - Understand Spark Session
00:21:28 - Write Spark DataFrame Schema
00:32:13 - Cast Column | Add Column | Static Column Value |Rename
00:42:16 - Working with Strings, Dates and Null
00:55:38 - Sorting data, Union and Aggregation in Spark
01:03:18 - Window Functions, Unique Data & Databricks Community Cloud
01:12:33 - Data Repartitioning & PySpark Joins | Coalesce vs Repartition
01:23:20 - Understand Spark UI, Read CSV Files and Read Modes
01:38:28 - Read Complex File Formats | Parquet | ORC
01:47:44 - Read, Parse or Flatten JSON data
02:03:40 - How Spark Writes data | Write modes in Spark
02:17:20 - Understand Spark Execution on Cluster
02:29:27 - User Defined Function (UDF)
02:38:45 - Understand DAG, Explain Plans & Spark Shuffle with Tasks
02:55:18 - Understand and Optimize Shuffle in Spark
03:10:20 - Data Caching in Spark | Cache vs Persist
03:23:23 - Broadcast Variable and Accumulators in Spark
03:35:43 - Optimize Joins in Spark & Understand Bucketing for Faster joins
04:03:35 - Static vs Dynamic Resource Allocation in Spark
04:13:48 - Fix Skewness and Spillage with Salting in Spark
04:34:51 - AQE aka Adaptive Query Execution in Spark
04:46:12 - Spark SQL, Hints, Spark Catalog and Metastore
05:05:20 - Read and Write from Azure Cosmos DB using Spark
05:26:17 - Get Started with Delta Lake using Databricks
06:06:06 - Optimize Data Scanning with Partitioning in Spark
06:13:17 - Data Skipping and Z-Ordering in Delta Lake Tables
06:31:45 - Delta Tables - Deletion Vectors and Liquid Clustering
Original Playlist has more that 250k views {https://www.youtube.com/playlist?list=PL2IsFZBGM_IHCl9zhRVC1EXTomkEp_1zm)
Other popular playlist}
Github link with all notebooks: https://github.com/subhamkharwal/pyspark-zero-to-hero
Other Popular playlist on our channel Ease with Data:
Databricks Zero to Hero: {https://www.youtube.com/playlist?list=PL2IsFZBGM_IGiAvVZWAEKX8gg1ItnxEEb}
Spark Streaming with PySpark: {https://www.youtube.com/playlist?list=PL2IsFZBGM_IEtp2fF5xxZCS9CYBSHV2WW}
Follow me on LinkedIn: https://www.linkedin.com/in/subhamkharwal
Follow Ease With Data YouTube Channel: @easewithdata
Make sure to Like and Subscribe 💓
#pyspark #apachespark #spark #dataengineering