Video explains - How to setup PySpark on Docker ? How to set up PySpark with Jupyter Lab on Docker ? We are going to use AWS Cloud services to design the Data Lakehouse and Processing power of Spark to load & process the data.
Github Docker Images - https://github.com/subhamkharwal/docker-images
If you are new to Data Warehousing checkout our playlist on YouTube -
https://www.youtube.com/watch?v=HrFMKGGb1gM&list=PL2IsFZBGM_IE-EvpN9gaZZukj-ysFudag
New Video will be uploaded every 3 days. Stay Tuned. Make sure to Like and Subscribe.
Chapters:
00:00 - Introduction & System requirements
00:18 - Prerequisites
00:55 - Docker Images Github
01:03 - Docker Setup for PySpark with Jupyter Lab
02:20 - Jupyter Lab Login setup
#datawarehousing #datalake #datalakehouse #dw