This Course Cover Complete Big Data Engineering Topics
Part 1 -
https://youtu.be/Tyg1FVNq40g
Part 2 -
https://youtu.be/k1LaWFNOa68
Resources
========
Hadoop Installation Steps - https://github.com/atozknowledge/bigdata/wiki/Hadoop-Single-Node-Installation
Hadoop Multi Node Cluster Setup Installation Steps - https://bit.ly/3LRwgRi
Big Data Integration Book - https://bit.ly/3ipIlBx
Hive
Hive-site.xml - https://github.com/Gowthamsb12/hive/blob/main/hive-site.xml
Hive ACID commands - https://bit.ly/2V9W1qT
Apache Hive ORC vs TextFile Format - https://bit.ly/3cbIbNl
Hive UDF Code - https://codewithgowtham.blogspot.com/2021/09/hive-udf.html
Spark
Spark Submit Cluster [YARN] Mode Code link - https://github.com/atozknowledge/bigdata
Spark Kafka Cassandra | End to End Streaming Project Code and Steps - https://bit.ly/3LqXXRC
Kafka Installation Video -
https://youtu.be/XCOIp-CqGkg
Sqoop Commands - https://codewithgowtham.blogspot.com/2021/03/sqoop-commands.html
Course Outline
00:00 About Instructor
00:48 All about [What is Big Data]
37:52 Big Data Engineering Road Map
54:45 Hadoop Distributed File System (HDFS)
02:16:39 Unboxing [Hadoop Framework]
02:38:06 Hadoop Single Node Setup
03:14:24 HDFS Quotas Use Cases
03:26:57 MapReduce Complete Video
05:30:33 Apache Hive Introduction & Architecture
05:47:23 Installation [ Apache Hive 2 with MySQL ]
06:00:50 Hive SQL [Create Load Insert Show]
06:19:05 Hive Internal Vs External Table
06:25:53 Hive Partition [ Static Vs Dynamic]
06:43:47 Hive Bucket End to End Explained
07:03:09 How to Decide [Bucket Count] in Hive
07:14:09 Hive Partition with Bucket Explained
07:20:42 Hive ORC File Format with Demo
07:28:18 Hive ACID Table
07:35:09 Hive UDF
07:47:39 Apache Spark Introduction
08:35:18 Apache Spark Installation
08:52:36 Apache Spark Scala Wordcount Program (REPL)
09:05:51 Spark Standalone Architecture
𝐒𝐨𝐜𝐢𝐚𝐥𝐬
🎥𝐘𝐨𝐮𝐓𝐮𝐛𝐞 - https://www.youtube.com/@thedatatech
📸𝐈𝐧𝐬𝐭𝐚𝐠𝐫𝐚𝐦 - https://instagram.com/thedatatech.in
💼𝐋𝐢𝐧𝐤𝐞𝐝𝐈𝐧 - https://www.linkedin.com/in/sbgowtham/
🌐𝐖𝐞𝐛𝐬𝐢𝐭𝐞 - https://codewithgowtham.blogspot.com
💻𝐆𝐢𝐭𝐇𝐮𝐛 - http://github.com/Gowthamdataengineer
💬𝐖𝐡𝐚𝐭𝐬 𝐀𝐩𝐩 - https://lnkd.in/g5JrHw8q
📧𝐄𝐦𝐚𝐢𝐥 -
[email protected]
📱𝐀𝐥𝐥 𝐌𝐲 𝐒𝐨𝐜𝐢𝐚𝐥𝐬 - https://lnkd.in/gf8k3aCH
Hash Tags
#bigdata #dataengineering #DataEngineering #BigData #DataPipeline #ETL #DataProcessing #DataScience #DataAnalytics #DataWrangling #DataOps #DataArchitecture #DataIntegration #DataTransformation #DataStorage #DataManagement #DataPlatform #CloudDataEngineering #AWS #Azure #GCP #DataCloud #CloudComputing #CloudDataPipeline #DataStreaming #Kafka #Spark #Hadoop #NoSQL #DataModeling #DataGovernance #DataLake #DataWarehouse #Redshift #BigQuery #Snowflake #DataVisualization #MachineLearning #AI #APIs #DatabaseManagement #ServerlessComputing #DataMigration #DevOps #MLOps #DataOrchestration #DataAutomation #DataSecurity #CloudMigration #DataEngineeringCommunity #RealTimeData #DataMonitoring #DataEngineeringTools #DataInsights #DataDriven #DataQuality #DataEngineeringProjects #PythonForData #SQL #DataPipelinesSimplified #CloudETL #ModernDataStack #CloudDataOps #DataLakehouse #AnalyticsEngineering #DataFlow #CloudIntegration #DataTools #DataPipelineAutomation #DataModelingSimplified #ETLTools #DataProcessingPipeline #DataCloudExperts #ServerlessData #CloudComputingSolutions #BigDataAnalytics #AdvancedAnalytics #DataInnovation #CloudDataManagement #DataOpsFramework #ETLProcesses #StreamingDataPipeline #DataScienceWorkflow #CloudEngineering #DataEngineerLife #DataEngineerJobs #DataEngineeringForBeginners #CloudSolutions #TechForData #DataScienceCommunity #CloudFirst #DataStorageOptimization #CloudETLTools #DataProcessingFrameworks #RealTimeAnalytics.