Learn to build a Vision Transformer (ViT) from scratch using PyTorch! This hands-on course guides you through each component, from patch embedding to the Transformer Encoder. Train your custom ViT model on CIFAR-10 and gain practical experience in image classification. Transition from CNNs to transformers in this efficient, end-to-end tutorial.
Code: https://github.com/MOHAMMEDFAHD/pytorch-collections/blob/main/Building_Vision_Transformer_on_CIFAR_10_From_Scratch_Pytorch.ipynb
Course developed by @programmingoceanacademy
❤️ Support for this channel comes from our friends at Scrimba – the coding platform that's reinvented interactive learning: https://scrimba.com/freecodecamp
⭐️ Contents ⭐️
⌨️ (
0:00:00) Intro
⌨️ (
0:28:23) Theoretical Explanation of Vision Transformers
⌨️ (
0:47:40) Environment Setup and Library Imports
⌨️ (
0:55:14) Configurations and Hyperparameter Setup
⌨️ (
0:58:28) Image Transformation Operations
⌨️ (
1:00:28) Downloading the CIFAR-10 Dataset
⌨️ (
1:04:22) Creating DataLoaders
⌨️ (
1:11:32) Building the Vision Transformer (ViT) Model
⌨️ (
1:43:41) Defining Loss Function and Optimizer
⌨️ (
1:45:37) Training Loop and Model Training
⌨️ (
2:03:18) Visualizing Accuracy (Training vs Testing)
⌨️ (
2:06:08) Making and Visualizing Predictions
⌨️ (
2:18:48) Fine-Tuning with Data Augmentation
⌨️ (
2:25:08) Training the Fine-Tuned Model
⌨️ (
2:27:08) Visualizing Fine-Tuned Accuracy
⌨️ (
2:28:38) Predictions After Fine-Tuning
🎉 Thanks to our Champion and Sponsor supporters:
👾 Drake Milly
👾 Ulises Moralez
👾 Goddard Tan
👾 David MG
👾 Matthew Springman
👾 Claudio
👾 Oscar R.
👾 jedi-or-sith
👾 Nattira Maneerat
👾 Justin Hual
--
Learn to code for free and get a developer job: https://www.freecodecamp.org
Read hundreds of articles on programming: https://freecodecamp.org/news