SWIN transformer (image recognition)

2.012 Lượt nghe

00:00

Update Required To play the media you will need to either update your browser to a recent version or update your Flash plugin.

Tải MP3

MÔ TẢ MP3TIẾP THEO

SWIN transformer (image recognition)

This video talks about SWIN transformer - a model trained for image classification, but also used in a variety of tasks as a backbone, replacing ResNet/ViT. It is currently the main part of SOTA object detection models like DINO.
This is another video from my "Modern Object Detection" series: https://www.youtube.com/playlist?list=PL1HdfW5-F8AQlPZCJBq2gNjERTDEAl8v3
Important links:
- Original paper: https://arxiv.org/pdf/2103.14030.pdf
- My previous video about ViT: https://youtu.be/NcbbPuRjMeE

00:00 - Intro
00:50 - Motivation, "Image Tokenization" Problem
08:14 - Hierarchical Patches Architecture
10:40 - Shifted Windows Attention
17:26 - Relative Positional Bias
21:58 - Results
26:00 - Next Up					

SWIN transformer (image recognition)

Nhạc Theo Chủ Đề

Liên kết website