PoseGPT (ChatPose): Chatting about 3D Human Pose

988 Lượt nghe

00:00

Update Required To play the media you will need to either update your browser to a recent version or update your Flash plugin.

Tải MP3

MÔ TẢ MP3TIẾP THEO

PoseGPT (ChatPose): Chatting about 3D Human Pose

Human pose estimation models struggle to grasp contextual information in images or video frames. Meanwhile, text-to-pose generation models, having limited training data, cannot effectively generate accurate poses for novel prompts. PoseGPT, a novel multimodal language model, not only comprehends 3D human pose but also processes image and text data. This innovative model excels in speculative pose generation, such as replicating a pose when a person is tired, and reasoning-based pose estimation, such as accurately estimating the pose of an individual wearing eyeglasses.



Paper link: https://arxiv.org/pdf/2311.18836.pdf
Project page:  https://yfeng95.github.io/posegpt/


Table of content:
00:00 Introduction
07:23 Architecture
11:58 LoRA
19:12 Data Construction
22:41 Speculative Pose Generation (SPG)
25:50 Reasoning-based Pose Estimation (RPE)


Icon made by Freepik from flaticon.com					

PoseGPT (ChatPose): Chatting about 3D Human Pose

Nhạc Theo Chủ Đề

Liên kết website