Today, we're joined by Jason Corso, co-founder of Voxel51 and professor at the University of Michigan, to explore automated labeling in computer vision. Jason introduces FiftyOne, an open-source platform for visualizing datasets, analyzing models, and improving data quality. We focus on Voxel51’s recent research report, “Zero-shot auto-labeling rivals human performance,” which demonstrates how zero-shot auto-labeling with foundation models can yield to significant cost and time savings compared to traditional human annotation. Jason explains how auto-labels, despite being "noisier" at lower confidence thresholds, can lead to better downstream model performance. We also cover Voxel51's "verified auto-labeling" approach, which utilizes a "stoplight" QA workflow (green, yellow, red light) to minimize human review. Finally, we discuss the challenges of handling decision boundary uncertainty and out-of-domain classes, the differences between synthetic data generation in vision and language domains, and the potential of agentic labeling.
🗒️ For the full list of resources for this episode, visit the show notes page: https://twimlai.com/go/735.
🔔 Subscribe to our channel for more great content just like this: https://youtube.com/twimlai?sub_confirmation=1
🗣️ CONNECT WITH US!
===============================
Subscribe to the TWIML AI Podcast: https://twimlai.com/podcast/twimlai/
Follow us on Twitter: https://twitter.com/twimlai
Follow us on LinkedIn: https://www.linkedin.com/company/twimlai/
Join our Slack Community: https://twimlai.com/community/
Subscribe to our newsletter: https://twimlai.com/newsletter/
Want to get in touch? Send us a message: https://twimlai.com/contact/
📖 CHAPTERS
===============================
00:00 - Introduction
2:10 - Voxel51
9:45 - Data analysis
11:58 - Path to auto-labeling
16:34 - Challenge of uncertainty
19:42 - Challenges of classifying rare data
23:36 - Auto-labeling
29:24 - Cost of labeling and inference
34:28 - Findings on confidence thresholds on models
39:51 - Challenges of auto-labeling
42:32 - Verified auto-labeling approach
43:19 - Spotlight approach
44:44 - Out-of-distribution domain performance
48:07 - Core agentic behavior
49:42 - Future directions
51:44 - Parallels and pitfalls of synthetic dataset generation in vision vs. language domains
🔗 LINKS & RESOURCES
===============================
Zero-shot auto-labeling rivals human performance - https://voxel51.com/blog/zero-shot-auto-labeling-rivals-human-performance
Auto-Labeling Data for Object Detection - https://arxiv.org/abs/2506.02359
Voxel51 Research Reveals Auto-Labeling Achieves up to 95% of Human-Level Performance While Cutting Costs by 100,000x - https://www.prnewswire.com/news-releases/voxel51-research-reveals-auto-labeling-achieves-up-to-95-of-human-level-performance-while-cutting-costs-by-100-000x-302473005.html
📸 Camera: https://amzn.to/3TQ3zsg
🎙️Microphone: https://amzn.to/3t5zXeV
🚦Lights: https://amzn.to/3TQlX49
🎛️ Audio Interface: https://amzn.to/3TVFAIq
🎚️ Stream Deck: https://amzn.to/3zzm7F5