The ARC Prize 2024 Winning Algorithm

15.255 Lượt nghe
00:00
Update Required To play the media you will need to either update your browser to a recent version or update your Flash plugin.
Tải MP3
MÔ TẢ MP3TIẾP THEO
The ARC Prize 2024 Winning Algorithm
Daniel Franzen and Jan Disselhoff, the "ARChitects" are the official winners (with co-researcher David Hartmann) of the ARC Prize 2024. Filmed at Tufa Labs in Zurich - they revealed how they achieved a remarkable 53.5% accuracy by creatively utilising large language models (LLMs) in new ways. Discover their innovative techniques, including depth-first search for token selection, test-time training, and a novel augmentation-based validation system. Their results were extremely surprising. 

SPONSOR MESSAGES:
***
CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide range of models, from small to large-scale deployments. Check out their super fast DeepSeek R1 hosting!
https://centml.ai/pricing/

Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers. Events in Zurich. 

Goto https://tufalabs.ai/
***

Jan Disselhoff
https://www.linkedin.com/in/jan-disselhoff-1423a2240/

Daniel Franzen
https://github.com/da-fr

ARC Prize: http://arcprize.org/

TRANSCRIPT AND BACKGROUND READING:
https://www.dropbox.com/scl/fi/utkn2i1ma79fn6an4yvjw/ARCHitects.pdf?rlkey=67pe38mtss7oyhjk2ad0d2aza&dl=0

TOC
1. Solution Architecture and Strategy Overview  
[00:00:00] 1.1 Initial Solution Overview and Model Architecture  
[00:04:25] 1.2 LLM Capabilities and Dataset Approach  
[00:10:51] 1.3 Test-Time Training and Data Augmentation Strategies  
[00:14:08] 1.4 Sampling Methods and Search Implementation  
[00:17:52] 1.5 ARC vs Language Model Context Comparison  

2. LLM Search and Model Implementation  
[00:21:53] 2.1 LLM-Guided Search Approaches and Solution Validation  
[00:27:04] 2.2 Symmetry Augmentation and Model Architecture  
[00:30:11] 2.3 Model Intelligence Characteristics and Performance  
[00:37:23] 2.4 Tokenization and Numerical Processing Challenges  

3. Advanced Training and Optimization  
[00:45:15] 3.1 DFS Token Selection and Probability Thresholds  
[00:49:41] 3.2 Model Size and Fine-tuning Performance Trade-offs  
[00:53:07] 3.3 LoRA Implementation and Catastrophic Forgetting Prevention  
[00:56:10] 3.4 Training Infrastructure and Optimization Experiments  
[01:02:34] 3.5 Search Tree Analysis and Entropy Distribution Patterns

REFS
[00:01:05] Winning ARC 2024 solution using 12B param model, Franzen, Disselhoff, Hartmann  
https://github.com/da-fr/arc-prize-2024/blob/main/the_architects.pdf  

[00:03:40] Robustness of analogical reasoning in LLMs, Melanie Mitchell  
https://arxiv.org/html/2411.14215  

[00:07:50] Re-ARC dataset generator for ARC task variations, Michael Hodel  
https://github.com/michaelhodel/re-arc  

[00:15:00] Analysis of search methods in LLMs (greedy, beam, DFS), Chen et al.  
https://arxiv.org/html/2408.00724v2  

[00:16:55] Language model reachability space exploration, University of Toronto  
https://www.youtube.com/watch?v=Bpgloy1dDn0  

[00:22:30] GPT-4 guided code solutions for ARC tasks, Ryan Greenblatt  
https://redwoodresearch.substack.com/p/getting-50-sota-on-arc-agi-with-gpt  

[00:41:20] GPT tokenization approach for numbers, OpenAI  
https://platform.openai.com/docs/guides/text-generation/tokenizer-examples  

[00:46:25] DFS in AI search strategies, Russell & Norvig  
https://www.amazon.com/Artificial-Intelligence-Modern-Approach-4th/dp/0134610997  

[00:53:10] Paper on catastrophic forgetting in neural networks, Kirkpatrick et al.  
https://www.pnas.org/doi/10.1073/pnas.1611835114  

[00:54:00] LoRA for efficient fine-tuning of LLMs, Hu et al.  
https://arxiv.org/abs/2106.09685  

[00:57:20] NVIDIA H100 Tensor Core GPU specs, NVIDIA  
https://developer.nvidia.com/blog/nvidia-hopper-architecture-in-depth/  

[01:04:55] Original MCTS in computer Go, Yifan Jin  
https://stanford.edu/~rezab/classes/cme323/S15/projects/montecarlo_search_tree_report.pdf					
The ARC Prize 2024 Winning Algorithm

Nhạc Theo Chủ Đề

Liên kết website