Yuxiong Wang | Bridging Generative & Discriminative Learning in the Open World

Yuxiong Wang | Bridging Generative & Discriminative Learning in the Open World

695 Lượt nghe
Yuxiong Wang | Bridging Generative & Discriminative Learning in the Open World
Sponsored by Evolution AI: https://www.evolution.ai Abstract: Generative AI has emerged as the new wave following discriminative AI, as exemplified by various powerful generative models including large language models (LLMs) and visual diffusion models. While these models excel at generating text, images, and videos, mere creation is not the ultimate goal. A grand objective lies in understanding and making decisions in the world through the generation process. In this talk, I discuss our efforts towards bridging generative and discriminative learning, facilitating autonomous agents to perceive, interact, and act in the open world. Several universal strategies are explored, such as repurposing latent representations of generative models and treating generative models as data engines. Beyond these, there has been recent interest in directly formulating generative models, especially LLMs, as agents for problem-solving and decision-making. Along this line, we introduce LATS (Language Agent Tree Search), the first framework that unifies the capabilities of LLMs in reasoning, acting, and planning. Our work enables LLMs as agents to leverage the external feedback from the environment with tree-based search algorithms while employing LLM-powered value functions and self-reflections for cleverer exploration, which provides a more deliberate and adaptive problem-solving mechanism. Finally, we demonstrate how to synergize knowledge from different generative models in the context of modeling human-object interaction, advancing the broader application of generative AI in real-world scenarios.