Pollux: Autoregressive Video Generation as a World Model

Project Details
Field of Study
Computer Science
Division
Computer, Electrical and Mathematical Sciences and Engineering
Faculty Lab Link
Project Description
Training an all-purpose robot requires a world model that captures and predicts the complex dynamics of the physical world. Recent advancements in video generation offer a promising approach, but existing models primarily generate entire video clips from text, often with fixed durations. To this end, this project aims to build a video generation model with three key advancements:
- Next-frame prediction – Rather than generating full clips, the model will synthesize videos frame by frame for finer temporal control and better aligned with physical principles.
- Interactive conditioning – The model will integrate additional signals such as text, camera motion, and actions to enable dynamic generation.
- Efficiency – The design will emphasize computational efficiency, ensuring feasibility for downstream applications like simulators or AI-driven creativity.
About the Researcher
Hans Schmidhuber
Professor, Computer Science Co-Chair, Center of Excellence for Generative AI Principal Investigator, Juergen Schmidhuber Research Group
- Professor, Computer Science
- Co-Chair, Center of Excellence for Generative AI
- Principal Investigator, Juergen Schmidhuber Research Group
Desired Project Deliverables
Building a world model for an all-purpose robot is a long-term research objective. For this internship project, the primary deliverable is a high-impact publication in top-tier conferences or journals. The research focus may include but is not limited to improving the efficiency/performance of generative models, optimizing inference strategies, or introducing novel technical contributions to world/video/image generative models
Recommended Student Background
Strong computer vision skills are essential
A strong publication record in top AI venues is a plus
We are shaping the
World of Research
Be part of the journey with VSRP
3-6 months
Internship period
100+
Research Projects
3.5/4
Cumulative GPA
310
Interns a Year