请输入关键字
Learning to Perceive and Generate 3D World
2025.05.29

Time: 14:00-15:30, Thursday, May 29, 2025

Venue: Meeting Room 1734, Lide Building

Host: Song Ruihua, Tenured Associate Professor, Gaoling School of Artificial Intelligence, Renmin University of China

Guest Speaker: Xu Yinghao, Assistant Professor, Department of Computer Science Engineering, Hong Kong University of Science and Technology

Topic: Learning to Perceive and Generate 3D World

Brief Introduction:

Perceiving and generating the 3D world from visual input is fundamental to how humans interact with their physical environment. While computer vision has made remarkable progress in 2D scene understanding, much of it remains constrained by the inherent limitations of 2D imagery, often failing to capture the full spatial and temporal complexity of real-world 3D environments. How can we perceive diverse, dynamic 3D scenes in the wild? How can we recreate the 3D world with human-like capabilities?

In this talk, I will first introduce pioneering 3D perception systems that learn to understand the world from images—mirroring how humans naturally infer 3D structure from visual experience. These systems leverage differentiable scene representations to enable generalizable 3D reconstruction and perception from multi-view imagery, often without requiring extensive supervision. While such perception systems allow machines to interpret existing scenes, they also lay the foundation for a more ambitious goal: enabling machines to generate, control, and interact with 3D environments. Building on these models, I will demonstrate how insights from 3D modeling can be used to develop high-fidelity 3D generative models, allowing for structured control over generated scenes and agents to simulate real-world structures and dynamics.

This line of research aims to advance spatial reasoning and decision-making in AI systems, ultimately bringing us closer to human-level intelligence in 3D perception, interaction, and creativity.

Speaker Bio:

Xu Yinghao is currently a Senior Research Scientist at Ant Research and an incoming Assistant Professor at the Department of Computer Science Engineering at the Hong Kong University of Science and Technology. Previously, he was a Postdoctoral Researcher at Stanford University, advised by Prof. Gordon Wetzstein. He got his Ph.D. from The Chinese University of Hong Kong (MMLab) under the supervision of Prof. Zhou Bolei and Prof. Lin Dahua.

His research interests lie at the intersection of 3D computer vision, graphics, and generative AI. He received the WAIC Star Award in 2024 and was nominated for the Snap Research Fellowship in 2022.

Last:Lecture Announcement | Art: A Question of Rules or Practice? Next:China’s Ecological Conservation Redlines: Delineation and Management