Research Interest

I'm broadly interested in computer vision and its real-world applications, with intersection of Robotics and Multimodal AI. In general, I hope to construct more intelligent visual systems that can better understand and perceive the physical world.
More specifically, I hope to apply computer vision techniques with higher quality and efficiency , probably relying on the power of frontier Machine Learning models and cutting-edge Optimization algorithms.
Part of my current desirable topics include:

Scalable Visual Recognition Models
3D Visual Generation
World Models
Physical Scene Understanding
Robotics Visual Perception
Video Summarization/Multimodal Reasoning

Relevant Research Field

Computer Vision, Multimodal AI, Robotics, Machine Learning, Optimization

Application Space

I strongly hope my future research work can benefit education field, and further promote education coverage in rural areas.
Some of my potential ideas are:

Comprehend student's homework/exam, and give its analysis accordingly(interdisciplinary of computer vision with Multimodal perception)
given one specific question(e.g., one computation/proof exercise in mathematics), generate a video for detailed explanation(interdisciplinary of computer vision with Text-to-Video Generation/Natural Language Processing)
Create intelligent educational device/software which can intelligently recommend/generate specific exercise questions based on each student's profile(interdisciplinary of computer vision with Recommedation System/Generative AI)
Manufacture intelligent robot as course assistant/lecture instructor, which can help operate classroom's device/deliver a lecture independently(interdisciplinary of computer vision with Robotics/Embodied AI/Multimodal Learning)