Research Interest
I'm broadly interested in computer vision and its real-world applications, with intersection of Robotics and Multimodal AI. In general, I hope to construct more intelligent visual systems that can better understand and perceive the physical world.More specifically, I hope to apply computer vision techniques with higher quality and efficiency , probably relying on the power of frontier Machine Learning models and cutting-edge Optimization algorithms.
Part of my current desirable topics include:
- Scalable Visual Recognition Models
- 3D Visual Generation
- World Models
- Physical Scene Understanding
- Robotics Visual Perception
- Video Summarization/Multimodal Reasoning
Relevant Research Field
Computer Vision, Multimodal AI, Robotics, Machine Learning, OptimizationApplication Space
I strongly hope my future research work can benefit education field, and further promote education coverage in rural areas.Some of my potential ideas are:
- Comprehend student's homework/exam, and give its analysis accordingly(interdisciplinary of computer vision with Multimodal perception)
- given one specific question(e.g., one computation/proof exercise in mathematics), generate a video for detailed explanation(interdisciplinary of computer vision with Text-to-Video Generation/Natural Language Processing)
- Create intelligent educational device/software which can intelligently recommend/generate specific exercise questions based on each student's profile(interdisciplinary of computer vision with Recommedation System/Generative AI)
- Manufacture intelligent robot as course assistant/lecture instructor, which can help operate classroom's device/deliver a lecture independently(interdisciplinary of computer vision with Robotics/Embodied AI/Multimodal Learning)