Multimodal LLMs for 3D Understanding, Reasoning, and Task Planning, ICLR 2025 Workshop on Embodied Intelligence with Large Language Models in Open City Environment, April 2025.
From Diffusion to Autoregression for Visual Content Generation, Hong Kong CVM 2025 Workshop at HKUST, April 2025.
Towards Compositional and Controllable Visual Content Generation, HKU TechTalk, June 2024.
Towards Compositional and Controllable Visual Content Generation, VALSE Webinar, May 2024.
Towards Compositional and Controllable Visual Content Generation, HKUST(GZ), May 2024.
Towards Compositional and Controllable Visual Content Generation, VALSE 2024 Workshop on Video Generation and World Models, May 2024.
Towards Compositional and Controllable Visual Content Generation, CUHK Multi-Modal Symposium 2024, April 2024.
Visual Content Generation with Diffusion Models, Shanghai Artificial Intelligence Lab, December 2023.
Controllable Diffusion Models for Visual Content Generation and Visual Perception, Tencent Rhino Bird Workshop, July 2023.
Controllable Diffusion Models for Visual Content Generation and Visual Perception, Huawei Media Technology Workshop, July 2023.
Image Synthesis and Editing with Diffusion Models, International Digital Economy Academy (IDEA), January 2023.
Image Synthesis and Editing with Diffusion Models, Tencent ARC Lab, December 2022.
Bridging Vision and Language for Cross-Modal Understanding and Generation, University of Hong Kong (HKU), Janurary 2022.
Bridging Vision and Language for Cross-Modal Understanding and Generation, Hong Kong University of Science and Technology (HKUST), Janurary 2022.
Bridging Vision and Language for Cross-Modal Understanding and Generation, University of California, Berkeley (UC Berkeley), Janurary 2021.
Bridging Vision and Language for Cross-Modal Understanding and Generation, Microsoft Research Asia (MSRA), Janurary 2021.
Bridging Vision and Language for Cross-Modal Understanding and Generation, NVIDIA Research, Janurary 2021.