T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation
Kaiyue Sun, Kaiyi Huang, Xian Liu, Yue Wu, Zihan Xu, Zhenguo Li, Xihui Liu
CVPR 2025 [Paper] [Project Page] [Code] [LeaderBoard]
CVPR
T2ISafety: Benchmark for Assessing Fairness, Toxicity, and Privacy in Image Generation
Lijun Li, Zhelun Shi, Xuhao Hu, Bowen Dong, Yiran Qin, Xihui Liu, Lu Sheng, Jing Shao
CVPR 2025 [Paper]
CVPR
MBQ: Modality-Balanced Quantization for Large Vision-Language Models
Shiyao Li, Yingchun Hu, Xuefei Ning, Xihui Liu, Ke Hong, xiaotao jia, Xiuhong Li, Yaqi Yan, PEI RAN, Guohao Dai, Shengen Yan, Huazhong Yang, Yu Wang
CVPR 2025 [Paper] [Code]
CVPR
MIDI: Multi-Instance Diffusion for Single Image to 3D Scene Generation
Zehuan Huang, Yuan-Chen Guo, Xingqiao An, Yunhan Yang, Yangguang Li, Zi-Xin Zou, Ding Liang, Xihui Liu, Yan-Pei Cao, Lu Sheng
CVPR 2025 [Paper] [Project Page] [Code]
CVPR
HMAR: Efficient Hierarchical Masked AutoRegressive Image Generation
Hermann Kumbong, Xian Liu, Tsung-Yi Lin, Ming-Yu Liu, Xihui Liu, Ziwei Liu, Daniel Y Fu, Christopher Re, David W. Romero
CVPR 2025 [Paper Coming Soon]
ICLR
Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
Yao Teng, Han Shi, Xian Liu, Xuefei Ning, Guohao Dai, Yu Wang, Zhenguo Li, Xihui Liu
ICLR 2025 [Paper] [Code]
Moto: Latent Motion Token as the Bridging Language for Robot Manipulation
Yi Chen, Yuying Ge, Yizhuo Li, Yixiao Ge, Mingyu Ding, Ying Shan, Xihui Liu
arXiv 2024 [Paper] [Project Page] [Code]
SAMPart3D: Segment Any Part in 3D Objects
Yunhan Yang, Yukun Huang, Yuan-Chen Guo, Liangjun Lu, Xiaoyang Wu, Edmund Y. Lam, Yan-Pei Cao, Xihui Liu
arXiv 2024 [Paper] [Project Page] [Code] [Dataset]
LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D Capabilities
Chenming Zhu, Tai Wang, Wenwei Zhang, Jiangmiao Pang, Xihui Liu
arXiv 2024 [Paper] [Project Page] [Code]
GameFactory: Creating New Games with Generative Interactive Videos
Jiwen Yu, Yiran Qin, Xintao Wang, Pengfei Wan, Di Zhang, Xihui Liu
arXiv 2025 [Paper] [Project Page] [Code] [Dataset]
Empowering 3D Visual Grounding with Reasoning Capabilities
Chenming Zhu, Tai Wang, Wenwei Zhang, Kai Chen, Xihui Liu
ECCV 2024 [Paper] [Project Page] [Code] [Data]
ECCV
TC4D: Trajectory-Conditioned Text-to-4D Generation
Sherwin Bahmani*, Xian Liu*, Yifan Wang*, Ivan Skorokhodov, Victor Rong, Ziwei Liu, Xihui Liu, Jeong Joon Park, Sergey Tulyakov, Gordon Wetzstein, Andrea Tagliasacchi, David B. Lindell
ECCV 2024 [Paper] [Project Page] [Code]
ECCV
PredBench: Benchmarking Spatio-Temporal Prediction across Diverse Disciplines
ZiDong Wang, Zeyu Lu, Di Huang, Tong He, Xihui Liu, Wanli Ouyang, Lei Bai
ECCV 2024 [Paper] [Code]
EgoPlan-Bench: Benchmarking Multimodal Large Language Models for Human-Level Planning
Yi Chen, Yuying Ge, Yixiao Ge, Mingyu Ding, Bohao Li, Rui Wang, Ruifeng Xu, Ying Shan, Xihui Liu
arXiv 2024 [Paper] [Project Page] [Code] [Challenge] [Data] [Leaderboard]
DiM: Diffusion Mamba for Efficient High-Resolution Image Synthesis
Yao Teng, Yue Wu, Han Shi, Xuefei Ning, Guohao Dai, Yu Wang, Zhenguo Li, Xihui Liu
arXiv 2024 [Paper] [Code]
Divide and Conquer: Language Models can Plan and Self-Correct for Compositional Text-to-Image Generation
Zhenyu Wang, Enze Xie, Aoxue Li, Zhongdao Wang, Xihui Liu, Zhenguo Li
arXiv 2024 [Paper] [Code]
ICML
FiT: Flexible Vision Transformer for Diffusion Model
Zeyu Lu, Zidong Wang, Di Huang, Chengyue Wu, Xihui Liu, Wanli Ouyang, Lei Bai
ICML 2024 [Paper] [Code]
CVPR
DreamComposer: Controllable 3D Object Generation via Multi-View Conditions
Yunhan Yang*, Yukun Huang*, Xiaoyang Wu, Yuan-Chen Guo, Song-Hai Zhang, Hengshuang Zhao, Tong He, Xihui Liu
CVPR 2024 [Paper] [Project Page] [Code]
CVPR Highlight
HumanGaussian: Text-driven 3d Human Generation with Gaussian Splatting
Xian Liu, Xiaohang Zhan, Jiaxiang Tang, Ying Shan, Gang Zeng, Dahua Lin, Xihui Liu, Ziwei Liu
CVPR 2024 Highlight [Paper] [Project Page] [Code] [video]
CVPR
EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI
Tai Wang*, Xiaohan Mao*, Chenming Zhu*, Runsen Xu, Ruiyuan Lyu, Peisen Li, Xiao Chen, Wenwei Zhang, Kai Chen, Tianfan Xue, Xihui Liu, Cewu Lu, Dahua Lin, Jiangmiao Pang
CVPR 2024 [Paper] [Project Page] [Code] [Data]
OV-PARTS: Towards Open-Vocabulary Part Segmentation
Meng Wei, Xiaoyu Yue, Wenwei Zhang, Shu Kong, Xihui Liu, Jiangmiao Pang
NeurIPS 2023 [Paper] [Code] [Data] [Challenge]
NeurIPS
Seeing is not always believing: A Quantitative Study on Human Perception of AI-Generated Images
Zeyu Lu*, Di Huang*, Lei Bai*, Jingjing Qu, Chengyue Wu, Xihui Liu, Wanli Ouyang
NeurIPS 2023 [Paper] [Project Page] [Data]
DDP: Diffusion Model for Dense Visual Prediction
Yuanfeng Ji*, Zhe Chen*, Enze Xie, Lanqing Hong, Xihui Liu, Zhaoqiang Liu, Tong Lu, Zhenguo Li, Ping Luo
ICCV 2023 [Paper] [Code]
ICCVW
SAM3D: Segment Anything in 3D Scenes
Yunhan Yang, Xiaoyang Wu, Tong He, Hengshuang Zhao, Xihui Liu
ICCV Workshop 2023 [Paper] [Code]
CVPR
Back to the Source: Diffusion-Driven Test-Time Adaptation
Jin Gao*, Jialing Zhang*, Xihui Liu, Trevor Darrell, Evan Shelhamer, Dequan Wang
CVPR 2023 [Paper] [Code]
CVPR
Learning Transferable Spatiotemporal Representations from Natural Script Knowledge
Ziyun Zeng*, Yuying Ge*, Xihui Liu, Bin Chen, Ping Luo, Shu-Tao Xia, Yixiao Ge
CVPR 2023 [Paper] [Code]
CVPR
Masked Scene Contrast: A Scalable Framework for Unsupervised 3D Representation Learning
Xiaoyang Wu, Xin Wen, Xihui Liu, Hengshuang Zhao
CVPR 2023 [Paper] [Code]
CVPR
RIFormer: Keep Your Vision Backbone Effective But Removing Token Mixer
Jiahao Wang, Songyang Zhang, Yong Liu, Taiqiang Wu, Yujiu Yang, Xihui Liu, Kai Chen, Ping Luo, Dahua Lin
CVPR 2023 [Paper] [Project Page] [Code]
More Control for Free! Image Synthesis with Semantic Diffusion Guidance Xihui Liu, Dong Huk Park, Samaneh Azadi, Gong Zhang, Arman Chopikyan, Yuxiao Hu, Humphrey Shi, Anna Rohrbach, Trevor Darrell
WACV 2023 [Paper] [Project Page] [Code]
TVTSv2: Learning Out-of-the-box Spatiotemporal Visual Representations at Scale
Ziyun Zeng, Zhan Tong, Xihui Liu, Bin Chen, Shu-Tao Xia, Yixiao Ge
arXiv 2023 [Paper] [Code]
NeurIPS
Point Transformer V2: Grouped Vector Attention and Partition-based Pooling
Xiaoyang Wu, Yixing Lao, Li Jiang, Xihui Liu, Hengshuang Zhao
NeurIPS 2022 [Paper] [Code]
ECCV
MILES: Visual BERT Pre-training with Injected Language Semantics for Video-text Retrieval
Yuying Ge, Yixiao Ge, Xihui Liu, Alex Jinpeng Wang, Jianping Wu, Ying Shan, Xiaohu Qie, Ping Luo
ECCV 2022 [Paper] [Code]
The ArtBench Dataset: Benchmarking Generative Models with Artworks
Peiyuan Liao*, Xiuyu Li*, Xihui Liu, Kurt Keutzer
arXiv 2022 [Paper] [Project Page] [Data]
NeurIPS
Benchmark for Compositional Text-to-Image Synthesis
Dong Huk Park, Samaneh Azadi, Xihui Liu, Trevor Darrell, Anna Rohrbach
NeurIPS Datasets and Benchmarks 2021 [Paper] [Code] [Data]
ECCV
Open-Edit: Open-Domain Image Manipulation with Open-Vocabulary Instructions Xihui Liu, Zhe Lin, Jianming Zhang, Handong Zhao, Quan Tran, Xiaogang Wang, Hongsheng Li
ECCV 2020 [Paper] [Code] [Video] [Slides]
NeurIPS
Learning to Predict Layout-to-image Conditional Convolutions for Semantic Image Synthesis Xihui Liu, Guojun Yin, Jing Shao, Xiaogang Wang, Hongsheng Li
NeurIPS 2019 [Paper] [Code] [Slides]
Show, Tell and Discriminate: Image Captioning by Self-retrieval with Partially Labeled Data Xihui Liu, Hongsheng Li, Jing Shao, Dapeng Chen, Xiaogang Wang
ECCV 2018 [Paper]
ECCV
Improving Deep Visual Representation for Person Re-identification by Global and Local Image-language Association
Dapeng Chen, Hongsheng Li, Xihui Liu, Yantao Shen, Jing Shao, Zejian Yuan, Xiaogang Wang
ECCV 2018 [Paper]