Hao Tang

北京大学 助理教授/研究员 博导 博雅/未名青年学者 国家级海外高水平人才计划入选者
Office: 5 Yiheyuan Road, Haidian District, Beijing, 100871, China🇨🇳
Email: bjdxtanghao@gmail.com

Hey, thanks for stopping by! 👋

I am a tenure-track Assistant Professor (Ph.D. Supervisor) at the School of Computer Science, Peking University, China🇨🇳, studying computer vision, machine learning, and AI. Previously, I held postdoctoral positions at both CMU (Robotics Institute), USA🇺🇸, and ETH Zürich (Computer Vision Lab), Switzerland🇨🇭. My academic journey includes earning a master's degree from Peking University, China🇨🇳, and completing my Ph.D. (cum laude) at University of Trento, Italy🇮🇹. Additionally, I had the privilege of working and visiting several institutions, including University of Oxford (UK🇬🇧), MIT (USA🇺🇸), Harvard University (USA🇺🇸), IIAI (UAE🇦🇪), Northeastern University (USA🇺🇸), University of Michigan (USA🇺🇸), UCLA (USA🇺🇸), UPenn (USA🇺🇸), and so on.

Beyond academia, I have also had the honor of serving as a senior technical advisor for numerous AI startups, including those in USA🇺🇸, UK🇬🇧, Romania🇷🇴, and China🇨🇳, with technologies ranging from efficient AI to 3D vision to AIGC to AI4Blockchain, etc.

News & Events

Hiring! We're hiring Postdoc/Ph.D./Master/Intern researchers on Embodied AI, and AIGC (including LLM) for our PKU lab, feel free to reach out to me directly.
2025-10 🎤Honored to be invited by NVIDIA to share our latest research and vision on Shaping the Future with Generative and Embodied Intelligence, and we have 1 paper (Survey about Multimodal Alignment and Fusion) accepted to IJCV 2025.
2025-09 🎉I was elected as one of the World's Top 2% Scientists in 2025 by Stanford University, and we have 1 paper (Articulation and Diffusion for Robot Modeling) accepted to CoRL 2025 LSRW Workshop, 4 papers including 1 spotlight (Parameter Efficient Merging for MLLM + Diffusion-based Adversarial Attacks + Spatial Adversarial Alignment + Dental AI) accepted to NeurIPS 2025, and 1 paper (VLM for Open-Vocabulary Segmentation) accepted to CVIU 2025.
2025-08 🎉I was invited to serve a Area Chair (AC) at ICLR 2026, and we have 1 paper (VLA for Multi-Task Manipulation) accepted to CoRL 2025.
2025-07 🎉I was invited to serve a Senior Program Committee (SPC) at AAAI 2026, and we have 1 paper (Music-Guided Dance Video Synthesis) accepted to TPAMI 2025, 1 paper (Video Anomaly Detection) accepted to ACM MM 2025.
2025-06 We have 1 paper (Cellular Phenotypic Transdifferentiation) accepted to ICML 2025 Workshop, 1 paper (Monocular Depth Estimation) accepted to TCSVT 2025, 2 oral papers (Surgical Robot + Reconstruction and Editing in V2X Scenarios) accepted to IROS 2025, and 1 paper (Medical Image Segmentation) accepted to ICCV 2025.
2025-05 🎉I was invited to serve on the Editorial Board of Discover Artificial Intelligence (a journal by Springer Nature), I was also invited to serve as an Area Chair (AC) at EMNLP 2025, and we have 1 paper (Real-Time ViT on Mobile) accepted to IJCV 2025.
2025-04 We have 3 papers (Single-Step Image SR + In-Context Meta LoRA + Sparse MoE) accepted to IJCAI 2025, 1 paper (Synergistic Immunotherapy in Glioma) accepted to Advanced Science 2025, 1 paper (Continual Gesture Learning) accepted to IJCNN 2025, and 1 paper (Fake News Video Detection) accepted to TMM 2025.
2025-03 🎉I was invited to serve as an Area Chair (AC) at ACM MM 2025, and we have 1 paper (Accident Warning Agent) accepted to IV 2025.
2025-02 🎉🎤Honored to be invited by NVIDIA to share our latest research and vision on Bridging the Gap to Fault-Tolerant Quantum Computing, and I was invited to serve as an Area Chair (AC) at ACL 2025, a Senior Program Committee (SPC) at IJCAI 2025, and we have 3 papers including 1 oral (Mamba for Image Compression + 4D Reconstruction + Diffusion Fourier Neural Operator) accepted to CVPR 2025.
2025-01 We have 1 paper (SAR Automatic Target Recognition) accepted to TAES 2025, 1 paper (Person Image Generation) accepted to TPAMI 2025, 1 paper (Explainability in MLLMs) accepted to NAACL 2025 Main Conference, and 1 paper (Urological Surgical Robots) accepted to ICRA 2025.
2024-12 We have 3 papers (Structured Pruning for LLM + FG-SBIR + Hair Transfer via Diffusion Model) accepted to AAAI 2025 and 1 paper (Efficient Fine-Tuning of LLM) accepted to ICASSP 2025.
2024-11 We have 1 paper (Virtual Try-On) accepted to TMM 2024.
2024-10 We have 1 paper (Quantization on Bird's-Eye View Representation) accepted to WACV 2025 and 1 paper (Semantic Segmentation on Autonomous Vehicles Platform) accepted to TCAD 2024.
2024-09 🎉I was elected as one of the World's Top 2% Scientists in 2024 by Stanford University, and we have 1 paper (Camera-Agnostic Attack) accepted to NeurIPS 2024 and 1 paper (Medical Image Segmentation) accepted to ACCV 2024.
2024-08 🎤I was invited as a speaker at the 2nd Workshop & Challenge on Micro-gesture Analysis for Hidden Emotion Understanding (MiGA) at IJCAI 2024, and we have 2 papers (Guided Image Translation + 3D Human Pose Estimation) accepted to PR 2024.
2024-07 We have 6 papers (Motion Mamba + Dataset Growth + Story Visualization and Completion + Diffusion Model for Semantic Image Synthesis + Generalizable Image Editing + 3D Semantic Segmentation) accepted to ECCV 2024, 1 paper (Survey about Physical Adversarial Attack) accepted to TPAMI 2024, and 2 papers (Talking Head Avatar + Story Visualization and Continuation) accepted to ACM MM 2024.
2024-06 🎉I joined Peking University as an Assistant Professor.
2024-04 🎉I received offers from MIT and Harvard University.
2024-02 We have 7 papers (Explanation for ViT + Faithfulness of ViT + Diffusion Policy for Versatile Navigation + Subject-Driven Generation [Final rating: 455] + Diffusion Model for 3D Hand Pose Estimation + Adversarial Learning for 3D Pose Transfer + Efficient Diffusion Distillation [224->235]) accepted to CVPR 2024.
2024-01 We have 1 paper (Architectural Layout Generation) accepted to TPAMI 2024.
2023-12 We have 1 paper (Sign Pose Sequence Generation) accepted to AAAI 2024.
2023-10 🎉I was elected as one of the World's Top 2% Scientists in 2023 by Stanford University and we have 4 papers (BEV Perception + Efficient ViT + 3D Motion Transfer + Graph Distillation) accepted to NeurIPS 2023.
2023-09 We have 1 paper (Practical Blind Image Denoising) accepted to MIR 2023 and 1 paper (Diffusion Model for HDR Deghosting) accepted to TCSVT 2023.
2023-08 🎉I received an offer from CMU.
2023-07 We have 1 paper (Semantic Image Synthesis) accepted to TPAMI 2023.
2023-06 We have 1 paper (Visible-Infrared Person Re-ID) accepted to ICCV 2023.
2023-05 We have 2 papers (Image Restoration Dataset + 3D-Aware Video Generation) accepted to CVPRW 2023 and 1 paper (3D Face Generation) accepted to JSTSP 2023.
2023-04 We have 1 paper (Speed-Aware Object Detection) accepted to ICML 2023, 2 papers (Lottery Ticket Hypothesis for ViT + Zero-shot Character Recognition) accepted to IJCAI 2023, 1 paper (3D Human Pose Estimation) accepted to PR 2023, and 1 paper (SAR Target Recognition) accepted to TGRS 2023.
2023-03 We have 6 papers (HDR Deghosting + Point Cloud Registration + Graph-Constrained House Generation + Mathematical Architecture Design + Text-to-Image Synthesis + Efficient Semantic Segmentation) accepted to CVPR 2023.
2023-02 We have 3 papers (Camouflaged Object Detection + Brain Vessel Image Segmentation + Cross-View Image Translation) accepted to ICASSP 2023 and 1 paper (Camouflaged Object Detection) accepted to TCSVT 2023.
2023-01 We have 1 paper (Semantic Image Synthesis) accepted to ICLR 2023 and 1 paper (Human Reaction Generation) accepted to TMM 2023.
2022-11 We have 4 papers (Real-Time Segmentation + Wearable Design + Efficient ViT Training + Text-Guided Image Editing) accepted to AAAI 2023, 1 paper accepted (Person Pose and Facial Image Synthesis) to IJCV 2022, 1 paper (Salient Object Detection) accepted to TIP 2022, and 1 paper (Object Detection Transformer) accepted to TCSVT 2022.
2022-10 We have 1 paper (Sinusoidal Neural Radiance Fields) accepted to BMVC 2022 and 1 paper (Guided Image-to-Image Translation) accepted to TPAMI 2022.
2022-09 We have 1 paper (Facial Expression Translation) accepted to TAFFC 2022 and 1 paper (Ship Detection) accepted to TGRS 2022.
2022-07 We have 5 papers (Real-Time SR + Video SR + Soft Token Pruning for ViT + 3D-Aware Human Synthesis + Video Semantic Segmentation) accepted to ECCV 2022, 1 paper (Gaze Correction and Animation) accepted to TIP 2022, and 1 paper (Cross-view Panorama Image Synthesis) accepted to PR 2022.
2022-06 We have 2 papers (Character Image Restoration + Character Image Denoising) accepted to ACM MM 2022.
2022-04 We have 1 paper (Real-Time Portrait Stylization) accepted to IJCAI 2022, 1 paper (Wide-Context Transformer for Semantic Segmentation) accepted to TGRS 2022, and 1 paper (Incremental Learning for Semantic Segmentation) accepted to TMM 2022.
2022-03 We have 5 papers including 1 oral (Text-to-Image Synthesis + 3D Human Pose Estimation + Text-Driven Image Manipulation + 3D Face Modeling + 3D Face Restoration) accepted to CVPR 2022, 1 paper (Image Generation) accepted to TPAMI 2022, and 1 paper (Cross-View Panorama Image Synthesis) accepted to TMM 2022.
2021-12 We have 2 papers (Generalized 3D Pose Transfer + Audio-Visual Speaker Tracking) accepted to AAAI 2022.
2021-11 We have 1 paper (Building Extraction in VHR Remote Sensing Images) accepted to TIP 2021.
2021-10 We have 3 papers (Cross-View Image Translation + Data-driven 3D Animation + Natural Image Matting) accepted to BMVC 2021.
2021-08 We have 1 paper (Layout-to-Image Translation) accepted to TIP 2021 and 1 paper (Unpaired Image-to-Image Translation) accepted to TNNLS 2021.
2021-07 We have 2 papers (Continuous Pixel-Wise Prediction + Unsupervised 3D Pose Transfer) accepted to ICCV 2021.
2021-06 We have 1 paper (Cross-View Exocentric to Egocentric Video Synthesis) accepted to ACM MM 2021 and 1 paper (Total Generate) accepted to TMM 2021.
2021-05 🎉I received an offer from ETH Zurich.
2020-09 🎉I received an offer from IIAI.
2020-08 We have 1 paper (Person Image Generation) accepted to BMVC 2020, 2 papers (Semantic Image Synthesis + Unsupervised Gaze Correction and Animation) accepted to ACM MM 2020, and 1 paper (Controllable Image-to-Image Translation) accepted to TIP 2020.
2020-07 We have 1 paper (Person Image Generation) accepted to ECCV 2020.
2020-05 We have 1 paper (Deep Dictionary Learning and Coding) accepted to TNNLS 2020 and 1 paper (Semantic Segmentation of Remote Sensing Images) accepted to TGRS 2020.
2020-02 We have 1 paper (Semantic-Guided Scene Generation) accepted to CVPR 2020.
2019-07 We have 1 paper (Keypoint-Guided Image Generation) accepted to ACM MM 2019.
2019-05 🎉I received an offer from University of Oxford.
2019-02 We have 1 paper (Cross-View Image Translation) accepted to CVPR 2019.
2018-06 We have 1 paper (Hand Gesture-to-Gesture Translation) accepted to ACM MM 2018.
2018-02 We have 1 paper (Monocular Depth Estimation) accepted to CVPR 2018.
2016-07 We have 1 paper (Large Scale Image Retrieval) accepted to IJCAI 2016.
2015-08 We have 1 paper (Gender Classification) accepted to ACM MM 2015.

Position Openings

For prospective collaborators interested in Embodied AI, and AIGC (including LLM), we are offering multiple positions for Postdoc/Ph.D./Master/Intern researchers. If you are interested in joining our group, please email me with your self-introduction, the project of interest (including the problem you are trying to solve and how you plan to solve it, being as specific as possible), your transcript, and CV to haotang@pku.edu.cn/bjdxtanghao@gmail.com. I'm sorry that I may not be able to respond to every email, but I assure you that your message will stand out if you have a strong research background.

For Ph.D./Postdoc/Master applicants, we have several openings for domestic students each year. Please reach out at least 6 months prior to the application deadline. For international students, PKU CS offers a variety of programs in English, including Master's, Ph.D. programs, Summer/Winter Schools, and various other options. Feel free to reach out if you are interested or have any questions. For visiting students/undergraduates/research interns/research assistants, we welcome undergraduate and graduate students from all over the world to apply for >6 months research internship. Our visitors/interns have published many top-tier conference/journal papers (e.g., TPAMI、CVPR、NeurIPS) and have been admitted to Postdoc/Ph.D./Master programs in prestigious institutions such as MIT, Harvard, Google, Brown University, UMich, University of Toronto, Caltech, ETH Zürich, NTU, NUS, and TUM.

Research Lab

The mission of our research lab is to harness AI to address real-world challenges. Our research priorities include Embodied AI and AIGC.

  • Qilin Wang (PhD, previously from Fudan University, China🇨🇳)
  • Siyuan Qian (PhD, w/ Shanghang Zhang, previously from BUAA, China🇨🇳)
  • Haoyu Wang (PhD, w/ Shiliang Zhang, previously from HIT, China🇨🇳)
  • Zhen Chen (PhD, w/ Shiliang Zhang, previously from Tongji University, China🇨🇳)
  • Derek Zeng (Master, previously from University of Waterloo, Canada🇨🇦)
  • Jiarui Ye (Master, previously from NUAA, China🇨🇳)
  • Xiaoyuan Wang (Visiting from CMU, USA🇺🇸)
  • Haozhan Tang (Visiting from CMU, USA🇺🇸)
  • Wenbo Gou (Visiting from CMU, USA🇺🇸)
  • Zhenyu Lu (Visiting from CMU, USA🇺🇸)
  • Rohan Siva (Visiting from UT Austin, USA🇺🇸)
  • Jun Liu (Visiting from NEU, USA🇺🇸)
  • Changdi Yang (Visiting from NEU, USA🇺🇸)
  • Na Li (Visiting from Goldman Sachs & UPenn, USA🇺🇸)
  • Zihao Wang (Visiting from UPenn, USA🇺🇸)
  • Yao Gong (Visiting from UPenn, USA🇺🇸)
  • Junjie Zeng (Visiting from UMich, USA🇺🇸)
  • Peng Huang (Visiting from Boston University, USA🇺🇸)
  • Xiaoyi Liu (Visiting from Washington University in St. Louis, USA🇺🇸 -> now Ph.D. at Brown University, USA🇺🇸)
  • Kang Chen (Visiting from Rensselaer Polytechnic Institute, USA🇺🇸)
  • Bin Xie (Visiting from IIT, USA🇺🇸)
  • Huixiu Jiang (Visiting from IIT, USA🇺🇸)
  • Wanru Cheng (Visiting from University of Toronto, Canada🇨🇦)
  • Peize Li (Visiting from KCL, UK🇬🇧)
  • Jingyi Wan (Visiting from University of Cambridge, UK🇬🇧)
  • Xuanyu Lai (Visiting from ICL, UK🇬🇧)
  • Yitong Luo (Visiting from ICL, UK🇬🇧)
  • Baohua Yin (Visiting from University of Sussex, UK🇬🇧)
  • Yitao Song (Visiting from Moscow State University, Russia🇷🇺)
  • Zhiguang Han (Visiting from NTU, Singapore🇸🇬)
  • Zhen Long (Visiting from NUS, Singapore🇸🇬)
  • Ali Haider (Visiting from Kyung Hee University, South Korea🇰🇷)
  • Amitoj Singh Miglani (Visiting from IIT Roorkee, India🇮🇳)
  • Pirzada Suhail (Visiting from IIT Bombay, India🇮🇳)
  • Siddhant Pathak (Visiting from IIT BHU, India🇮🇳)
  • Vrushank Ajay Ahire (Visiting from IIT Ropar, India🇮🇳)
  • Ziwei Li (Visiting from KAUST, Saudi Arabia🇸🇦)
  • Ahmad Imran (Visiting from NUST, Pakistan🇵🇰)
  • Zeyu Zhang (Visiting from Australian National University, Australia🇦🇺)
  • Hongpeng Wang (Visiting from University of Sydney, Australia🇦🇺)
  • Zeyu Ren (Visiting from University of Melbourne, Australia🇦🇺)
  • Haihang Wu (Visiting from University of Melbourne, Australia🇦🇺)
  • Ziang Li (Visiting from UTS, Australia🇦🇺)
  • Zhixing Wang (Visiting from University of Malaya, Malaysia🇲🇾)
  • Pakawat Phasook (Visiting from King Mongkut’s University of Technology Thonburi, Thailand🇹🇭)
  • Ahmed Eldaw Mohamed (Visiting from University of Cape Town, South Africa🇿🇦)
  • Hongfeng Lai (Visiting from University of Hong Kong, Hong Kong🇭🇰)
  • Zicheng Liu (Visiting from University of Hong Kong, Hong Kong🇭🇰)
  • Yihua Shao (Visiting from City University of Hong Kong, Hong Kong🇭🇰)
  • Yuxuan Fan (Visiting from HKUST (Guangzhou), China🇨🇳)
  • Nonghai Zhang (Intern from Peking University, China🇨🇳)
  • Dongjian Li (Intern from Peking University, China🇨🇳)
  • Rui Yang (Intern from Peking University, China🇨🇳)
  • Mingyu Li (Intern from Peking University, China🇨🇳)
  • Zhaohui Wang (Intern from Peking University, China🇨🇳)
  • Ziyan Mao (Intern from Peking University, China🇨🇳)
  • Xinran Kuang (Intern from Peking University, China🇨🇳)
  • Keyu Chen (Intern from Peking University, China🇨🇳)
  • Di Yu (Visiting from Tsinghua University, China🇨🇳)
  • Yuxuan Zhang (Visiting from Shanghai Jiao Tong University, China🇨🇳 -> now Ph.D. at CUHK, Hong Kong🇭🇰)
  • Renkai Wu (Visiting from Shanghai Jiao Tong University, China🇨🇳 -> now Ph.D. at Tsinghua University, China🇨🇳)
  • Junxian Li (Visiting from Shanghai Jiao Tong University)
  • Hui Wei (Visiting from Wuhan University, China🇨🇳 -> now Postdoc at University of Oulu, Finland🇫🇮)
  • Xiaofeng Zhang (Visiting from Shanghai Jiao Tong University, China🇨🇳)
  • Ting Huang (Visiting from Shanghai University of Engineering Science, China🇨🇳)
  • I-Tak Ieong (Visiting from Tongji University, China🇨🇳)
  • Kunze Jiang (Visiting from USTC, China🇨🇳)
  • Lei Xin (Visiting from Wuhan University, China🇨🇳)
  • Fanhu Zeng (Visiting from CAISA, China🇨🇳)
  • Mingju Gao (Visiting from ICT, CAS, China🇨🇳)
  • Songtao Li (Visiting from Northeastern Universiyty, China🇨🇳)
  • Qinhua Xie (Visiting from East China Normal University, China🇨🇳)
  • Zihang Liu (Visiting from Beijing Institute of Technology, China🇨🇳)
  • Aoming Liang (Visiting from Westlake University, China🇨🇳)
  • Sifan Li (Visiting from Liaoning University, China🇨🇳)

Former members and visitors:

  • Youran Qu (RA from Peking University, China🇨🇳 -> now Master at Dartmouth College, USA🇺🇸)
  • Kaiwen Shi (RA from Peking University, China🇨🇳 -> now PhD at University of Notre Dame, USA🇺🇸)
  • Lujing Xie (RA from Peking University, China🇨🇳 -> now PhD at University of Texas at Dallas, USA🇺🇸)
  • Yaowu Zhang (RA from Peking University, China🇨🇳 -> now Master at Institute of Computing Technology, Chinese Academy of Sciences, China🇨🇳)
  • Haoran Li (RA from Peking University, China🇨🇳 -> now Master at Peking University, China🇨🇳)
  • Yaoxiang Xiong (RA from Peking University, China🇨🇳 -> now Master at Peking University, China🇨🇳)
  • Jinxian Ren (RA from Peking University, China🇨🇳)
  • Guillaume Thiry (RA from ETH Zürich, Switzerland🇨🇭 -> now Software Engineer at Google, Switzerland🇨🇭)
  • Sherwin Bahmani (RA from ETH Zürich, Switzerland🇨🇭 -> now Ph.D. at University of Toronto, Canada🇨🇦)
  • Sanghwan Kim (RA from ETH Zürich, Switzerland🇨🇭 -> now Ph.D. at TUM, Germany🇩🇪)
  • Alexandros Delitzas (RA from ETH Zürich, Switzerland🇨🇭 -> now Ph.D. at ETH Zürich and Max Planck Institute for Informatics, Switzerland🇨🇭 and Germany🇩🇪)
  • Jingfeng Rong (RA from ETH Zürich, Switzerland🇨🇭 -> now Ph.D. at Swiss Finance Institute, Switzerland🇨🇭)
  • Yitong Xia (RA from ETH Zürich, Switzerland🇨🇭)
  • Boyan Duan (RA, now Master at ETH Zürich, Switzerland🇨🇭)
  • Baptiste Chopin (RA, now Postdoc at INRIA, France🇫🇷)
  • Chenyang Gu (RA from Peking University, China🇨🇳 -> now Ph.D. at Peking University, China🇨🇳)
  • Kosta Gjorgjievski (RA from UCLA, USA🇺🇸 -> now Master at Tsinghua University, China🇨🇳)
  • Xiaoyu Yi (RA from Peking University, China🇨🇳)

Teaching

International Collaborations

Our lab maintains strong collaborative relationships with several leading international research institutions, including

  • USA🇺🇸: MIT, Harvard, Stanford University, CMU, Princeton University, UIUC, UMich, Northeastern University, University of Maryland, University of Texas at Austin, UC Irvine, University of Illinois at Chicago, Illinois Institute of Technology, University of Connecticut, Texas State University, University of Georgia, Clemson University, University of Oregon, College of William & Mary
  • Canada🇨🇦: University of Toronto, Simon Fraser University
  • Switzerland🇨🇭: ETH Zürich, EPFL
  • UK🇬🇧: University of Oxford, University of Cambridge, University of Leicester, University of Warwick
  • Italy🇮🇹: University of Trento, FBK, Politecnico di Milano, University of Modena e Reggio Emilia
  • Germany🇩🇪: TUM, University of Würzburg
  • France🇫🇷: INRIA, University of Lille
  • Finland🇫🇮: University of Oulu
  • Netherlands🇳🇱: TU Delft
  • Belgium🇧🇪: KU Leuven
  • Bulgaria🇧🇬: INSAIT
  • Singapore🇸🇬: NUS, NTU
  • Japan🇯🇵: University of Tokyo, National Institute of Informatics
  • South Korea🇰🇷: Sungkyunkwan University
  • Australia🇦🇺: University of Adelaide, ANU, Monash University, University of Technology Sydney
  • UAE🇦🇪: IIAI, MBZUAI
  • HongKong🇭🇰: University of Hong Kong, Hong Kong University of Science and Technology
I am deeply grateful for the opportunities to collaborate with such esteemed institutions and for the valuable contributions they have made to our joint research efforts. Additionally, we maintain long-term collaborations with industry, including Google, Meta, Amazon, Cisco, Western Digital, Mercedes-Benz, Xiaohongshu, Alibaba, Tencent, etc, aiming to translate cutting-edge research into practical applications and drive technological advancement.

Featured Publications

(Including CVPR, ICCV, ECCV, NeurIPS, ICML, ICLR, AAAI, IJCAI, ACM MM, ICRA, IROS, CoRL, NAACL, TPAMI, IJCV)

My Students or Interns, *Corresponding Author(s)

  1. VaseVQA-3D: Benchmarking 3D VLMs on Ancient Greek Pottery
     Nonghai Zhang,  Zeyu Zhang,  Jiazi Wang,  Yang Zhao,  Hao Tang*
    In Arxiv, 2025
  2. Fidelity-Aware Data Composition for Robust Robot Generalization
     Zizhao Tong,  Di Chen,  Sicheng Hu,  Hongwei Fan,  Liliang Chen,  Guanghui Ren,  Hao Tang,  Hao Dong,  Ling Shao
    In Arxiv, 2025
  3. UniVid: The Open-Source Unified Video Model
     Jiabin Luo,  Junhui Lin,  Zeyu Zhang,  Biao Wu,  Meng Fang,  Ling Chen,  Hao Tang*
    In Arxiv, 2025
  4. StereoAdapter: Adapting Stereo Depth Estimation to Underwater Scenes
     Zhengri Wu,  Yiran Wang,  Yu Wen,  Zeyu Zhang,  Biao Wu,  Hao Tang*
    In Arxiv, 2025
  5. Nav-R1: Reasoning and Navigation in Embodied Scenes
     Qingxiang Liu,  Ting Huang,  Zeyu Zhang,  Hao Tang*
    In Arxiv, 2025
  6. ReMoMask: Retrieval-Augmented Masked Motion Generation
     Zhengdao Li,  Siheng Wang,  Zeyu Zhang,  Hao Tang*
    In Arxiv, 2025
  7. 3D-R1: Enhancing Reasoning in 3D VLMs for Unified Scene Understanding
     Ting Huang,  Zeyu Zhang,  Hao Tang*
    In Arxiv, 2025
  8. Follow Your Motion: A Generic Temporal Consistency Portrait Editing Framework with Trajectory Guidance
     Haijie Yang,  Zhenyu Zhang,  Hao Tang,  Jianjun Qian,  Jian Yang
    In Arxiv, 2025
  9. FE-UNet: Frequency Domain Enhanced U-Net with Segment Anything Capability for Versatile Image Segmentation
     Guohao Huo,  Ruiting Dai,  Ling Shao,  Hao Tang*
    In Arxiv, 2025
  10. RFMedSAM 2: Automatic Prompt Refinement for Enhanced Volumetric Medical Image Segmentation with SAM 2
     Bin Xie,  Hao Tang,  Yan Yan,  Gady Agam
    In Arxiv, 2025
  11. Self-Prompt SAM: Medical Image Segmentation via Automatic Prompt SAM Adaptation
     Bin Xie,  Hao Tang,  Dawen Cai,  Yan Yan,  Gady Agam
    In Arxiv, 2025
  12. UDiTQC: U-Net-Style Diffusion Transformer for Quantum Circuit Synthesis
     Zhiwei Chen,  Hao Tang*
    In Arxiv, 2025
  13. Artificial Intelligence for Quantum Error Correction: A Comprehensive Review
     Zihao Wang,  Hao Tang*
    In Arxiv, 2024
  14. PolSAM: Polarimetric Scattering Mechanism Informed Segment Anything Model
     Yuqing Wang,  Zhongling Huang,  Shuxin Yang,  Hao Tang,  Xiaolan Qiu,  Junwei Han,  Dingwen Zhang
    In Arxiv, 2024
  15. Artificial Intelligence for Central Dogma-Centric Multi-Omics: Challenges and Breakthroughs
     Lei Xin,  Caiyun Huang,  Hao Li,  Shihong Huang,  Yuling Feng,  Zhenglun Kong,  Zicheng Liu,  Siyuan Li,  Chang Yu,  Fei Shen,  Hao Tang*
    In Arxiv, 2024
  16. Text-to-Image Synthesis: A Decade Survey
     Nonghai Zhang,  Hao Tang*
    In Arxiv, 2024
  17. AllRestorer: All-in-One Transformer for Image Restoration under Composite Degradations
     Jiawei Mao,  Yu Yang,  Xuesong Yin,  Ling Shao,  Hao Tang*
    In Arxiv, 2024
  18. KMM: Key Frame Mask Mamba for Extended Motion Generation
     Zeyu Zhang,  Hang Gao,  Akide Liu,  Qi Chen,  Feng Chen,  Yiran Wang,  Danning Li,  Hao Tang*
    In Arxiv, 2024
  19. GWQ: Gradient-Aware Weight Quantization for Large Language Models
     Yihua Shao,  Siyu Liang,  Xiaolin Lin,  Zijian Ling,  Zixian Zhu,  Minxi Yan,  Haiyang Liu,  Siyu Chen,  Ziyang Yan,  Yilan Meng,  Chenyu Zhang,  Haotong Qin*,  Michele Magno,  Yang Yang,  Zhen Lei,  Yan Wang,  Jingcai Guo,  Ling Shao,  Hao Tang*
    In Arxiv, 2024
  20. M2M: Learning Controllable Multi of Experts and Multi-Scale Operators Are the Partial Differential Equations Need
     Aoming Liang,  Zhaoyang Mu,  Pengxiao Lin,  Cong Wang,  Mingming Ge,  Ling Shao,  Dixia Fan*,  Hao Tang*
    In Arxiv, 2024
  21. Barbie: Text to Barbie-Style 3D Avatars
     Xiaokun Sun,  Zhenyu Zhang,  Ying Tai,  Qian Wang,  Hao Tang,  Zili Yi,  Jian Yang
    In Arxiv, 2024
  22. InfiniMotion: Mamba Boosts Memory in Transformer for Arbitrary Long Motion Generation
     Zeyu Zhang,  Akide Liu,  Qi Chen,  Feng Chen,  Ian Reid,  Richard Hartley,  Bohan Zhuang,  Hao Tang*
    In Arxiv, 2024
  23. A Survey on Multimodal Wearable Sensor-based Human Action Recognition
     Jianyuan Ni,  Hao Tang,  Syed Tousiful Haque,  Yan Yan,  Anne HH Ngu
    In Arxiv, 2024
  24. StableGarment: Garment-Centric Generation via Stable Diffusion
     Rui Wang,  Hailong Guo,  Jiaming Liu,  Huaxia Li,  Haibo Zhao,  Xu Tang,  Yao Hu,  Hao Tang,  Peipei Li
    In Arxiv, 2024
  25. Multimodal Alignment and Fusion: A Survey
     Songtao Li,  Hao Tang*
    Springer International Journal of Computer Vision (IJCV), 2025
  26. Parameter Efficient Merging for Multimodal Large Language Models with Direction Robustness
     Fanhu Zeng,  Haiyang Guo,  Fei Zhu,  Li Shen,  Hao Tang*
    In NeurIPS 2025, San Diego, USA
  27. Towards Better Dental AI: A Multimodal Benchmark and Instruction Dataset for Panoramic X-ray Analysis
     Jing Hao,  Yuxuan Fan,  Yanpeng Sun,  Kaixin Guo,  Lizhuo Lin,  Jinrong Yang,  Qi Yong H. Ai,  Lun M. Wong,  Hao Tang*,  Kuo Feng Hung*
    In NeurIPS 2025, San Diego, USA
  28. Enhancing Diffusion-based Unrestricted Adversarial Attacks via Adversary Preferences Alignment
     Kaixun Jiang,  Zhaoyu Chen,  HaiJing Guo,  Jinglun Li,  Jiyuan Fu,  Pinxue Guo,  Hao Tang,  Bo Li,  Wenqiang Zhang
    In NeurIPS 2025, San Diego, USA
  29. Boosting Adversarial Transferability with Spatial Adversarial Alignment
     Zhaoyu Chen,  Haijing Guo,  Kaixun Jiang,  Jiyuan Fu,  Xinyu Zhou,  Dingkang Yang,  Hao Tang,  Bo Li,  Wenqiang Zhang
    In NeurIPS 2025, San Diego, USA
  30. 3DS-VLA: A 3D Spatial-Aware Vision Language Action Model for Robust Multi-Task Manipulation
     Xiaoqi Li,  Liang Heng,  Jiaming Liu,  Yan Shen,  Chenyang Gu,  Zhuoyang Liu,  Hao Chen,  Nuowei Han,  Renrui Zhang,  Hao Tang,  Shanghang Zhang,  Hao Dong
    In CoRL 2025, Seoul, Korea
  31. GRADRobot: Geometry-Aware Rendering with Articulation and Diffusion for Robot Modeling
     Yunlong Li,  Boyuan Chen,  Chongjie Ye,  Bohan Li,  Zhaoxi Chen,  Shaocong Xu,  Hao Tang,  Hao Zhao
    In CoRL 2025, Seoul, Korea
  32. EventVAD: Training-free Event-aware Video Anomaly Detection
     Yihua Shao,  Haojin He,  Sijie Li,  Siyu Chen,  Xinwei Long,  Fanhu Zeng,  Yuxuan Fan,  Muyang Zhang,  Ziyang Yan,  Ao Ma,  Xiaochen Wang,  Hao Tang,  Yan Wang,  Shuyan Li
    In ACM MM 2025, Dublin, Ireland
  33. MaskSAM: Towards Auto-prompt SAM with Mask Classification for Medical Image Segmentation
     Bin Xie,  Hao Tang,  Bin Duan,  Dawen Cai,  Yan Yan,  Gady Agam
    In ICCV 2025, Honolulu, USA
  34. TTTFusion: A Test-Time Training-Based Strategy for Multimodal Medical Image Fusion in Surgical Robots
     Qinhua Xie,  Hao Tang*
    In IROS 2025, Hangzhou, China
  35. CRUISE: Cooperative Reconstruction and Editing in V2X Scenarios using Gaussian Splatting
     Haoran Xu,  Saining Zhang,  Peishuo Li,  Baijun Ye,  Xiaoxue Chen,  Huan-ang Gao,  Jv Zheng,  Xiaowei Song,  Ziqiao Peng,  Run Miao,  Jinrang Jia,  Yifeng Shi,  Guangqi Yi,  Hang Zhao,  Hao Tang,  Hongyang Li,  Kaicheng Yu,  Hao Zhao
    In IROS 2025, Hangzhou, China
  36. AutoViT: Achieving Real-Time Vision Transformers on Mobile via Latency-aware Coarse-to-Fine Search
     Zhenglun Kong,  Dongkuan Xu,  Zhengang Li,  Peiyan Dong,  Hao Tang,  Yanzhi Wang, Subhabrata Mukherjee
    Springer International Journal of Computer Vision (IJCV), 2025
  37. Semantic-Guided Diffusion Model for Single-Step Image Super-Resolution
     Zihang Liu,  Zhenyu Zhang,  Hao Tang*
    In IJCAI 2025, Montreal, Canada
  38. In-Context Meta LoRA Generation
     Yihua Shao,  Minxi Yan,  Yang Liu,  Siyu Chen,  Wenjie Chen,  Xinwei Long,  Ziyang Yan,  Lei Li,  Chenyu Zhang,  Nicu Sebe,  Hao Tang*,  Yan Wang,  Hao Zhao,  Mengzhu Wang,  Jingcai Guo*
    In IJCAI 2025, Montreal, Canada
  39. FairSMOE: Mitigating Multi-Attribute Fairness Problem with Sparse Mixture-of-Experts
     Changdi Yang,  Zheng Zhan,  Ci Zhang,  Yifan Gong,  Yize Li,  Zichong Meng,  Jun Liu,  Xuan Shen,  Hao Tang,  Geng Yuan,  Pu Zhao,  Xue Lin,  Yanzhi Wang
    In IJCAI 2025, Montreal, Canada
  40. Smart Organic–Inorganic Copolymer Nanoparticles Distinguish Between Microglia and Cancer Cells for Synergistic Immunotherapy in Glioma
     Shiming Zhang,  Kun Shang,  Lidong Gong,  Qian Xie,  Jianfei Sun,  Meng Xu,  Xunbin Wei,  Zhaoheng Xie,  Xinyu Liu,  Hao Tang,  Zhengren Xu,  Wei Wang,  Haihua Xiao,  Zhiqiang Lin,  Hongbin Han
    Advanced Science, 2025
  41. DiffFNO: Diffusion Fourier Neural Operator
     Xiaoyi Liu,  Hao Tang*
    In CVPR 2025, Nashville, USA
  42. MambaIC: State Space Models for High-Performance Learned Image Compression
     Fanhu Zeng,  Hao Tang,  Yihua Shao,  Siyu Chen,  Ling Shao,  Yan Wang
    In CVPR 2025, Nashville, USA
  43. PartRM: Modeling Part-Level Dynamics with Large Cross-State Reconstruction Model
     Mingju Gao,  Yike Pan,  Huan-ang Gao,  Zongzheng Zhang,  Wenyi Li,  Hao Dong,  Hao Tang,  Li Yi,  Hao Zhao
    In CVPR 2025, Nashville, USA
  44. Toward Zero-Shot Learning for Visual Dehazing of Urological Surgical Robots
     Renkai Wu,  Xianjin Wang,  Pengchen Liang,  Zhenyu Zhang,  Qing Chang*,  Hao Tang*
    In ICRA 2025, Atlanta, USA
  45. From Redundancy to Relevance: Information Flow in LVLMs Across Reasoning Tasks
     Xiaofeng Zhang,  Yihao Quan,  Chen Shen,  Xiaosong Yuan,  Shaotian Yan,  Liang Xie,  Wenxiao Wang,  Chaochen Gu,  Hao Tang, Jieping Ye
    In NAACL 2025, Albuquerque, USA
  46. Toward Adaptive Large Language Models Structured Pruning via Hybrid-grained Weight Importance Assessment
     Jun Liu,  Zhenglun Kong,  Pu Zhao,  Changdi Yang,  Hao Tang*,  Xuan Shen,  Geng Yuan,  Wei Niu,  Wenbin Zhang,  Xue Lin,  Dong Huang*,  Yanzhi Wang*
    In AAAI 2025, Philadelphia, USA
  47. Stable-Hair: Real-World Hair Transfer via Diffusion Model
     Yuxuan Zhang,  Qing Zhang,  Yiren Song,  Jichao Zhang,  Hao Tang,  Jiaming Liu
    In AAAI 2025, Philadelphia, USA
  48. Towards Self-Supervised FG-SBIR with Unified Sample Feature Alignment and Multi-Scale Token Recycling
     Jianan Jiang,  Hao Tang*,  Zhilin Jiang,  Weiren Yu,  Di Wu*
    In AAAI 2025, Philadelphia, USA
  49. Revisiting Adversarial Patches for Designing Camera-Agnostic Attacks against Person Detection
     Hui Wei,  Zhixiang Wang,  Kewei Zhang,  Jiaqi Hou,  Yuanwei Liu,  Hao Tang,  Zheng Wang
    In NeurIPS 2024, Vancouver, Canada
  50. ConsistentAvatar: Learning to Diffuse Fully Consistent Talking Head Avatar with Temporal Guidance
     Haijie Yang,  Zhenyu Zhang,  Hao Tang,  Jianjun Qian,  Jian Yang
    In ACM MM 2024, Melbourne, Australia
  51. CoIn: A Lightweight and Effective Framework for Story Visualization and Continuation
     Ming Tao,  Bao Bingkun,  Hao Tang,  Yaowei Wang,  Changsheng Xu
    In ACM MM 2024, Melbourne, Australia
  52. Physical Adversarial Attack Meets Computer Vision: A Decade Survey
     Hui Wei,  Hao Tang,  Xuemei Jia,  Zhixiang Wang,  Hanxun Yu,  Zhubo Li,  Shin'ichi Satoh,  Luc Van Gool,  Zheng Wang
    IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024
  53. Motion Mamba: Efficient and Long Sequence Motion Generation with Hierarchical and Bidirectional Selective SSM
     Zeyu Zhang,  Akide Liu,  Ian Reid,  Richard Hartley,  Bohan Zhuang,  Hao Tang*
    In ECCV 2024, Milan, Italy
  54. 3D Weakly Supervised Semantic Segmentation with 2D Vision-Language Guidance
     Xiaoxu Xu,  Yitian Yuan,  Jinlong Li,  Qiudan Zhang,  Zequn Jie,  Lin Ma,  Hao Tang,  Nicu Sebe,  Xu Wang
    In ECCV 2024, Milan, Italy
  55. StoryImager: A Unified and Efficient Framework for Coherent Story Visualization and Completion
     Ming Tao,  Bingkun Bao,  Hao Tang,  Yaowei Wang,  Changsheng Xu
    In ECCV 2024, Milan, Italy
  56. InstructGIE: Towards Generalizable Image Editing
     Zichong Meng,  Changdi Yang,  Jun Liu,  Hao Tang*,  Pu Zhao*,  Yanzhi Wang*
    In ECCV 2024, Milan, Italy
  57. SCP-Diff: Photo-Realistic Semantic Image Synthesis with Spatial-Categorical Joint Prior
     Huan-ang Gao,  Mingju Gao,  Jiaju Li,  Wenyi Li,  Rong Zhi,  Hao Tang,  Hao Zhao
    In ECCV 2024, Milan, Italy
  58. Dataset Growth
     Ziheng Qin,  Zhaopan Xu,  Yukun Zhou,  Zangwei Zheng,  Zebang Cheng,  Hao Tang,  Lei Shang,  Baigui Sun,  Xiaojiang Peng,  Radu Timofte,  Hongxun Yao,  Kai Wang,  Yang You
    In ECCV 2024, Milan, Italy
  59. HandDiff: 3D Hand Pose Estimation with Diffusion on Image-Point Cloud
     Wencan Cheng,  Hao Tang,  Luc Van Gool,  Jong Hwan Ko
    In CVPR 2024, Seattle, USA
  60. Versatile Navigation under Partial Observability via Value-guided Diffusion Policy
     Gengyu Zhang,  Hao Tang,  Yan Yan
    In CVPR 2024, Seattle, USA
  61. Towards Robust 3D Pose Transfer with Adversarial Learning
     Haoyu Chen,  Hao Tang,  Ehsan Adeli,  Guoying Zhao
    In CVPR 2024, Seattle, USA
  62. On the Faithfulness of Vision Transformer Explanations
     Junyi Wu,  Weitai Kang,  Hao Tang,  Yuan Hong,  Yan Yan
    In CVPR 2024, Seattle, USA
  63. Token Transformation Matters: Towards Faithful Post-hoc Explanation for Vision Transformer
     Junyi Wu,  Bin Duan,  Weitai Kang,  Hao Tang,  Yan Yan
    In CVPR 2024, Seattle, USA
  64. SSR-Encoder: Encoding Selective Subject Representation for Subject-Driven Generation
     Yuxuan Zhang,  Jiaming Liu,  Yiren Song,  Rui Wang,  Hao Tang,  Jinpeng Yu,  Huaxia Li,  Xu Tang,  Yao Hu,  Han Pan,  Zhongliang Jing
    In CVPR 2024, Seattle, USA
  65. Towards Online Real-Time Memory-based Video Inpainting Transformers
     Guillaume Thiry,  Hao Tang*,  Radu Timofte,  Luc Van Gool
    In CVPR 2024, Seattle, USA
  66. G2P-DDM: Generating Sign Pose Sequence from Gloss Sequence with Discrete Diffusion Model
     Pan Xie,  Qipeng Zhang,  Peng Taiying,  Hao Tang*,  Yao Du,  Zexian Li
    In AAAI 2024, Vancouver, Canada
  67. HotBEV: Hardware-oriented Transformer-based Multi-View 3D Detector for BEV Perception
     Peiyan Dong,  Zhenglun Kong,  Xin Meng,  Pinrui Yu,  Yifan Gong,  Geng Yuan,  Hao Tang*, Yanzhi Wang
    In NeurIPS 2023, New Orleans, USA
  68. PackQViT: Faster Sub-8-bit Vision Transformers via Full and Packed Quantization on the Mobile
     Peiyan Dong,  Lei Lu,  Chao Wu,  Cheng Lyu,  Geng Yuan,  Hao Tang*, Yanzhi Wang
    In NeurIPS 2023, New Orleans, USA
  69. LART: Neural Correspondence Learning with Latent Regularization Transformer for 3D Motion Transfer
     Haoyu Chen,  Hao Tang,  Radu Timofte,  Luc Van Gool,  Guoying Zhao
    In NeurIPS 2023, New Orleans, USA
  70. Does Graph Distillation See Like Vision Dataset Counterpart?
     Beining Yang,  Kai Wang,  Qingyun Sun,  Cheng Ji,  Xingcheng Fu,  Hao Tang,  Yang You,  Jianxin Li
    In NeurIPS 2023, New Orleans, USA
  71. Practical Blind Image Denoising via Swin-Conv-UNet and Data Synthesis
     Kai Zhang,  Yawei Li,  Jingyun Liang,  Jiezhang Cao,  Yulun Zhang,  Hao Tang,  Dengping Fan,  Radu Timofte,  Luc Van Gool
    Springer Machine Intelligence Research (MIR), 2023
  72. Learning Concordant Attention via Target-aware Alignment for Visible-Infrared Person Re-identification
     Jianbing Wu,  Hong Liu,  Yuxin Su,  Wei Shi,  Hao Tang
    In ICCV 2023, Paris, France
  73. SpeedDETR: Speed-aware Transformers for End-to-end Object Detection
     Peiyan Dong,  Zhenglun Kong,  Xin Meng,  Peng Zhang,  Hao Tang*,  Yanzhi Wang,  Chih-Hsien Chou
    In ICML 2023, Hawaii, USA
  74. Data Level Lottery Ticket Hypothesis for Vision Transformers
     Xuan Shen,  Zhenglun Kong,  Minghai Qin,  Peiyan Dong,  Geng Yuan,  Xin Meng,  Hao Tang,  Xiaolong Ma,  Yanzhi Wang
    In IJCAI 2023, Macao, China
  75. Graph Transformer GANs for Graph-Constrained House Generation
     Hao Tang,  Zhenyu Zhang,  Humphrey Shi,  Bo Li,  Ling Shao,  Nicu Sebe,  Radu Timofte,  Luc Van Gool
    In CVPR 2023, Vancouver, Canada
  76. Unsupervised Deep Probabilistic Approach for Partial Point Cloud Registration
     Guofeng Mei,  Hao Tang,  Xiaoshui Huang,  Weijie Wang,  Juan Liu,  Jian Zhang,  Luc Van Gool,  Qiang Wu
    In CVPR 2023, Vancouver, Canada
  77. DeepMAD: Mathematical Architecture Design for Deep Convolutional Neural Network
     Xuan Shen,  Yaohua Wang,  Ming Lin,  Yilun Huang,  Hao Tang,  Xiuyu Sun,  Yanzhi Wang
    In CVPR 2023, Vancouver, Canada
  78. GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis
     Ming Tao,  Bingkun Bao,  Hao Tang,  Changsheng Xu
    In CVPR 2023, Vancouver, Canada
  79. Edge Guided GANs with Contrastive Learning for Semantic Image Synthesis
     Hao Tang,  Xiaojuan Qi,  Guolei Sun,  Dan Xu,  Nicu Sebe,  Radu Timofte,  Luc Van Gool
    In ICLR 2023, Kigali, Rwanda
  80. 3D-Aware Semantic-Guided Generative Model for Human Synthesis
     Jichao Zhang,  Enver Sangineto,  Hao Tang,  Aliaksandr Siarohin,  Zhun Zhong,  Nicu Sebe,  Wei Wang
    In ECCV 2022, Tel Aviv, Israel
  81. Towards Interpretable Video Super-Resolution via Alternative Optimization
     Jiezhang Cao,  Jingyun Liang,  Kai Zhang,  Wenguan Wang,  Qin Wang,  Yulun Zhang,  Hao Tang,  Luc Van Gool
    In ECCV 2022, Tel Aviv, Israel
  82. MHFormer: Multi-Hypothesis Transformer for 3D Human Pose Estimation
     Wenhao Li,  Hong Liu,  Hao Tang,  Pichao Wang,  Luc Van Gool
    In CVPR 2022, New Orleans, USA
  83. DF-GAN: A Simple and Effective Baseline for Text-to-Image Synthesis
     Ming Tao,  Hao Tang,  Fei Wu,  Xiaoyuan Jing,  Bingkun Bao,  Changsheng Xu
    In CVPR 2022, New Orleans, USA
  84. GestureGAN for Hand Gesture-to-Gesture Translation in the Wild
     Hao Tang,  Wei Wang,  Dan Xu,  Yan Yan,  Nicu Sebe
    In ACM MM 2018, Seoul, South Korea