Chunyuan Li

I am a senior researcher at Microsoft Research, Redmond. My recent research focuses on exploring and practicing the connections between deep generative models and self-supervised pre-training using large-scale datasets and training, with applications to natural language modeling, image generation and vision-and-language tasks. For an overview, please check the slides and blogs (in English and Chinese ).

I completed my PhD in machine learning at Duke University, advised by Prof. Lawrence Carin . My PhD research studied the intersection of deep learning and Bayesian statistics --- enriching one with each other: (1) Bayesian Deep Learning: Scalable Bayesian learning methods for the weight uncertainty of deep neural networks, e.g., SG-MCMCs. (2) Deep Bayesian Learning: Deep neural networks as flexible representation methods in Bayesian models, e.g., GANs and VAEs. These tools have been applied to various domains, including computer vision and natural language processing.
[Email: lichunyuan24@gmail.com] GitHub ] [ Google Scholar ] [Linkedin] [CV] [Dissertation]


    Recent papers
  • Deep Learning
 

ALICE: Towards Understanding Adversarial Learning for Joint Distribution Matching [Paper] [Poster] [Code]
Chunyuan Li, Hao Liu, Changyou Chen, Yunchen Pu, Liqun Chen, Ricardo Henao and Lawrence Carin
Neural Information Processing Systems (NIPS), 2017
1) Raise the non-identifiability issues in bidirectional adversarial learning
2) Propose ALICE algorithms: a conditional entropy framework to remedy the issues
3) Unify ALI/BiGAN, CycleGAN/DiscoGAN/DualGAN and Conditional GAN as joint distribution matching
  Measuring the Intrinsic Dimension of Objective Landscapes [Paper] [Blog] [YouTube] [Code] [Poster] [Reddit]
Chunyuan Li, Heerad Farkhoor, Rosanne Liu, Jason Yosinski
International Conference on Learning Representations (ICLR), 2018. 
Training neural nets in a random subspace to find the minimum number of trainable parameters for a solution
 



Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space [Paper] [Code] [Blog] [Demo]
Chunyuan Li, Xiang Gao, Yuan Li, Xiujun Li, Baolin Peng, Yizhe Zhang, Jianfeng Gao

The first BIG VAE language model, as an illustration of pre-training a compact latent space at scale
 


Feature Quantization Improves GAN Training [Paper] [Code] [Demo]
Yang Zhao*, Chunyuan Li*, Ping Yu, Jianfeng Gao, Changyou Chen (* Equal contribution)

International Conference on Machine Learning (ICML), 2020. 
A simple plug-in to improve BigGAN, StyleGAN, and U-GAT-IT for large-scale image generation/translation
 


Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-training [Paper] [Code]
Weituo Hao*, Chunyuan Li*, Xiujun Li, Lawrence Carin, Jianfeng Gao (* Equal contribution)

Computer Vision and Pattern Recognition (CVPR), 2020.
The first generic agent for navigation tasks, pre-trained with synthesized samples of a generative model
 


Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks [Paper] [Code] [Blog]
Xiujun Li, Xi Yin, Chunyuan Li, Pengchuan Zhang, Xiaowei Hu, Lei Zhang, Lijuan Wang, Houdong Hu, Li Dong, Furu Wei, Yejin Choi, Jianfeng Gao

European Conference on Computer Vision (ECCV), 2020.
Objects as anchor points to align image-text, creating new SoTA on six vision-and-language tasks.

  Structure-Aware Human-Action Generation [Paper]
Ping Yu, Yang Zhao, Chunyuan Li, Junsong Yuan, Changyou Chen

European Conference on Computer Vision (ECCV), 2020.

 
Few-shot Natural Language Generation for Task-Oriented Dialog [Paper] [Code] [Project] [Demo]
Baolin Peng, Chenguang Zhu, Chunyuan Li, Xiujun Li, Jinchao Li, Michael Zeng, Jianfeng Gao


  Soloist: Few-shot Task-Oriented Dialog with A Single Pre-trained Auto-regressive Model [Paper] [Project]
Baolin Peng, Chunyuan Li, Jinchao Li, Shahin Shayandeh, Lars Liden, Jianfeng Gao

A new and simplified paradigm shift for task-oritend dialogs: grounded pre-training + few-shot fine-tuning

  Cyclical Stochastic Gradient MCMC for Bayesian Deep Learning [Paper] [Code] [Video]
Ruqi Zhang, Chunyuan Li, Jianyi Zhang, Changyou Chen, Andrew Gordon Wilson

International Conference on Learning Representations (ICLR), 2020.   Oral Presentation

  RaCT: Towards Amortized Ranking-Critical Training for Collaborative Filtering [Paper] [Code] [Video]
Sam Lobel*, Chunyuan Li*, Jianfeng Gao, Lawrence Carin (* Equal contribution)

International Conference on Learning Representations (ICLR), 2020.    

  Complementary Auxiliary Classifiers for Label-Conditional Text Generation [Paper]
Y. Li, C. Li, Y. Zhang, X. Li, G. Zheng, L. Carin and J. Gao

AAAI Conference on Artificial Intelligence (AAAI), 2020

  Survival Cluster Analysis [Paper] [Code]
Paidamoyo Chapfuwa, Chunyuan Li, Nikhil Mehta, Lawrence Carin, Ricardo Henao

ACM Conference on Health, Inference, and Learning (ACM CHIL), 2020

  Twin Auxiliary Classifiers GAN [Paper] [Code]
Mingming Gong, Yanwu Xu, Chunyuan Li, Kun Zhang, Kayhan Batmanghelich

Neural Information Processing Systems (NeurIPS), 2019.   Spotlight Presentation

  Implicit Deep Latent Variable Models for Text Generation [Paper] [Code]
Le Fang, Chunyuan Li, Jianfeng Gao, Wen Dong, Changyou Chen

Empirical Methods in Natural Language Processing(EMNLP), 2019.
 
  Efficient Navigation with Language Pre-training and Stochastic Sampling [Paper] [Code]
Xiujun Li, Chunyuan Li, Qiaolin Xia, Yonatan Bisk, Asli Celikyilmaz, Jianfeng Gao, Noah A. Smith, Yejin Choi

Empirical Methods in Natural Language Processing(EMNLP), 2019.
 
  Cyclical Annealing Schedule: A Simple Approach to Mitigating KL Vanishing [Paper] [Code] [MSR Blog]
Hao Fu*,Chunyuan Li*, Xiaodong Liu, Jianfeng Gao, Asli Celikyilmaz, Lawrence Carin (* Equal contribution)
North American Chapter of the Association for Computational Linguistics (NAACL), 2019.  Oral Presentation
  Adversarial Learning of a Sampler Based on an Unnormalized Distribution [Paper] [Code] [Poster]
Chunyuan Li, Ke Bai, Jianqiao Li, Guoyin Wang, Changyou Chen, Lawrence Carin
Artificial Intelligence and Statistics (AISTATS), 2019. 
 














Communication-Efficient Stochastic Gradient MCMC for Neural Networks [Paper] [Appendix] [Poster]
Chunyuan Li, Changyou Chen, Yunchen Pu, Ricardo Henao and Lawrence Carin
AAAI Conference of Artificial Intelligence (AAAI), 2019. 

Continuous-Time Flows for Efficient Inference and Density Estimation [Paper]
Changyou Chen, Chunyuan Li, Liqun Chen, Wenlin Wang, Yunchen Pu, Lawrence Carin
International Conference on Machine Learning (ICML), 2018. 

Policy Optimization as Wasserstein Gradient Flows [Paper]
Ruiyi Zhang, Changyou Chen, Chunyuan Li and Lawrence Carin
International Conference on Machine Learning (ICML), 2018. 

Adversarial Time-to-Event Modeling [Paper] [Code]
Paidamoyo Chapfuwa, Chenyang Tao, Chunyuan Li, C. Page, B. Goldstein, Lawrence Carin, Ricardo Henao
International Conference on Machine Learning (ICML), 2018. 

Joint Word and Label Embeddings for Text Classification [Paper] [Code]
G. Wang, C. Li, W. Wang Y. Zhang, D. Shen, X. Zhang, R. Henao and L. Carin
Annual Meeting of the Association for Computational Linguistics (ACL), 2018. 

On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms [Paper] [Code]
D. Shen, G. Wang, W. Wang, M. Min, Q. Su, Y. Zhang, C. Li, R. Henao and L. Carin
Annual Meeting of the Association for Computational Linguistics (ACL), 2018. 

Learning Structural Weight Uncertainty for Sequential Decision-Making [Paper] [Code]
Ruiyi Zhang, Chunyuan Li, Changyou Chen, Lawrence Carin
Artificial Intelligence and Statistics (AISTATS), 2018. 

Symmetric Variational Autoencoder and Connections to Adversarial Learning [Paper]
Liqun Chen, Shuyang Dai, Yunchen Pu, Chunyuan Li, Qinliang Su, Lawrence Carin
Artificial Intelligence and Statistics (AISTATS), 2018. 

MIN1PIPE: A Miniscope 1-photon-based Calcium Imaging Signal Extraction Pipeline [Paper] [Code]
J. Lu, C. Li, J. Singh-Alvarado, Z. Zhou, F. Frohlich, R. Mooney and F. Wang
Cell Reports, 2018. (Impact factor: 8.282)

 

Scalable Bayesian Learning of Recurrent Neural Networks for Language Modeling [arXiv] [Code] [Slides]
Zhe Gan*, Chunyuan Li*, Changyou Chen, Yunchen Pu, Qinliang Su, Lawrence Carin (* Equal contribution)
Annual Meeting of the Association for Computational Linguistics (ACL), 2017 Oral Presentation

VAE Learning via Stein Variational Gradient Descent [Paper]
Yunchen Pu, Zhe Gan, Ricardo Henao, Chunyuan Li, Shaobo Han, Lawrence Carin
Neural Information Processing Systems (NIPS), 2017

Adversarial Symmetric Variational Autoencoder [Paper]
Yunchen Pu, Weiyao Wang, Ricardo Henao, Liqun Chen, Zhe Gan, Chunyuan Li and Lawrence Carin
Neural Information Processing Systems (NIPS), 2017

Triangle Generative Adversarial Networks [Paper] [Code]
Zhe Gan*, Liqun Chen*, Weiyao Wang, Yunchen Pu, Yizhe Zhang, Hao Liu, Chunyuan Li, Lawrence Carin
Neural Information Processing Systems (NIPS), 2017

Learning Generic Sentence Representations using Convolutional Neural Networks [Paper] [Code]
Zhe Gan, Yunchen Pu, Ricardo Henao, Chunyuan Li, Xiaodong He, Lawrence Carin
Empirical Methods on Natural Language Processing (EMNLP), 2017 Oral  Presentation
 
Unsupervised Learning with Truncated Gaussian Graphical Models [Paper]
Qinliang Su, Xuejun Liao, Chunyuan Li, Zhe Gan, Lawrence Carin
AAAI Conference of Artificial Intelligence (AAAI), 2017  Oral Presentation

 






Learning Weight Uncertainty with SG-MCMC for Shape Classification [Paper] [Slides] [Poster] [Illustration]
Chunyuan Li, Andrew Stevens, Changyou Chen, Yunchen Pu, Zhe Gan and Lawrence Carin
Computer Vision and Pattern Recognition (CVPR), 2016 Spotlight Presentation
Equivalence between Dropout and SGLD; SG-MCMC for computer vision.

Preconditioned Stochastic Gradient Langevin Dynamics for Deep Neural Networks [PDF] [arXiv] [Code] [Slides]
Chunyuan Li, Changyou Chen, David Carlson and Lawrence Carin
AAAI  Conference on Artificial Intelligence (AAAI), 2016. Oral Presentation
Any preconditioning optimization (eg, RMSprop/Adagrad/Adam
) as scalable sampling methods

High-Order Stochastic Gradient Thermostats for Bayesian Learning of Deep Models [PDF[arXiv] [Code[Poster]
Chunyuan Li, Changyou Chen, Kai Fan and Lawrence Carin
AAAI Conference of Artificial Intelligence (AAAI), 2016

Stochastic Gradient MCMC with Stale Gradients [PDF] [Code]
Changyou Chen, Nan Ding, Chunyuan Li, Yizhe Zhang and Lawrence Carin
Neural Information Processing Systems (NIPS), 2016

Variational Autoencoders for Deep Learning with Images, Labels and Captions [PDF]
Yunchen Pu, Zhe Gan, Ricardo Henao, Xin Yuan, Chunyuan Li, Andrew Stevens, Lawrence Carin
Neural Information Processing Systems (NIPS), 2016

 

  



  
Bridging the Gap between Stochastic Gradient MCMC and Stochastic Optimization [PDF] [arXiv] [Code] [Slides]
Changyou Chen, David Carlson, Zhe Gan, 
Chunyuan Li and Lawrence Carin
Artificial Intelligence and Statistics (AISTATS), 2016. Oral Presentation

A Deep Generative Deconvolutional Image Model [PDF] [arXiv]
Yunchen Pu, Xin Yuan, Andrew Stevens, Chunyuan Li and Lawrence Carin
Artificial Intelligence and Statistics (AISTATS), 2016

Bayesian Dictionary Learning with Gaussian Processes and Sigmoid Belief Networks [PDF]
Yizhe Zhang, Ricardo Henao, Chunyuan Li and Lawrence Carin
International Joint Conference on Artificial Intelligence (IJCAI), 2016

 


Hierarchical Graph-Coupled HMM with Application to Influenza Infection [PDF]
Kai Fan, Chunyuan Li and Katherine Heller
AAAI Conference on Artificial Intelligence (AAAI), 2016

Deep Temporal Sigmoid Belief Networks for Sequence Modeling [PDF] [arXiv] [Code] [Poster]
Zhe Gan, Chunyuan Li, Ricardo Henao, David Carlson and Lawrence Carin
Neural Information Processing Systems (NIPS), 2015
  • Geometry and Topology Methods for Shape Analysis 
 
A comparison of 3D shape retrieval methods based on a large-scale benchmark supporting multimodal queries
Bo Li, Yijuan Lu, Chunyuan Li, Afzal Godil, Tobias Schreck, et al. 
Computer Vision and Image Understanding (CVIU), 2015
 
[PDF] [Dataset 1] [Dataset 2

 

http://en.wikipedia.org/wiki/Topological_data_analysis
Persistence-based Structural Recognition [PDF] [Poster] [Code]
Chunyuan Li, Maks Ovsjanikov and Frederic Chazal
Computer Vision and Pattern Recognition (CVPR), 2014

Minimum Near-Convex Decomposition for Shape Representation [PDF]
Zhou Ren, Junsong Yuan, Chunyuan Li and Wenyu Liu
International
 Conference on Computer Vision (ICCV), 2011

 





Spatially Aggregating Spectral Descriptors for Non-Rigid 3D Shape Retrieval [PDF] [Code 
Chunyuan Li and A. Ben Hamza, Multimedia Systems, 2014 
A comparison of spectral descriptors: GPS, HKS, SIHKS, WKS, HMS

A Multi-Resolution Descriptor for Deformable 3D Shape Retrieval [PDF] [Code [Slides] [Thesis]
Chunyuan Li and A. Ben Hamza, Visual Computer (Computer Graphics International, acceptance rate 18%), 2013 
SGWS: A general form of spectral descriptors from the perspective of spectral graph wavelet transform 

Shape retrieval of non-rigid 3D human models [PDF]
with David Pickup et al, International Journal of Computer Vision (IJCV), 2016
SGSW achieves highest retrieval performance on synthetic body shape dataset

    Courses taken/TA'ed at Duke
        ECE681 Pattern Classification: Introduction to Deep Neural Networks [Slides]
          STA561 Probabilistic Machine Learning [Link] [Overview]
          STA571 Advanced Machine Learning
          STA601 Bayesian and Modern Statistics 
          STA663 Statistical Computation [Link]
          
ECE587 Information Theory
          ECE590 Graphical Models and Inference 
          ECE590 Discrete Optimization 

         [ Good Old Days in Canada ]