Chunyuan Li
I am a principal researcher at Microsoft Research, Redmond. My recent research focuses on large-scale pre-training in computer vision and natural language processing. Some recent works include:
- Building large multimodal models that follow human intents [1]
- Vision-and-language pre-training [1, 2, 3]
- Deep generative models at scale [1, 2, 3, 4 ]
I obtained my PhD in machine learning at Duke University, advised by Prof. Lawrence Carin. My PhD research studies probabilistic deep learning. I have served as an Area Chair for NeurIPS, ICML, ICLR, EMNLP & AAAI, and a Guest Editor of IJCV on ``the promises and dangers of large vision models''.
news
Oct/Nov, 2023 |
LLaVA is upgraded:
|
---|---|
September 20, 2023 | A 110-page paper is released to share our perspective on LMMs: ``Multimodal Foundation Models: From Specialists to General-Purpose Assistants''. This is based our CVPR 2023 Tutorial. [Note on Large Multimodal Models] [Slides] [YouTube] [Bilibili] |
June 1, 2023 | LLaVA-Med: Training a large language-and-vision assistant for biomedicine in one day. NeurIPS 2023 Datasets and Benchmarks Track (Spotlight) |
April 17, 2023 | Visual Instruction Tuning with GPT-4! We release LLaVA, a Large Language-and-Vision Assistant towards multimodal GPT-4 level capabilities. NeurIPS 2023 (Oral Presentation) [Project] [Paper] [Github] [Demo] [Data] [Model] [Scaling Note] |
April 7, 2023 | Instruction Tuning with GPT-4! a "first attempt" to use GPT-4 data for LLM self-instruct tuning. [Paper] [Github] [My Learnings] |
March, 2023 |
CVPR 2023:
|
Feb, 2023 |
CVPR2023 Workshop and Challenge on the 2nd Computer Vision in the Wild (CVinW). For those who are new to this topic, please check out the CVinW Reading List . [Workshop] [SGinW Challenge] [RF100 Challenge] |
Oct 23, 2022 |
ECCV 2022 Workshop and Challenge on the 1st Computer Vision in the Wild (CVinW). Please check out the videos of this event at [YouTube] [BiliBili]. [Workshop] [ICinW Challenge] [ODinW Challenge] |
Oct 17, 2022 | "Vision-Language Pre-Training: Basics, Recent Advances, and Future Trends", A 100-page survey paper in Foundations and Trends® in Computer Graphics and Vision |
Sep 16, 2022 |
NeurIPS 2022: K-LITE (Oral, 1%), ELEVATER and FocalNet. A team effort to push CVinW. ; [CVPR Tutorial]
|
Mar 25, 2022 |
Upcoming events as a co-organizer:
|
Mar 1, 2022 | CVPR 2022: |
June 17, 2021 | EsViT chieves SoTA 81.3% top-1 on the ImageNet linear probe evaluation, outperforming prior arts with an order magnitude of higher throughput. [GitHub] |
recent publications
-
K-LITE
-
ELEVATERELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual ModelsNeurIPS (Datasets and Benchmarks Track) 2022