Yushun Zhang

Ph.D. student,
School of Data Science,
The Chinese University of Hong Kong, Shenzhen, China

Email: yushunzhang [AT] link.cuhk.edu.cn

Google Scholar    /    GitHub    /    Twitter    /    Weibo

About me

I'm a Ph.D student in School of Data Science at The Chinese University of Hong Kong, Shenzhen, China. I'm very proud to be advised by Prof. Zhi-Quan (Tom) Luo. I’m also very fortunate to work closely with Prof. Ruoyu Sun. Previously, I did my undergraduate study in the Department of Mathematics at Southern University of Science and Technology (SUSTech).

My research focuses on optimization, deep learning, and especially, large language models. I am interested in important and practical problems with optimization flavor.

Biography

  • 2019 - Present: Ph.D. student at The Chinese University of Hong Kong, Shenzhen
  • 2015 - 2019: B.Sc., Southern University of Science and Technology
  • 2012 - 2015: Shenzhen Foreign Language School
  • 2009 - 2012: Shenzhen Foreign Language School, Branch

Research that I lead or co-lead

Finite Horizon Optimization: Framework and Applications
Yushun Zhang , Dmitry Rybin, Zhi-Quan Luo
Preprint

Adam-mini: Use Fewer Learning Rates To Gain More
Yushun Zhang* , Congliang Chen*, Ziniu Li, Tian Ding, Chenwei Wu, Diederik P. Kingma, Yinyu Ye, Zhi-Quan Luo, Ruoyu Sun
ICLR 2025

Why Transformers Need Adam: A Hessian Perspective
Yushun Zhang, Congliang Chen, Tian Ding, Ziniu Li, Ruoyu Sun, Zhi-Quan Luo
NeurIPS 2024

Adam Can Converge Without Any Modification on Update Rules
Yushun Zhang, Congliang Chen, Naichen Shi, Ruoyu Sun, Zhi-Quan Luo
NeurIPS 2022

Does Adam Converge and When?
Yushun Zhang, Congliang Chen, Zhi-Quan Luo
ICLR Blog Track 2022

When Expressivity Meets Trainability: Fewer than n Neurons Can Work
Jiawei Zhang*, Yushun Zhang* , Mingyi Hong, Ruoyu Sun, Zhi-Quan Luo
NeurIPS 2021

Research that I participate in

ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models
Ziniu Li, Tian Xu, Yushun Zhang, Zhihang Lin, Yang Yu, Ruoyu Sun, Zhi-Quan Luo
ICML 2024

Provable Adaptivity of Adam under Non-Uniform Smoothness
Bohan Wang*, Yushun Zhang*, Huishuai Zhang, Qi Meng, Ruoyu Sun, Zhi-Ming Ma, Zhi-Quan Luo, Tie-Yan Liu, Wei Chen
KDD 2024

HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning
Ziniu Li, Yingru Li, Yushun Zhang, Tong Zhang, Zhi-Quan Luo
ICLR 2022
(This work is also selected as Oral presentation at NeurIPS workshop, 2021)

Fast QLB algorithm and hypothesis tests in logistic model for ophthalmologic bilateral correlated data
Yiqi Lin*, Yushun Zhang* , Guoliang Tian, Changxing Ma
Journal of Biopharmaceutical Statistics 2020
(This work is done in SUSTech)

Invited Talks

Dec 2024: I gave a talk at Tsinghua University, hosted by Kaifeng Lyu. Thanks Kaifeng for the invitation!

  • Topic: Adam for Transformers: Why and Why Not

  • Slides can be seen here

Oct 2024: I gave a talk at University of Minnesota, hosted by Prof. Mingyi Hong. Thanks Prof. Hong for the invitation!

Oct 2024: I gave a talk at INFORMS Anneal Meeting, Seattle, hosted by Jianhao Ma. Thanks Jianhao for the invitation!

Sep 2023: I gave a talk at Tsinghua University, hosted by Prof. Jian Li. Thanks Prof. Li for the invitation!

Jan 2023: I gave a talk at Google Brain, hosted by Dr. Diederik P. Kingma. Thanks Dr. Kingma for the invitation!

  • Topic: Adam Can Converge Without Any Modification on Update Rules

  • Slides can be seen here

Awards

Dec 2023: Duan Yongping Outstanding Resesearch Award (1st place)

Dec 2023: Teaching Assistant Award, School of Data Science

Aug 2022: Best Paper Presentation Award (1st place), 2nd Doctoral and Postdoctoral Daoyuan Academic Forum

  • Topic: Does Adam Converge and When?

  • Slides can be seen here.

  • A short version of this talk can be viewed here.

Jul 2021: Best Paper Presentation Award (1st place), 3rd Tsinghua-Berkeley workshop on Learning Theory

  • Topic: When Expressivity Meets Trainability: Width < n Can Work

  • A short version of this talk can be viewed here.

Jun 2019: Magna cum laude of SUSTech

Jun 2019: Outstanding graduation thesis, SUSTech

Sep 2018: Scholarship Award for Excellence, Mathematics department, SUSTech (Top 10 students)

Services

Reviewer

I serve as a reviewer for machine learning conferences including NeurIPS, ICLR, ICML, COLT, AISTATS, as well as journals including JMLR and TMLR.

Social Activities

I hosted a session named “Optimization Issues in Recent AI Models” at INFORMS Anneal Meeting, Oct, 2024.

Teaching Assistant (by time)

  • DDA4300: Optimization for Machine Learning, by Prof. Yinyu Ye (2023 Spring)

  • DDA 6060: Machine Learning, by Prof. Hongyuan Zha & Prof. Shuang Li (2022 Spring)

  • DDA 4002: Multivariate Statistics, by Prof. Zhaoyuan Li (2021 Autumn)

  • DDA 4250: Mathematics for Deep Learning, by Prof. Arnulf Jentzen (2021 Spring)

  • MFE 5100: Optimization, by Prof. Zizhuo Wang (2020 Autumn)

  • STA 2002: Probalility and Statistics, by Prof. Xinyun Chen (2020 Summer)

  • CSC 4020: Fundationals of Machine Learning, by Prof. Hongyuan Zha (2020 Spring)

  • MAT 2040: Linear algebra, by Prof. Shenghao Yang (2019 Autumn)

  • MAT 7035: Computational Statistics, by Prof. Guoliang Tian (SUSTech) (2018 Autumn)

  • MA 204: Mathematical Statistics, by Prof. Guoliang Tian (SUSTech) (2018 Spring)

Experiences

2019 Spring: I spent a great time as an exchange undergraduate student at Mathematics department, UC San Diego.

2009 - 2012: I spent the best three years at the Shenzhen Foreign Language School, branch.