Yushun Zhang

alt text 

Yushun Zhang (张雨舜)
Ph.D student,
School of Data Science,
The Chinese University of Hong Kong, Shenzhen, China

Email: yushunzhang AT link DOT cuhk DOT edu DOT cn

[Google Scholar] [Twitter] [Github] [Instagram]

About me

I'm a Ph.D student in School of Data Science at The Chinese University of Hong Kong, Shenzhen, China. I'm very proud to be advised by Prof. Zhi-Quan (Tom) Luo. I’m also very fortunate to work closely with Prof. Ruoyu Sun. Previously, I did my undergraduate study in the Department of Mathematics at Southern University of Science and Technology (SUSTech).

My research focuses on optimization, deep learning, and especially, large language models. I aim to work on important and practical problems with optimization flavor.

Research that I lead or co-lead

Adam-mini: Use Fewer Learning Rates To Gain More
Yushun Zhang* , Congliang Chen*, Ziniu Li, Tian Ding, Chenwei Wu, Diederik P. Kingma, Yinyu Ye, Zhi-Quan Luo, Ruoyu Sun
Preprint

Why Transformers Need Adam: A Hessian Perspective
Yushun Zhang, Congliang Chen, Tian Ding, Ziniu Li, Ruoyu Sun, Zhi-Quan Luo
NeurIPS 2024

Adam Can Converge Without Any Modification on Update Rules
Yushun Zhang, Congliang Chen, Naichen Shi, Ruoyu Sun, Zhi-Quan Luo
NeurIPS 2022

Does Adam Converge and When?
Yushun Zhang, Congliang Chen, Zhi-Quan Luo
ICLR Blog Track 2022

When Expressivity Meets Trainability: Fewer than n Neurons Can Work
Jiawei Zhang*, Yushun Zhang* , Mingyi Hong, Ruoyu Sun, Zhi-Quan Luo
NeurIPS 2021

Research that I proudly participate in

ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models
Ziniu Li, Tian Xu, Yushun Zhang, Zhihang Lin, Yang Yu, Ruoyu Sun, Zhi-Quan Luo
ICML 2024

Provable Adaptivity of Adam under Non-Uniform Smoothness
Bohan Wang*, Yushun Zhang*, Huishuai Zhang, Qi Meng, Ruoyu Sun, Zhi-Ming Ma, Zhi-Quan Luo, Tie-Yan Liu, Wei Chen
KDD 2024

HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning
Ziniu Li, Yingru Li, Yushun Zhang, Tong Zhang, Zhi-Quan Luo
ICLR 2022
(This work is also selected as Oral presentation at NeurIPS workshop, 2021)

Fast QLB algorithm and hypothesis tests in logistic model for ophthalmologic bilateral correlated data
Yiqi Lin*, Yushun Zhang* , Guoliang Tian, Changxing Ma
Journal of Biopharmaceutical Statistics 2020
(This work is done in Sustech)

Invited Talks

Oct 2024: I gave a talk at U of Minnesota, hosted by Prof. Mingyi Hong. Thanks Prof. Hong for the invitation!

Oct 2024: I gave a talk at INFORMS Anneal Meeting, Seattle, hosted by Jianhao Ma. Thanks Jianhao for the invitation!

Sep 2023: I gave a talk at Tsinghua University, hosted by Prof. Jian Li. Thanks Prof. Li for the invitation!

  • Topic: Converge or Diverge? A Story of Adam

  • Slides can be seen here

Jan 2023: I gave a talk at Google Brain, hosted by Dr. Diederik P. Kingma. Thanks Dr. Kingma for the invitation!

  • Topic: Adam Can Converge Without Any Modification on Update Rules

  • Slides can be seen here

Awards (By time)

Duan Yongping Outstanding Resesearch Award (1st place), 2023

Teaching Assistant Award, School of Data Science, 2023

Best Paper Presentation Award (1st place), 2nd Doctoral and Postdoctoral Daoyuan Academic Forum, 2022

  • Topic: Does Adam Converge and When?

  • Slides can be seen here.

  • A short version of this talk can be viewed here.

Best Paper Presentation Award (1st place), 3rd Tsinghua-Berkeley workshop on Learning Theory, 2021

  • Topic: When Expressivity Meets Trainability: Width < n Can Work

  • A short version of this talk can be viewed here.

Magna cum laude of SUSTech, 2019

Outstanding graduation thesis, SUSTech, 2019

Scholarship Award for Excellence, Mathematics department, SUSTech (Top 10 students) , 2018

Services

Reviewer

I serve as a reviewer for machine learning conferences including NeurIPS, ICLR, ICML, COLT, AISTATS, as well as journals including JMLR and TMLR.

Social Activities

I hosted a session named “Optimization Issues in Recent AI Models” at INFORMS Anneal Meeting, Oct, 2024.

Teaching Assistant (by time)

  • DDA4300: Optimization for Machine Learning, by Prof. Yinyu Ye (2023 Spring)

  • DDA 6060: Machine Learning, by Prof. Hongyuan Zha & Prof. Shuang Li (2022 Spring)

  • DDA 4002: Multivariate Statistics, by Prof. Zhaoyuan Li (2021 Autumn)

  • DDA 4250: Mathematics for Deep Learning, by Prof. Arnulf Jentzen (2021 Spring)

  • MFE 5100: Optimization, by Prof. Zizhuo Wang (2020 Autumn)

  • STA 2002: Probalility and Statistics, by Prof. Xinyun Chen (2020 Summer)

  • CSC 4020: Fundationals of Machine Learning, by Prof. Hongyuan Zha (2020 Spring)

  • MAT 2040: Linear algebra, by Prof. Shenghao Yang (2019 Autumn)

  • MAT 7035: Computational Statistics, by Prof. Guoliang Tian (SUSTech) (2018 Autumn)

  • MA 204: Mathematical Statistics, by Prof. Guoliang Tian (SUSTech) (2018 Spring)

Experiences

I spent a great time as an exchange undergraduate student at Mathematics department, UC San Diego, 2019 Spring.

I spent the best three years at the Shenzhen Foreign Language School, branch, 2009 - 2012.