About me
I am a LLM post-training lead at Amazon Store Foundation Modeling team, working on post-training data, training paradigm (IFT/DPO/RL/Agentic), application, tooling and live traffic behavior alignment. We build ultra large-scale models (~100B Dense and X00B MoE) from scratch and customize it for shopping (Rufus). I obtained my Ph.D. degree from HKUST CSE, under the supervision of Prof. Qiang Yang. My research was grounded into various Amazon products and delivered significant improvements to conversational shopping (Rufus), e-commerce KG (COSMO), search query understanding, navigation, and discovery (A9), driving ~$0.84B revenues gains since 2021.
Selected Projects
- Rufus: The first LLM-based Shopping Assistant in the world built from 0 to 1 (Founding Members)
- COSMO: The world’s largest LLM-powered shopping KG from 0 to 1, grounded in Amazon Search. It won the 2024 Amazon’s Most Viewed Paper No.1 and Blog No.5. (Science Lead)
- KDD Cup’24: LLMs for Shopping, (Challenge Lead)
- KDD Cup’23: Multilingual Session-based Recommendation(Challenge Co-Leads)
- ACL 2023 Outstanding Paper Award
Research interests
My current interest focus on building reliable and responsible Generative Foundation Models, specializing in Post-Training and its application:
- RLHF; E2E Agentic RL; Shopping Agent with Thinking; Constitutional RL; Live Traffic Behavior Alignment;
- Supervised/Instruction Fine-Tuning; Synthetic Data Generation; LLM Application \& Evaluation.
Experiences
- Amazon Search (A9), 2020-Present, Senior Applied Scientist, Bay Area, USA,
- Google Research, Jun 2020-Sep 2020, Research intern, NLX Group, Mountain View, CA, USA. Host: Ji Young Lee
- Amazon Search (A9), Seq 2019-Dec 2019, Applied scientist intern, Search and NLP group, Palo, Alto, CA, USA. Research topic: Meta Learning, Cross-lingual transfer. Host: Bing Yin
- HKUST Fok Ying Tung Research, Jun 2016-Aug 2016, Research intern, Host, Prof. Qiang Yang
- Microsoft Research Asia (MSRA), Jul 2015-Oct 2015, Research intern, Multimedia Search and Mining Group. Host: Dr. Tao Mei
News
- Oct 2025 - I will serve Area Chair for ICLR 2026.
- Oct 2025 - I will serve SPC for AAAI 2026.
- Aug 2025 - Three paper were accepted by EMNLP 2025.
- Aug 2025 - One paper were accepted by COLM 2025.
- May 2025 - Two paper were accepted by ACL 2025.
- Jan 2025 - Two paper were accepted by NAACL 2025.
- Jan 2025 - One paper was accepted by Nature Reviews Bioengineering.
- Dec 2024 - Our COSMO [paper] and [blog] ranks the 1st and 5th in the [10-most-viewed-publications-of-2024] and [10-most-viewed-blog-posts-of-2024] for Amazon, respectively.
- Oct 2024 - One paper was accepted by NeurIPS 2024.
- Sep 2024 - Four papers were accepted by EMNLP 2024.
- May 2024 - Two papers were accepted by ICML 2024.
- May 2024 - One paper was accepted by KDD 2024.
- May 2024 - One paper was accepted by ACL 2024.
- April 2024 - One paper was accepted by NAACL 2024.
- March 2024 - We are hosting the 🛍️Amazon KDD Cup 2024: Multi-Task Online Shopping Challenge for LLMs with plenty of awards. Click here to contribute ingenious solutions 🚀!
- Jan 2024 - One paper was accepted by SIGMOD 2024.
- Jan 2024 - One paper was accepted by WWW 2024.
Publications [Google Scholar]
(* denotes equal contributions, # denotes the corresponding author, + denotes interns/students i mentored)
2025
- Application of Large Language Models in Medicine [pdf][github]
Hongjian Zhou+, Fenglin Liu+, Zheng Li#, Jiebo Luo, David A. Clifton (Nature Reviews Bioengineering 2025)
Aligning Large Language Models with Implicit Preferences from User-Generated Content [pdf]
Zhaoxuan Tan+, Zheng Li#, et al.(ACL 2025)UniConv: Unifying Retrieval and Response Generation for Large Language Models in Conversations [pdf]
Fengran Mo+, Yifan Gao, Chuan Meng, Xin Liu, Zhuofeng Wu, Kelong Mao, Zhengyang Wang, Pei Chen, Zheng Li, Xian Li, Bing Yin, Meng Jiang (ACL 2025)DrAgent: Empowering Large Language Models as Medical Agents for Multi-hop Medical Reasoning [pdf]
Fenglin Liu+, Zheng Li#, et al. (EMNLP 2025)Can Language Models Follow Multiple Turns of Entangled Instructions? [pdf]
Chi Han+, Xin Liu, Haodong Wang, Shiyang Li, Jingfeng Yang, Haoming Jiang, Zhengyang Wang, Qingyu Yin, Liang Qiu, Changlong Yu, Yifan Gao, Zheng Li, Bing Yin, Jingbo Shang, Heng Ji (EMNLP 2025)GRIL: Knowledge Graph Retrieval-Integrated Learning with Large Language Models [pdf]
Jialin Chen+, Houyu Zhang, Seongjun Yun, Alejandro Mottini, Rex Ying, Xiang song, Vassilis N. Ioannidis, Zheng Li, Qingjun Cui (EMNLP 2025)
- Self-Rewarding PPO: Aligning Large Language Models with Demonstrations Only [pdf]
Qingru Zhang+, Liang Qiu, Ilgee Hong, Zhenghao Xu, Tianyi Liu, Shiyang Li, Rongzhi Zhang, Zheng Li, Lihong Li, Bing Yin, Chao Zhang, Jianshu Chen, Haoming Jiang, Tuo Zhao (COLM 2025)
Hephaestus: Improving Fundamental Agent Capabilities of Large Language Models through Continual Pre-Training [pdf]
Yuchen Zhuang, Jingfeng Yang, Haoming Jiang, Xin Liu, Kewei Cheng, Sanket Lokegaonkar, Yifan Gao, Qing Ping, Tianyi Liu, Binxuan Huang, Zheng Li, Zhengyang Wang, Pei Chen, Ruijie Wang, Rongzhi Zhang, Nasser Zalmout, Priyanka Nigam, Bing Yin, Chao Zhang (NAACL 2025)IHEval: Evaluating Language Models on Following the Instruction Hierarchy [pdf]
Zhihan Zhang, Shiyang Li, Zixuan Zhang, Xin Liu, Haoming Jiang, Xianfeng Tang, Yifan Gao, Zheng Li, Haodong Wang, Zhaoxuan Tan, Yichuan Li, Qingyu Yin, Bing Yin, Meng Jiang (NAACL 2025)
2024
Shopping MMLU: A Massive Multi-Task Online Shopping Benchmark for Large Language Models [pdf][code & data][leaderboard][workshop][competition]
Yilun Jin+#, Zheng Li#, Chenwei Zhang, et al (NeurIPS 2024, our KDD Cup’24 benchmark paper)Evolutionary Contrastive Distillation for Language Model Alignment [pdf]
Julian Katz-Samuels#, Zheng Li#, Hyokun Yun#, Priyanka Nigam, Yi Xu, Vaclav Petricek, Bing Yin, Trishul Chilimbi (EMNLP 2024)Large Language Models Are Poor Clinical Decision-Makers: A Comprehensive Benchmark [pdf][code][leaderboard]
Fenglin Liu+, Zheng Li#, et al (EMNLP 2024)IntentionQA: A Benchmark for Evaluating Purchase Intention Comprehension Abilities of Large Language Models in E-commerce [pdf][code]
Wenxuan Ding, Weiqi Wang, Sze Heng Douglas Kwok, Minghao Liu, Tianqing Fang, Jiaxin Bai, Xin Liu, Changlong Yu, Zheng Li, Chen Luo, Qingyu Yin, Bing Yin, Junxian He, Yangqiu Song (EMNLP 2024)MIND: Multimodal Shopping Intention Distillation from Large Vision-language Models for E-commerce Purchase Understanding [pdf][code]
Baixuan Xu, Weiqi Wang, Haochen Shi, Wenxuan Ding, Huihao Jing, Tianqing Fang, Jiaxin Bai, Xin Liu, Changlong Yu, Zheng Li, Chen Luo, Qingyu Yin, Bing Yin, Long Chen, Yangqiu Song (EMNLP 2024)
COSMO: A Large-Scale E-commerce Common Sense Knowledge Generation and Serving System at Amazon [pdf]
Changlong Yu*+, Xin Liu*+,…, Zheng Li*# (SIGMOD 2024)Language Models As Semantic Indexers [pdf]
Bowen Jin+, Hansi Zeng, Guoyin Wang, Xiusi Chen, Tianxin Wei, Ruirui Li, Zhengyang Wang, Zheng Li, Yang Li, Hanqing Lu, Suhang Wang, Jiawei Han, Xianfeng Tang (ICML 2024)MEMORYLLM: Toward Self-Updating Large Language Models [pdf]
Yu Wang+, Yifan Gao, Xiusi Chen, Haoming Jiang, Shiyang Li, Jingfeng Yang, Qingyu Yin, Zheng Li, Xian Li, Bing Yin, Jingbo Shang, Julian McAuley (ICML 2024)Graph Chain-of-Thought: Augmenting Large Language Models by Reasoning on Graphs [pdf]
Bowen Jin+, Chulin Xie, Jiawei Zhang, Kashob Kumar Roy, Yu Zhang, Zheng Li, Ruirui Li, Xianfeng Tang, Suhang Wang, Yu Meng, Jiawei Han (ACL 2024)IterAlign: Iterative Constitutional Alignment of Large Language Models [pdf]
Xiusi Chen, Hongzhi Wen, Sreyashi Nag, Chen Luo, Qingyu Yin, Ruirui Li, Zheng Li, Wei Wang (NAACL 2024)Understanding Inter-Session Intentions via Complex Logical Reasoning [pdf]
Jiaxin Bai+, Chen Luo, Zheng Li, Qingyu Yin, Yangqiu Song (KDD 2024)Hierarchical Query Classification in E-commerce Search [pdf]
Bing He+, Sreyashi Nag, Limeng Cui, Suhang Wang, Zheng Li, Rahul Goutam, Zhen Li, Haiyang Zhang (WWW 2024)
2023
Amazon-M2: A Multilingual Multi-locale Shopping Session Dataset for Recommendation and Text Generation [pdf][KDD Cup website]
Wei Jin+, Haitao Mao, Zheng Li,…,Xianfeng Tang (NeurIPS 2023, our KDD Cup’23 benchmark paper)Enhancing User Intent Capture in Session-Based Recommendation with Attribute Patterns [pdf][code]
Xin Liu+, Zheng Li#, Yifan Gao, Jingfeng Yang, Tianyu Cao, Zhengyang Wang, Bing Yin, Yangqiu Song (NeurIPS 2023)Mutually-paced Knowledge Distillation for Cross-lingual Temporal Knowledge Graph Reasoning [pdf][code]
Ruijie Wang+, Zheng Li#, Jingfeng Yang, Tianyu Cao, Bing Yin, Tarek Abdelzaher (WWW 2023)SCOTT: Self-Consistent Chain-of-Thought Distillation [pdf][code]
Peifeng Wang+, Zhengyang Wang, Zheng Li#, Yifan Gao, Bing Yin, Xiang Ren (ACL 2023, Outstanding Paper Award)Multimodal Prompt Learning for Product Title Generation with Extremely Limited Labels [pdf][code]
Fenglin Liu+, Bang Yang, Zheng Li#, Qingyu Yin, Chenyu You, Xuewei Ma, Bing Yin and Yuexian Zou (ACL 2023)FolkScope: Intention Knowledge Graph Construction for E-commerce Commonsense Discovery [pdf][code]
Changlong Yu+, Weiqi Zhang, Xin Liu+, Jiaxin Bai+, Yangqiu Song, Zheng Li, Yifan Gao, Tianyu Cao, Bing Yin, (ACL 2023)Graph Reasoning for Question Answering with Triplet Retrieval pdf
Shiyang Li+, Yifan Gao, Haoming Jiang, Qingyu Yin, Zheng Li, Xifeng Yan, Chao Zhang and Bing Yin (ACL 2023)Improving Consistency for Text Summarization with Energy Functions [pdf][code]
Qi Zeng, Qingyu Yin, Zheng Li, Yifan Gao, Sreyashi Nag, Zhengyang Wang, Bing Yin, Heng Ji, Chao Zhang(EMNLP 2023)Knowledge-Selective Pretraining for Attribute Value Extraction [pdf][code]
Hui Liu, Qingyu Yin, Zhengyang Wang, Chenwei Zhang, Haoming Jiang, Yifan Gao, Zheng Li, Xian Li, Chao Zhang, Bing Yin, William Yang Wang, Xiaodan Zhu (EMNLP 2023)Knowledge graph reasoning over entities and numerical values [pdf][code]
Jiaxin Bai, Chen Luo, Zheng Li, Qingyu Yin, Bing Yin, Yangqiu Song (KDD 2023)HomoDistil: Homotopic Task-Agnostic Distillation of Pre-trained Transformers [pdf][code]
Chen Liang+, Haoming Jiang, Zheng Li, Xianfeng Tang, Bing Yin, Tuo Zhao (ICLR 2023)
2022
Learning to Sample and Aggregate: Few-shot Reasoning over Temporal Knowledge Graph [pdf][code]
Ruijie Wang+, Zheng Li#, Dachun Sun, Shengzhong Liu, Jinning Li, Bing Yin, Tarek Abdelzaher (NeurIPS 2022)Multilingual Knowledge Graph Completion with Self-Supervised Adaptive Graph Alignment [pdf][code][data][media]
Zijie Huang+, Zheng Li#, Haoming Jiang, Tianyu Cao, Hanqing Lu, Bing Yin, Karthik Subbian, Yizhou Sun, Wei Wang (ACL 2022, Long paper)RETE: Retrieval-Enhanced Temporal Event Forecasting on Unified Query Product Evolutionary Graph [pdf][code]
Ruijie Wang+, Zheng Li#, Danqing Zhang, Qingyu Yin, Tong Zhao, Bing Yin and Tarek Abdelzaher (WWW 2022, Long paper, Research track)Disentangling Task Relations for Few-shot Text Classification via Self-Supervised Hierarchical Task Clustering [pdf]
Juan Zha*, Zheng Li*, Ying Wei and Yu Zhang (EMNLP 2022, Long paper)Retrieval-Augmented Multilingual Keyphrase Generation with Retriever-Generator Iterative Training [pdf][code][academic data][e-commerce data]
Yifan Gao+, Qingyu Yin#, Zheng Li#, Rui Meng, Tong Zhao, Bing Yin, Irwin King, Michael Lyu (NAACL 2022, Long paper)Condensing Graphs via One-Step Gradient Matching [pdf][code]
Wei Jin+, Xianfeng Tang, Haoming Jiang, Zheng Li, Danqing Zhang, Jiliang Tang, Bin Yin (KDD 2022, Long paper, Research Track)Query Attribute Recommendation at Amazon Search [pdf]
Chen Luo, William Headean, Neela Avudaiappan, Haoming Jiang, Tianyu Cao, Qingyu Yin, Yifan Gao, Zheng Li, Rahul Goutam, Haiyang Zhang, Bing Yin (RecSys 2022, Industry Track)
2021 & Before
Meta Teacher Student Network for Multilingual Sequence Labeling with Minimal Supervision [pdf][code]
Zheng Li, Danqing Zhang, Tianyu Cao, Yiwei Song, Bing Yin (EMNLP 2021, Long paper, poster)QUEACO: Borrowing Treasures from Weakly-labeled Behavior Data for Query Attribute Value Extraction [pdf]
Danqing Zhang*, Zheng Li*, Tianyu Cao, Chen Luo, Tony Wu, Hanqing Lu, Yiwei Song, Bing Yin, Tuo Zhao, Qiang Yang (CIKM 2021, Long paper, Applied science track.)Learn to Cross-lingual Transfer with Meta Graph Learning Across Heterogeneous Languages [pdf]
Zheng Li, Mukul Kumar, William Headden, Bing Yin, Ying Wei, Yu Zhang, Qiang Yang (EMNLP 2020, Long paper, oral)Transferable End-to-End Aspect-based Sentiment Analysis with Selective Adversarial Learning [pdf][slides][code]
Zheng Li, Xin Li, Ying Wei, Lidong Bing, Yu Zhang, Qiang Yang (EMNLP 2019, Long paper, oral)Exploiting Coarse-to-Fine Task Transfer for Aspect-level Sentiment Classification [pdf][slides][data]
Zheng Li, Ying Wei, Yu Zhang, Xiang Zhang, Xin Li, Qiang Yang (AAAI 2019, oral)Hierarchical Attention Transfer Network for Cross-domain Sentiment Classification [pdf][slides][code][demo]
Zheng Li, Ying Wei, Yu Zhang, Qiang Yang (AAAI 2018, oral)End-to-End Adversarial Memory Network for Cross-domain Sentiment Classification [pdf][slides][code]
Zheng Li, Yu Zhang, Ying Wei, Yuxiang Wu, Qiang Yang (IJCAI 2017, oral)Compressive Perceptual Hashing Tracking [pdf]
Zheng Li, Long Chen, Jian-Fei Yang, Neurocomputing 2017.Online Visual Tracking via Correlation Filter with Convolutional Networks [pdf][slides][demo]
Zheng Li, Jianfei Yang, Juan Zha, Chang-Dong Wang, Weishi Zheng (VCIP 2016, Oral).Compressive Perceptual Hashing Tracking with Online foreground learning [pdf][slides][demo]
Zheng Li, Jian-Fei Yang, Long Chen, Juan Zha (ROBIO 2015, Oral).Robust Vehicle Tracking Using Perceptual Hashing Algorithm [pdf]
Zheng Li, Jian-Fei Yang, Long Chen, Juan Zha (ICMLA 2015, Oral).- Long-Term Revenue Maximization Pricing Scheme for Cloud
Wen-Kai Huan, Chang-Dong Wang, Shao-Shu Huan, Zheng Li, Jian-Huang Lai, Ling Huang, (IJSSE journal 2015)
Professional Activities
- Program Organzier/Chair: KDD Cup 2023, 2024.
- Senior Program Committee/Area Chair (Meta-Reviewer): ICLR 2026, AAAI (2023-2025), IJCAI (2021)
- Program Committee (Reviewer):
- ICLR, NeurIPS, ICML, KDD, ACL, NAACL, AAAI, IJCAI (2022)
- NeurIPS, ICLR, ACL, EMNLP, NAACL, AAAI, IJCAI (2021)
- ACL, EMNLP, ICLR, AAAI, IJCAI (2020)
- Conference Secondary Reviewer: AAAI, IJCAI (2019)
- Journal Reviewer: PAMI, TBD, Neurocomputing
Honors & Awards
- Jul 2023, ACL 2023 Outstanding Paper Award
- Nov 2018, Baidu PhD Fellowship Nomination Awards, about 20/5,000 applicants worldwide.
- 2017-2019, AAAI19, AAAI18, IJCAI17 student travel awards
- Jun 2016, Excellent Graduates Awards, Sun Yat-sen University
- May 2016, Excellent Undergraduate Thesis Awards, Sun Yat-sen University
- Sep 2015, “YongSheng Liu” Excellent Undergraduate Scholarship
- Sep 2015, First-class Merit Scholarship, Sun Yat-sen University
- Aug 2015, “HUAWEI” Cup China Intelligent Design Competition, Second Prize
- Sep 2014, Second-class Merit Scholarship, Sun Yat-sen University
- Sep 2013, Third-class Merit Scholarship, Sun Yat-sen University
Mentorship
- Ruijie Wang, UIUC Ph.D., topic: Temporal Event Forecasting. Achievement: NeurIPS 2022, WWW 2022, WWW 2023. Now: Assistant Professor, Beihang University.
- Zijie Huang, UCLA Ph.D., topic: Multilingual KG Completion. Achievement: ACL 2022. Now: Research Scientist, Google DeepMind.
- Peifeng Wang, USC Ph.D., topic: Self-Consistent Chain-of-Thought Distillation. Achievement: ACL 2023 Outstanding Paper Award. Now: Research Scientist, Meta.
- Wei Jin, Michigan State University Ph.D., topic: Graph Condensation. Achievement: KDD 2022, KDD Cup 2023. Now: Assistant Professor, Emory University.
- Fenglin Liu, Oxford Ph.D., topic: Benchmarking LLM on Healthcare, Few-shot Product Title Generation. Achievement: ACL 2023, EMNLP 2025, Nature Review Bioengineering. Now: Applied Scientist, Amazon.
- Chen Liang, Gatech Ph.D., topic: Knowledge Distillation. Achievement: ICLR 2022. Now: Senior Researcher, Microsoft.
- Changlong Yu, HKUST Ph.D., topic: Commonsense Knowledge Graph Construction. Achievement: ACL 2023, SIGMOD 2024. Now: Applied Scientist, Amazon.
- Yifan Gao, CUHK Ph.D., topic: Multilingual Keyphrase Generation. Achievement: NAACL 2022. Now: Senior Applied Scientist, Amazon.
- Jiaxin Bai, HKUST Ph.D., topic: Knowledge Graph Reasoning. Achievement: KDD 2023, KDD 2024.
- Shiyang Li, UCSB Ph.D., topic: Retrieval-Augmented Question Answering. Achievement: ACL 2023. Now: Applied Scientist, Amazon.
- Xin Liu, HKUST Ph.D., topic: Session-based Recommendation. Achievement: NeurIPS 2023, SIGMOD 2024. Now: Senior Applied Scientist, Amazon.
- Hui Liu, HKUST Ph.D., topic: Knowledge-Selective Pretraining for Attribute Value Extraction. Achievement: EMNLP 2023. Now: Applied Scientist, Amazon.
- Qi Zeng, UIUC Ph.D., topic: Document Summarization. Achievement: EMNLP 2023. Now: Research Scientist, Meta.
- Yilun Jin, HKUST Ph.D., topic: Comprehensive Benchmark of Large Language Models for E-Commerce Applications. Achievement: KDD Cup 2024.
- Xiusi Chen, UCLA Ph.D., topic: Iterative Constitutional Alignment of Large Language Models. Achievement: NAACL 2024.
- Bing He, Gatech Ph.D., topic: Hierarchical Classification. Achievement: WWW 2024. Now: Applied Scientist, Amazon.
- Bowen Jin, UIUC Ph.D. student, topic: Semantic ID Generation by Large Language Model.
- Yu Wang, UCSD Ph.D. student, topic: Emulating Human Memory: Towards Autonomous, Lifelong Learning Large Language Model.
- Jie Huang, UIUC Ph.D., topic: Explainable Complementary Concept Generation in E-Commerce. Now: Research Scientist, xAI.
- Enyan Dai, Penn State Ph.D., topic: Event Extraction. Now: Assistant Professor, HKUST.
- Yujia Xie, Gatech Ph.D., topic: Extreme Multi-label Classification. Achievement: Amazon Post-internship Fellowship. Now: Principal Researcher, Microsoft.
- Xutang Peng, University of Sheffield Ph.D., topic: Multilingual KG Pretraining.
- Xiaotian Han, Texas A&M Ph.D., topic: Large Language Model Evaluation. Now: Applied Scientist, Amazon.
- Kewei Cheng, UCLA Ph.D., topic: Can Large Language Models (LLMs) Generalize a Model based on Few-shot Examples? Now: Applied Scientist, Amazon.
- Zhaoxuan Tan, University of Notre Dame Ph.D. student, topic: LLM for Personalization.
- Fengran Mo, Université de Montréal Ph.D. student.
- Fan Wei, HKUST Ph.D. student, topic: Hierarchical RL for Deep Researcher.
- Feng Yao, UCSD Ph.D. student, topic: Async RL Training Infra.
- Yupeng Hou, UCSD Ph.D. student, topic: Reasoning Tokenization.