Wentao Zhang is currently a postdoc research fellow working with Prof. Jian Tang at Montreal Institute for Learning Algorithms (Mila). Prior to that, he received his Ph.D. degree in computer science from Peking University in June 2022, supervised by Prof. Bin Cui. And he worked with Prof. Lei Chen as a visiting scholar at HKUST in 2019. Besides, Wentao has accumulated for 4 years industrial experience in Tencent and Apple Research.
Motivated by the industrial demand, Wentao’s research focuses on large-scale graph learning from three perspectives – data, model, and system. He has published 30+ papers, including 10+ first author papers in the top DB (SIGMOD, VLDB, ICDE), DM (KDD, WWW) and ML (ICML, NeurIPS, ICLR) conferences. Besides, he is the contributor or designer of several system projects, including Angel , SGL , MindWare , and OpenBox . His research works have been powering several billion-scale applications in Tencent, and some of them have been recognized by multiple prestigious awards, including the Outstanding Doctoral Dissertation Award, and the Best Student Paper Award at WWW’22.
Feel free to catch me if interested to discuss ideas or work together. 😜
Email: wentao.zhang@mila.quebec
Wechat (微信): z1299799152
Office: 6666 St-Urbain Street, #200. Montreal, QC.
I’m on job market now!
If you are interested in me, contact me via Email or Wechat.
Research Interests
Data: annotation, augmentation, imbalance, noise, generation, out-of-distribution, heterogeneity, and privacy.
Model: scalable, fast, memory-efficient, and powerful graph learning.
System: large-scale distributed training, AutoML, and data-centric AI platform.
Application: AI4Science (e.g., drug and protein), and AI4Industry (e.g.,recommender system and anomaly detection).
A Summary of My Research on Large-scale Graph Learning:
- Data: how to improve the quality and quantity of large-scale graph data?
– Data-centric Graph Learning- Model-free annotation [Grain, VLDB 21]
- Annotation with noisy oracle [RIM, NeurIPS 21, Spotlight]
- Annotation with relaxed queries [IGP, ICLR 22]
- Reliable data distillation [RDD, SIGMOD 20]
- Reception-aware online distillation [ROD, KDD 21]
- Model: how to build scalable and deep graph models for large-scale graph data?
– Scalable and Deep Graph Learning- Node dependent local smoothing [NDLS, NeurIPS 21, Spotlight]
- Model-free graph representation learning [NAFS, ICML 22]
- Graph decomposition [DeGNN, KDD 21]
- Node-aware layer aggregators [Lasagne, TKDE 21]
- Graph-based MLP deployed at Tencent [GAMLP, KDD 22]
- Deep GNN evaluation [GNN EA, KDD 22]
- System: how to make large-scale graph learning faster and easier?
– Automated and Distributed Graph Learning- Scalable graph NAS [PasCa, WWW 22, Best Student Paper Award]
- Deep and flexible graph NAS [DF-GNAS, ICML 22]
- Fast ensemble learning [EDDE, ICDE 20]
- Scalable graph learning [SGL]
- Distributed graph learning [Angel Graph]
- End-to-End AutoML [MindWare, VLDB 21]
- Black box optimization [OpenBox, KDD 21]
- Large-scale hyper-parameter tuning [Hyper-Tune, VLDB 22]
- Application: how to use graph learning in large-scale industrial graphs?
- GNN-based recommendation [The First Survey of GNN-based RS, CSUR 22]
- GNN-based recommendation system deployed at Taobao [Zoomer, ICDE 22]
What's New
- 2022-11: One paper is accepted by AAAI 2023.
- 2022-11: One paper is accepted by ICDE 2023.
- 2022-10: One paper is accepted by VLDBJ 2022.
- 2022-09: One paper is accepted by NeurIPS 2022.
- 2022-09: I am awared Rising Star (云帆奖-明日之星) in World AI Conference, 2022.
- 2022-06: I am honor to present the valedictorian for the class of 2022 in CS of PKU.
- 2022-06: I receive my Ph.D. degree in computer science from Peking University with Outstanding Doctoral Dissertation Award.
- 2022-05: One paper is accepted by the journal VLDBJ 2022.
- 2022-05: Four papers are accepted by the conference SIGKDD 2022.
- 2022-05: Two papers as first author, have been accepted by ICML 2022.
- 2022-05: One paper related to AutoML, has been accepted by Bioinformatics 2022.
- 2022-04: 🏆 We win the Best Student Paper Award in WWW 2022 !
- 2022-04: We release our first version of the scalable graph learning toolkit–SGL.
- 2022-03: One paper is selected as the Best Paper Award Nominees in WWW 2022. The corresponding PasCa system (integrated into SGL) will be open source next month!
- 2022-03: One paper as corresponding author, related to GNN-based Recommendation, has been accepted by the journal ACM Computing Survey 2022 .
- 2022-01: One paper related to graph-based recommendation, has been accepted by the conference ICDE 2022 .
- 2022-01: One paper as first author, related to graph data annotation, has been accepted by the conference ICLR 2022 .
- 2022-01: One paper related to our large scale Hyper-paramater Tuning system, has been accepted by the conference VLDB 2022 .
- 2022-01: I accepted the invitation to serve as Program Committee member of the Research Track of ACM SIGKDD 2022.
- 2022-01: One paper as first author, related to our scalable graph NAS system, has been accepted by the conference WWW 2022 .
- 2021-12: Our OpenBox team won the “Outstanding Winner” at the openGCC contest in CCF ChinaSoft 2021. Congratulations!
- 2021-09: Two papers as first author, related to scalable graph learning and graph data annotation, have been accepted by the conference NeurIPS 2021 with Spotlight (< 3%).
- 2021-08: We propose GAMLP, a scalable and efficient graph model, which achieves the top #1 performance in three public and largest ogbn graphs (i.e., ogbn-papers100M, ogbn-products, and ogbn-mag)! See the leaderboards here.
- 2021-07: One paper as first author, related to large-scale graph data selection, has been accepted by the conference VLDB 2021.
- 2021-07: One paper as co-first author, related to deep GNN, has been accepted by the journal TKDE 2021.
- 2021-06: One paper as third author, related to our AutoML system – VocalnoML, has been accepted by the conference VLDB 2021.
- 2021-05: Three papers, related to sparse graph, graph decomposition and our blackbox optimization (BBO) system – OpenBox, are accepted by the conference SIGKDD 2021.
- 2021-03: As the only person in China, I was supported by the Apple Scholars in AI/ML PhD fellowship. Many thanks to Apple!
- 2021-03: One paper as first author has been accepted by the conference SIGMOD 2021. Looking forward to the meeting in Xi’an this summer!
Contributed Open-source Projects
- Angel: a high-performance distributed machine learning and graph computing platform, jointly designed by Tencent and PKU.
SGL: a scalable graph learning toolkit for extremely large graph datasets.
MindWare: a powerful AutoML system, which automates feature engineering, algorithm selection and hyperparameter tuning.
- OpenBox: an efficient open-source system designed for solving generalized black-box optimization (BBO) problems.
Selected Awards
- Rising Star (云帆奖-明日之星), World AI Conference, 2022.
- 🏆 Best Student Paper Award of WWW 2022 (1/1822, the second WWW Best Student Paper from China), 2022
- IVADO Postdoctoral Fellowship, Canada
- Outstanding Doctoral Dissertation Award, Peking University (Sole winner in Computer Software and Theory), 2022
- Outstanding Graduate of Beijing, China, 2022
- Candidate of May 4th Medal (Each School recommends 1 candidate, highest honor in PKU), 2022
- The Big Data Expo Leading Technology Achievement Award, China International Big Data Industry Expo (Angel Graph project), 2022
- Candidate of People of the Year (1 people in EECS, and 42 people in PKU), 2021
- Merit Student of Beijing (2 people in EECS, and 58 people in PKU), 2021
- Apple PhD Fellowship (1 people in China, and 15 people in the world), 2021
- National Scholarship (Top 1% in PKU), 2019, 2021
- Baidu Scholarship Nominee (20 people in the world), 2021
Selected Competitions
- Outstanding Winner of the openGCC contest in CCF ChinaSoft (1/3814), 2021
- Rank #1 in Open Graph Benchmark, 2021
- Outstanding Winner of the BDIC Big Data Competition (1/575), 2018
Program Committee Member and Reviewer
KDD, ICML, CVPR, WWW, DASFFA, TKDE, TNNLS, PAKDD, Machine Learning etc.
Invited Talks
I am happy to give a talk if you are interested in my work. 😊
- Model Degradation Hinders Deep Graph Neural Networks.
KDD’22, 2022. 08 - Graph Attention Multi-Layer Perceptron.
KDD’22, 2022. 08 - NAFS: A Simple yet Tough-to-beat Baseline for Graph Representation Learning.
AI Time [News]
ICML’22, Virtual, 2022. 07
Jiqizhixin, Virtual, 2022. 07 [News][Slides] - Deep and Flexible Graph Neural Architecture Search.
ICML’22, Virtual, 2022. 07
Jiqizhixin, Virtual, 2022. 07 - Towards Large Scale Graph Learning: Data, Model and System.《大规模图学习:数据、模型与系统》
SUSTech, 2023.01
HKUST (Guang Zhou), Virtual, 2022.04 [News]
Stanford, Virtual, 2021.11
Mila, Virtual, 2021.9 - Towards Automated Graph Learning. 《自动化图机器学习》 [Doc]
HKUST, Virtual, 2022.11[News]
NUDT, Virtual, 2022. 07
HUST, 2022. 08
Zhejiang University, 2022. 08 - Information gain propagation a new way to graph active learning with soft labels. 《软标签场景下的图主动学习》
AI Time, Virtual, 2022. 06 [News]
ICLR’22, Virtual, 2022. 04 - Towards Data-Centric ML.《数据驱动的机器学习》
Apple research, 2022. 06 - valedictorian Speech.《北京大学计算机系2022级毕业生代表致辞》
CS of PKU, 2022. 06 [News] - PaSca: a graph neural architecture search system under the scalable paradigm. 《可扩展性的图神经结构搜索系统》
DGL Team, Amazon, Virtual, 2022.07
CSU, Virtual, 2022. 07
CCF, Virtual, 2022.06 [News] [Slides]
DataFun, Virtual, 2022.06 [Slides]
MLNLP, Virtual, 2022.06 [News][Slides][Video]
InfoQ, Tencent Cloud, Virtual, 2022.06 [News]
WWW’22, Virtual, 2022.04 [Slides]
Data Platform, Tencent, Virtual, 2022.05 - Towards Large-scale Graph Machine Learning. 《大规模图机器学习》 [Doc]
HKUST, Virtual, 2022. 08 (In Preparing)
LOGs, Virtual, 2022. 07 [Video] How to Do Research? 《浅谈科研》
Apple Research, Virtual, 2021.12
PKU, Virtual, 2021.12 [News-1, News-2] [Slides]- The Scalability of Large-scale Graph Machine Learning.《大规模图机器学习的可扩展性》
Tencent Big Data, Virtual, 2022.04
NeurIPS, Virtual, 2021.12
4Paradigm, Virtual, 2021.12
AI Drive, 2021.12 [Video] [News] [Slides] - RIM: Reliable Influence-based Active Learning on Graphs.
NeurIPS, Virtual, 2021.12
NeurIPS MeetUp China, 2021.12 [News] [Slides] A survey of GNN system.《GNN系统调研》
Tencent, Virtual, 2021.12 [Slides]- Graph Attention Multi-Layer Perceptron.《图注意力多层感知器》
DataFun, Virtual, 2021.10 [News] [Slides]