Kaize Ding

(丁凯泽)

Data Mining and Machine Learning Laboratory
School of Computing and Augmented Intelligence
Arizona State University

About me: I am a PhD student of Computing Science and Engineering, Arizona State University. I work as a research assistant at Data Mining and Machine Learning Laboratory, advised by Professor Huan Liu. Before that, I obtained my master and bachelor degrees from Beijing University of Posts and Telecommunications (BUPT).

Research Interest: My research interests generally lie in data mining and machine learning, currently I'm focusing on minimally-supervised learning (e.g., few-shot learning, weakly-supervised learning, self-supervised learning), with a special focus on graph and text data.

I'm on job market for faculty or researcher position this year. Please feel free to reach out if you have potential job opportunities.


News

11/2022
Our tutorial Graph Minimally-supervised Learning has been accepted at SDM'23.
11/2022
Check out our SIGKDD Exlorations survey on Data Augmentation for Deep Graph Learning.
10/2022
One paper got accepted in WSDM 2023.
11/2021
Invited to be PC member of KDD 2022, IJCAI 2022.
12/2021
One paper got accepted in AAAI 2022.
10/2021
One paper got accepted in WSDM 2022.
08/2021
One paper got accepted in EMNLP 2021.
08/2021
Two papers got accepted in CIKM 2021.
08/2021
Invited to be PC member of WSDM 2022.
01/2021
One paper got accepted in WWW 2021.
12/2020
One paper got accepted in SDM 2021.
12/2020
One paper got accepted in AAAI 2021.
12/2020
Invited to be PC member of NAACL 2021, ACL 2021.
09/2017
I enrolled as a PhD student at ASU.

Selected Papers

[Google Scholar] [Full List]

Meta Propagation Networks for Graph Few-shot Semi-supervised Learning
Kaize Ding, Jianling Wang, James Caverlee and Huan Liu
AAAI Conference on Artificial Intelligence (AAAI) 2022.
@InProceedings{ding2022meta,
  title     = {Meta Propagation Networks for Graph Few-shot Semi-supervised Learning},
  author    = {Ding, Kaize and Wang, Jianling and Caverlee, James and Liu, Huan},
  booktitle = {AAAI},
  year     = {2022},
}

Inspired by the extensive success of deep learning, graph neural networks (GNNs) have been proposed to learn expressive node representations and demonstrated promising performance in various graph learning tasks. However, existing endeavors predominately focus on the conventional semi-supervised setting where relatively abundant gold-labeled nodes are provided. While it is often impractical due to the fact that data labeling is unbearably laborious and requires intensive domain knowledge, especially when considering the heterogeneity of graph-structured data. Under the few-shot semi-supervised setting, the performance of most of the existing GNNs is inevitably undermined by the overfitting and oversmoothing issues, largely owing to the shortage of labeled data. In this paper, we propose a decoupled network architecture equipped with a novel meta-learning algorithm to solve this problem. In essence, our framework Meta-PN infers high-quality pseudo labels on unlabeled nodes via a meta-learned label propagation strategy, which effectively augments the scarce labeled data while enabling large receptive fields during training. Extensive experiments demonstrate that our approach offers easy and substantial performance gains compared to existing techniques on various benchmark datasets.

Learning to Selectively Learn for Weakly-supervised Paraphrase Generation
Kaize Ding, Dingcheng Li, Alexander Hanbo Li, Xing Fan, Chenlei Guo, Yang Liu, and Huan Liu
Conference on Empirical Methods in Natural Language Processing (EMNLP) 2021.
@InProceedings{ding2021learning,
  title     = {Learning to Selectively Learn for Weakly-supervised Paraphrase Generation},
  author    = {Ding, Kaize and Li, Dingcheng and Li, Alexander Hanbo and Fan, Xing and Guo, Chenlei and Liu, Yang and Liu, Huan},
  booktitle = {Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
  year     = {2021},
}

Paraphrase generation is a longstanding NLP task that has diverse applications for downstream NLP tasks. However, the effectiveness of existing efforts predominantly relies on large amounts of golden labeled data. Though unsupervised endeavors have been proposed to address this issue, they may fail to generate meaningful paraphrases due to the lack of supervision signals. In this work, we go beyond the existing paradigms and propose a novel approach to generate high-quality paraphrases with weak supervision data. Specifically, we tackle the weakly-supervised paraphrase generation problem by: (1) obtaining abundant weakly-labeled parallel sentences via retrieval-based pseudo paraphrase expansion; and (2) developing a meta-learning framework to progressively select valuable samples for fine-tuning a pre-trained language model, i.e., BART, on the sentential paraphrasing task. We demonstrate that our approach achieves significant improvements over existing unsupervised approaches, and is even comparable in performance with supervised state-of-the-arts.

Few-shot Network Anomaly Detection with Cross-network Meta-learning
Kaize Ding*, Qinghai Zhou*, Hanghang Tong, and Huan Liu (*equal contribution)
The Web Conference (formerly WWW) 2021.
@InProceedings{ding2021few,
  title     = {Few-shot Network Anomaly Detection via Cross-network Meta-learning},
  author    = {Ding, Kaize and Zhou, Qinghai and Tong, Hanghang and Liu, Huan},
  booktitle = {Proceedings of the Web Conference 2021},
  year      = {2021}
}

In general, graph neural networks (GNNs) adopt the message-passing scheme to capture the information of a node (i.e., nodal attributes, and local graph structure) by iteratively transforming, aggregating the features of its neighbors. Nonetheless, recent studies show that the performance of GNNs can be easily hampered by the existence of abnormal or malicious nodes due to the vulnerability of neighborhood aggregation. Thus it is necessary to learn anomaly-resistant GNNs without the prior knowledge of ground-truth anomalies, given the fact that labeling anomalies is costly and requires intensive domain knowledge. In order to keep the effectiveness of GNNs on anomaly-contaminated graphs, in this paper, we propose a new framework named RARE-GNN (Reinforced Anomaly-Resistant Graph Neural Networks) which can detect anomalies from the input graph and learn anomaly-resistant GNNs simultaneously. Extensive experiments on real-world datasets demonstrate the effectiveness of the proposed framework.

Be More with Less: Hypergraph Attention Networks for Inductive Text Classification
Kaize Ding, Jianling Wang, Jundong Li, Dingchneg Li, and Huan Liu
Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020.
@InProceedings{ding2020more,
  title     = {Be more with less: Hypergraph attention networks for inductive text classification},
  author    = {Ding, Kaize and Wang, Jianling and Li, Jundong and Li, Dingcheng and Liu, Huan},
  booktitle = {Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
  year     = {2020},
}

Text classification is a critical research topic with broad applications in natural language processing. Recently, graph neural networks (GNNs) have received increasing attention in the research community and demonstrated their promising results on this canonical task. Despite the success, their performance could be largely jeopardized in practice since they are: (1) unable to capture high-order interaction between words; (2) inefficient to handle large datasets and new documents. To address those issues, in this paper, we propose a principled model -- hypergraph attention networks (HyperGAT), which can obtain more expressive power with less computational consumption for text representation learning. Extensive experiments on various benchmark datasets demonstrate the efficacy of the proposed approach on the text classification task.

Graph Prototypical Networks for Few-shot Learning on Attributed Networks
Kaize Ding, Jianling Wang, Jundong Li, Kai Shu, Chenghao Liu, and Huan Liu
ACM International Conference on Information and Knowledge Management (CIKM) 2020.
@InProceedings{ding2020graph,
  title     = {Graph prototypical networks for few-shot learning on attributed networks},
  author    = {Ding, Kaize and Wang, Jianling and Li, Jundong and Shu, Kai and Liu, Chenghao and Liu, Huan},
  booktitle = {Proceedings of the 29th ACM International Conference on Information \& Knowledge Management},
  year     = {2020},
}

Attributed networks nowadays are ubiquitous in a myriad of high-impact applications, such as social network analysis, financial fraud detection, and drug discovery. As a central analytical task on attributed networks, node classification has received much attention in the research community. In real-world attributed networks, a large portion of node classes only contain limited labeled instances, rendering a long-tail node class distribution. Existing node classification algorithms are unequipped to handle the few-shot node classes. As a remedy, few-shot learning has attracted a surge of attention in the research community. Yet, few-shot node classification remains a challenging problem as we need to address the following questions: (i) How to extract meta-knowledge from an attributed network for few-shot node classification? (ii) How to identify the informativeness of each labeled instance for building a robust and effective model? To answer these questions, in this paper, we propose a graph meta-learning framework -- Graph Prototypical Networks (GPN). By constructing a pool of semi-supervised node classification tasks to mimic the real test environment, GPN is able to perform meta-learning on an attributed network and derive a highly generalizable model for handling the target classification task. Extensive experiments demonstrate the superior capability of GPN in few-shot node classification.

Deep Anomaly Detection on Attributed Networks with Graph Convolutional Networks
Kaize Ding, Jundong Li, Rohit Bhanushali, and Huan Liu
SIAM International Conference on Data Mining (SDM) 2019.
@InProceedings{ding2019deep,
  title     = {Deep anomaly detection on attributed networks},
  author    = {Ding, Kaize and Li, Jundong and Bhanushali, Rohit and Liu, Huan},
  booktitle = {Proceedings of the 2019 SIAM International Conference on Data Mining},
  year     = {2019},
}

Attributed networks are ubiquitous and form a critical component of modern information infrastructure, where additional node attributes complement the raw network structure in knowledge discovery. Recently, detecting anomalous nodes on attributed networks has attracted an increasing amount of research attention, with broad applications in various high-impact domains, such as cybersecurity, finance, and healthcare. Most of the existing attempts, however, tackle the problem with shallow learning mechanisms by ego-network or community analysis, or through subspace selection. Undoubtedly, these models cannot fully address the computational challenges on attributed networks. For example, they often suffer from the network sparsity and data nonlinearity issues, and fail to capture the complex interactions between different information modalities, thus negatively impact the performance of anomaly detection. To tackle the aforementioned problems, in this paper, we study the anomaly detection problem on attributed networks by developing a novel deep model. In particular, our proposed deep model: (1) explicitly models the topological structure and nodal attributes seamlessly for node embedding learning with the prevalent graph convolutional network (GCN); and (2) is customized to address the anomaly detection problem by virtue of deep autoencoder that leverages the learned embeddings to reconstruct the original data. The synergy between GCN and autoencoder enables us to spot anomalies by measuring the reconstruction errors of nodes from both the structure and the attribute perspectives. Extensive experiments on real-world attributed network datasets demonstrate the efficacy of our proposed algorithm.


Professional Experiences

  • Google Brain, Research Intern, 2022
  • Microsoft Research, Research Intern, 2021
  • Amazon Alexa AI, Applied Scientist Intern, 2020
  • Microsoft Research Asia, Research Intern, 2016, 2017
  • Chinese University of Hong Kong, Research Assistant, 2017
  • Meituan, Intern, 2015
  • Sogou, Intern, 2014

Services

Program Committee

KDD'22, WSDM'22, IJCAI'22, CIKM'21, ECML-PKDD'21, EMNLP21, ACL'21, IJCAI'21, IJCAI'20, ECML-PKDD'20

External Reviewer

KDD'19, WWW'19, SIGIR'18, ASONAM'18


Honors and Awards

  • SDM Best Posters (runner up) Award, 2022
  • ASU Graduate College Completion Fellowship, 2022
  • ASU GPSA Outstanding Research Award, 2022
  • ASU CIDSE Doctoral Fellowship, 2021
  • ASU Engineering Graduate Fellowship, 2019, 2020
  • Microsoft Research Asia Stars of Tomorrow (Excellent Intern Award), 2017

  Last update 2021