General Information

I am now a computer science PhD student at Arizona State University under the supervision of Professor Jieping Ye. I also work as a graduate research assistant at the Center for Evolutionary Medicine and Informatics (CEMI) of the Biodesign Institute at ASU. You can find my CV here.

Research Interests

My research interests include structured sparse learning, data mining and their applications in bio-medical problems, especially Alzheimer's Disease Drosophila Embryo image analysis.

    Publications

    • Lei Yuan, Yalin Wang, Paul M. Thompson, Vaibhav A. Narayan and Jieping Ye, "Multi-Source Learning for Joint Analysis of Incomplete Multi-Modality Neuroimaging Data", The 18th ACM SIGKDD International Conference On Knowledge Discovery and Data Mining (SIGKDD 2012), Full Presentation.
    • Sen Yang, Lei Yuan, Peter Wonka and Jieping Ye, "Feature Grouping and Feature Selection Over an Undirected Graph", The 18th ACM SIGKDD International Conference On Knowledge Discovery and Data Mining (SIGKDD 2012), Full Presentation.
    • Lei Yuan, Yalin Wang, Paul M. Thompson, Vaibhav A. Narayan and Jieping Ye, for the Alzheimer's Disease Neuroimaging Initiative, "Multi-source Feature Learning for Joint Analysis of Incomplete Multiple Heterogeneous Neuroimaging Data", NeuroImage 2012 Jul 2; 61(3):622-632. PubMed
    • Lei Yuan, Alexander Woodard, Shuiwang Ji, Yuan Jiang, Zhi-Hua Zhou, Sudhir Kumar and Jieping Ye, "Learning Sparse Representations for Fruit-Fly Gene Expression Pattern Image Annotation and Retrieval", BMC Bioinformatics 2012, 13:107. BiomedCentral
    • Lei Yuan, Jun Liu and Jieping Ye, "Efficient Methods for Overlapping Group Lasso", Twenty-Fifth Annual Conference on Neural Information Processing Systems (NIPS 2011).  PDF
    • Jiayu Zhou, Lei Yuan, Jun Liu and Jieping Ye. "A Multi-Task Learning Formulation for Predicting Disease Progression", The 17th ACM SIGKDD International Conference On Knowledge Discovery and Data Mining (SIGKDD 2011). PDF
    • Jun Liu, Lei Yuan, and Jieping Ye, "An Efficient Algorithm for a Class of Fused Lasso Problems", The Sixteenth ACM SIGKDD International Conference On Knowledge Discovery and Data Mining (SIGKDD 2010). Full Presentation. PDF  CODE
    • Shuiwang Ji, Lei Yuan, Ying-Xin Li, Zhi-Hua Zhou, Sudhir Kumar and Jieping Ye, "Drosophila Gene Expression Pattern Annotation Using Sparse Features and Term-term Interactions", The Fifteenth ACM SIGKDD International Conference On Knowledge Discovery and Data Mining (SIGKDD 2009). PDF CODE

    Poster Presentations

    • Lei Yuan, Cheng Pan, Shuiwang Ji, Sudhir Kumar, Jieping Ye, "Machine Learning Approaches for Drosophila Expression Image Analysis",  Annual Drosophila Research Conference, 2012. Poster
    • Lei Yuan, "Multi-Source Data Fusion Using Cox Model for Predicting MCI to AD Conversion", SDM 2011, SIAM Data Mining Conference - The SDM 2011 Doctoral Forum. Poster

    Awards

    • Conference Travel Grant Award, Graduate and Professional Student Association (GPSA), Arizona State University, 2012
    • SIAM Data Mining Student Travel Award, 2011
    • University Graduate Fellowship, Arizona State University, 2008

    Ongoing Projects

    Learning with Incomplete Data in the ADNI Cohort

    Illustration of Incomplete Data

    Missing data present a special challenge when integrating large-scale biomedical data. For example, in the data set provided by Alzheimer's Disease Neuroimaging Initiative (ADNI), over half of the subjects lack CSF measurements; an independent half of the subjects do not have FDG-PET. This results in a scenario shown in the figure above, where large chunks of missing data are marked by the white areas. A simple and popular approach is to remove all the subjects with missing values, but this greatly reduces the number of samples and fails to fully use the information in the dataset. The goal of our study in this project is to develop novel tools to make full use of the available information at hand while bypassing the difficulty of guessing the complete missing block of data.

    Learning with Sparse Structure

    Structures in Real World Data

    Structures are ubiquitous among the real world data sets. The structures are often modeled by relationships between features, such as groups or graphs. Utilizing these prior knowledge enables us to greatly reduce the solution space, and furthermore to obtain more meaningful models. In this project, we aim to use sparse learning techniques to explore underlying structures in the data to build effective model selection methods.

    Constructing Effective Representations for Drosophila Embryo Image

    SPM

    Fruit fly embryogenesis is one of the best understood animal development systems, and the spatiotemporal gene expression dynamics in this process are captured by digital images. Analysis of these high-throughput images will provide novel insights into the functions, interactions, and networks of animal genes governing development. In this project, we aim to develop a novel approach for the automated annotation and retrieval of Drosophila melanogaster images. Inspired by the spatial bag-of-words (BoW) approach, we propose an image representation model that takes advantage of the spatial information provided by the BDGP images while at the same time being robust against distortions.