CSE 572 DATA MINING

(January 16 – May 1, Spring 2007)

I hear, I forget; I see, I remember; I do, I understand. - Proverb

We will work together to hear, see and do in this class in many forms including lectures, invited talks, discussions, research paper reading assignment, a project, and presentation, in addition to homework, quizzes, and exam(s).  We will create opportunities to learn from each other. If you find anything interesting about data mining, please share with us all. The ultimate goal is to provide a conducive environment that unleashes the student creativity in producing new works with impact.

We want to strike a balance between exploiting and exploring in learning with respect to passive and active learning.

Your suggestions are most welcome. Please send email to huan.liu at asu.edu

OUTLINE

  1. Introduction to Data Mining
  2. Classification Methods (ensemble methods, SVMs, skewed data, cost-sensitive classification)
  3. A Brief Review of Probability and Entropy
  4. Performance Evaluation (Measures, Comparison between two algorithms)
  5. Data, Data Preparation, and Data Preprocessing (feature selection, discretization, sampling, instance selection)
  6. Clustering Methods (subspace clustering, CLIQUE)
  7. Association Rules
  8. Current Challenges
    1. The Shark Toothed Elephant – an invited talk by Bill Rose, VP GIS of Avnet
    2. USuggest  - an invited talk on data mining in a Web application
  9. Some Thought-Provoking Applications (steganography and steganalysis, streaming data extraction, gene selection)

ASSIGNMENTS (To be updated)

More on Paper Reading Assignment, Project and its Due Dates

   

EXAMS

LINKS

Prepared by Huan Liu on January 9, 2007
Last updated: April 17, 2007