Topics of CSE591d
Block 1 Introduction (1 week)
-
About CSE 591d (Spring 2002)
-
Format (Learning, Thinking, Doing, Presenting, Reporting)
-
Textbooks and reference materials
-
What students are expected to do:
-
2 exams (20%, 20%)
-
One group topic presentation and slides (15%, 10%)
-
One project presentation and report (with demo)(10%,15%)
-
Class participation (10%)
-
Survey (student backgrounds, views, expectations, new areas of research)
-
Data mining
-
What is it
-
Why now and what can we gain
-
Data warehousing and Web mining
-
What are to be covered
-
Classification
-
Clustering
-
Association
-
Evaluation
-
Data Preparation
-
New topics (Web Minng, Machine Learning, Data Seleciton, Data Streaming,
....)
Block 2 Classification (2 weeks)
-
Data and its format
-
What is the problem of classification?
-
How to learn a classifier?
-
What are the representative approaches?
-
What are the key issues?
-
Group presentations
-
k-NN and Naive Bayes Classification
-
Case-Based Reasoning
-
Neural Networks
-
Rule Extraction from Trees and Networks
-
MetaCost: cost sensitive classification
Block 3 Performance Evaluation and Comparison (1 week)
-
What to evaluate?
-
How to evaluate?
-
Group presentation
Block 4 Data Preparation and Preprocessing (1 week)
-
Noise, missing values, data types
-
Data reduction: feature selection, instance selection, database selection
-
Paper presentations
-
Feature discretization
-
Sampling
Block 5 Clustering (2 weeks)
-
What is clustering
-
Types of clustering
-
Issues of clustering
-
Paper presentations
-
Partitioning: K-means and EM
-
Hierarchical clustering: BIRCH
-
Density-based: DBSCAN
-
Grid-based: STING
Block 6 Association (2 weeks)
-
Market basket analysis
-
Principles of association analysis
-
Group presentations
-
APRIORI in detail
-
Frequent ITemsets without Candidate Generation
-
Multi-level association rules
-
Constraint-based Association Mining
-
Parallel association rule mining
Block 7 Data Warehousing (1 week)
-
What is a data warehouse
-
Basic concepts and operations
-
Schemas
-
Meta data
-
Creating a data warehouse
-
Group Presentation
Block 8 Web Data and Mining (2 weeks)
-
Semi-structured data (XML)
-
Resource description framework (RDF)
-
Data Streaming
-
Web mining
-
Group presentations
-
Search
-
Mining
-
Personalization
Block 9 Real-World Applications and Challenges (1 week)
-
Image Mining
-
Customer retention
Block 10 Project Presentation (2 weeks)
-
Project presentations (2 weeks)
Last updated Jan. 14 2002