ECE 594N: DATA MINING

Fall 2006

Dec 7, 2006:

Instructor: Manjunath, 805 893 7112

Data mining refers to tools and techniques for processing and managing large collections of data with the main objective being able to detect significant "patterns" or associations in such data sets. As such, it has a wide range of applications to problems in natural and social sciences, medicine, finance, and marketing. This introductory course will cover some of the basic principles of data mining with emphasis on data mining tasks and algorithms. These will include, for example, tools for classification and clustering, data structures for organizing high-dimensional data, association rules for mining, and retrieval by content.

Note that the lecture slides and project links require authentication as they contain copyright/limited access materials.

Recommended Book(s)

Grading

H/W + Class participation/discussions: 25%; Project: 40%; Final exam (take home): 35%.

Paper Presentations 

Project Proposals: 
Oct 19: Retinal Detachment (Joshi, Mangiat, Ni and Sargin)
Oct 24: Netflix project (Ranu,k Singh and Sakarya), Delay Testing Project (CY Chen) 
Oct 26: Social Networks (P Wu, S Wu, and WY Chen)
Oct 31: Cortina Project (De Guzman, Moxley and XU), Folksonomy project (Petko and Imran), PersonalAlbums (Yeh, Zhu) Nov 02: Multisensor EEG (Choi, Kleban, Sarkar, Rahimi), Atomic Motifs (Sturm, Gauglitz)

Tentative Outline

Slides01 (09/28/2006): Introduction (~4MB)
Slides02 (10/02/2006): Data (~4MB)

Classification Methods
Slides03 (10/17/2006): Decision Trees (~2MB)

Slides04:(11/09/2006): Other classification methods (~2MB)

Association Mining

Slides05: (11/09/2006): Association mining methods (~3MB) (FINAL)

Clustering Methods

Slides06: (12/02/2006): Clustering (~3MB)

Final Notes (12/07/2006)

 

 

Homeworks:

HW#1: Due on Oct 17. (solutions)

HW #2: Due on Oct 26. (solutions)

HW #3: Due on Nov 9.

HW #4: Due Nov 21. (solutions)