Search This Blog


Data Mining PPT

Data Mining


An upper-level undergraduate course(s) in algorithms and data structures, a basic course on probability and statistics. This is a first course on data mining and no prior knowledge of data mining or machine learning is assumed. Homework assignments will require programming in Java, which can sometimes be substituted with C++

Text book

There is no single text book for the course. For each topic, we will list the relevant chapters from various books and papers. We will make sure that you can access such books chapters. Second, we will have students scribe two page summaries of lectures in class.

Reading Material

Chapters from the following text books could be useful
[ Bis07]
Pattern recognition and machine learning by Christopher Bishop
T. Mitchell. Machine Learning. McGraw-Hill, 1997.
[ HTF]
Hastie, Tibshirani, Friedman The elements of Statistical Learning Springer Verlag
[ Han00]
Data Mining: Concepts and Techniques by Jiawei Han, Micheline Kamber, Morgan Kaufmann Publishers
[ MVS ]
Applied Multivariate statistical analysis by Johnson and Wichern, 3rd Edition, PHI
[ PRS ]

Probability, Random Variables and Stochastic processes by Papoulis and Pillai, 4th Edition, Tata McGraw Hill Edition.

[ BV ]

Boyd and Vandenberghe Convex optimization Book available online: Local copy


A. K. Jain and R. C. Dubes. Algorithms for Clustering Data. Prentice Hall, 1988. Local copy

Lecture Slides

These are old slides that are made available as is. Most of the instructions will be using the board and the material covered may be substantially different from what is there in these slides. Also, remember that it is not very useful to study from slides.