Home
Course info
Announcements
Course outline
Handouts
Assignments
CS246:
Mining Massive Data Sets
Winter 2011
Handouts:
Basic Math recitation
Slides
Hadoop session
Slides
,
Example
,
Hadoop Installation Instructions
1/3/2011 Introduction
[PDF]
, MapReduce
[PDF]
Reading:
Ch2: Large-Scale File Systems and Map-Reduce
1/5/2011 Association Rules: Frequent itemsets and Association rules
[PDF]
Reading:
Ch6: Frequent itemsets
1/10/2011 Near Neighbor Search in High Dimensional Data
[PDF]
Reading:
Ch3: Finding Similar Items
1/12/2011 Locality Sensitive Hashing (LSH)
[PDF]
Reading:
Ch3: Finding Similar Items
1/17/2011 Martin Luther King, Jr., Day (no class)
1/19/2011 Theory of Locality Sensitive Hashing
[PDF]
Reading:
Ch3: Finding Similar Items
1/24/2011 Dimensionality reduction: SVD and CUR
[PDF]
Reading:
Some uses of spectral methods
by Abhiram Ranade
1/26/2011 Clustering
[PDF]
Reading:
Ch7: Clustering
1/31/2011 Recommendation Systems
[PDF]
Reading:
Ch9: Recommendation systems
and
The Long Tail
in Wired.
2/2/2011 Recommendation Systems (Netflix challenge)
[PDF]
Reading:
Ch9: Recommendation systems
2/7/2011 Link Analysis, PageRank, Hubs and Authorities
[PDF]
Reading:
Ch5: Link Analysis
2/9/2011 Web spam and TrustRank, Random Walks with Restarts
[PDF]
Reading:
Ch5: Link Analysis
2/14/2011 Large scale supervised machine learning (1): k-nearest neighbor, Perceptron
[PDF]
2/16/2011 Large scale supervised machine learning (2): Support Vector Machines
[PDF]
2/21/2011 Presidents' Day (no class)
2/23/2011 Large scale supervised machine learning (3): Classification and regression trees
[PDF]
Reading:
PLANET: Massively Parallel Learning of Tree Ensembles with MapReduce
by Panda, Herbach, Basu and Bayardo. VLDB 2009.
2/28/2011 Mining data streams (1)
[PDF]
Reading:
Ch4: Mining data streams
3/2/2011 Mining data streams (2)
[PDF]
Reading:
Ch4: Mining data streams
3/7/2011 Web Advertising
[PDF]
Reading:
Ch8:Advertising on the Web
3/9/2011 Review session
[PDF]