CS246
Mining Massive Data Sets
Winter 2017
Handouts
Sample Final Exams
Assignments
Gradiance (no late periods allowed):
- GHW 1: Due on 1/19 at 11:59pm.
- GHW 2: Due on 1/26 at 11:59pm.
- GHW 3: Due on 2/02 at 11:59pm.
- GHW 4: Due on 2/09 at 11:59pm.
- GHW 5: Due on 2/16 at 11:59pm.
- GHW 6: Due on 2/23 at 11:59pm.
- GHW 7: Due on 3/02 at 11:59pm.
- GHW 8: Due on 3/09 at 11:59pm.
- GHW 9: Due on 3/16 at 11:59pm.
Homeworks (1 late period allowed):
- HW0 (Hadoop tutorial) to help you set up Hadoop: Due on 1/19 at 11:59pm.
- HW1: Due on 1/26 at 11:59pm. Submission Templates: [pdf | tex].
- HW2: Due on 2/09 at 11:59pm. Submission Templates: [pdf | tex].
- HW3: Due on 2/23 at 11:59pm. Submission Templates: [pdf | tex].
- HW4: Due on 3/09 at 11:59pm. Submission Templates: [pdf | tex].
Lecture notes (Future Schedule is tentative)
- 01/10: Introduction; MapReduce
Slides: [pptx], [pdf]
Reading: Ch1: Data Mining and Ch2: Large-Scale File Systems and Map-Reduce (Sect. 2.1-2.4)
- 01/12: Frequent Itemsets Mining
Slides: [pptx], [pdf]
Reading: Ch6: Frequent itemsets
- 01/17: Locality-Sensitive Hashing I
Slides: [pptx], [pdf]
Reading: Ch3: Finding Similar Items (Sect. 3.1-3.4)
- 01/19: Locality-Sensitive Hashing II
Slides: [pptx], [pdf]
Reading: Ch3: Finding Similar Items (Sect. 3.5-3.8)
- 01/24: Clustering
Slides: [pptx], [pdf]
Reading: Ch7: Clustering (Sect. 7.1-7.4)
- 01/26: Dimensionality Reduction
Slides: [pptx], [pdf]
Reading: Ch11: Dimensionality Reduction (Sect. 9.4)
- 01/31: PageRank
Slides: [pptx], [pdf]
Reading: Ch5: Link Analysis (Sect. 5.1-5.3, 5.5)
- 02/02: Link Spam and Introduction to Social Networks
Slides: [pptx], [pdf]
Reading: Ch5: Link Analysis (Sect. 5.4)
Reading: Ch10: Analysis of Social Networks (Sect. 10.1-10.2, 10.6)
- 02/07: Social Networks
Slides: [pptx][pdf]
Reading: Ch10: Analysis of Social Networks (Sect. 10.3-10.5)
- 02/09: Algorithms on Large Graphs
Slides: [pptx], [pdf]
Reading: Ch10: Analysis of Social Networks (Sect. 10.7-10.8)
- 02/14: Recommender Systems I
Slides: [pdf]
Reading: Ch9: Recommendation systems
- 02/16: Recommender Systems II
Slides: [pdf]
Reading: Ch9: Recommendation systems
- 02/21: Large-Scale Machine Learning I
Slides: [pptx], [pdf]
Reading: Ch12: Large-Scale Machine Learning
- 02/23: Large-Scale Machine Learning II
Slides: [pptx], [pdf]
Reading: Ch12: Large-Scale Machine Learning
- 02/28: Mining Data Streams I
Slides: [pptx], [pdf]
Reading: Ch4: Mining data streams (Sect. 4.1-4.3)
- 03/02: Mining Data Streams II
Slides: [pptx], [pdf]
Reading: Ch4: Mining data streams (Sect. 4.4-4.7)
- 03/07: Computational Advertising
Slides: [pptx], [pdf]
Reading: Ch8: Advertising on the Web
- 03/09: Complexity Theory for MapReduce Algorithms
Slides: [pptx], [pdf]
Reading: Ch2: Large-Scale File Systems and Map-Reduce (Sect. 2.5-2.6)
- 03/14: Large Scale Recommender systems at Pinterest (Jure Leskovec)
- 03/16: Review
Slides: [pdf]
All readings have been derived from the Mining Massive Datasets by J. Leskovec, A. Rajaraman and J. Ullman.
Recitation sessions documents