Mining Massive Data Sets
Winter 2016
The course will discuss data mining and machine learning algorithms for analyzing very large amounts of data. The emphasis will be on Map Reduce as a tool for creating parallel algorithms that can process very large amounts of data.


Final Exam

Final exam for this class will be in Dinkelspiel Auditorium from 8:30AM - 11:30AM on March 16 (Wednesday). For more information about the exam, refer to post @638 on Piazza.

Other Announcements

Course information:


Tuesday & Thursday 9AM - 10:20AM in NVIDIA Auditorium, Jen-Hsun Huang Engineering Center.
Watch video lectures on SCPD. Stanford students can see them here.


Jeff Ullman
Office: 425 Gates
Email: lastname @ gmail.com
Office Hours: Tuesday 10:30AM-Noon, Friday 10:30AM-Noon

Companion course CS246H:

There is a companion course CS246H, which is completely independent from CS246 and covers Hadoop programming. It meets Tuesdays 3PM - 4:20PM, also in NVIDIA Auditorium

Office hours:

Note: Jeff Ullman will not hold office hours on 1/26, 1/29, and 2/9.

Jeff UllmanTuesday10:30AM-noon425 Gates
Jeff UllmanFriday10:30AM-noon425 Gates
Caroline SuenMonday1:30PM-3:30PM414 Gates
Duyun ChenMonday5PM-7PMHuang Basement
Shubham GuptaTuesday10:30AM-11:30AMHuang Basement
Ivaylo BahtchevanovTuesday1PM-3PMHuang Basement
Jacky WangTuesday4PM-6PMHuang Basement
Leon YaoWednesday11AM-1PMHuang Basement
Himabindu LakkarajuWednesday2PM-4PM448 Gates
Jeff HwangWednesday6PM-8PMHuang Basement
Shubham GuptaThursday10:30AM-11:30AMHuang Basement
Tim AlthoffThursday3PM-5PM414 Gates
Sameep BagadiaFriday3PM-5PMHuang Basement
You ZhouFriday9AM-11AMHuang Basement
Nihit DesaiFriday1PM-3PMHuang Basement

Course materials:

Automated Quizzes: We will be using Gradiance. Everyone should create an account there (passwords are at least 10 letters and digits with at least one of each) and enter the class code 62B99A55. Please use your real first and last name, with the standard capitalization, e.g., "Jeffrey Ullman" so we can match your Gradiance score report to other class grades.

Books: Leskovec-Rajaraman-Ullman: Mining of Massive Datasets can be downloaded for free. It can be purchased from Cambridge University Press, but you are not required to do so.

MOOC: There is a Coursera MOOC that is similar to this course. You may find it useful to view some of the videos there.

Piazza: Piazza Discussion Group for this class (access code "mmds").

Course handouts: Available here.