CS246
Mining Massive Data Sets
Winter 2017
The course will discuss data mining and machine learning algorithms for analyzing very large amounts of data. The emphasis will be on Map Reduce as a tool for creating parallel algorithms that can process very large amounts of data.
Announcements:
- 1/9: The first class will be held at 3pm on Tuesday, January 10, in NVIDIA Auditorium, Huang Engineering Center.
- 1/9: HW0 (Hadoop tutorial) is out, due on January 19 at 11:59pm.
- 1/10: GHW1 has been assigned on Gradiance, due on January 19 at 11:59pm.
- 1/10: Time and location of linear algebra review session: January 13, 3:00pm to 4:20pm in Gates B03.
- 1/10: Time and location of probability and statistics review session: January 20, 3:00pm to 4:20pm in Gates B03.
- 1/10 [for SCPD students]: We'd like to clarify that SCPD students should use Gradescope (for assignments) and Gradiance (for automated quizzes) just like the on-campus students. Please don't submit your work through SCPD. For more details, please see the course infomation page.
- 1/12: We are organizing a VM clinic to help students set up their VMs. Daniel Templeton will be at the session, assisted by several other TAs. Time and Location: January 16 (coming Monday), 6PM to 9PM in Gates 415.
- 1/12: HW1 is out, due on January 26 at 11:59pm.
- 1/24: GHW2, GHW3 have been assigned on Gradiance, GHW2 due on January 26 at 11:59pm, GHW3 due on February 2 at 11:59pm
- 1/27: HW2 is out, due on Februrary 09 at 11:59pm.
- 1/31: GHW4 has been assigned on Gradiance, due on February 9 at 11:59pm.
- 1/31: GHW5 has been assigned on Gradiance, due on February 16 at 11:59pm.
- 2/09: HW3 is out, due on Februrary 23 at 11:59pm.
- 2/14: GHW6 has been assigned on Gradiance, due on February 23 at 11:59pm.
- 2/22: GHW7 has been assigned on Gradiance, due on March 2 at 11:59pm.
- 2/23: HW4 is out, due on March 09 at 11:59pm.
- 2/24: 2016 final exam released for practice. You can find solutions to this exam here.
- 2/28: GHW8 has been assigned on Gradiance, due on March 9 at 11:59pm.
- 3/07: GHW9 has been assigned on Gradiance, due on March 16 at 11:59pm.
Course information:
Lectures:
Tuesday & Thursday 3PM - 4:20pm in NVIDIA Auditorium, Jen-Hsun Huang Engineering Center.
Watch video lectures on SCPD. Stanford students can see them here.
Instructor:
Jeff Ullman
Office: 425 Gates
Email: lastname @ gmail.com
Office Hours: Tuesday 1:00-2:30pm, Friday 10:30am-12:00pm
Companion course CS246H:
There is a companion course
CS246H, which is completely independent from CS246 and covers Hadoop programming. It meets Wednesdays 11:30AM - 1:20PM, in Skilling Auditorium
Office hours:
SCPD students can join the office hours via Google Hangouts. The hangouts link is available on Piazza.
Note: Jeff Ullman will not be able to hold office hours on 3/14 as he is traveling during this time. On 3/17, his office hours are by appointment only. He will hold extra office hours the morning of the final exam 3/21 from 10AM-noon by appointment.
Name | Day | Hours | Location |
Jeff Ullman | Tuesday | 1:15pm-2:30pm | 425 Gates |
Jeff Ullman | Friday | 10:30am-12:00pm | 425 Gates |
Michael Zhu | Monday | 10:00am-12:00pm | Huang Basement |
Rishabh Bhargava | Monday | 3:00pm-5:00pm | Huang Basement |
Jessica Su | Monday | 5:00pm-7:00pm | Huang Basement |
Naveen Arivazhagan | Monday | 7:00pm-9:00pm | Huang Basement |
Anthony Kim | Tuesday | 9:30am-11:30am | Huang Basement |
Nihit Desai | Tuesday | 4:30pm-6:30pm | Huang Basement |
Leon Yao | Wednesday | 12:00pm-2:00pm | Huang Basement |
Yixin Wang | Wednesday | 2:00pm-4:00pm | Huang Basement |
Vinaya Polamreddi | Thursday | 10:30am-12:30pm | Huang Basement |
Yixin Cai | Thursday | 1:00pm-3:00pm | Huang Basement |
Junwei Yang | Friday | 9:00am-11:00am | Huang Basement |
Sachin Padmanabhan | Friday | 11:00am-1:00pm | Huang Basement |
Luda Zhao | Friday | 1:00pm-3:00pm | Huang Basement |
Course materials:
Automated Quizzes: We will be using Gradiance. Everyone (on-campus as well as SCPD students) should create an account there (passwords are at least 10 letters and digits with at least one of each) and enter the class code 380CE054. Please use your real first and last name, with the standard capitalization, e.g., "Jeffrey Ullman". Also please register using your stanford email or the same email you used for Gradescope so we can match your Gradiance score report to other class grades.
Books: Leskovec-Rajaraman-Ullman: Mining of Massive Datasets can be downloaded for free. It can be purchased from Cambridge University Press, but you are not required to do so.
MOOC: You can watch videos from a past Coursera MOOC (similary to this course) on Youtube.
Piazza: Piazza Discussion Group for this class (access code "mmds").
Course handouts: Available here.
Staff Email: You can reach us at cs246-win1617-staff@lists.stanford.edu
Previous versions of the course
CS246: Winter 2016
CS246: Winter 2015
CS246: Winter 2014
CS246: Winter 2013
CS246: Winter 2012
CS246: Winter 2011
CS345a: Winter 2010