Research Positions in the SNAP Group
Autumn Quarter 2014-15

Status Update:

The application process is now closed, we do not accept any new applications. Thank you to those that applied for a research position in our group!

We were overwhelmed with applications and at the end we were able to select only a few candidates. The competition was exceptionally tough and decisions very hard to make. If you have not heard from us, you were not selected for a position. We will have more openings in the future, we encourage you to apply again at that time.


Welcome to the application page for research positions in the SNAP group, Autumn Quarter 2014-15!

Our group has several open positions for Research Assistants as well as students interested in independent studiess (CS191, CS199, CS399). These positions are available for Stanford University students only. Below is a list of relevant research projects. All the projects will lead to research publications and working demos/systems. We are looking for highly motivated students with any combination of skills: data mining, machine learning, algorithms, human computer interaction, and computer systems.

Please apply by filling out and submitting the form below. Apply quickly since the positions usually get filled early in the quarter. Thanks for your interest!

If you have any questions please contact Prof. Leskovec at jure@cs.stanford.edu.

Application form

First and Last Name

SUNetID

SUNetID is your Stanford CS login name and contact email address, <your_SUNetID>@cs.stanford.edu. If you don't have a SUNetID, use <your_last_name>_<your_first_name>, so if your last name is Smith and your first name is John, use smith_john.

Email

Department

Student Status

Project(s)

Please select all the projects that you are interested in. You can find the project descriptions below.

Machine Learning for Social Media Recommender Systems [description]
Missing Link Prediction on Wikipedia [description]
Do Birds of Feather Flock Together [description]
SNAP: Stanford Network Analysis Platform [description]
Ringo: In-memory Graph Exploration Engine [description]
Snapworld: A System for Processing Tera-Scale Graphs [description]

Position

Please select the position you are interested in. Please select all that apply.

25% RA
50% RA
Independent study (CS399, CS199, CS191)

Statement of Purpose

Briefly explain why you would like to participate in this project and why you think you are qualified to work on it.

Your Resume

Your Transcript

Click on the button below to Submit


Projects

Machine Learning for Social Media Recommender Systems

Our social networks overload us with information, bombarding us with thousands of tweets, blog posts, and status updates every day. To cope with this "information overload", there is a need to identify content that users will find relevant, interesting, and important. This requires us to develop statistical models of user behavior in order to discover their preferences. The problem also involves large-scale machine learning and optimization tools in order to recommend meaningful content.

We are searching for students interested in machine learning, optimization, and statistical modeling, with strong algorithmic backgrounds. Strong coding experience in Python/C++ is a plus.

Go to the application form.

Missing Link Prediction on Wikipedia

Wikispeedia is an online human-computation game, where the goal is to find a short path between two given Wikipedia articles by clicking existing Wikipedia links. However, important links are often missing, and identifying them is known as the network completion, or link prediction, problem. The goal of this project is to predict missing links on Wikipedia, using data collected through Wikispeedia. If many users went through article A when looking for target T, but A has no links to T, then the method will suggest a new link from A to T. This project will use Wikispeedia data to predict missing links on Wikipedia and then develop a framework for gamifying website navigation beyond Wikipedia.

Programming experience in Java is desirable, since some existing code is written in Java. As the project will involve the gamification of Web-browsing, creative thinking and Web-programming experience (HTML, JavaScript, PHP, SQL) are a big plus.

Go to the application form.

Do Birds of Feather Flock Together: Exploring the Similarities and Differences in Online Behaviour Between Facebook Friends

Social scientists claim that friendship groups are usually homogeneous, or share similar traits and behaviours. This is usually explained by people being attracted to similar others, and encouraging others to conform to norms accepted in a given group. In reality, however, the extent to which people select and influence their friends is unknown. Social media allows observing people at an unprecedented scale allowing to explore this issue further. Which behaviors and preferences are shared between friends and which are not? Are some people 'incompatible' with each other?

We are searching for students interested in exploring these questions using a Facebook-based dataset of 6 million people. Programming experience in Python, C++ or R is necessary. Experience with databases and large sparse matrices is a big advantage. We expect you to write up and publish your results.

Go to the application form.

SNAP: Stanford Network Analysis Platform

Stanford Network Analysis Platform (SNAP) is a general purpose, high performance network analysis and graph mining library that easily scales to massive networks with hundreds of millions of nodes, and billions of edges. It efficiently manipulates large graphs, calculates structural properties, generates regular and random graphs, and supports attributes on nodes and edges. SNAP is being constantly expanded with new graph and network algorithms for big-memory multi-core machines with 1TB RAM and 80 CPU cores.

We are looking for students with interest in developing sequential or parallel graph algorithms. SNAP is written in C++, so extensive experience with this language is a plus.

Go to the application form.

Ringo: In-memory Graph Exploration Engine

What data analysis engine would you build if your computer has unlimited RAM and CPU? Large-scale data analysis is transforming science and industry. However, tools and solutions for data analysts are bulky and cumbersome to use. The goal of the Ringo research project is to build an interactive system for analysis of large datasets with billions of items. The system will implement strong primitives to handle relational tables as well as networks -- huge graphs with node and edge attributes. Ringo will be based on the SNAP platform. We will run Ringo on machines with 1TB RAM and 80 CPU cores.

We are looking for students with strong programming skills and desire to build computer systems. Ringo is written in C++ and Python, so extensive experience with those languages is a plus.

Go to the application form.

Snapworld: A System for Processing Tera-Scale Graphs

Large graphs are fundamental to big data science and analytics. Processing of such graphs is challenging, since it is pushing the limits of current computing systems. Snapworld is a distributed framework for executing large computations on a compute cluster with over 1000 cores, based on the BSP (Bulk Synchnonous Parallel) model. The goal of the project is to advance Snapworld and develop graph algorithms that can handle tera-scale graphs - graphs with trillions of edges.

We are looking for students with strong programming skills and desire to build distributed computer systems and algorithms. Most of Snapworld is written in Python with some time sensitive modules in C++ using SNAP, so extensive experience with those languages is a plus.

Go to the application form.