Research Positions in the SNAP Group
Autumn Quarter 2015-16

Welcome to the application page for research positions in the SNAP group, Autumn Quarter 2015-16!

Our group has several open positions for Research Assistants as well as students interested in independent studies (CS191, CS199, CS399). These positions are available for Stanford University students only. Below is a list of relevant research projects. All the projects will lead to research publications and working demos/systems. We are looking for highly motivated students with any combination of skills: data mining, machine learning, algorithms, human computer interaction, and computer systems.

Please apply by filling out and submitting the form below. Apply quickly since the positions usually get filled early in the quarter. Thanks for your interest!

If you have any questions please contact Prof. Leskovec at

Application form

First and Last Name


SUNetID is your Stanford CS login name and contact email address, <your_SUNetID> If you don't have a SUNetID, use <your_last_name>_<your_first_name>, so if your last name is Smith and your first name is John, use smith_john.



Student Status


Please select all the projects that you are interested in. You can find the project descriptions below.

SnapVX: A Network-Based Convex Optimization Solver [description]
Combining Online Social Networks with Offline Physical Activity [description]
Using Machine Learning to Analyze and Complement Human Decision Making [description]
SNAP: Stanford Network Analysis Platform [description]
Ringo: In-memory Graph Exploration Engine [description]


Please select the position you are interested in. Please select all that apply.

25% RA
50% RA
Independent study (CS399, CS199, CS191)

Statement of Purpose

Briefly explain why you would like to participate in this project and why you think you are qualified to work on it.

Your Resume

Your Transcript

Click on the button below to Submit


SnapVX: A Network-Based Convex Optimization Solver

Convex optimization has become a widely used approach of modeling and solving problems in many different fields. However, as applications get larger and more intricate, classical methods of convex analysis begin to fail due to a lack of scalability. The challenge of large-scale optimization lies in developing methods general enough to work well independent of the input and capable of scaling to the immense datasets that today's applications require. We are building SnapVX a general solver for large scale convex problems defined on networks, which can be applied to a variety of examples in machine learning, graph analysis, and more!

We are looking for students with strong programming experience and interest in convex optimization. Knowledge of Python is a plus.

Go to the application form.

Combining Online Social Networks with Offline Physical Activity

The growing popularity of smartphones and wearable sensors provides us with an unprecedented view of physical activity and health outcomes across millions of individuals. The data collected from such devices (e.g., exercise, food intake, hydration, stress, sleep, heart rate, weight) provides a unique opportunity to understand how to encourage healthy physical activity and weight management to stem the inactivity epidemic that more and more countries are facing.

We are social creatures and enjoy sharing our lives with our peers who also influence our behavior through encouragement, support, competition, or peer pressure. Online social networks capture human interactions on a grand scale and could enable us to better understand the behavioral and social factors that influence individuals' decisions to engage in regular physical activity. We are combining activity tracking data with online social interaction data to improve our understanding of how to make users and communities more successful and healthy.

We are searching for students interested in exploring these questions using an activity tracking dataset of several million people including data exploration, data cleaning, data visualization, feature extraction, and machine learning. Experience in Python, dealing with large datasets, data visualization, social network analysis, data mining, machine learning is a plus.

Go to the application form.

Using Machine Learning to Analyze and Complement Human Decision Making

Understanding human decision making is a very challenging and exciting endeavor. Several diverse domains such as health care, judiciary, insurance etc. provide invaluable insights into human decision making capabilities. The high level goals of this project involve identifying interesting patterns in collective as well as individual decision making behavior, analyzing how these patterns evolve with time and, building frameworks which help us evaluate the goodness of decisions. We also aim to build interpretable machine learning models which allow us to understand as well as predict the future decisions of experts such as doctors, judges, insurance underwriters.

More specifically, for the Autumn quarter of 2015, we will be looking at data from a large insurance company which insures small to mid-scale businesses. The dataset consists of reports written to analyze the risk associated with a business, the decisions of whether a particular business was insured and the outcomes and aftermath of an insurance policy after its approval.

We will be analyzing this dataset and answering several interesting questions such as: how do the risk analysis teams of an insurance company make decisions and what features do they consider, can machine learning algorithms do a better job than humans in predicting the risks associated with insuring a business, how do we evaluate the decision making ability of the risk analysis teams, and how accurately can machine learning algorithms predict the future decisions of risk analysis teams.

We are looking for students who are interested in and have some experience with data exploration, data cleaning, visualization, feature extraction and machine learning. Working knowledge of Python or R is required. Experience in Java, dealing with large datasets is a plus.

Go to the application form.

SNAP: Stanford Network Analysis Platform

Stanford Network Analysis Platform (SNAP) is a general purpose, high performance network analysis and graph mining library that easily scales to massive networks with hundreds of millions of nodes, and billions of edges. It efficiently manipulates large graphs, calculates structural properties, generates regular and random graphs, and supports attributes on nodes and edges. SNAP is being constantly expanded with new graph and network algorithms for big-memory multi-core machines with 1TB RAM and 80 CPU cores.

We are looking for students with interest in developing sequential or parallel graph algorithms. SNAP is written in C++, so extensive experience with this language is a plus.

Go to the application form.

Ringo: In-memory Graph Exploration Engine

What data analysis engine would you build if your computer has unlimited RAM and CPU? Large-scale data analysis is transforming science and industry. However, tools and solutions for data analysts are bulky and cumbersome to use. The goal of the Ringo research project is to build an interactive system for analysis of large datasets with billions of items. The system will implement strong primitives to handle relational tables as well as networks -- huge graphs with node and edge attributes. Ringo will be based on the SNAP platform. We will run Ringo on machines with 1TB RAM and 80 CPU cores.

We are looking for students with strong programming skills and desire to build computer systems. Ringo is written in C++ and Python, so extensive experience with those languages is a plus.

Go to the application form.