Research Positions in the SNAP Group
Winter Quarter 2017-18

Welcome to the application page for research positions in the SNAP group, Winter Quarter 2017-18!

Our group has open positions for students interested in independent studies or research (CS191, CS199, CS399). These positions are available for Stanford University students only. Below are some of the possible research projects. All projects are high-impact, allowing participants to perform research and work on real-world problems and data, and leading to research publications or working systems. We are looking for highly motivated students with a combination of the following skills: data mining, machine learning, algorithms, network analysis, and computer systems.

Please apply by filling out and submitting the form below. Thanks for your interest!

If you have any questions please contact Prof. Leskovec at jure@cs.stanford.edu.

Application form

First and Last Name

SUNetID

SUNetID is your Stanford CS login name and contact email address, <your_SUNetID>@cs.stanford.edu. If you don't have a SUNetID, use <your_last_name>_<your_first_name>, so if your last name is Smith and your first name is John, use smith_john.

Email

Department

Student Status

Project(s)

Please select all the projects that you are interested in. You can find the project descriptions below.

Gender Bias in Medical Care [description]
Keywords: data mining, statistical analysis, NLP
Mining Data Science Patterns [description]
Keywords: big data analysis, data science, recommender systems
SNAP: Stanford Network Analysis Platform [description]
Keywords: network analysis, open-source software, graph algorithms, parallel algorithms

Statement of Purpose

Briefly explain why you would like to participate in this project, why you think you are qualified to work on it, and how you would like to contribute.

Your Resume

Your Transcript

Click on the button below to Submit


Projects

Gender Bias in Medical Care

Keywords: data mining, statistical analysis, NLP

Based on a large dataset of 19,000 comments made by patients on about 300 different physicians, the goal of the project is to analyze what patients think of physicians and existing gender bias between male and female physicians. To that effect, we have a year's worth of comments from patients for physicians that we have demographic data on (gender, ethnicity, rank, track) and we also have the evaluators' (patients') gender and age.

We are looking for students with interest in social science and gender issues. Desired skills include statistical analysis and basic NLP-based methods.

Go to the application form.

Mining Data Science Patterns

Keywords: big data analysis, data science, deep learning

Data Scientists often develop a standard set of software patterns to analyze data and gather insights from it. However, these patterns are often repetitive and best practices are scattered across StackOverflow, GitHub and iPython Notebooks. The goal of this project is to identify frequent data science patterns, understand their semantics, and develop a platform that will guide a data scientist during the analysis of a given dataset. The challenge will be in identifying and extracting common patterns, developing a code similarity function, and building a code recommendation engine by leveraging machine learning.

We are searching for students with a strong programming background (especially in Python), and an interest in data science and machine learning.

Go to the application form.

SNAP: Stanford Network Analysis Platform

Keywords: network analysis, open-source software, graph algorithms, parallel algorithms

Stanford Network Analysis Platform (SNAP) is a general purpose, high performance network analysis and graph mining library that easily scales to massive networks with billions of nodes and edges. It efficiently manipulates large graphs, calculates structural properties, generates regular and random graphs, and supports attributes on nodes and edges. SNAP is being constantly expanded with new graph and network algorithms for big-memory multi-core machines with 12TB RAM and 288 CPU cores.

We are looking for students with interest in contributing to the SNAP codebase or in developing sequential or parallel graph algorithms. SNAP is written mostly in C++, so experience with this language is a plus.

Go to the application form.