Research Positions in the SNAP Group
Autumn Quarter 2021-22

Welcome to the application page for research positions in the SNAP group under Prof. Jure Leskovec, Autumn Quarter 2021-22!

Our group has open positions for Research Assistants and students interested in independent studies and research (CS191, CS195, CS199, CS399). These positions are available for Stanford University students only. Below are some of the possible research projects. All projects are high-impact, allowing participants to perform research and work on real-world problems and data, and leading to research publications or open source software. Positions are often extended over several quarters. We are looking for highly motivated students with any combination of skills: machine learning, data mining, network analysis, algorithms, and computer systems.

Please apply by filling out and submitting the form below. Apply quickly since the positions usually get filled early in the quarter. Thanks for your interest!

If you have any questions please contact Sue George at smgeorge@stanford.edu.

Application form

First and Last Name

SUNetID

SUNetID is your Stanford CS login name and contact email address, <your_SUNetID>@cs.stanford.edu. If you don't have a SUNetID, use <your_last_name>_<your_first_name>, so if your last name is Smith and your first name is John, use smith_john.

Email

Department

Student Status

Project(s)

Please select all the projects that you are interested in. You can find the project descriptions below.

Robust Embeddings for Multiple Downstream Tasks [description]
Keywords: representation learning, multi-task learning
Learning to Speed Up Simulations of Large-Scale Systems [description]
Keywords: large-scale simulation, speed up, graph neural networks
Graph-Based Architecture for Time-Aware Machine Meta-Learning [description]
Keywords: representation learning, temporal reasoning, meta-learning
Neural-Symbolic Visual Concept Reasoning [description]
Keywords: concept reasoning, neural-symbolic, program synthesis, few-shot learning

Position

Please select the position you are interested in. Please select all that apply.

25% RA
50% RA
Independent study (CS399, CS199, CS191, CS195)

Statement of Purpose

Briefly explain why you would like to participate in this project, why you think you are qualified to work on it, and how you would like to contribute.

Your Resume

Your Transcript

Click on the button below to Submit


Projects

Robust Embeddings for Multiple Downstream Tasks

Keywords: representation learning, multi-task learning

A common paradigm for machine learning in industrial settings is to: (1) Learn general-purpose entity embeddings, then (2) Use these embeddings as input to downstream machine learning models, which may have diverse goals from product rating prediction to fraudulent user detection. In such settings, embeddings are updated over time in order to incorporate new data trends or improved model architectures, giving rise to temporal distribution shifts. While traditional multi-task learning techniques handle such distribution shifts by assuming that (1) and (2) are tightly coupled, e.g. by training both steps end-to-end, in reality, this process is highly asynchronous, with the set of entity embeddings constantly evolving, and downstream models being updated infrequently, making such methods infeasible for real-world usage. Our goals are to: (1) develop a realistic experimental setting that captures the unique issues that arise in industrial multi-task settings, and (2) develop a multi-task pipeline for updating embeddings over time so that downstream models can make use of improved embeddings while minimizing retraining and coupling of upstream and downstream models.

We are looking for highly motivated students who have experience in machine learning and deep learning (e.g., CS224W, CS229, CS224N, CS231N), and are familiar with PyTorch.

Go to the application form.

Learning to Speed Up Simulations of Large-Scale Systems

Keywords: large-scale simulation, speed up, graph neural networks

Simulating the evolution of large-scale systems with millions of nodes and complicated interactions over time is pivotal in many scientific and engineering domains, e.g. fluid dynamics, laser-plasma interaction, weather prediction. etc, and usually requires high-performance computing (HPC). In this project, our goal is to develop machine learning methods that will be able to speed up such simulations by at least hundred-fold. We will build on our prior work of learning to simulate complex physics with graph neural networks, and design techniques and architectures that utilize the multiscale characteristics of the system under study and achieve a good accuracy vs. speed up tradeoff. Interested students will participate in one of the two applications: speeding up the simulation of laser-plasma interactions, which is a key problem in plasma physics; or speeding up the simulation of large-scale oil reservoir dynamics, which is crucial in the energy resource engineering.

We are looking for students who are interested in large-scale machine learning and want to work in the area of machine learning with graphs, and are proficient with PyTorch. Experience with GNNs (e.g. CS224W), CNNs (e.g. CS231N) and time series/sequence models is recommended.

Go to the application form.

Graph-Based Architecture for Time-Aware Machine Meta-Learning

Keywords: representation learning, temporal reasoning, meta-learning

Although neural network-based approaches have achieved great successes, their ability to deal with time and temporal reasoning is severely limited. They lack an understanding of the time aspects of learnt knowledge, thus they are not able to perform even some simple time-based tasks, such as changing weights or attention over time. The goal of the project is to develop a new time-aware meta-learning neural architecture that will be able to "think in and about time" as well as exercise its learned task knowledge to exceed state of the art in task performance and provide long-term learning. The aim is to achieve this goal by modeling complex time-dependent event dependencies and time-based causal awareness via temporal graphical relations. The project plans to develop the following novel capabilities: maintaining long-lasting knowledge about relevant past events; time-based causal awareness of actions; transfer-learning of time-invariant and time-conditioned knowledge; and mental time travel that provides a recall of past events as well as reasoning about future events.

We are looking for self-motivated students with experience and background knowledge in machine learning, deep learning, especially GNNs and meta-learning (CS224W, CS330, CS230/CS231N/CS224N), and are familiar with PyTorch.

Go to the application form.

Neural-Symbolic Visual Concept Reasoning

Keywords: concept reasoning, neural-symbolic, program synthesis, few-shot learning

Humans are good at forming a succinct representation of the raw visual inputs, as well as performing complicated reasoning on the representation, and then solve a wide variety of tasks or learn new tasks with very few examples. To enable such capability in machines, we believe that a key component is to learn visual concepts that are compositional and transferable across tasks, as well as learn an efficient way to manipulate such concepts. In this project, we combine tools from neural networks, e.g. graph neural networks, with program synthesis via reinforcement learning, to demonstrate the feasibility of such a system, and use it to solve two challenging tasks that require both learning a good representation and reasoning with few examples.

We are looking for highly motivated students who have experience in machine learning and deep learning (e.g., CS224W, CS229, CS224N, CS231N), and proficient with PyTorch.

Go to the application form.