Social Network and Social Media Analysis: Methods, Models and Applications

Time	Topic
08:00 - 08:15	Opening Remarks
08:15 - 09:00	Invited Speaker:	Jennifer Chayes (Microsoft Research)
09:00 - 09:30	Coffee Break
09:30 - 10:30	Poster Spotlights
10:30 - 15:30	Poster Session/Ski Break
15:30 - 16:15	Invited Speaker:	Zoubin Ghahramani (University of Cambridge)
16:15 - 17:00	Invited Speaker:	Andrew Tomkins (Google Research)
17:00 - 17:30	Coffee Break
17:30 - 18:15	Invited Speaker:	Sharad Goel (Microsoft Research)
18:15 - 19:00	Closing Panel

Jennifer Chayes (Microsoft Research)

The Power of Locality for Network Algorithms

Abstract:

Given the massive size of most networks of current interest, conventional algorithms, which scale as polynomials in the network size, are woefully inadequate. This talk focuses on how to use locality to deliver much more efficient algorithms (quasilinear or even sublinear in the network size), even if this requires that we accept some degradation in the approximation relative to optimal algorithms. In first part of the talk, I consider systems that have only local data access. Many large networks are encoded locally, e.g., with adjacency lists. How does the nature of the data access impact the efficiency of algorithms? Surprisingly, we show that small differences in data access can lead to very large differences in efficiency and approximability of network algorithms. As an example, we consider a covering problem which arises naturally for recruiters on social networks, and show how small differences between the information on neighbors in LinkedIn and Facebook lead to large differences in their utility to recruiters. In the second part of the talk, I consider problems where there is no a priori limit on data access. Nevertheless, we show how for a host of questions - pagerank, coverage, diffusion, determining the most influential nodes in the independent cascade model - we can exploit locality to design highly efficient algorithms, often with quasilinear or sublinear runtimes. This talk represents several projects with Christian Borgs, Mickey Brautbar, Brendan Lucier and Shanghua Teng.

Bio:

Jennifer Tour Chayes is Distinguished Scientist and Managing Director of Microsoft Research New England in Cambridge, Massachusetts, which she co-founded in 2008, and Microsoft Research New York City, which she co-founded in 2012. Chayes was Research Area Manager for Mathematics, Theoretical Computer Science and Cryptography at Microsoft Research Redmond until 2008. Chayes joined Microsoft Research in 1997, when she co-founded the Theory Group. Her research areas include phase transitions in discrete mathematics and computer science, structural and dynamical properties of self-engineered networks, and algorithmic game theory. She is the co-author of over 110 scientific papers and the co-inventor of more than 25 patents.

Zoubin Ghahramani (University of Cambridge)

Probabilistic Models for Social Networks and Other Relational Data

Abstract:

Network data and more generally relational data encoding the pairwise relations between objects appear in many fields. For instance in biology, a protein network connects interacting partners, while in a social network, links among people indicate social relationships. The problems of analysing, understanding and modelling such networks have attracted interest from many research communities. I will briefly review some probabilistic approaches to modelling networks. The key idea behind many such models is that each object has certain latent features, and that observed links in the network depend on these latent features. Probabilistic inference allows one to discover the potentially unbounded number of latent features (including discovering communities as a special case), predict missing links, and generally learn about the statistical properties of the networks. Many of these models can be cast within the theoretical framework of exchangeable arrays established by Aldous, Hoover and Kallenberg. I will describe our work on a general network model (the Random Function Model) that instantiates this theory using Gaussian processes, and relate it to existing models. I will also discuss our work on the Infinite Latent Attribute (ILA) model which allows for a highly structured nonparametric latent variable representation of nodes in a network. Finally, I will describe our Latent Feature Propagation model for dynamic networks. What ties these models together is the idea that rich latent representations underlie the structure of networks, and that these can be discovered via Bayesian inference.

Joint work with Creighton Heaukulani, David A. Knowles, James Lloyd, Peter Orbanz, Konstantina Palla, and Dan Roy.

Bio:

Zoubin Ghahramani is Professor of Information Engineering at the University of Cambridge, UK, and is also Associate Research Professor of Machine Learning at Carnegie Mellon University, USA. He obtained BA and BSE degrees from University of Pennsylvania, and a PhD in 1995 from MIT working with Prof Mike Jordan. He did a postdoc in Computer Science at University of Toronto working with Prof Geoff Hinton. His work has included research on human sensorimotor control, cognitive science, statistics, and machine learning. His current focus is on Bayesian approaches to statistical machine learning, and has recently started a research programme on machine learning applications to information retrieval. He has published over 100 peer reviewed papers, and serves on the editorial boards of several leading journals in the field, including JMLR, JAIR, Annals of Statistics, Machine Learning, and Bayesian Analysis. He is Associate Editor in Chief of IEEE Transactions on Pattern Analysis and Machine Intelligence, currently the IEEE's highest impact journal. He also serves on the Board of the International Machine Learning Society, and was Program Chair of the 2007 International Machine Learning Conference.

Sharad Goel (Microsoft Research)

"Going Viral": The Structure of Online Diffusion

Abstract:

Viral products and ideas are intuitively understood to grow through a person-to-person diffusion process analogous to that of an infectious disease. Until recently, however, it has been prohibitively difficult to directly observe such viral events, and thus to rigorously quantify or characterize their structural properties. Here we propose a formal measure of what we label ``structural virality'' that interpolates between two extremes: content that gains its popularity through a single, large broadcast, and that which grows through multiple generations with any one individual directly responsible for only a fraction of total adoption. We use this notion of structural virality to analyze a unique dataset of several billion diffusion events---including the propagation of news stories, videos, images, and petitions---on a large online social network. We find that the very largest observed events nearly always exhibit high structural virality, providing some of the first direct evidence that many of the most popular products and ideas grow through person-to-person diffusion. However, medium-sized events---having thousands of adopters---exhibit surprising structural diversity, and are as likely to grow via broadcast as through viral means. These empirical findings are consistent with a contagion having subcritical infection rate spreading on a network with heavy-tailed degree distribution, which, notably, is a non-viral mechanism in the traditional, theoretical sense.

Bio:

Sharad Goel holds a PhD in Applied Mathematics and a Masters in Computer Science from Cornell, and a BS in Mathematics from the University of Chicago. Following postdoctoral positions in the math departments at Stanford and the University of Southern California, he joined the Microeconomics and Social Systems group at Yahoo! Research. Sharad is currently a Senior Researcher at Microsoft Research – New York City.

Andrew Tomkins (Google Research)

Social Network Modeling and Task Completion

Abstract

In this talk I'll describe two classical statistical estimation problems facing social network providers: estimation of edge weight, and prediction of engagement. I'll then offer some predictions about problems of interest to the next generation of social networks, related to the provision of assistance in completing tasks. Finally, I'll close with some questions about how network modeling algorithms could or should interact with the significant literature around network characterization.

Andrés Corrada-Emmanuel (Nanigans)

Accepted Papers

Learning Mixed Membership Stochastic Blockmodels

Anima Anandkumar, Rong Ge, Daniel Hsu and Sham Kakade
Identifying Influenza Trends via Twitter

Dena Asta and Cosma Shalizi
A Scalable Approach to Size-Independent Network Similarity

Michele Berlingerio, Danai Koutra, Tina Eliassi-Rad and Christos Faloutsos
Persistent Homology of Collaboration Networks

Corrie Jacobien Carstens and Kathy Horadam
Consistency of Co-clustering for Nonparametric Bipartite Graph(on) Data

David Choi and Patrick Wolfe
Consistent Estimation of Counting Processes on Sparse Networks

Alexander D'Amour and Edoardo Airoldi
Capturing Unobserved Correlated Effects in Diffusion in Large Virtual Networks

Elenna Dugundji, Ate Poorthuis and Michiel Van Meeteren
Mapping the Geographical Diffusion of New Words

Jacob Eisenstein, Brendan O'Connor, Noah A. Smith and Eric P. Xing
Sampling Strategies for the Scalable Inference of Communities

Prem Gopalan and David Blei
Winner of Best Student Poster Award
Learning Latent Factor Models of Human Travel

Michael Guerzhoy and Aaron Hertzmann
Social Group Modeling with Probabilistic Soft Logic

Bert Huang, Stephen Bach, Eric Norris, Jay Pujara and Lise Getoor
Causal Estimation of Peer Influence Effects

Edward Kao, Panos Toulis, Edoardo Airoldi and Donald Rubin
SiGMa: Simple Greedy Matching for Aligning Large Knowledge Bases

Simon Lacoste-Julien, Konstantina Palla, Alex Davies, Gjergji Kasneci, Thore Graepel and Zoubin Ghahramani
Valid Inference from Non-Ignorable Network Sampling Designs

Simon Lunagomez and Edoardo Airoldi
Learning to Discover Social Circles in Ego Networks

Julian McAuley and Jure Leskovec
Order Statistics of Observed Network Degrees

Sofia C. Olhede and Patrick Wolfe
Predicting US Primary Elections with Twitter

Lei Shi, Neeraj Agarwal, Ankur Agarwal, Rahul Garg and Jacob Spoelstra
Waiting for a Retweet: Modeling Waiting Times in Information Propagation

Emma Spiro, Christopher Dubois and Carter Butts
Winner of Best Student Paper Award
Network Exposure to Multiple Universes

Johan Ugander, Brian Karrer, Lars Backstrom and Jon Kleinberg
Modeling Dynamic Social Networks with Vertex Evolution via Latent Graphical Models

Ragupathyraj Valluvan, Zack Almquist, Animashree Anandkumar and Carter Butts
Model Selection for Degree-corrected Block Models

Xiaoran Yan, Jacob Jensen, Florent Krzakala, Cristopher Moore, Cosma Shalizi, Lenka Zdeborova, Pan Zhang and Yaojia Zhu
Oriented and Degree-generated Block Models: Generating and Inferring Communities with Inhomogeneous Degree Distributions

Yaojia Zhu, Xiaoran Yan and Cristopher Moore

Social Network and Social Media Analysis: Methods, Models and Applications

NIPS 2012 Workshop

December 7, 2012

Lake Tahoe, Nevada

Room: Glenbrook & Emerald Bay, Harrah’s

Organizers:

Motivation and Goals

Schedule

Opening Remarks

Coffee Break

Poster Spotlights

Poster Session/Ski Break

Coffee Break

Closing Panel

Confirmed Speakers

Jennifer Chayes (Microsoft Research)

The Power of Locality for Network Algorithms

Abstract:

Bio:

Zoubin Ghahramani (University of Cambridge)

Probabilistic Models for Social Networks and Other Relational Data

Abstract:

Bio:

Sharad Goel (Microsoft Research)

"Going Viral": The Structure of Online Diffusion

Abstract:

Bio:

Andrew Tomkins (Google Research)

Social Network Modeling and Task Completion

Abstract

Andrés Corrada-Emmanuel (Nanigans)

Accepted Papers