MLG 2013, Eleventh Workshop on Mining and Learning with Graphs

Invited Speakers:

Evimaria Terzi (Boston University)

Evimaria Terzi is an assistant professor at the Computer Science Dept at Boston University. Before joining BU in 2009, she was a research scientist at IBM Almaden Research Center. Evimaria has received her PhD from University of Helsinki, Finland and her MSc from Purdue University. Evimaria is a recipient of the Microsoft Faculty Fellowship (2010) and the NSF CAREER award (2013). Her research interests span a wide range of data-mining topics including algorithmic problems arising in recommendation systems and online social networks and social media.

The Dynamics of Opinion Formation in Social Networks
The process of opinion formation through synthesis and contrast of different viewpoints has been the subject of many studies in economics and social sciences. Today, this process manifests itself also in online social networks and social media. The key characteristic of successful promotion campaigns is that they take into consideration such opinion-formation dynamics in order to create a overall favorable opinion about a specific information item, such as a person, a product, or an idea. In this talk, we will review models of opinion dynamics and give a game-theoretic viewpoint to the opinion-formation process. Moreover, we will formalize the campaign-design problem as the problem of identifying a set of target individuals whose positive opinion about an information item will maximize the overall positive opinion for the item in the social network. From the technical point of view, we will discuss different variants of such campaign-design problems and analyze their computational difficulties as well as their applicability in practical settings.

Sam Shah (LinkedIn)

Sam Shah is responsible for many of LinkedIn's large-scale recommendation and analytics systems, which analyze hundreds of terabytes of data daily to produce products and insights that serve LinkedIn's members. His work involves pure research, product-focused features, and infrastructure development, including social network analysis, recommendation engines, distributed systems, and grid computing.

Large-Scale Graph Mining for Recommendations
The availability and affordability of large-scale data processing is transforming graph mining into a core production use case, especially in the consumer web space. At LinkedIn, the largest professional online social network with 225+ million members, a crucial characteristic is the use of static and temporal network features for many applications, particularly recommendations. These include "People You May Know", a link prediction system to find other members on the network; "Endorsements", a lightweight skill reputation product; "Related Searches", query recommendations in our search engine; and more. How do we perform this graph mining at scale? What are some of the challenges we face? Besides the social graph, what about other interesting, but potentially more complex and larger graphs? In this talk, I will illustrate several of LinkedIn's solutions in large scale graph mining.

Jure Leskovec (Stanford)

Jure Leskovec is assistant professor of Computer Science at Stanford University. His research focuses on mining large social and information networks. Problems he investigates are motivated by large scale data, the Web and on-line media. This research has won several awards including best paper awards at KDD, WSDM, ICDM, WWW, ACM KDD dissertation award, Microsoft Research Faculty Fellowship, as well as Alfred P. Sloan Fellowship. Jure received his bachelor's degree in computer science from University of Ljubljana, Slovenia, Ph.D. in machine learning from the Carnegie Mellon University and postdoctoral training at Cornell University. You can follow him on Twitter @jure

Analyzing and Influencing the Evolution of Online Communities
Activity of millions of humans on the Web leaves massive digital traces, that can be naturally represented and analyzed as complex dynamic networks of human interactions. Today the Web is a `sensor' that captures the pulse of humanity and allows us to observe phenomena that were once essentially invisible to us: the social interactions and collective behavior of hundreds of millions of people. In this talk we discuss how large-scale data analytics can be applied to model user behavior in online networks and to inform the design of future online computing applications: How will a community or a social network evolve in the future? How friends in the network shape one's opinions? How can we create incentives to influence the evolution of an online community? We discuss algorithmic methods that scale to massive networks and mathematical models that seek to abstract some of the underlying phenomena.

Tina Eliassi-Rad (Rutgers)

Tina Eliassi-Rad is an Associate Professor of Computer Science at Rutgers University. Before joining academia, she was a Member of Technical Staff and Principal Investigator at Lawrence Livermore National Laboratory. Tina earned her Ph.D. in Computer Sciences (with a minor in Mathematical Statistics) at the University of Wisconsin-Madison. Within data mining and machine learning, Tina's research has been applied to the World-Wide Web, text corpora, large-scale scientific simulation data, complex networks, and cyber situational awareness. She has published over 50 peer-reviewed papers (including a best paper runner-up award at ICDM'09 and a best interdisciplanary paper award at CIKM'12); and has given over 80 invited presentations. Tina is an action editor for the Data Mining and Knowledge Discovery Journal. In 2010, she received an Outstanding Mentor Award from the US DOE Office of Science and a Directorate Gold Award from Lawrence Livermore National Laboratory for work on cyber situational awareness. Visit http://eliassi.org for more details

Measuring Tie Strength in Implicit Social Networks
Given a set of people and a set of events attended by them, we address the problem of measuring connectedness or tie strength between each pair of persons. The underlying assumption is that attendance at mutual events gives an implicit social network between people. We take an axiomatic approach to this problem. Starting from a list of axioms, which a measure of tie strength must satisfy, we characterize functions that satisfy all the axioms. We then show that there is a range of tie-strength measures that satisfy this characterization. A measure of tie strength induces a ranking on the edges of the social network (and on the set of neighbors for every person). We show that for applications where the ranking, and not the absolute value of the tie strength, is the important aspect about the measure, the axioms are equivalent to a natural partial order. To settle on a particular measure, we must make a non-obvious decision about extending this partial order to a total order. This decision is best left to particular applications. We also classify existing tie-strength measures according to the axioms that they satisfy; and observe that none of the "self-referential" tie-strength measures satisfy the axioms. In our experiments, we demonstrate the efficacy of our approach; show the completeness and soundness of our axioms, and present Kendall Tau Rank Correlation between various tie-strength measures.

Evgeniy Gabrilovich (Google)

Dr. Evgeniy Gabrilovich is a senior staff research scientist at Google, where he works on knowledge discovery from the web. Prior to joining Google in 2012, he was a director of research and head of the natural language processing and information retrieval group at Yahoo! Research. Evgeniy is an ACM Distinguished Scientist (2012), and is a recipient of the 2010 Karen Sparck Jones Award for his contributions to natural language processing and information retrieval. He served as an area chair or senior program committee member at numerous major conferences, including SIGIR, WWW, WSDM, AAAI, IJCAI, ACL, EMNLP, CIKM, ICDM and ICWSM. He has organized a number of workshops and taught multiple tutorials at SIGIR, ACL, WWW, WSDM, ICML, IJCAI, AAAI, CIKM and EC. Evgeniy earned his PhD in computer science from the Technion - Israel Institute of Technology.

Understanding the Web using Big Knowledge
Google's Knowledge Graph contains over half a billion entities and over 18 billion facts and connections. The Knowledge Graph can grow via human contributions, linking to existing knowledge repositories, and automatic acquisition of knowledge from the Internet. In this talk, we will discuss the frontiers of research in knowledge discovery on the Web. We will also discuss new functionalities that become possible due to deeper, knowledge-based text understanding, including proactively fetching relevant information and entity-based services.

David Gleich (Purdue)

Professor Gleich is interested in how we can utilize matrix algebra to express -- and improve -- algorithms in network analysis and data-based simulation analysis. Matrix algebra is a particularly attractive paradigm to study these procedures as it often gives rise to efficient computational procedures in a variety of settings (serial, parallel, streaming). This research straddles a few different areas and often involves working with large datasets on high performance computing architectures (e.g. MPI clusters) and data computing architectures (e.g. MapReduce).

Personalized PageRank based Community Detection
Personalized PageRank is a reasonably well known technique to find a community in a network starting from a single node. It works by approximating the stationary distribution of a resetting random-walk and using that stationary distribution to estimate the presence of nearby cuts in the graph. I'll discuss recent work on how to find use a personalized PageRank community to quickly estimate the sets of best conductance anywhere in the graph as well as how to find a good set of seeds to cover the entire graph with personalized PageRank communities.