** MAPPR** (Motif-based Approximate Personalized PageRank)
is an algorithmic framework for local higher-order clustering. Given any graph, a motif of interest,
and a target node, it can find a local cluster around this node with minimal motif conductance.

Local graph clustering methods are attractive because they enable
targeted clustering around a given seed node and are faster than traditional
global graph clustering methods because their runtime does not depend on the
size of the input graph.
However, current local graph partitioning methods are not designed to account
for the higher-order structures crucial to the network, nor can they effectively
handle directed networks.
In this project, we introduce a new class of local graph clustering methods that address
these issues by incorporating higher-order network information captured by small
subgraphs, also called *network motifs*.
We develop the Motif-based Approximate Personalized PageRank (MAPPR)
algorithm that finds clusters containing a seed node with minimal *motif conductance*.
We also generalize existing theory to prove the fast running time (independent of
the size of the graph) and obtain theoretical guarantees on the
cluster quality.

Local Higher-Order Graph Clustering.

Hao Yin, Austin R. Benson, Jure Leskovec, and David F. Gleich.

To appear in KDD, 2017.

Our goal is to find a subset of nodes around the seed node with small *motif conductance*. The
motif conductances of a set S of nodes with respect to a motif M is
φ_{M}(S) = cut_{M}(S, S) / min(vol_{M}(S),
vol_{M}(S)), where
cut_{M}(S, S) is the
number of instances of motif M with at least one node in S and one in
S,
and vol_{M}(S) is the number of nodes in instances of M that reside in S.

Our MAPPR method for finding sets S with small motif conductance is:

- Given a network and a motif M of interest, form the motif adjacency matrix whose (i, j) entries are the number of times that nodes i and j co-occur in instances of M.
- Compute ordering on nodes based on the approximated Personalized PageRank vector seeded at the target node.
- Find the prefix set S of σ with the smallest motif conductance; formally:
S = argmin φ
_{M}(S_{r}), where S_{r}= {σ_{1}, ..., σ_{r}}.

We also compare the recovery performance of the two methods on real-world networks with ground truth communities. Although these graphs have as many as 1.8 billion edges, the MAPPR method takes at most a few seconds per seed once the graph is in memory. Details of these experiments are explained in the paper.