Quick Introduction to SNAP

Graph and Network Types

SNAP supports graphs and networks. Graphs describe topologies. That is nodes with unique integer ids and directed/undirected/multiple edges between the nodes of the graph. Networks are graphs with data on nodes and/or edges of the network. Data types that reside on nodes and edges are simply passed as template parameters which provides a very fast and convenient way to implement various kinds of networks with rich data on nodes and edges.

Graph types in SNAP:

TUNGraph: undirected graph (single edge between an unordered pair of nodes) TNGraph: directed graph (single directed edge between an ordered pair of nodes) TNEGraph: directed multi-graph (any number of directed edges between a pair of nodes)

Network types in SNAP:

TNodeNet<TNodeData>: like TNGraph but with TNodeData object for each node TNodeEDatNet<TNodeData, TEdgeData>: like TNGraph but with TNodeData on each node and TEdgeData on each edge TNodeEdgeNet<TNodeData, TEdgeData>: like TNEGraph but with TNodeData on each node and TEdgeData on each edge TBigNet<TNodeData>: memory efficient implementation of TNodeNet that avoids memory fragmentation and handles billions of edges (depending on how much RAM you have)

An example of how to create a network with rich data on its nodes can be found in imdbnet.h, which is part of SNAP. imdbnet.h provides an example of a IMDB actors-to-movies bipartite network with movie and actor data on the nodes of the network.

Graph Creation

Example of how to create and use a directed graph:

// create a graph PNGraph Graph = TNGraph::New(); Graph->AddNode(1); Graph->AddNode(5); Graph->AddNode(32); Graph->AddEdge(1,5); Graph->AddEdge(5,1); Graph->AddEdge(5,32);

Nodes have explicit (and arbitrary) node ids. There is no restriction for node ids to be contiguous integers starting at 0. In a multi-graph TNEGraph edges have explicit integer ids. In TUNGraph and TNGraph edges have no explicit ids -- edges are identified by a pair node ids.

SNAP uses smart-pointers (TPt) so there is not need to explicitly free (delete) graph objects. They self-destruct when they are not needed anymore. Prefix P in the class name stands for a pointer, while T means a type.

Networks are created in the same way as graphs.

Iterators

Many SNAP operations are based on node and edge iterators which allow for efficient implementation of algorithms that work on networks regardless of their type (directed, undirected, graphs, networks) and specific implementation.

Some examples of iterator usage are shown below:

// create a directed random graph on 100 nodes and 1k edges PNGraph Graph = TSnap::GenRndGnm<PNGraph>(100, 1000); // traverse the nodes for (TNGraph::TNodeI NI = Graph->BegNI(); NI < Graph->EndNI(); NI++) { printf("node id %d with out-degree %d and in-degree %d\n", NI.GetId(), NI.GetOutDeg(), NI.GetInDeg()); } // traverse the edges for (TNGraph::TEdgeI EI = Graph->BegEI(); EI < Graph->EndEI(); EI++) { printf("edge (%d, %d)\n", EI.GetSrcNId(), EI.GetDstNId()); } // we can traverse the edges also like this for (TNGraph::TNodeI NI = Graph->BegNI(); NI < Graph->EndNI(); NI++) { for (int e = 0; e < NI.GetOutDeg(); e++) { printf("edge (%d %d)\n", NI.GetId(), NI.GetOutNId(e)); } }

All graph and network datatypes define node and edge iterators. In general graph/network data types use the following functions to return various iterators:

BegNI(): iterator to first node EndNI(): iterator to one past last node GetNI(u): iterator to node with id u BegEI(): iterator to first edge EndEI(): iterator to one past last edge GetEI(u,v): iterator to edge (u,v) GetEI(e): iterator to edge with id e (only for multigraphs)

In general node iterators provide the following functionality:

GetId(): return node id GetOutDeg(): return out-degree of a node GetInDeg(): return in-degree of a node GetOutNId(e): return node id of the endpoint of e-th out-edge GetInNId(e): return node id of the endpoint of e-th in-edge IsOutNId(int NId): do we point to node id n IsInNId(n): does node id n point to us IsNbhNId(n): is node n our neighbor

In addition, iterators over networks also allow for easy access to the data:

GetDat(): return data type TNodeData associated with the node GetOutNDat(e): return data associated with node at endpoint of e-th out-edge GetInNDat(e): return data associated with node at endpoint of e-th in-edge GetOutEDat(e): return data associated with e-th out-edge GetInEDat(e): return data associated with e-th in-edge

For additional information on node (TNodeI) and edge (TEdgeI) iterators see the implementation of various graph and network data types in graph.h and network.h.

Input/Output

With SNAP it is easy to save and load networks in various formats. Internally SNAP saves networks in compact binary format but functions for loading and saving networks in various other text and XML formats are also available (see gio.h).

For example:

// generate a network using Forest Fire model PNGraph G = TSnap::GenForestFire(1000, 0.35, 0.35); // save and load binary { TFOut FOut("test.graph"); G->Save(FOut); } { TFIn FIn("test.graph"); PNGraph G2 = TNGraph::Load(FIn); } // save and load from a text file TSnap::SaveEdgeList(G2, "test.txt", "Save as tab-separated list of edges"); PNGraph G3 = TSnap::LoadEdgeList("test.txt", 0, 1);

Manipulating Graphs and Networks

SNAP provides rich functionality to efficiently manipulate graphs and networks. Functions are implemented as a part of namespace TSnap. All functions support any graph/network type.

For example:

// generate a network using Forest Fire model PNGraph G = TSnap::GenForestFire(1000, 0.35, 0.35); // convert to undirected graph TUNGraph PUNGraph UG = TSnap::ConvertGraph<PUNGraph>(G); // get largest weakly connected component of G PNGraph WccG = TSnap::GetMxWcc(G); // get a subgraph induced on nodes {0,1,2,3,4,5} PNGraph SubG = TSnap::GetSubGraph(G, TIntV::GetV(0,1,2,3,4)); // get 3-core of G PNGraph Core3 = TSnap::GetKCore(G, 3); // delete nodes of degree 10 TSnap::DelDegKNodes(G, 10);

For more examples of how to work with networks see example applications in SnapSamples directory of the SNAP distribution. Also refer to description of files to see what functionality is implemented in various SNAP files.

Computing Structural Properties

SNAP provides rich functionality to efficiently compute structural properties of networks. Functions are implemented as a part of namespace TSnap. All functions support any graph/network type.

For example

// generate a Preferential Attachment graph on 1000 nodes and node out degree of 3 PUNGraph G = TSnap::GenPrefAttach(1000, 3); // get distribution of connected components (component size, count) TVec<TPair<TInt, TInt> > CntV; // vector of pairs of integers (size, count) TSnap::GetWccSzCnt(G, CntV); // get degree distribution pairs (degree, count) TSnap::GetOutDegCnt(G, CntV); // get first eigenvector of graph adjacency matrix TFltV EigV; // vector of floats TSnap::GetEigVec(G, EigV); // get diameter of G TSnap::GetBfsFullDiam(G); // count the number of triads in G, get the clustering coefficient of G TSnap::GetTriads(G); TSnap::GetClustCf(G);

For more examples of how to work with networks see example applications in SnapSamples directory of the SNAP distribution. Also refer to description of files to see what functionality is implemented in various SNAP files.