Open positions
Open research positions in SNAP group are available here.

Snap.py - SNAP for Python

About Snap.py

Snap.py is a Python interface for SNAP. SNAP is a general purpose, high performance system for analysis and manipulation of large networks. SNAP is written in C++ and optimized for maximum performance and compact graph representation. It easily scales to massive networks with hundreds of millions of nodes, and billions of edges.

Snap.py provides performance benefits of SNAP, combined with flexibility of Python. Most of the SNAP functionality is available via Snap.py in Python.

The latest version of Snap.py is 6.0 (Dec 28, 2020), available for macOS, Linux, and Windows 64-bit. This version is a major release with a large number of new features, most notably a significantly improved way to call Snap.py functions in Python, a NetworkX compatibility layer, standard Python functions to handle SNAP vector and hash types, new functions for egonets and graph union, and a completely revised package building infrastructure with a better support for various versions of Python (see Release Notes for details). These enhancements are backward compatible, so existing Snap.py based programs should continue to work.

System Requirements

Snap.py supports Python 2.x and Python 3.x on macOS, Linux, and Windows 64-bit. Snap.py requires that Python is installed on your machine. Make sure that your operating system is 64-bit and that your Python is a 64-bit version.

Snap.py is self-contained, it does not require any additional packages for its basic functionality. However, it requires external packages to support plotting and visualization functionality. The following packages need to be installed in addition to Snap.py, if you want to use plotting and visualizations in Snap.py:

Set the system PATH variable, so that Gnuplot and Graphviz are available, or put their executables in the working directory.

Installing Snap.py

Snap.py can be installed via the pip module. To install Snap.py, execute pip from the command line as follows:


python -m pip install snap-stanford

If you have more than one version of Python installed on the system, make sure that python refers to the executable that you want to install Snap.py for. You might also need to add --user after install, if pip complains about your adminsitrative rights. The most recent notes about installing Snap.py on various systems is available at this document: Snap.py Installation Matrix.

Manual Install of Snap.py

If you want to use Snap.py in a local directory without installing it, then download the corresponding Snap.py package for your system, unpack it, and copy files snap.py and _snap.so (or _snap.pyd) to your working directory. The working directory must be different than the install directory.

Documentation and Support

Snap.py Tutorial and Manual are available.

Snap.py is a Python interface for SNAP, which is written in C++. Most of the SNAP functionality is supported. For more details on SNAP C++, check out SNAP C++ documentation.

A tutorial on Large Scale Network Analytics with SNAP with a significant Snap.py specific component was given at the WWW2015 conference in Florence.

Use the SNAP and Snap.py users mailing list for any questions or a discussion about Snap.py installation, use, and development. To post to the group, send your message to snap-discuss at googlegroups dot com.

Quick Introduction to Snap.py

This document gives a quick introduction to a range of Snap.py operations.

Several programs are available to demonstrate the use of Snap.py. The programs are also useful as tests to confirm that your installation of Snap.py is working correctly:

The code from intro.py is explained in more details below.

All the code assumes that Snap.py has been imported by the Python program. Make sure that you execute this line in Python before running any of the code below:


import snap

Graph and Network Types

Snap.py supports graphs and networks. Graphs describe topologies. That is nodes with unique integer ids and directed/undirected/multiple edges between the nodes of the graph. Networks are graphs with data on nodes and/or edges of the network. Data types that reside on nodes and edges are simply passed as template parameters which provides a very fast and convenient way to implement various kinds of networks with rich data on nodes and edges.

Graph types in SNAP:


TUNGraph: undirected graph (single edge between an unordered pair of nodes) TNGraph: directed graph (single directed edge between an ordered pair of nodes)

Network types in SNAP:


TNEANet: directed multigraph with attributes for nodes and edges

Graph Creation

Example of how to create and use a directed graph:


# create a graph TNGraph G1 = snap.TNGraph.New() G1.AddNode(1) G1.AddNode(5) G1.AddNode(32) G1.AddEdge(1,5) G1.AddEdge(5,1) G1.AddEdge(5,32)

Nodes have explicit (and arbitrary) node ids. There is no restriction for node ids to be contiguous integers starting at 0. In TUNGraph and TNGraph edges have no explicit ids -- edges are identified by a pair node ids.

Networks are created in the same way as graphs.

Iterators

Many SNAP operations are based on node and edge iterators which allow for efficient implementation of algorithms that work on networks regardless of their type (directed, undirected, graphs, networks) and specific implementation.

Some examples of iterator usage in Snap.py are shown below:


# create a directed random graph on 100 nodes and 1k edges G2 = snap.GenRndGnm(snap.TNGraph, 100, 1000) # traverse the nodes for NI in G2.Nodes(): print("node id %d with out-degree %d and in-degree %d" % ( NI.GetId(), NI.GetOutDeg(), NI.GetInDeg())) # traverse the edges for EI in G2.Edges(): print("edge (%d, %d)" % (EI.GetSrcNId(), EI.GetDstNId())) # traverse the edges by nodes for NI in G2.Nodes(): for Id in NI.GetOutEdges(): print("edge (%d %d)" % (NI.GetId(), Id))

In general node iterators provide the following functionality:


GetId(): return node id GetOutDeg(): return out-degree of a node GetInDeg(): return in-degree of a node GetOutNId(e): return node id of the endpoint of e-th out-edge GetInNId(e): return node id of the endpoint of e-th in-edge IsOutNId(int NId): do we point to node id n IsInNId(n): does node id n point to us IsNbrNId(n): is node n our neighbor

For additional information on node and edge iterators, check out the Graph and Network Classes section in the Snap.py reference manual.

Input/Output

With SNAP it is easy to save and load networks in various formats. Internally SNAP saves networks in compact binary format but functions for loading and saving networks in various other text and XML formats are also available.

For example, Snap.py code for saving and loading graphs looks as follows:


# generate a network using Forest Fire model G3 = snap.GenForestFire(1000, 0.35, 0.35) # save and load binary FOut = snap.TFOut("test.graph") G3.Save(FOut) FOut.Flush() FIn = snap.TFIn("test.graph") G4 = snap.TNGraph.Load(FIn) # save and load from a text file snap.SaveEdgeList(G4, "test.txt", "Save as tab-separated list of edges") G5 = snap.LoadEdgeList(snap.TNGraph, "test.txt", 0, 1)

Manipulating Graphs and Networks

SNAP provides rich functionality to efficiently manipulate graphs and networks. Most functions support all graph/network types.

For example:


# generate a network using Forest Fire model G6 = snap.GenForestFire(1000, 0.35, 0.35) # convert to undirected graph G7 = G6.ConvertGraph(snap.TUNGraph) WccG = G6.GetMxWcc() # get a subgraph induced on nodes [0,1,2,3,4,5] SubG = G6.GetSubGraph([0,1,2,3,4]) # get 3-core of G Core3 = G6.GetKCore(3) # delete nodes of out degree 10 and in degree 5 G6.DelDegKNodes(10, 5)

For more details on Snap.py functionality, check out the Snap.py Manuals.

Computing Structural Properties of Networks

SNAP provides rich functionality to efficiently compute structural properties of networks. Most functions support all graph/network types.

For example:


# generate a Preferential Attachment graph on 1000 nodes and node out degree of 3 G8 = snap.GenPrefAttach(1000, 3) # get distribution of connected components (component size, count) CntV = G8.GetWccSzCnt() # get degree distribution pairs (degree, count) CntV = G8.GetOutDegCnt() # get first eigenvector of graph adjacency matrix EigV = G8.GetLeadEigVec() # get diameter of G8 G8.GetBfsFullDiam(100) # count the number of triads in G8, get the clustering coefficient of G8 G8.GetTriads() G8.GetClustCf()

For more details on Snap.py functionality, check out the Snap.py Manuals.