Network datasets

A collection of 40 large network datasets:


Social networks

Name Type Nodes Edges Description
soc-Epinions1 Directed 75,879 508,837 Who-trusts-whom network of Epinions.com
soc-LiveJournal1 Directed 4,847,571 6,8993,773 LiveJournal online social network
soc-Slashdot0811 Directed 77,360 905,468 Slashdot social network from November 2008
soc-Slashdot0922 Directed 82,168 948,464 Slashdot social network from February 2009
wiki-Vote Directed 7115 103,689 Wikipedia who-votes-on-whom network

Communication networks

Name Type Nodes Edges Description
email-EuAll Directed 265,214 420,045 Email network from a EU research institution
email-Enron Directed 36,692 367,662 Email communication network from Enron
wiki-Talk Directed 2,394,385 5,021,410 Wikipedia talk (communication) network

Citation networks

Name Type Nodes Edges Description
cit-HepPh Directed, Temporal, Labeled 34,546 421,578 Arxiv High Energy Physics paper citation network
cit-HepTh Directed, Temporal, Labeled 27,770 352,807 Arxiv High Energy Physics paper citation network
cit-Patents Directed, Temporal, Labeled 3,774,768 16,518,948 Citation network among US Patents

Collaboration networks

Name Type Nodes Edges Description
ca-AstroPh Undirected 18,772 396,160 Collaboration network of Arxiv Astro Physics
ca-CondMat Undirected 23,133 186,936 Collaboration network of Arxiv Condensed Matter
ca-GrQc Undirected 5,242 28,980 Collaboration network of Arxiv General Relativity
ca-HepPh Undirected 12,008 237,010 Collaboration network of Arxiv High Energy Physics
ca-HepTh Undirected 9,877 51,971 Collaboration network of Arxiv High Energy Physics Theory

Web graphs

Name Type Nodes Edges Description
web-BerkStan Directed 685,230 7,600,595 Web graph of Berkeley and Stanford
web-Google Directed 875,713 5,105,039 Web graph from Google
web-NotreDame Directed 325,729 1,497,134 Web graph of Notre Dame
web-Stanford Directed 281,903 2,312,497 Web graph of Stanford.edu

Product co-purchasing network

Name Type Nodes Edges Description
amazon0302 Directed 262,111 1,234,877 Amazon product co-purchasing network from March 2 2003
amazon0312 Directed 400,727 3,200,440 Amazon product co-purchasing network from March 12 2003
amazon0505 Directed 410,236 3,356,824 Amazon product co-purchasing network from May 5 2003
amazon0601 Directed 403,394 3,387,388 Amazon product co-purchasing network from June 1 2003

Internet peer-to-peer networks

Name Type Nodes Edges Description
p2p-Gnutella04 Directed 10,876 39,994 Gnutella peer to peer network from August 4 2002
p2p-Gnutella05 Directed 8,846 31,839 Gnutella peer to peer network from August 5 2002
p2p-Gnutella06 Directed 8,717 31,525 Gnutella peer to peer network from August 6 2002
p2p-Gnutella08 Directed 6,301 20,777 Gnutella peer to peer network from August 8 2002
p2p-Gnutella09 Directed 8,114 26,013 Gnutella peer to peer network from August 9 2002
p2p-Gnutella24 Directed 26,518 65,369 Gnutella peer to peer network from August 24 2002
p2p-Gnutella25 Directed 22,687 54,705 Gnutella peer to peer network from August 25 2002
p2p-Gnutella30 Directed 36,682 88,328 Gnutella peer to peer network from August 30 2002
p2p-Gnutella31 Directed 62,586 147,892 Gnutella peer to peer network from August 31 2002

Road networks

Name Type Nodes Edges Description
roadNet-CA Undirected 1,965,206 5,533,214 Road network of California
roadNet-PA Undirected 1,088,092 3,083,796 Road network of Pennsylvania
roadNet-TX Undirected 1,379,917 3,843,320 Road network of Texas

Network types

Network statistics

Dataset statistics
Nodes Number of nodes in the network
Edges Number of edges in the network
Nodes in largest WCC Number of nodes in the largest weakly connected component
Edges in largest WCC Number of edges in the largest weakly connected component
Nodes in largest SCC Number of nodes in the largest stongly connected component
Edges in largest SCC Number of edges in the largest stongly connected component
Average clustering coefficient Average clustering coefficient
Number of triangles Number of triples of connected nodes (considering the network as undirected)
Fraction of closed triangles Number of connected triples of nodes / number of length 2 paths
Diameter (longest shortest path) Maximum shortest path length (sampled over 1,000 random nodes)
90-percentile effective diameter 90-th percentile of shortest path length distribution (sampled over 1,000 random nodes)