Open positions
We have filled all the positions for this quarter. More info.

English Wikipedia hyperlink network

Dataset information

This is a network of hyperlinks from a snapshot of English Wikipedia in 2013. An edge from i to j indicates a hyperlink on page i to page j. As part of this dataset, we also include the titles of the pages.

Dataset statistics
Nodes 4203323
Edges 101311614
Nodes in largest WCC 4203294 (1.000)
Edges in largest WCC 101311589 (1.000)
Nodes in largest SCC 3744228 (0.891)
Edges in largest SCC 95530079 (0.943)
Average clustering coefficient 0.2559
Number of triangles 304083160
Fraction of closed triangles 0.002106
Diameter (longest shortest path) 8
90-percentile effective diameter 3.8

Source (citation)


Files

File Description
enwiki-2013.txt.gz Directed English Wikipedia hyperlink network
enwiki-2013-names.csv.gz Names of web pages