Berkeley-Stanford web graph

Dataset information

Nodes represent pages from and domains and directed edges represent hyperlinks between them. The data was collected in 2002.

Dataset statistics
Nodes 685230
Edges 7600595
Nodes in largest WCC 654782 (0.956)
Edges in largest WCC 7499425 (0.987)
Nodes in largest SCC 334857 (0.489)
Edges in largest SCC 4523232 (0.595)
Average clustering coefficient 0.5967
Number of triangles 64690980
Fraction of closed triangles 0.002746
Diameter (longest shortest path) 514
90-percentile effective diameter 9.9

Source (citation)


File Description
web-BerkStan.txt.gz Berkely-Stanford web graph from 2002