Google web graph
Dataset information
Nodes represent web pages and directed edges represent hyperlinks between them. The data was released in 2002 by Google as a part of Google Programming Contest.
| Dataset statistics |
| Nodes | 875713 |
| Edges | 5105039 |
| Nodes in largest WCC | 855802 (0.977) |
| Edges in largest WCC | 5066842 (0.993) |
| Nodes in largest SCC | 434818 (0.497) |
| Edges in largest SCC | 3419124 (0.670) |
| Average clustering coefficient | 0.6047 |
| Number of triangles | 13391903 |
| Fraction of closed triangles | 0.05523 |
| Diameter (longest shortest path) | 22 |
| 90-percentile effective diameter | 8.1 |
Source (citation)
Files
| File |
Description |
| web-Google.txt.gz |
Webgraph from the Google programming contest, 2002 |