Open positions
We have filled all the positions for this quarter. More info.

Wikipedia Talk network

Dataset information

Wikipedia is a free encyclopedia written collaboratively by volunteers around the world. Each registered user has a talk page, that she and other users can edit in order to communicate and discuss updates to various articles on Wikipedia. Using the latest complete dump of Wikipedia page edit history (from January 3 2008) we extracted all user talk page changes and created a network.

The network contains all the users and discussion from the inception of Wikipedia till January 2008. Nodes in the network represent Wikipedia users and a directed edge from node i to node j represents that user i at least once edited a talk page of user j.

Dataset statistics
Nodes 2394385
Edges 5021410
Nodes in largest WCC 2388953 (0.998)
Edges in largest WCC 5018445 (0.999)
Nodes in largest SCC 111881 (0.047)
Edges in largest SCC 1477893 (0.294)
Average clustering coefficient 0.0526
Number of triangles 9203519
Fraction of closed triangles 0.001112
Diameter (longest shortest path) 9
90-percentile effective diameter 4

Source (citation)


File Description
Wiki-Talk.txt.gz Wikipedia talk graph till January 2008