Wikispeedia navigation paths

Dataset information

This dataset contains human navigation paths on Wikipedia, collected through the human-computation game Wikispeedia. In Wikispeedia, users are asked to navigate from a given source to a given target article, by only clicking Wikipedia links. A condensed version of Wikipedia (4,604 articles) is used. In addition to the navigation paths, we provide the full HTML package of this version of Wikipedia, as well as all articles in plaintext.

Dataset statistics
Finished paths 51,318
Unfinished paths 24,875
Articles 4,604
Links 119,882

Sources (citations)


File Description Size
wikispeedia_paths-and-graph.tar.gz Navigation paths and Wikipedia hyperlink graph (without article content)9.5 MB
wikispeedia_articles_plaintext.tar.gz Plaintext content of the Wikipedia articles35 MB
wikispeedia_articles_html.tar.gz Full HTML package of the Wikipedia version used by Wikispeedia755 MB

Data format

Each file in wikispeedia_paths-and-graph.tar.gz contains the row format as a header.