Open positions
Open research positions in SNAP group are available at undergraduate, graduate and postdoctoral levels.

Facebook Large Page-Page Network

Dataset information

This webgraph is a page-page graph of verified Facebook sites. Nodes represent official Facebook pages while the links are mutual likes between sites. Node features are extracted from the site descriptions that the page owners created to summarize the purpose of the site. This graph was collected through the Facebook Graph API in November 2017 and restricted to pages from 4 categories which are defined by Facebook. These categories are: politicians, governmental organizations, television shows and companies. The task related to this dataset is multi-class node classification for the 4 site categories.

MUSAE paper: arxiv.org
MUSAE Project: Github


Dataset statistics
DirectedNo.
Node featuresYes.
Edge featuresNo.
Node labelsYes. Binary-labeled.
TemporalNo.
Nodes 22,470
Edges 171,002
Density 0.001
Transitvity 0.232

Possible tasks
Multi-class node classification
Link prediction
Community detection
Network visualization

Source (citation)

  • B. Rozemberczki, C. Allen and R. Sarkar. Multi-scale Attributed Node Embedding. 2019.
  •           @misc{rozemberczki2019multiscale,
                title={Multi-scale Attributed Node Embedding},
                author={Benedek Rozemberczki and Carl Allen and Rik Sarkar},
                year={2019},
                eprint={1909.13021},
                archivePrefix={arXiv},
                primaryClass={cs.LG}
            }
            
              

    Files

    File Description
    facebook_large.zip Facebook Large Page-Page Network