Open positions
Open research positions in SNAP group are available here.

Stanford Biomedical Network Dataset Collection

Mambo is a tool for construction, representation, and analysis of large and multimodal biomedical network data.

Networks and relationships

Name Edges Entities Description
CC-Neuron 49,471,006 cell, cell Similarity network between cells in embroyonic mouse brain
ChCh-Miner 96,137 drug, drug Interactions between FDA-approved drugs
ChChSe-Decagon 4,649,441 drug, drug, side-effect Side effects of drug combinations
ChG-InterDecagon 131,034 drug, gene Chemical-gene interaction network
ChG-Miner 15,424 drug, gene Drug-target interaction network
ChG-TargetDecagon 18,690 drug, gene Drug-target interaction network
ChSe-Decagon 174,977 drug, side-effect Drug side-effect association network
DCh-Miner 1,334,088 disease, drug Disease-drug association network
DD-Miner 6,877 disease, disease Hierarchical ontology of diseases
DF-Miner 802,760 disease, function Disease-function association network
DG-AssocMiner 21,357 disease, gene Disease-gene association network
DG-Miner 42,475,361 disease, gene Disease-gene association network
FF-Miner 119,464 function, function Relations between biological processes, molecular functions, and cellular components
GF-Miner 16,628 gene, function Gene-function association network
GG-EnhancedTissue 3,642,834,333 gene, gene Enhanced tissue-specific gene-gene interaction networks
GP-Miner 102,450 gene, protein Protein-coding gene associations
GrGr-EnhancedHiC1K 7,224,824 genomic-region, genomic-region Enhanced Hi-C interaction network
GrGr-EnhancedHiC5K 682,566 genomic-region, genomic-region Enhanced Hi-C interaction network
PP-Decagon 715,612 protein, protein Physical and functional protein-protein association network for human
PP-Miner 1,847,117,370 protein, protein Protein-protein association networks for many different species
PP-Pathways 342,353 protein, protein Physical protein-protein interaction network for human
PPT-Ohmnet 70,338 protein, protein, tissue Tissue-specific protein-protein interaction network
SS-Butterfly 832 species, species Similarity network between butterflies
TFG-Ohmnet 20,619 tissue, function, gene Tissue-specific protein-function association networks

Entities and feature tables

Name Size Entity Description
D-DoMiner 9,247 disease Disease synopses
D-DoPathways 301 disease Mapping of diseases to disease categories
D-MeshMiner 11,332 disease Disease synopses
D-MtfPathways 519 disease Network motifs of disease pathways
D-OmimMiner 1,191 disease Disease synopses
D-StructPathways 520 disease Network structural features of disease pathways
G-HumanEssential 18,529 gene Information on experimentally tested essential and non-essential genes
G-MtfPathways 22,552 gene Network motifs of genes
G-SynMiner 35,654 gene Gene synopses
Se-DoDecagon 562 side-effect Mapping of side effects to side-effect categories

Entity types

Network statistics

Dataset statistics
Nodes Number of nodes in the network
Edges Number of edges in the network
Nodes in largest WCC Number of nodes in the largest weakly connected component
Edges in largest WCC Number of edges in the largest weakly connected component
Nodes in largest SCC Number of nodes in the largest strongly connected component
Edges in largest SCC Number of edges in the largest strongly connected component
Average clustering coefficient Average clustering coefficient
Number of triangles Number of triples of connected nodes (considering the network as undirected)
Fraction of closed triangles Number of connected triples of nodes / number of (undirected) length 2 paths
Diameter (longest shortest path) Maximum undirected shortest path length (sampled over 1,000 random nodes)
90-percentile effective diameter 90-th percentile of undirected shortest path length distribution (sampled over 1,000 random nodes)

Citing BioSNAP

We encourage you to cite our datasets if you have used them in your work. You can use the following BibTeX citation:

@misc{biosnapnets,
  author       = {Marinka Zitnik, Rok Sosi\v{c}, Sagar Maheshwari, and Jure Leskovec},
  title        = {{BioSNAP Datasets}: {Stanford} Biomedical Network Dataset Collection},
  howpublished = {\url{http://snap.stanford.edu/biodata}},
  month        = aug,
  year         = 2018
}

The following people also contributed to BioSNAP: Monica Agrawal, Agrim Gupta, Nina Mrzelj, Priyanka Nigam, Sheila Ramaswamy, and Viswajith Venugopal.