Open positions
Open research positions in SNAP group are available at undergraduate, graduate and postdoctoral levels.

Reddit User and Subreddit Embeddings

Dataset information

This dataset contains two files: user embeddings and subreddit embeddings on Reddit. The user and subreddit embeddings represent a vector representation of each user and each subreddit. (A subreddit is a community on Reddit.) The data is extracted from publicly available Reddit data of 2.5 years from Jan 2014 to April 2017. The vectors are generated from the user-to-subreddit posting network using a word2vec-style objective function. Please see the reference paper below for details on how the vectors are generated.

User embeddings: This file generates one numerical vector in low dimensional space (a.k.a. embeddings) for each user. The embeddings are 300 dimensions each. Two user embeddings are similar if they who post in similar subreddits.

Subreddit embeddings: This file generates one numerical vector in low dimensional space (a.k.a. embeddings) for each subreddit. The embeddings are 300 dimensions each. Two subreddit embeddings are similar if the users who post in them are similar.

Project website: These files have been generated as part of the research project on how subreddits attack one another. The details of the project can be found here.

Other related datasets: We have also released two other datasets that are closely related:


Dataset statistics
Number of users 118,381
Number of subreddits 51,278
Embedding length 300
Timespan of data Jan 2014 - April 2017

Source (citation)

The following BibTeX citation can be used:
@inproceedings{kumar2018community,
  title={Community interaction and conflict on the web},
  author={Kumar, Srijan and Hamilton, William L and Leskovec, Jure and Jurafsky, Dan},
  booktitle={Proceedings of the 2018 World Wide Web Conference on World Wide Web},
  pages={933--943},
  year={2018},
  organization={International World Wide Web Conferences Steering Committee}
}

Files

File Description
web-redditEmbeddings-users.csv Embedding vectors of users on Reddit.
web-redditEmbeddings-subreddits.csv Embedding vectors of subreddits (communities on Reddit).

Data format

The data file is in comma separated format.
USER_ID,VECTOR rotoreuters,-0.224305,0.034301,-0.082651,0.004676,0.00696,0.892179,-0.309423,0.570185,0.49211,0.667661,0.379927,-0.701833,0.494844,-0.112651,-0.499859,-0.03113,-0.17902,-0.307026,0.804202,-0.126007,0.298278,0.699318,-0.122089,-0.147698,0.347853,-0.171306,-0.324271,-0.599804,0.423248,-0.56949,-0.824675,-0.568197,-0.515359,-0.281378,-0.631208,0.31375,0.43415,0.314626,0.219685,0.177992,0.476424,-0.303418,-0.40719,-0.099023,0.12914,0.437157,0.19942,-0.400879,-0.83451,-0.399204,-0.735938,0.633666,-0.195332,0.006758,-1.091519,0.41688,0.055319,0.40614,-0.087184,0.721328,0.585882,-0.441858,0.011837,-0.358463,-0.323385,0.573054,-0.008566,-0.110555,-0.111838,-0.628141,-0.37604,0.539726,0.022366,0.479097,0.043697,0.132671,0.765249,0.700398,0.493926,0.241689,0.128558,0.253161,0.082354,0.247792,0.15935,-0.504183,0.283101,0.11646,0.109912,0.016254,-0.635325,0.083934,0.400957,0.33653,0.080672,-0.712021,0.02349,-0.499163,0.142773,-0.779104,-0.167535,-0.282673,0.417065,-0.253296,-0.041676,-0.220045,0.491036,-0.031163,0.355421,-0.912913,-0.132866,0.15732,-0.062805,-0.160181,0.041099,-0.245248,-0.054643,0.322623,-0.548176,0.166469,-0.057362,-0.230725,-0.419439,0.18926,-0.664602,-0.567163,-0.546665,-0.244773,-0.021004,-0.403838,-0.029683,-0.02533,0.350426,0.07477,0.065412,0.241725,-0.336525,-0.901883,0.534846,0.030413,-0.63059,-0.361515,-0.630254,-0.002442,-0.144353,0.318511,0.998885,-0.993112,0.701324,-0.352901,-0.257294,0.388479,0.109291,-0.57535,-0.510159,-0.638403,-0.549713,-0.415056,0.247532,0.066906,-0.676021,-0.39411,0.599426,0.896202,0.476426,0.496846,0.5276,-0.144111,-0.240765,0.49653,0.408169,0.165807,-0.210979,0.326131,0.538052,-0.368556,-0.378118,-0.221417,0.038478,-0.326394,-0.623636,-0.045483,-0.35498,0.024394,-0.134996,0.248642,0.708362,0.768013,-0.269403,-0.586033,-0.551153,-0.038667,-0.288946,0.030872,-0.229663,0.43991,-0.58382,-0.764331,0.49603,0.02332,-0.018123,-0.785993,0.336409,0.329915,0.019162,-0.156693,-0.046217,0.341809,0.216982,0.361256,0.765107,0.09945,0.566142,-0.380906,0.073389,-0.833633,-0.444517,-0.529169,-0.350931,-0.112044,0.032254,-0.314222,-0.670453,-0.003535,0.757898,-0.547555,0.356095,-0.237955,-0.169256,0.361111,-0.695178,0.128437,-1.013242,-0.038218,0.192656,-0.044316,0.413002,-0.112519,0.438106,-0.163539,-0.288049,1.116224,0.125394,0.456745,0.619035,-0.194935,0.393341,0.931975,0.101569,-0.384092,0.225502,-0.29988,-0.682437,0.208696,-0.343127,-0.132798,-0.565871,0.261739,-0.560174,-0.000564,0.299804,-0.120867,0.849765,-0.337365,-0.418125,-0.084188,-0.248032,0.35677,0.028407,-0.21356,0.06294,-0.188042,0.431441,-0.472865,0.222936,0.076625,0.285511,0.222161,0.284596,-0.158964,0.182507,0.711164,0.423767,-0.486449,0.403645,-0.716357,-0.359746,0.063134,0.646768,-0.287045,-0.380348,-0.14416,-0.289317,0.471727,-0.174092,0.534364,0.218821,0.269216,-0.412621,-0.469088 fiplefip,-0.306765,0.259314,-0.950335,0.560013,-0.364981,0.073359,-0.256642,-0.348088,-0.030323,-0.284338,0.377343,-0.358473,0.559322,0.062051,0.099554,0.46136,-0.273855,-0.274918,0.725871,-0.230823,-0.436114,0.186223,-0.004017,0.297142,-0.066631,0.16217,-0.364509,0.229731,-0.151828,0.22865,0.171403,-0.334804,-0.408777,-0.165566,0.274575,-0.265074,0.429774,-0.217675,0.195341,-0.343059,-0.232225,0.013402,-1.047794,-0.202717,0.275221,0.022242,0.409946,0.062818,0.061196,-0.250688,-0.633233,0.872642,-0.409842,0.481186,-0.799223,0.050954,-0.54631,0.381634,0.052297,0.402593,-0.433905,-0.739268,0.238811,0.229997,0.274061,-0.264081,0.040341,0.33944,-0.060263,-0.42728,-0.18308,0.076269,-0.233166,-0.049634,0.091474,-0.04185,-0.659204,0.075952,-0.137962,0.525735,-0.363403,-0.270721,-0.286186,0.75313,-0.251231,-0.05558,0.133998,-0.922978,-0.681682,0.379896,-0.114465,-0.403521,0.572923,0.437024,0.191971,-0.145903,0.161456,-0.463453,-0.683026,0.161966,-0.38077,-0.64148,0.344847,-0.537787,-0.515634,0.291856,1.349782,0.622313,0.377038,-0.213636,0.413977,0.6242,0.104531,-0.581911,-0.276961,-0.101371,0.624383,0.504247,0.561515,0.117927,0.614386,0.839709,0.20462,-0.480569,-0.068113,-0.11683,-1.055569,-0.629379,-0.158954,0.287536,0.780041,0.63561,-0.010422,0.192075,0.167964,-0.443677,-0.045857,-0.497096,0.202096,0.280315,-0.439252,0.113552,-0.177334,0.02243,-0.612739,-0.357007,1.117971,-0.476095,-0.036811,-0.524293,-0.441786,-0.076792,0.496846,-0.045843,0.284069,0.137884,0.029353,0.16189,-0.007264,-0.399116,-0.363432,0.110552,-0.224124,-0.134213,0.023973,-0.017608,-0.495291,0.71312,-0.507308,0.123047,-0.084328,0.531125,-0.105674,0.537314,0.325989,0.22315,-0.562248,-0.337955,-0.212933,-0.042747,-0.445113,-0.017054,-0.09796,0.11615,-0.295329,-0.008426,-0.12945,-0.557655,0.267662,-0.038402,-0.643162,0.027822,0.275612,0.260406,-0.157455,-0.505367,0.119877,-0.422747,-0.542611,-0.03355,0.768213,0.08271,0.752942,0.043498,-0.159485,-0.598905,0.219367,0.072352,-0.390551,-0.361184,0.074472,0.103684,0.892971,0.168095,0.359259,0.151044,-0.794605,-0.623525,-0.003719,-0.200213,0.867815,-0.881989,-0.765666,0.24489,0.198795,-0.012775,0.492104,0.911354,-0.553588,0.42052,-0.407224,-0.646224,-0.34462,-0.037624,-0.36168,0.669453,0.435865,0.409443,0.472808,0.175568,0.17398,0.295725,0.133073,-0.141865,0.166259,-0.305745,-0.306207,0.11455,-0.212828,-0.12571,0.241662,0.008276,0.435264,0.50296,-0.220416,0.385967,-0.198875,0.250335,-0.337965,0.026384,0.854753,0.323354,0.050374,-0.007571,-0.811709,0.228903,-0.525756,0.513215,-0.197298,-0.706438,-0.42467,0.410469,0.16759,0.003496,0.130472,-0.59432,-0.076453,0.671613,-0.53084,0.171624,-0.14149,0.264164,-0.338885,-0.125357,-0.206496,0.660746,-0.327274,0.188642,0.439133,-0.158999,-0.432182,-0.769793,-0.434484,0.268733,-0.163076,-0.455654,0.41656,-0.219805,-0.568944,-0.477788

where