Web data: Reddit submissions
Dataset information
This dataset is a collection of 132,308 reddit.com submissions. Each submission is of an image, which has been submitted to reddit multiple times. For each submission, we collect features such as the number of ratings (positive/negative), the submission title, and the number of comments it received. We also include the html of the comment pages themselves.
Dataset statistics |
Number of submissions | 132,308 |
Number of unique images | 16,736 |
Average number of times an image is resubmitted | 7.9 |
Timespan | July 2008 - Jan 2013 |
Source (citation)
Files
Data format
#image_id,unixtime,rawtime,title,total_votes,reddit_id,number_of_upvotes,subreddit,
number_of_downvotes,localtime,score,number_of_comments,username
1005,1335861624,2012-05-01T15:40:24.968266-07:00,I immediately regret this decision,27,
t296r,20,pics,7,1335886824,13,0,ninjaroflmaster
1005,1336470481,2012-05-08T16:48:01.418140-07:00,"Pushing your friend into the water,
Level: 99",18,tds4i,16,funny,2,1336495681,14,0,hme4
1005,1339566752,2012-06-13T12:52:32.371941-07:00,I told him. He Didn't Listen,6,v0cma,4,
funny,2,1339591952,2,0,HeyPatWhatsUp
1005,1342200476,2012-07-14T00:27:56.857805-07:00,Don't end up as this guy.,16,wjivx,7,
funny,9,1342225676,-2,2,catalyst24
1005,1342485280,2012-07-17T07:34:40.225147-07:00,last one in is a rot...oh shit,9987,
wpar7,5633,pics,4354,1342510480,1279,153,phillythebeaut
1005,1342499862,2012-07-17T11:37:42-07:00,Dat t[o]ngue,139,wpq35,134,whalebait,5,
1342525062,129,10,clicksnd
1005,1342788847,2012-07-20T19:54:07.898464-07:00,When I try to break the ice with a
smooth joke,6,wwocv,3,funny,3,1342814047,0,0,LHeeezy
1005,1343926973,2012-08-03T00:02:53-07:00,Whalebait (x-post from r/whalebait),53,xlyyr,
45,PerfectTiming,8,1343952173,37,5,thatseffedup
where
- image_id: id of the image, submissions with the same id are of the same image
- unixtime: time of the submission (unix time)
- rawtime: raw text of the time
- title: submission title
- total_votes: number of upvotes + number of downvotes
- reddit_id: id of the submission on reddit, e.g. reddit.com/14c3ls
- number_of_upvotes: number of upvotes
- subreddit: subreddit, e.g. reddit.com/r/pics/
- number_of_downvotes: number of downvotes
- localtime: local time of the submission (unix time)
- score: number of upvotes - number of downvotes
- number_of_comments: number of comments the submission received
- username: name of the user who submitted the image e.g. www.reddit.com/user/thatseffedup