Skip to main content

SHOW DETAILS
up-solid down-solid
eye
Title
Date Favorited
Creator
The Dataset Collection
data
eye 4,185
favorite 2
comment 0
I took the Reddit comment archive and converted all the JSON into one SQLite database using this program that I wrote: https://gist.github.com/ers35/3b615a75fa0ed5e6d5cc I ran a few tests to make sure the number of database rows matches the number of JSON records. "SELECT MAX(rowid) FROM comment" and "SELECT COUNT(id) FROM comment" both return 1659361605. This gives me some confidence as to the integrity of the dataset, but I cannot be 100% sure. The compressed size is 163G....