Cool Datasets

A place to find cool datasets. Follow us on Twitter for updates! @cooldatasets

Now anyone can submit datasets! Submit

alt text


City of Chicago Employee Salaries
This file contains salaries for the City Of Chicago
Toxic Inventory Chemicals
The Toxics Release Inventory (TRI) makes available information for more than 600 toxic chemicals
2016 Election Results by State
2016 National popular vote tracker compiled by David Wasserman
Crime in the United States.csv
by Volume and Rate per 100,000 Inhabitants, 1994–2013. Includes Violent Crimes, Murders, Rapes, Bu
Hip and Knee Complications Dataset from the CDC
This data set includes provider data for the hip/knee complication measure, and the Agency for Healthcare Research and Quality (AHRQ) measures of serious complications.
Payment and Value of Hospital Care
This data set includes provider data for the payment measures and value of care displays associated with a 30-day episode of care for heart attack, heart failure, and pneumonia patients.
City of Phoenix Employee Salaries
City Official's salaries for the City of Phoenix, Arizona.
Construction Activity in the United States
United States Department of Commerce dataset of total value of construction currently put in place.
Amendments in America
11,000 proposed amendments to the United States Constitution from 1787-2014
Louisville Crime Statistics
Crime in Louisville, Kentucky from 2003 to 2016
Police Cruiser Districts
Dataset of police cruiser district locations in Columbus Ohio
FCC Complaint Calls
List of informal consumer complaint calls regarding unwanted robocalls and telemarketing calls.
White House Staff Salaries Dataset
Information on the salaries of staff at the White House
EU Climate Change Mitigation Policies
This dataset contains a number of climate change mitigation policies and measures (PAM) implemented or planned by European countries to reduce greenhouse gas emissions.
Officer Involved Shootings Austin Texas
Officer Involved Shootings in Austin Texas from 2000-2014
Hillary Clinton Income Taxes
Adjusted gross income and taxes owed by Hillary are included for each year from 2000-2015.
Presidential Debate Tweets
2000 tweets immediately following the first Presidential Debate in September 2016
The Open Data Dataset
A dataset containing the Open Data Portals of 100 of America's largest cities
White House Nominations
800 White House nominations and appointments



Top 100 Rotten Tomatoes Movies
Movies with 40 or more critic reviews vie for their place in history at Rotten Tomatoes. Eligible movies are ranked based on their Adjusted Scores.
Spotify Songs
50 Most Streamed Spotify Songs
Bookie Backer Football Datasets
Weekly updated football datasets.
TED Talks Dataset
Master list of 2,600 Ted Talks and descriptions
Top 500 Albums
Dataset of Rolling Stone's 500 greatest albums of all time
Popular Toys 2017
The most search toys and games of this years Christmas Season


Stanford Drone Dataset
Images and videos of various types of agents (not just pedestrians, but also bicyclists, skateboarders, cars, buses, and golf carts) that navigate in a real world outdoor environment
20 Newsgroups Dataset
This data set consists of 20000 messages taken from 20 Usenet newsgroups.
Hate Speech Identification
A sampling of Twitter posts that have been judged based on whether they are offensive or contain hate speech, as a training set for text analysis.
Forest Fire Dataset
The aim of this data is to predict the burned area of forest fires, in the northeast region of Portugal, by using meteorological and other data.
Image Processing Datasets
Curated datasets from Computer Vision Online
Natural Language Question and Answer Dataset
The largest human created question answer dataset for natural language processing
Microsoft MARCO Dataset
A reading comprehension dataset for the AI research
2000 Positive Words Sentiment Dataset
2000 positive words used for sentiment analysis
Youtube's 8M Dataset
8Million video URLs, 500K hours of video
Comma.AI Driving Dataset
7 hours of self-driving training data from
Uber Movement
Anonymized data from over 2 billion Uber trips.
Standard Remibursement Rates for Travel
200,000 standard reimbursement rates for travel among various U.S. destinations
Galton's Pea Dataset
Francis Galton introduced the correlation coefficient with an analysis of the similarities of the parent and child generation of 700 sweet peas.
Diamond Quality
Sample dataset of 350 diamonds, their color, size, clarity, and price
Deep Fashion
Categorized database of 800,000 fasion images
Wells Fargo Deposits
Wells Fargo branch deposits by US states and counties
Instacart Orders and Customers
3 million Instacart Orders, Opensourced
[Video] Lane Detection
Highway recording for self-driving car lane detection


Titanic Passengers Dataset
Passenger information from the Titanic
United States Patents
United States patent information dating from 1790-2015
Global Airports Dataset
Name, City, Country, and Lat/Lon of 5000 Airports Around the World.
Oil Prices
Historical dataset for nominal and inflation adjusted oil prices since 1918
NYC Restaurant Inspections
This dataset provides restaurant inspections, violations, grades and adjudication information
NYC High School SAT Scores
A dataset containing all NYC High Schools average SAT scores in reading, writing and math
Air Traffic Dataset
San Francisco International Airport Report on Monthly Passenger Traffic Statistics by Airline.
EPA Fuel Economy
Fuel economy data are the result of vehicle testing done at the Environmental Protection Agency's National Vehicle and Fuel Emissions Laboratory in Ann Arbor, Michigan, and by vehicle manufacturers with oversight by EPA.
Beer Styles Dataset
A crowd sourced database of how well beer styles (Stout, Pale Ale, etc) and additions (chocolate, bacon, cherry) go with each other.
Water Use Dataset
Monthly residential water usage use by zip code. Numbers represent Hundered Cubic Feet (HCF) usage. Records from 2005-2013
United States Birth Rates
Birth Rates, by Age of Mother in the United States from 1940
Popular Baby Boy Names in Illinois
Top 25 boy names, each year from 1980-2013 including frequency.
Food Recalls by Brand
Most common food recalls by brand since 2009.
Homeless Population Dataset
Population of homeless in New York City Neighborhoods by year
EU Bank Interest Rates
This dataset covers euro-denominated deposits with an agreed maturity from euro area households (percentages per annum, rates on new business).
Jail Bookings Dataset
Miami-Dade Corrections jail bookings from May 29, 2015 to current.
Valet Parking Dataset
Valet Parking by District, Facility, and Locations in Philadelphia
Los Angeles Businesses
Listing of 470,000-PLUS business names and locations in Los Angeles
Street Trees
List of San FranciscoDepartment of Public Works (dpw) maintained street trees including: Planting date, species, and location
Public Libraries
Dataset of all public libraries in the United States
Death Probability Since 1900
historical and projected probabilities of death by single year of age, gender, and year for the period 1900 through 2010. Death Probabilities for Male.
STDs Nationally Ranked
U.S. states ranked by cases of Chlamydia, gonorrhea, and primary and secondary syphilis reported.
The World's Telephones by Year
The world's telephones by continent in the years 1951, 1956, 1957, 1958, 1959, 1960, 1961
International Energy Consumption
These data list total primary energy consumption by country and region in Quadrillion Btu. Figures are annual totals for the years 1980 through 2008
NYC Subways
1900 New York City subway entrance locations
Immigration to Ellis Island (1892-1924)
Dataset by trip, dates, ports, ships, and passengers.
Sovereign Bond Holdings Dataset
Data on sectorial holdings of sovereign bonds for 12 countries
1 million digits of Pi
Not necessarily a dataset but still cool
Kickstarter Datasets
Monthly datasets of all campaigns from
World Internet Users
A yearly look at the number of internet users around the World.