GAISE Data Sources
[&]-easy to access [&&]-may need some work [&&&]-may require wrangling/digging
Data Repositories
DASL - Dataset and Story Library Repository of datasets searchable by keyword or statistical technique. Maintained by Data Description Inc. the creators of Data Desk. [&]
Kaggle Lots of interesting datasets, some tied to data competitions. May need to register for (free) account to download data. [&&-&&&]
JSE Data Archive Datasets with associated articles from the “Datasets and Stories” section of the Journal of Statistics Education (1993-2014) [&]
eeps data zoo Dataets from eeps media. These load automatically in CODAP (an online data analysis tool) and can then be exported as a .csv file. Use the information link in the CODAP window for more documentation on the data. [&&]
TidyTuesday Weekly dataset projects from the Data Science Learning Community. This links to a github repo. Check the readme files for a specific year (e.g. 2024) for a list and links to the weekly data topics for that year. [&&-&&&]
FiveThirtyEght Data Repository Datasets related to articles on the FiveThirtyEight website. [&&]
TSHS Resources Portal A collection of resources maintained by the ASA’s Section on Teaching Statistics in the Health Sciences [&]
Datahub Links to tons of curated datasets organized into “collections” by application topics. [&-&&&]
Our World in Data Lots of interactive charts with options to download data “to make progress against the world’s largest problems”. [&&]
Stanford Open Policing Project Data on police stops from communities around the US. [&&&]
Mendeley Data Searchable repository of hundreds of shared datasets with documedntation. [&-&&&]
R Datasets Links to datasets from various R pacakges in .csv folrma eith data documentation. [&-&&]
OzDASL- Australian Data and Story Library Datasets for teaching organized by statistical topics. [&]
Instructor Pages of Data Links
- Shonda Kuiper’s Links to Data Sources Links provided by Shonda Kuiper as part of the Stat2Labs project [&-&&&]
Sports Data
SCORE Data Repository Sports-related datasets produced by the SCORE network [&]
Sports Reference Links to sites such as Baseball-Reference.com, Pro-Football-Reference.com with current and historical data on teams and players. Other links go to Hockey, College Basketball, College Football, and Soccer databases. [&&]
Places to search for data
These usually take some more digging to get at usable data…
Google - Dataset Search A Google search specifically for datasets.[&&-&&&]
Data.gov Links to sources for US government data. [&&-&&&]
Open Data Network “A global search engine that allows you to search across tens of thousands of datasets from hundreds of open data catalogs.” [&&-&&&]
Centers for Disease Control Searchable access to lots of health related datasets [&&-&&&]
UNdata Searchable access to datasets provided by the United Nations [&&-&&&]
World Bank Data Links to datasets with global perspectives provided by the World Bank. [&&-&&&]
US Census Data Links to tools to access various forms of data from the US Census Bureau
Census Reporter “An independent project to make it easier for journalists to write stories using information from the U.S. Census bureau. Place profiles and comparison pages provide a friendly interface for navigating data, including visualizations for a more useful first look.” [&&]
IPUMS Integrated Public Use Microdata Series is a collection of standardized microdata from a wide variety of national and international surveys and censuses. [&&-&&&]
Knoema World and national datasets (required an account). [&&-&&&]
Open data from Montgomery County County-level datasets on a variety of topics. [&&-&&&]
Open Data from Baltimore City-level public government datasets. [&&-&&&]
ICPSR Inter-university Consortium for Political and Social Research [&&-&&&]
World Health Organization Data Public Health data for many countries. [&&-&&&]
QSIDE Institute Datasets related to the QSIDE Data4Justice Curriculum [&&]
Textbook Data
Webpages with links to data from textbooks. Many also support R packages with the datasets.
Statistics - UnLocking the Power of Data and in R package Lock5Data [&]
OpenIntro and in R package openintro [&]
Stat2 and in R package Stat2Data [&]