Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

redirect to new tools #460

Merged
merged 2 commits into from
Jul 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
64 changes: 2 additions & 62 deletions csv_download.md
Original file line number Diff line number Diff line change
@@ -1,64 +1,4 @@
---
layout: default
title: CSV Bulk Download
nav_order: 70
has_children: false
layout: redirect
redirect: https://datacommons.org/tools/download
---

# CSV Bulk Download

We provide access to some of our data in a relational format in a public Google Cloud Storage bucket, which is available for CSV download. The tables are constructed such that each row represents a [Place](https://datacommons.org/browser/Place) and each column represents a [Statistical Variable](https://datacommons.org/browser/StatisticalVariable).

These relational tables are organized by vertical, each within a different zip folder, which can be downloaded from the links below:
* [Agriculture](https://storage.googleapis.com/relational_tables/agriculture.zip)
* [Climate](https://storage.googleapis.com/relational_tables/climate.zip)
* [Crime](https://storage.googleapis.com/relational_tables/crime.zip)
* [Demographics](https://storage.googleapis.com/relational_tables/demographics.zip)
* [Economics](https://storage.googleapis.com/relational_tables/economics.zip)
* [Education](https://storage.googleapis.com/relational_tables/education.zip)
* [Employment](https://storage.googleapis.com/relational_tables/employment.zip)
* [Energy](https://storage.googleapis.com/relational_tables/energy.zip)
* [Health](https://storage.googleapis.com/relational_tables/health.zip)
* [Household](https://storage.googleapis.com/relational_tables/household.zip)
* [Housing](https://storage.googleapis.com/relational_tables/housing.zip)

Each vertical zip folder contains tables for various Place categories: `all` (all places), `us` (US places), `non_us` (non-US places), `county` (US counties), and `zip` (US zip codes). For each vertical and Place category, there are three types of tables:

* `value`: Each cell contains the value of the latest observation for a given Statistical Variable and Place.
* `date`: Each cell contains the date of the latest observation for a given Statistical Variable and Place.
* `provenance`: Each cell contains the provenance URL of the latest observation for a given Statistical Variable and Place, as well as the [measurement method](https://docs.datacommons.org/glossary.html), if provided. Measurement methods that are prefixed with `dcAggregate/` represent Data Commons aggregated values.

The table names follow the pattern `[vertical]_[place_category]_[type]` and are sharded into multiple CSV files. (For example, the file [`demographics_all_date-00000-of-00456.csv`](https://storage.googleapis.com/relational_tables/demographics/demographics_all_date-00000-of-00456.csv) contains a portion of the observation `dates` for `demographics` Statistical Variables and `all` Places. In this case, the table has been sharded into 456 files.)

The corresponding `value`, `date`, and `provenance` tables can be joined using the first three columns, which contain information about the place:
* `place_name`: The name(s) of the Place.
* `place_dcid`: The [Data Commons ID](https://docs.datacommons.org/glossary.html) for the Place.
* `place_type`: The type(s) of the Place.

## Example Table Structure

Below is a subset of the `housing_county_value` table:

| place_name | place_dcid | place_type | Count_HousingUnit | Count_HousingUnit_NoCashRent | ... |
| :---: | :---: | :---: | :---: | :---: | :---: |
| Nuckolls County | geoId/31129 | County | 2445 | 74 | ... |
| Wells County | geoId/38103 | County | 2422 | 74 | ... |
| ... | ... | ... | ... | ... | ... |

And the corresponding subset of the `housing_county_date` table:

| place_name | place_dcid | place_type | Count_HousingUnit | Count_HousingUnit_NoCashRent | ... |
| :---: | :---: | :---: | :---: | :---: | :---: |
| Nuckolls County | geoId/31129 | County | 2019 | 2019 | ... |
| Wells County | geoId/38103 | County | 2019 | 2019 | ... |
| ... | ... | ... | ... | ... | ... |

And for the `housing_county_provenance` table:

| place_name | place_dcid | place_type | Count_HousingUnit | Count_HousingUnit_NoCashRent | ... |
| :---: | :---: | :---: | :---: | :---: | :---: |
| Nuckolls County | geoId/31129 | County | https://www.census.gov/\|CensusACS5yrSurvey | https://www.census.gov/\|CensusACS5yrSurvey | ... |
| Wells County | geoId/38103 | County | https://www.census.gov/\|CensusACS5yrSurvey | https://www.census.gov/\|CensusACS5yrSurvey | ... |
| ... | ... | ... | ... | ... | ... |

The provenance value `https://www.census.gov/|CensusACS5yrSurvey` indicates that the observation comes from [https://www.census.gov/](https://www.census.gov/) using the [CensusACS5yrSurvey](https://datacommons.org/browser/CensusACS5yrSurvey) measurement method.
21 changes: 3 additions & 18 deletions statistical_variables.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,4 @@
---
layout: default
title: Statistical Variables
nav_order: 100
---

# Statistical Variables

<div markdown="span" class="alert alert-success" role="alert">
<span class="material-icons md-16">info</span> <b>Note:</b>
The previous list of curated statistical variables on this page has been migrated to the <a href="https://datacommons.org/tools/statvar#">Statistical Variable Explorer</a>.
</div>

Many of the Data Commons APIs deal with nodes of the type
[StatisticalVariable](https://datacommons.org/browser/StatisticalVariable). A StatisticalVariable is a node in the graph that captures any type of metric, statistic or measure that can measured at a place and time. Examples include the [number of males in a population](https://datacommons.org/browser/Count_Person_Male), [population without health insurance](https://autopush.datacommons.org/tools/statvar#Count_Person_NoHealthInsurance) and the [annual Carbon Dioxide emissions from iron and steel production](https://autopush.datacommons.org/tools/statvar#Annual_Emissions_GreenhouseGas_IronAndSteelProduction_NonBiogenic).

The [Statistical Variable Explorer](https://datacommons.org/tools/statvar) tool makes it easy for you to find information about each variable, such as the different data sources, metadata and places for which the statistics are available.

[Click here to access the tool](https://datacommons.org/tools/statvar#){: .btn .btn-primary }
layout: redirect
redirect: https://datacommons.org/tools/statvar
---