-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new overview docs #421
new overview docs #421
Conversation
Staged at bullie.svl.corp.google.com:4000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for sharing this Kara. Has this been reviewed by Prem and Guha?
+Prem Ramaswami ***@***.***>
It's been reviewed by Prem, who told me to "just publish it". :-) Prem,
should Guha review?
…On Tue, Jun 18, 2024 at 11:23 AM Carolyn Au ***@***.***> wrote:
***@***.**** commented on this pull request.
Thanks for sharing this Kara. Has this been reviewed by Prem and Guha?
------------------------------
In how_to_use.md
<#421 (comment)>
:
> +nav_order: 1
+---
+
+{: .no_toc}
+# How to use Data Commons
+
+* TOC
+{:toc}
+
+## Learn about the data in Data Commons
+
+To find out what data is available in Data Commons, check out the [Statistical Variable Explorer](https://datacommons.org/tools/statvar) and see the [Data sources](/datasets/index.html) pages.
+
+## Issue interactive data queries
+
+For quick analysis, use the search query bar on the home page or any of the visualization tools, such as the [Timeline](https://datacommons.org/tools/visualization#visType=timeline), [Scatter](https://datacommons.org/tools/visualization#visType%3Dscatter), and [Map](https://datacommons.org/tools/visualization#visType%3Dmap) explorers.
It could be helpful to also link to the homepage.
Also, to me, it reads as if they're also supposed ot use the search query
bar on the vis tools. Should it be ... or use any of the ... or ... or
explore using any of the visualization tools ...
------------------------------
In how_to_use.md
<#421 (comment)>
:
> +
+## Issue interactive data queries
+
+For quick analysis, use the search query bar on the home page or any of the visualization tools, such as the [Timeline](https://datacommons.org/tools/visualization#visType=timeline), [Scatter](https://datacommons.org/tools/visualization#visType%3Dscatter), and [Map](https://datacommons.org/tools/visualization#visType%3Dmap) explorers.
+
+If you want to issue SQL queries, and you have a Google Cloud Platform account, use BigQuery Studio on Data Commons data in [Analytics Hub](https://cloud.google.com/analytics-hub). See the [Data Commons in BigQuery](/bigquery/index.html) page for more details.
+
+## Issue programmatic data queries
+
+Data Commons publishes REST, Python, Pandas, Google Sheets, and SPARQL [APIs](/api/index.html). The APIs support both low-level exploration of the knowledge graph as well as higher-level statistical analysis of variables.
+
+The Python and pandas APIs provide convenient wrappers for calling the APIs; we have developed a set of [Google Colab tutorials](/tutorials/index.html) to help you get started with analysis. We have also developed a [Data science curriculum](/courseware/intro_data_science.html) featuring our API and data, currently in use at MIT.
+
+## Embed data analyses and visualizations in your site
+
+If you would like to include Data Commons result visualizations in your own website, we provide a [Javascript API](/api/web_components/index.html) that makes it a snap to embed various chart elements, such as scatter plots, maps, pie charts, and many more.
could we label the link Web Components API (it's not quite javascript
only...)
------------------------------
In how_to_use.md
<#421 (comment)>
:
> +The Python and pandas APIs provide convenient wrappers for calling the APIs; we have developed a set of [Google Colab tutorials](/tutorials/index.html) to help you get started with analysis. We have also developed a [Data science curriculum](/courseware/intro_data_science.html) featuring our API and data, currently in use at MIT.
+
+## Embed data analyses and visualizations in your site
+
+If you would like to include Data Commons result visualizations in your own website, we provide a [Javascript API](/api/web_components/index.html) that makes it a snap to embed various chart elements, such as scatter plots, maps, pie charts, and many more.
+
+## Download data for offline analysis
+
+Data Commons provides several tools for downloading its data:
+
+- To preview and download for selected places and statistical variables, use the standalone [Data Download Tool](https://datacommons.org/tools/download) or click the **Download** link in any of the results pages of the visualization tools.
+- To load data into Google Sheets for analysis and charting, install and run the Data Commons Google [Sheets add-on](/api/sheets/index.html).
+
+## Build machine learning models
+
+Data Commons data sets provide ideal training data for developing machine learning models and other data science applications. The [data science tutorials](/courseware/intro_data_science.html) show you how to use our APIs and data to get started.
We try not to use "data sets".. perhaps just Data Commons provides ideal
training ...
------------------------------
In how_to_use.md
<#421 (comment)>
:
> +## Embed data analyses and visualizations in your site
+
+If you would like to include Data Commons result visualizations in your own website, we provide a [Javascript API](/api/web_components/index.html) that makes it a snap to embed various chart elements, such as scatter plots, maps, pie charts, and many more.
+
+## Download data for offline analysis
+
+Data Commons provides several tools for downloading its data:
+
+- To preview and download for selected places and statistical variables, use the standalone [Data Download Tool](https://datacommons.org/tools/download) or click the **Download** link in any of the results pages of the visualization tools.
+- To load data into Google Sheets for analysis and charting, install and run the Data Commons Google [Sheets add-on](/api/sheets/index.html).
+
+## Build machine learning models
+
+Data Commons data sets provide ideal training data for developing machine learning models and other data science applications. The [data science tutorials](/courseware/intro_data_science.html) show you how to use our APIs and data to get started.
+
+## Contribute data to Data Commons
Can we point to this page instead, which lists other ways to contibute:
https://docs.datacommons.org/contributing/
------------------------------
In index.md
<#421 (comment)>
:
>
-One of the salient aspects of Data Commons is that it is not a repository of data sets. There are many great repositories (Dataverse, BQ public datasets, data.gov) that more than adequately address that need. Instead, it is a single unified database created by normalizing/aligning the schemas and entity references across these different datasets (to the extent possible). So, for example, if a researcher wants the population, violent crime rate and unemployment rate of a county, the researcher does not have to go to three different datasets (Census, FBI and BLS), but can instead, get it from a single database, using one schema, one API. Of course, she would want to know the provenance of the data, which is recorded with every data point, something enabled in the APIs.
+Behind the scenes, Data Commons does the tedious work of finding data, understanding the data collection methodologies, cleaning the data, reconciling different formats and schemas, figuring out how to merge data about the same entity from different sources, and so on – saving organizations months of tedious, costly and error-prone work. Data Commons is not a repository of public datasets (such as Kaggle or Google Cloud BiqQuery Public Datasets). Instead, it is a single unified data source created by normalizing and aligning schemas and references to the same entities (such as cities, counties, organizations, etc.) across different datasets.
+
+For example, if you wanted to get [population stats, poverty and unemployment rates of a specific county](https://datacommons.org/place/geoId/06081), you don't need to go to three different datasets; instead, you can get the data from a single data source, using one schema, and one API. Data Commons is also used by Google Search whenever it can provide the most relevant statistical results to a query. For example, the top Google Search result for the query "what is the life expectancy of Vietnam" returns a Data Commons timeline graph and a link to the [Place page](https://datacommons.org/place/country/VNM?utm_medium=explore&mprop=lifeExpectancy&popt=Person&hl=en) for Vietnam:
+
+![Google Search query result](/assets/images/dcoverview1.png){: width="800"}
+
+## A standards-based knowledge graph, schema, and APIs
+
+Data Commons needs to be able to stitch together data from disparate data sets in different formats and encodings, in a wide range of domains, from time series about demographics and employment, to hurricanes, to protein structures. To do so, it models the world as a [knowledge graph](https://blog.google/products/search/introducing-knowledge-graph-things-not/) consisting of nodes, or entities, with properties (attributes) and relationships between them forming directed edges between the nodes. The data model is based on the [Schema.org](https://www.schema.org) framework, an open framework used by over 40M websites; Data Commons makes heavy use of [Schema.org](https://www.schema.org/docs/schemas.html) constructs and extends the model as required, introducing both general constructs (such as intervals) and values for common properties.
rather than "makes heavy use of schema.org constructs", data commons is
an extension of schema.org
------------------------------
In index.md
<#421 (comment)>
:
> +
+## A standards-based knowledge graph, schema, and APIs
+
+Data Commons needs to be able to stitch together data from disparate data sets in different formats and encodings, in a wide range of domains, from time series about demographics and employment, to hurricanes, to protein structures. To do so, it models the world as a [knowledge graph](https://blog.google/products/search/introducing-knowledge-graph-things-not/) consisting of nodes, or entities, with properties (attributes) and relationships between them forming directed edges between the nodes. The data model is based on the [Schema.org](https://www.schema.org) framework, an open framework used by over 40M websites; Data Commons makes heavy use of [Schema.org](https://www.schema.org/docs/schemas.html) constructs and extends the model as required, introducing both general constructs (such as intervals) and values for common properties.
+
+The Data Commons [Knowledge Graph browser](https://datacommons.org/browser/) allows you to peek into the structure of the graph, and the APIs allow you to directly query the parts of the graph (e.g. nodes, triples, etc.).
+
+Importantly, numeric time series data are first-class entities, with "(statistical) variable" being an entity that represents a metric definition, and "observation" being an entity that represents the value of a variable at a specific time. The [Statistical Variable Explorer](https://datacommons.org/tools/statvar) allows you to browse existing variables, and the [Visualization tools](https://datacommons.org/tools/visualization) provide aggregated views of this data over time, geography, or 2-dimensional space. The APIs also allow you to directly query observations.
+
+The knowledge graph is also mapped to relational tables that allow for [SQL querying](https://docs.datacommons.org/bigquery/) (requiring a [Google Cloud BigQuery](https://cloud.google.com/bigquery) account) and easier joining to other datasets outside of Data Commons.
+
+<!--To learn more about the data model and key concepts, see [Key concepts](). -->
+
+## An open-source project and website platform
+
+Data Commons is intended to be a community-based resource, where individuals and organizations can contribute data, code, documentation and educational materials. Source code, schemas, and documentation are publicly available at [https://github.com/datacommonsorg](https://github.com/datacommonsorg).
Can we drop "intended" and just say that it is a community-based resource?
We do have external contributions...
------------------------------
In index.md
<#421 (comment)>
:
> +
+The Data Commons [Knowledge Graph browser](https://datacommons.org/browser/) allows you to peek into the structure of the graph, and the APIs allow you to directly query the parts of the graph (e.g. nodes, triples, etc.).
+
+Importantly, numeric time series data are first-class entities, with "(statistical) variable" being an entity that represents a metric definition, and "observation" being an entity that represents the value of a variable at a specific time. The [Statistical Variable Explorer](https://datacommons.org/tools/statvar) allows you to browse existing variables, and the [Visualization tools](https://datacommons.org/tools/visualization) provide aggregated views of this data over time, geography, or 2-dimensional space. The APIs also allow you to directly query observations.
+
+The knowledge graph is also mapped to relational tables that allow for [SQL querying](https://docs.datacommons.org/bigquery/) (requiring a [Google Cloud BigQuery](https://cloud.google.com/bigquery) account) and easier joining to other datasets outside of Data Commons.
+
+<!--To learn more about the data model and key concepts, see [Key concepts](). -->
+
+## An open-source project and website platform
+
+Data Commons is intended to be a community-based resource, where individuals and organizations can contribute data, code, documentation and educational materials. Source code, schemas, and documentation are publicly available at [https://github.com/datacommonsorg](https://github.com/datacommonsorg).
+
+Google has partnerships with the [United Nations](https://unstats.un.org/UNSDWebsite/undatacommons/sdgs), the [World Health Organization](https://unstats.un.org/UNSDWebsite/undatacommons/areas/1471028664), [One.org](https://datacommons.one.org/), [TechSoup](https://publicdata.techsoup.org/), and many other non-profit, academic, and governmental organizations across the world. We are always looking to expand the available data sources and welcome contributions from data owners around the world.
+
+In addition, Data Commons makes its data and visualizations accessible to any website through [REST](/api/rest/v2/index.html) and [Javascript](/api/web_components/index.html) APIs.
Javascript --> Web Components
------------------------------
In index.md
<#421 (comment)>
:
> +
+Importantly, numeric time series data are first-class entities, with "(statistical) variable" being an entity that represents a metric definition, and "observation" being an entity that represents the value of a variable at a specific time. The [Statistical Variable Explorer](https://datacommons.org/tools/statvar) allows you to browse existing variables, and the [Visualization tools](https://datacommons.org/tools/visualization) provide aggregated views of this data over time, geography, or 2-dimensional space. The APIs also allow you to directly query observations.
+
+The knowledge graph is also mapped to relational tables that allow for [SQL querying](https://docs.datacommons.org/bigquery/) (requiring a [Google Cloud BigQuery](https://cloud.google.com/bigquery) account) and easier joining to other datasets outside of Data Commons.
+
+<!--To learn more about the data model and key concepts, see [Key concepts](). -->
+
+## An open-source project and website platform
+
+Data Commons is intended to be a community-based resource, where individuals and organizations can contribute data, code, documentation and educational materials. Source code, schemas, and documentation are publicly available at [https://github.com/datacommonsorg](https://github.com/datacommonsorg).
+
+Google has partnerships with the [United Nations](https://unstats.un.org/UNSDWebsite/undatacommons/sdgs), the [World Health Organization](https://unstats.un.org/UNSDWebsite/undatacommons/areas/1471028664), [One.org](https://datacommons.one.org/), [TechSoup](https://publicdata.techsoup.org/), and many other non-profit, academic, and governmental organizations across the world. We are always looking to expand the available data sources and welcome contributions from data owners around the world.
+
+In addition, Data Commons makes its data and visualizations accessible to any website through [REST](/api/rest/v2/index.html) and [Javascript](/api/web_components/index.html) APIs.
+
+Finally, Data Commons provides an open-source, [customizable website](/custom_dc/index.html) implementation, for organizations that want to host their own version of a Data Commons website, using their own data and user interfaces.
is "customizable website" the best way to describe custom dc? could we
call it [customizable implementation]
------------------------------
In index.md
<#421 (comment)>
:
> +
+Data Commons needs to be able to stitch together data from disparate data sets in different formats and encodings, in a wide range of domains, from time series about demographics and employment, to hurricanes, to protein structures. To do so, it models the world as a [knowledge graph](https://blog.google/products/search/introducing-knowledge-graph-things-not/) consisting of nodes, or entities, with properties (attributes) and relationships between them forming directed edges between the nodes. The data model is based on the [Schema.org](https://www.schema.org) framework, an open framework used by over 40M websites; Data Commons makes heavy use of [Schema.org](https://www.schema.org/docs/schemas.html) constructs and extends the model as required, introducing both general constructs (such as intervals) and values for common properties.
+
+The Data Commons [Knowledge Graph browser](https://datacommons.org/browser/) allows you to peek into the structure of the graph, and the APIs allow you to directly query the parts of the graph (e.g. nodes, triples, etc.).
+
+Importantly, numeric time series data are first-class entities, with "(statistical) variable" being an entity that represents a metric definition, and "observation" being an entity that represents the value of a variable at a specific time. The [Statistical Variable Explorer](https://datacommons.org/tools/statvar) allows you to browse existing variables, and the [Visualization tools](https://datacommons.org/tools/visualization) provide aggregated views of this data over time, geography, or 2-dimensional space. The APIs also allow you to directly query observations.
+
+The knowledge graph is also mapped to relational tables that allow for [SQL querying](https://docs.datacommons.org/bigquery/) (requiring a [Google Cloud BigQuery](https://cloud.google.com/bigquery) account) and easier joining to other datasets outside of Data Commons.
+
+<!--To learn more about the data model and key concepts, see [Key concepts](). -->
+
+## An open-source project and website platform
+
+Data Commons is intended to be a community-based resource, where individuals and organizations can contribute data, code, documentation and educational materials. Source code, schemas, and documentation are publicly available at [https://github.com/datacommonsorg](https://github.com/datacommonsorg).
+
+Google has partnerships with the [United Nations](https://unstats.un.org/UNSDWebsite/undatacommons/sdgs), the [World Health Organization](https://unstats.un.org/UNSDWebsite/undatacommons/areas/1471028664), [One.org](https://datacommons.one.org/), [TechSoup](https://publicdata.techsoup.org/), and many other non-profit, academic, and governmental organizations across the world. We are always looking to expand the available data sources and welcome contributions from data owners around the world.
how about "data coverage" instead of "available data sources"
------------------------------
In index.md
<#421 (comment)>
:
> ---
-# Why Data Commons?
+{: .no_toc}
+# What is Data Commons?
Should we point to the About page at all, for the Why?
—
Reply to this email directly, view it on GitHub
<#421 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/BHMM7UGITGPZI56JUXP5233ZIB3ILAVCNFSM6AAAAABJOTB4EWVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDCMRWGE3DOMZRGI>
.
You are receiving this because you authored the thread.Message ID:
<datacommonsorg/docsite/pull/421/review/2126167312 <(212)%20616-7312>@
github.com>
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the quick updates!
Feel free to point this at Guha once before publishing.
…On Tue, Jun 18, 2024 at 2:31 PM Kara Moscoe ***@***.***> wrote:
+Prem Ramaswami ***@***.***>
It's been reviewed by Prem, who told me to "just publish it". :-) Prem,
should Guha review?
On Tue, Jun 18, 2024 at 11:23 AM Carolyn Au ***@***.***>
wrote:
> ***@***.**** commented on this pull request.
>
> Thanks for sharing this Kara. Has this been reviewed by Prem and Guha?
> ------------------------------
>
> In how_to_use.md
> <#421 (comment)>
> :
>
> > +nav_order: 1
> +---
> +
> +{: .no_toc}
> +# How to use Data Commons
> +
> +* TOC
> +{:toc}
> +
> +## Learn about the data in Data Commons
> +
> +To find out what data is available in Data Commons, check out the [Statistical Variable Explorer](https://datacommons.org/tools/statvar) and see the [Data sources](/datasets/index.html) pages.
> +
> +## Issue interactive data queries
> +
> +For quick analysis, use the search query bar on the home page or any of the visualization tools, such as the [Timeline](https://datacommons.org/tools/visualization#visType=timeline), [Scatter](https://datacommons.org/tools/visualization#visType%3Dscatter), and [Map](https://datacommons.org/tools/visualization#visType%3Dmap) explorers.
>
> It could be helpful to also link to the homepage.
>
> Also, to me, it reads as if they're also supposed ot use the search query
> bar on the vis tools. Should it be ... or use any of the ... or ... or
> explore using any of the visualization tools ...
> ------------------------------
>
> In how_to_use.md
> <#421 (comment)>
> :
>
> > +
> +## Issue interactive data queries
> +
> +For quick analysis, use the search query bar on the home page or any of the visualization tools, such as the [Timeline](https://datacommons.org/tools/visualization#visType=timeline), [Scatter](https://datacommons.org/tools/visualization#visType%3Dscatter), and [Map](https://datacommons.org/tools/visualization#visType%3Dmap) explorers.
> +
> +If you want to issue SQL queries, and you have a Google Cloud Platform account, use BigQuery Studio on Data Commons data in [Analytics Hub](https://cloud.google.com/analytics-hub). See the [Data Commons in BigQuery](/bigquery/index.html) page for more details.
> +
> +## Issue programmatic data queries
> +
> +Data Commons publishes REST, Python, Pandas, Google Sheets, and SPARQL [APIs](/api/index.html). The APIs support both low-level exploration of the knowledge graph as well as higher-level statistical analysis of variables.
> +
> +The Python and pandas APIs provide convenient wrappers for calling the APIs; we have developed a set of [Google Colab tutorials](/tutorials/index.html) to help you get started with analysis. We have also developed a [Data science curriculum](/courseware/intro_data_science.html) featuring our API and data, currently in use at MIT.
> +
> +## Embed data analyses and visualizations in your site
> +
> +If you would like to include Data Commons result visualizations in your own website, we provide a [Javascript API](/api/web_components/index.html) that makes it a snap to embed various chart elements, such as scatter plots, maps, pie charts, and many more.
>
> could we label the link Web Components API (it's not quite javascript
> only...)
> ------------------------------
>
> In how_to_use.md
> <#421 (comment)>
> :
>
> > +The Python and pandas APIs provide convenient wrappers for calling the APIs; we have developed a set of [Google Colab tutorials](/tutorials/index.html) to help you get started with analysis. We have also developed a [Data science curriculum](/courseware/intro_data_science.html) featuring our API and data, currently in use at MIT.
> +
> +## Embed data analyses and visualizations in your site
> +
> +If you would like to include Data Commons result visualizations in your own website, we provide a [Javascript API](/api/web_components/index.html) that makes it a snap to embed various chart elements, such as scatter plots, maps, pie charts, and many more.
> +
> +## Download data for offline analysis
> +
> +Data Commons provides several tools for downloading its data:
> +
> +- To preview and download for selected places and statistical variables, use the standalone [Data Download Tool](https://datacommons.org/tools/download) or click the **Download** link in any of the results pages of the visualization tools.
> +- To load data into Google Sheets for analysis and charting, install and run the Data Commons Google [Sheets add-on](/api/sheets/index.html).
> +
> +## Build machine learning models
> +
> +Data Commons data sets provide ideal training data for developing machine learning models and other data science applications. The [data science tutorials](/courseware/intro_data_science.html) show you how to use our APIs and data to get started.
>
> We try not to use "data sets".. perhaps just Data Commons provides ideal
> training ...
> ------------------------------
>
> In how_to_use.md
> <#421 (comment)>
> :
>
> > +## Embed data analyses and visualizations in your site
> +
> +If you would like to include Data Commons result visualizations in your own website, we provide a [Javascript API](/api/web_components/index.html) that makes it a snap to embed various chart elements, such as scatter plots, maps, pie charts, and many more.
> +
> +## Download data for offline analysis
> +
> +Data Commons provides several tools for downloading its data:
> +
> +- To preview and download for selected places and statistical variables, use the standalone [Data Download Tool](https://datacommons.org/tools/download) or click the **Download** link in any of the results pages of the visualization tools.
> +- To load data into Google Sheets for analysis and charting, install and run the Data Commons Google [Sheets add-on](/api/sheets/index.html).
> +
> +## Build machine learning models
> +
> +Data Commons data sets provide ideal training data for developing machine learning models and other data science applications. The [data science tutorials](/courseware/intro_data_science.html) show you how to use our APIs and data to get started.
> +
> +## Contribute data to Data Commons
>
> Can we point to this page instead, which lists other ways to contibute:
> https://docs.datacommons.org/contributing/
> ------------------------------
>
> In index.md
> <#421 (comment)>
> :
>
> >
> -One of the salient aspects of Data Commons is that it is not a repository of data sets. There are many great repositories (Dataverse, BQ public datasets, data.gov) that more than adequately address that need. Instead, it is a single unified database created by normalizing/aligning the schemas and entity references across these different datasets (to the extent possible). So, for example, if a researcher wants the population, violent crime rate and unemployment rate of a county, the researcher does not have to go to three different datasets (Census, FBI and BLS), but can instead, get it from a single database, using one schema, one API. Of course, she would want to know the provenance of the data, which is recorded with every data point, something enabled in the APIs.
> +Behind the scenes, Data Commons does the tedious work of finding data, understanding the data collection methodologies, cleaning the data, reconciling different formats and schemas, figuring out how to merge data about the same entity from different sources, and so on – saving organizations months of tedious, costly and error-prone work. Data Commons is not a repository of public datasets (such as Kaggle or Google Cloud BiqQuery Public Datasets). Instead, it is a single unified data source created by normalizing and aligning schemas and references to the same entities (such as cities, counties, organizations, etc.) across different datasets.
> +
> +For example, if you wanted to get [population stats, poverty and unemployment rates of a specific county](https://datacommons.org/place/geoId/06081), you don't need to go to three different datasets; instead, you can get the data from a single data source, using one schema, and one API. Data Commons is also used by Google Search whenever it can provide the most relevant statistical results to a query. For example, the top Google Search result for the query "what is the life expectancy of Vietnam" returns a Data Commons timeline graph and a link to the [Place page](https://datacommons.org/place/country/VNM?utm_medium=explore&mprop=lifeExpectancy&popt=Person&hl=en) for Vietnam:
> +
> +![Google Search query result](/assets/images/dcoverview1.png){: width="800"}
> +
> +## A standards-based knowledge graph, schema, and APIs
> +
> +Data Commons needs to be able to stitch together data from disparate data sets in different formats and encodings, in a wide range of domains, from time series about demographics and employment, to hurricanes, to protein structures. To do so, it models the world as a [knowledge graph](https://blog.google/products/search/introducing-knowledge-graph-things-not/) consisting of nodes, or entities, with properties (attributes) and relationships between them forming directed edges between the nodes. The data model is based on the [Schema.org](https://www.schema.org) framework, an open framework used by over 40M websites; Data Commons makes heavy use of [Schema.org](https://www.schema.org/docs/schemas.html) constructs and extends the model as required, introducing both general constructs (such as intervals) and values for common properties.
>
> rather than "makes heavy use of schema.org constructs", data commons is
> an extension of schema.org
> ------------------------------
>
> In index.md
> <#421 (comment)>
> :
>
> > +
> +## A standards-based knowledge graph, schema, and APIs
> +
> +Data Commons needs to be able to stitch together data from disparate data sets in different formats and encodings, in a wide range of domains, from time series about demographics and employment, to hurricanes, to protein structures. To do so, it models the world as a [knowledge graph](https://blog.google/products/search/introducing-knowledge-graph-things-not/) consisting of nodes, or entities, with properties (attributes) and relationships between them forming directed edges between the nodes. The data model is based on the [Schema.org](https://www.schema.org) framework, an open framework used by over 40M websites; Data Commons makes heavy use of [Schema.org](https://www.schema.org/docs/schemas.html) constructs and extends the model as required, introducing both general constructs (such as intervals) and values for common properties.
> +
> +The Data Commons [Knowledge Graph browser](https://datacommons.org/browser/) allows you to peek into the structure of the graph, and the APIs allow you to directly query the parts of the graph (e.g. nodes, triples, etc.).
> +
> +Importantly, numeric time series data are first-class entities, with "(statistical) variable" being an entity that represents a metric definition, and "observation" being an entity that represents the value of a variable at a specific time. The [Statistical Variable Explorer](https://datacommons.org/tools/statvar) allows you to browse existing variables, and the [Visualization tools](https://datacommons.org/tools/visualization) provide aggregated views of this data over time, geography, or 2-dimensional space. The APIs also allow you to directly query observations.
> +
> +The knowledge graph is also mapped to relational tables that allow for [SQL querying](https://docs.datacommons.org/bigquery/) (requiring a [Google Cloud BigQuery](https://cloud.google.com/bigquery) account) and easier joining to other datasets outside of Data Commons.
> +
> +<!--To learn more about the data model and key concepts, see [Key concepts](). -->
> +
> +## An open-source project and website platform
> +
> +Data Commons is intended to be a community-based resource, where individuals and organizations can contribute data, code, documentation and educational materials. Source code, schemas, and documentation are publicly available at [https://github.com/datacommonsorg](https://github.com/datacommonsorg).
>
> Can we drop "intended" and just say that it is a community-based
> resource? We do have external contributions...
> ------------------------------
>
> In index.md
> <#421 (comment)>
> :
>
> > +
> +The Data Commons [Knowledge Graph browser](https://datacommons.org/browser/) allows you to peek into the structure of the graph, and the APIs allow you to directly query the parts of the graph (e.g. nodes, triples, etc.).
> +
> +Importantly, numeric time series data are first-class entities, with "(statistical) variable" being an entity that represents a metric definition, and "observation" being an entity that represents the value of a variable at a specific time. The [Statistical Variable Explorer](https://datacommons.org/tools/statvar) allows you to browse existing variables, and the [Visualization tools](https://datacommons.org/tools/visualization) provide aggregated views of this data over time, geography, or 2-dimensional space. The APIs also allow you to directly query observations.
> +
> +The knowledge graph is also mapped to relational tables that allow for [SQL querying](https://docs.datacommons.org/bigquery/) (requiring a [Google Cloud BigQuery](https://cloud.google.com/bigquery) account) and easier joining to other datasets outside of Data Commons.
> +
> +<!--To learn more about the data model and key concepts, see [Key concepts](). -->
> +
> +## An open-source project and website platform
> +
> +Data Commons is intended to be a community-based resource, where individuals and organizations can contribute data, code, documentation and educational materials. Source code, schemas, and documentation are publicly available at [https://github.com/datacommonsorg](https://github.com/datacommonsorg).
> +
> +Google has partnerships with the [United Nations](https://unstats.un.org/UNSDWebsite/undatacommons/sdgs), the [World Health Organization](https://unstats.un.org/UNSDWebsite/undatacommons/areas/1471028664), [One.org](https://datacommons.one.org/), [TechSoup](https://publicdata.techsoup.org/), and many other non-profit, academic, and governmental organizations across the world. We are always looking to expand the available data sources and welcome contributions from data owners around the world.
> +
> +In addition, Data Commons makes its data and visualizations accessible to any website through [REST](/api/rest/v2/index.html) and [Javascript](/api/web_components/index.html) APIs.
>
> Javascript --> Web Components
> ------------------------------
>
> In index.md
> <#421 (comment)>
> :
>
> > +
> +Importantly, numeric time series data are first-class entities, with "(statistical) variable" being an entity that represents a metric definition, and "observation" being an entity that represents the value of a variable at a specific time. The [Statistical Variable Explorer](https://datacommons.org/tools/statvar) allows you to browse existing variables, and the [Visualization tools](https://datacommons.org/tools/visualization) provide aggregated views of this data over time, geography, or 2-dimensional space. The APIs also allow you to directly query observations.
> +
> +The knowledge graph is also mapped to relational tables that allow for [SQL querying](https://docs.datacommons.org/bigquery/) (requiring a [Google Cloud BigQuery](https://cloud.google.com/bigquery) account) and easier joining to other datasets outside of Data Commons.
> +
> +<!--To learn more about the data model and key concepts, see [Key concepts](). -->
> +
> +## An open-source project and website platform
> +
> +Data Commons is intended to be a community-based resource, where individuals and organizations can contribute data, code, documentation and educational materials. Source code, schemas, and documentation are publicly available at [https://github.com/datacommonsorg](https://github.com/datacommonsorg).
> +
> +Google has partnerships with the [United Nations](https://unstats.un.org/UNSDWebsite/undatacommons/sdgs), the [World Health Organization](https://unstats.un.org/UNSDWebsite/undatacommons/areas/1471028664), [One.org](https://datacommons.one.org/), [TechSoup](https://publicdata.techsoup.org/), and many other non-profit, academic, and governmental organizations across the world. We are always looking to expand the available data sources and welcome contributions from data owners around the world.
> +
> +In addition, Data Commons makes its data and visualizations accessible to any website through [REST](/api/rest/v2/index.html) and [Javascript](/api/web_components/index.html) APIs.
> +
> +Finally, Data Commons provides an open-source, [customizable website](/custom_dc/index.html) implementation, for organizations that want to host their own version of a Data Commons website, using their own data and user interfaces.
>
> is "customizable website" the best way to describe custom dc? could we
> call it [customizable implementation]
> ------------------------------
>
> In index.md
> <#421 (comment)>
> :
>
> > +
> +Data Commons needs to be able to stitch together data from disparate data sets in different formats and encodings, in a wide range of domains, from time series about demographics and employment, to hurricanes, to protein structures. To do so, it models the world as a [knowledge graph](https://blog.google/products/search/introducing-knowledge-graph-things-not/) consisting of nodes, or entities, with properties (attributes) and relationships between them forming directed edges between the nodes. The data model is based on the [Schema.org](https://www.schema.org) framework, an open framework used by over 40M websites; Data Commons makes heavy use of [Schema.org](https://www.schema.org/docs/schemas.html) constructs and extends the model as required, introducing both general constructs (such as intervals) and values for common properties.
> +
> +The Data Commons [Knowledge Graph browser](https://datacommons.org/browser/) allows you to peek into the structure of the graph, and the APIs allow you to directly query the parts of the graph (e.g. nodes, triples, etc.).
> +
> +Importantly, numeric time series data are first-class entities, with "(statistical) variable" being an entity that represents a metric definition, and "observation" being an entity that represents the value of a variable at a specific time. The [Statistical Variable Explorer](https://datacommons.org/tools/statvar) allows you to browse existing variables, and the [Visualization tools](https://datacommons.org/tools/visualization) provide aggregated views of this data over time, geography, or 2-dimensional space. The APIs also allow you to directly query observations.
> +
> +The knowledge graph is also mapped to relational tables that allow for [SQL querying](https://docs.datacommons.org/bigquery/) (requiring a [Google Cloud BigQuery](https://cloud.google.com/bigquery) account) and easier joining to other datasets outside of Data Commons.
> +
> +<!--To learn more about the data model and key concepts, see [Key concepts](). -->
> +
> +## An open-source project and website platform
> +
> +Data Commons is intended to be a community-based resource, where individuals and organizations can contribute data, code, documentation and educational materials. Source code, schemas, and documentation are publicly available at [https://github.com/datacommonsorg](https://github.com/datacommonsorg).
> +
> +Google has partnerships with the [United Nations](https://unstats.un.org/UNSDWebsite/undatacommons/sdgs), the [World Health Organization](https://unstats.un.org/UNSDWebsite/undatacommons/areas/1471028664), [One.org](https://datacommons.one.org/), [TechSoup](https://publicdata.techsoup.org/), and many other non-profit, academic, and governmental organizations across the world. We are always looking to expand the available data sources and welcome contributions from data owners around the world.
>
> how about "data coverage" instead of "available data sources"
> ------------------------------
>
> In index.md
> <#421 (comment)>
> :
>
> > ---
>
> -# Why Data Commons?
> +{: .no_toc}
> +# What is Data Commons?
>
> Should we point to the About page at all, for the Why?
>
> —
> Reply to this email directly, view it on GitHub
> <#421 (review)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/BHMM7UGITGPZI56JUXP5233ZIB3ILAVCNFSM6AAAAABJOTB4EWVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDCMRWGE3DOMZRGI>
> .
> You are receiving this because you authored the thread.Message ID:
> <datacommonsorg/docsite/pull/421/review/2126167312 <(212)%20616-7312>@
> github.com>
>
--
==================================
Prem Ramaswami
Product Manager
DataCommons.org
Make Data Sing
***@***.***
Phone: +18579981598
==================================
|
BTW, one of the odd things about the "About" page is that there is a
sentence that says, "I am bullish about...". But there is no indication of
who "I" is. Should there be some kind of byline about Guha there?
…On Tue, Jun 18, 2024 at 1:54 PM Prem Ramaswami ***@***.***> wrote:
Feel free to point this at Guha once before publishing.
On Tue, Jun 18, 2024 at 2:31 PM Kara Moscoe ***@***.***> wrote:
> +Prem Ramaswami ***@***.***>
> It's been reviewed by Prem, who told me to "just publish it". :-) Prem,
> should Guha review?
>
> On Tue, Jun 18, 2024 at 11:23 AM Carolyn Au ***@***.***>
> wrote:
>
>> ***@***.**** commented on this pull request.
>>
>> Thanks for sharing this Kara. Has this been reviewed by Prem and Guha?
>> ------------------------------
>>
>> In how_to_use.md
>> <#421 (comment)>
>> :
>>
>> > +nav_order: 1
>> +---
>> +
>> +{: .no_toc}
>> +# How to use Data Commons
>> +
>> +* TOC
>> +{:toc}
>> +
>> +## Learn about the data in Data Commons
>> +
>> +To find out what data is available in Data Commons, check out the [Statistical Variable Explorer](https://datacommons.org/tools/statvar) and see the [Data sources](/datasets/index.html) pages.
>> +
>> +## Issue interactive data queries
>> +
>> +For quick analysis, use the search query bar on the home page or any of the visualization tools, such as the [Timeline](https://datacommons.org/tools/visualization#visType=timeline), [Scatter](https://datacommons.org/tools/visualization#visType%3Dscatter), and [Map](https://datacommons.org/tools/visualization#visType%3Dmap) explorers.
>>
>> It could be helpful to also link to the homepage.
>>
>> Also, to me, it reads as if they're also supposed ot use the search
>> query bar on the vis tools. Should it be ... or use any of the ... or ...
>> or explore using any of the visualization tools ...
>> ------------------------------
>>
>> In how_to_use.md
>> <#421 (comment)>
>> :
>>
>> > +
>> +## Issue interactive data queries
>> +
>> +For quick analysis, use the search query bar on the home page or any of the visualization tools, such as the [Timeline](https://datacommons.org/tools/visualization#visType=timeline), [Scatter](https://datacommons.org/tools/visualization#visType%3Dscatter), and [Map](https://datacommons.org/tools/visualization#visType%3Dmap) explorers.
>> +
>> +If you want to issue SQL queries, and you have a Google Cloud Platform account, use BigQuery Studio on Data Commons data in [Analytics Hub](https://cloud.google.com/analytics-hub). See the [Data Commons in BigQuery](/bigquery/index.html) page for more details.
>> +
>> +## Issue programmatic data queries
>> +
>> +Data Commons publishes REST, Python, Pandas, Google Sheets, and SPARQL [APIs](/api/index.html). The APIs support both low-level exploration of the knowledge graph as well as higher-level statistical analysis of variables.
>> +
>> +The Python and pandas APIs provide convenient wrappers for calling the APIs; we have developed a set of [Google Colab tutorials](/tutorials/index.html) to help you get started with analysis. We have also developed a [Data science curriculum](/courseware/intro_data_science.html) featuring our API and data, currently in use at MIT.
>> +
>> +## Embed data analyses and visualizations in your site
>> +
>> +If you would like to include Data Commons result visualizations in your own website, we provide a [Javascript API](/api/web_components/index.html) that makes it a snap to embed various chart elements, such as scatter plots, maps, pie charts, and many more.
>>
>> could we label the link Web Components API (it's not quite javascript
>> only...)
>> ------------------------------
>>
>> In how_to_use.md
>> <#421 (comment)>
>> :
>>
>> > +The Python and pandas APIs provide convenient wrappers for calling the APIs; we have developed a set of [Google Colab tutorials](/tutorials/index.html) to help you get started with analysis. We have also developed a [Data science curriculum](/courseware/intro_data_science.html) featuring our API and data, currently in use at MIT.
>> +
>> +## Embed data analyses and visualizations in your site
>> +
>> +If you would like to include Data Commons result visualizations in your own website, we provide a [Javascript API](/api/web_components/index.html) that makes it a snap to embed various chart elements, such as scatter plots, maps, pie charts, and many more.
>> +
>> +## Download data for offline analysis
>> +
>> +Data Commons provides several tools for downloading its data:
>> +
>> +- To preview and download for selected places and statistical variables, use the standalone [Data Download Tool](https://datacommons.org/tools/download) or click the **Download** link in any of the results pages of the visualization tools.
>> +- To load data into Google Sheets for analysis and charting, install and run the Data Commons Google [Sheets add-on](/api/sheets/index.html).
>> +
>> +## Build machine learning models
>> +
>> +Data Commons data sets provide ideal training data for developing machine learning models and other data science applications. The [data science tutorials](/courseware/intro_data_science.html) show you how to use our APIs and data to get started.
>>
>> We try not to use "data sets".. perhaps just Data Commons provides
>> ideal training ...
>> ------------------------------
>>
>> In how_to_use.md
>> <#421 (comment)>
>> :
>>
>> > +## Embed data analyses and visualizations in your site
>> +
>> +If you would like to include Data Commons result visualizations in your own website, we provide a [Javascript API](/api/web_components/index.html) that makes it a snap to embed various chart elements, such as scatter plots, maps, pie charts, and many more.
>> +
>> +## Download data for offline analysis
>> +
>> +Data Commons provides several tools for downloading its data:
>> +
>> +- To preview and download for selected places and statistical variables, use the standalone [Data Download Tool](https://datacommons.org/tools/download) or click the **Download** link in any of the results pages of the visualization tools.
>> +- To load data into Google Sheets for analysis and charting, install and run the Data Commons Google [Sheets add-on](/api/sheets/index.html).
>> +
>> +## Build machine learning models
>> +
>> +Data Commons data sets provide ideal training data for developing machine learning models and other data science applications. The [data science tutorials](/courseware/intro_data_science.html) show you how to use our APIs and data to get started.
>> +
>> +## Contribute data to Data Commons
>>
>> Can we point to this page instead, which lists other ways to contibute:
>> https://docs.datacommons.org/contributing/
>> ------------------------------
>>
>> In index.md
>> <#421 (comment)>
>> :
>>
>> >
>> -One of the salient aspects of Data Commons is that it is not a repository of data sets. There are many great repositories (Dataverse, BQ public datasets, data.gov) that more than adequately address that need. Instead, it is a single unified database created by normalizing/aligning the schemas and entity references across these different datasets (to the extent possible). So, for example, if a researcher wants the population, violent crime rate and unemployment rate of a county, the researcher does not have to go to three different datasets (Census, FBI and BLS), but can instead, get it from a single database, using one schema, one API. Of course, she would want to know the provenance of the data, which is recorded with every data point, something enabled in the APIs.
>> +Behind the scenes, Data Commons does the tedious work of finding data, understanding the data collection methodologies, cleaning the data, reconciling different formats and schemas, figuring out how to merge data about the same entity from different sources, and so on – saving organizations months of tedious, costly and error-prone work. Data Commons is not a repository of public datasets (such as Kaggle or Google Cloud BiqQuery Public Datasets). Instead, it is a single unified data source created by normalizing and aligning schemas and references to the same entities (such as cities, counties, organizations, etc.) across different datasets.
>> +
>> +For example, if you wanted to get [population stats, poverty and unemployment rates of a specific county](https://datacommons.org/place/geoId/06081), you don't need to go to three different datasets; instead, you can get the data from a single data source, using one schema, and one API. Data Commons is also used by Google Search whenever it can provide the most relevant statistical results to a query. For example, the top Google Search result for the query "what is the life expectancy of Vietnam" returns a Data Commons timeline graph and a link to the [Place page](https://datacommons.org/place/country/VNM?utm_medium=explore&mprop=lifeExpectancy&popt=Person&hl=en) for Vietnam:
>> +
>> +![Google Search query result](/assets/images/dcoverview1.png){: width="800"}
>> +
>> +## A standards-based knowledge graph, schema, and APIs
>> +
>> +Data Commons needs to be able to stitch together data from disparate data sets in different formats and encodings, in a wide range of domains, from time series about demographics and employment, to hurricanes, to protein structures. To do so, it models the world as a [knowledge graph](https://blog.google/products/search/introducing-knowledge-graph-things-not/) consisting of nodes, or entities, with properties (attributes) and relationships between them forming directed edges between the nodes. The data model is based on the [Schema.org](https://www.schema.org) framework, an open framework used by over 40M websites; Data Commons makes heavy use of [Schema.org](https://www.schema.org/docs/schemas.html) constructs and extends the model as required, introducing both general constructs (such as intervals) and values for common properties.
>>
>> rather than "makes heavy use of schema.org constructs", data commons is
>> an extension of schema.org
>> ------------------------------
>>
>> In index.md
>> <#421 (comment)>
>> :
>>
>> > +
>> +## A standards-based knowledge graph, schema, and APIs
>> +
>> +Data Commons needs to be able to stitch together data from disparate data sets in different formats and encodings, in a wide range of domains, from time series about demographics and employment, to hurricanes, to protein structures. To do so, it models the world as a [knowledge graph](https://blog.google/products/search/introducing-knowledge-graph-things-not/) consisting of nodes, or entities, with properties (attributes) and relationships between them forming directed edges between the nodes. The data model is based on the [Schema.org](https://www.schema.org) framework, an open framework used by over 40M websites; Data Commons makes heavy use of [Schema.org](https://www.schema.org/docs/schemas.html) constructs and extends the model as required, introducing both general constructs (such as intervals) and values for common properties.
>> +
>> +The Data Commons [Knowledge Graph browser](https://datacommons.org/browser/) allows you to peek into the structure of the graph, and the APIs allow you to directly query the parts of the graph (e.g. nodes, triples, etc.).
>> +
>> +Importantly, numeric time series data are first-class entities, with "(statistical) variable" being an entity that represents a metric definition, and "observation" being an entity that represents the value of a variable at a specific time. The [Statistical Variable Explorer](https://datacommons.org/tools/statvar) allows you to browse existing variables, and the [Visualization tools](https://datacommons.org/tools/visualization) provide aggregated views of this data over time, geography, or 2-dimensional space. The APIs also allow you to directly query observations.
>> +
>> +The knowledge graph is also mapped to relational tables that allow for [SQL querying](https://docs.datacommons.org/bigquery/) (requiring a [Google Cloud BigQuery](https://cloud.google.com/bigquery) account) and easier joining to other datasets outside of Data Commons.
>> +
>> +<!--To learn more about the data model and key concepts, see [Key concepts](). -->
>> +
>> +## An open-source project and website platform
>> +
>> +Data Commons is intended to be a community-based resource, where individuals and organizations can contribute data, code, documentation and educational materials. Source code, schemas, and documentation are publicly available at [https://github.com/datacommonsorg](https://github.com/datacommonsorg).
>>
>> Can we drop "intended" and just say that it is a community-based
>> resource? We do have external contributions...
>> ------------------------------
>>
>> In index.md
>> <#421 (comment)>
>> :
>>
>> > +
>> +The Data Commons [Knowledge Graph browser](https://datacommons.org/browser/) allows you to peek into the structure of the graph, and the APIs allow you to directly query the parts of the graph (e.g. nodes, triples, etc.).
>> +
>> +Importantly, numeric time series data are first-class entities, with "(statistical) variable" being an entity that represents a metric definition, and "observation" being an entity that represents the value of a variable at a specific time. The [Statistical Variable Explorer](https://datacommons.org/tools/statvar) allows you to browse existing variables, and the [Visualization tools](https://datacommons.org/tools/visualization) provide aggregated views of this data over time, geography, or 2-dimensional space. The APIs also allow you to directly query observations.
>> +
>> +The knowledge graph is also mapped to relational tables that allow for [SQL querying](https://docs.datacommons.org/bigquery/) (requiring a [Google Cloud BigQuery](https://cloud.google.com/bigquery) account) and easier joining to other datasets outside of Data Commons.
>> +
>> +<!--To learn more about the data model and key concepts, see [Key concepts](). -->
>> +
>> +## An open-source project and website platform
>> +
>> +Data Commons is intended to be a community-based resource, where individuals and organizations can contribute data, code, documentation and educational materials. Source code, schemas, and documentation are publicly available at [https://github.com/datacommonsorg](https://github.com/datacommonsorg).
>> +
>> +Google has partnerships with the [United Nations](https://unstats.un.org/UNSDWebsite/undatacommons/sdgs), the [World Health Organization](https://unstats.un.org/UNSDWebsite/undatacommons/areas/1471028664), [One.org](https://datacommons.one.org/), [TechSoup](https://publicdata.techsoup.org/), and many other non-profit, academic, and governmental organizations across the world. We are always looking to expand the available data sources and welcome contributions from data owners around the world.
>> +
>> +In addition, Data Commons makes its data and visualizations accessible to any website through [REST](/api/rest/v2/index.html) and [Javascript](/api/web_components/index.html) APIs.
>>
>> Javascript --> Web Components
>> ------------------------------
>>
>> In index.md
>> <#421 (comment)>
>> :
>>
>> > +
>> +Importantly, numeric time series data are first-class entities, with "(statistical) variable" being an entity that represents a metric definition, and "observation" being an entity that represents the value of a variable at a specific time. The [Statistical Variable Explorer](https://datacommons.org/tools/statvar) allows you to browse existing variables, and the [Visualization tools](https://datacommons.org/tools/visualization) provide aggregated views of this data over time, geography, or 2-dimensional space. The APIs also allow you to directly query observations.
>> +
>> +The knowledge graph is also mapped to relational tables that allow for [SQL querying](https://docs.datacommons.org/bigquery/) (requiring a [Google Cloud BigQuery](https://cloud.google.com/bigquery) account) and easier joining to other datasets outside of Data Commons.
>> +
>> +<!--To learn more about the data model and key concepts, see [Key concepts](). -->
>> +
>> +## An open-source project and website platform
>> +
>> +Data Commons is intended to be a community-based resource, where individuals and organizations can contribute data, code, documentation and educational materials. Source code, schemas, and documentation are publicly available at [https://github.com/datacommonsorg](https://github.com/datacommonsorg).
>> +
>> +Google has partnerships with the [United Nations](https://unstats.un.org/UNSDWebsite/undatacommons/sdgs), the [World Health Organization](https://unstats.un.org/UNSDWebsite/undatacommons/areas/1471028664), [One.org](https://datacommons.one.org/), [TechSoup](https://publicdata.techsoup.org/), and many other non-profit, academic, and governmental organizations across the world. We are always looking to expand the available data sources and welcome contributions from data owners around the world.
>> +
>> +In addition, Data Commons makes its data and visualizations accessible to any website through [REST](/api/rest/v2/index.html) and [Javascript](/api/web_components/index.html) APIs.
>> +
>> +Finally, Data Commons provides an open-source, [customizable website](/custom_dc/index.html) implementation, for organizations that want to host their own version of a Data Commons website, using their own data and user interfaces.
>>
>> is "customizable website" the best way to describe custom dc? could we
>> call it [customizable implementation]
>> ------------------------------
>>
>> In index.md
>> <#421 (comment)>
>> :
>>
>> > +
>> +Data Commons needs to be able to stitch together data from disparate data sets in different formats and encodings, in a wide range of domains, from time series about demographics and employment, to hurricanes, to protein structures. To do so, it models the world as a [knowledge graph](https://blog.google/products/search/introducing-knowledge-graph-things-not/) consisting of nodes, or entities, with properties (attributes) and relationships between them forming directed edges between the nodes. The data model is based on the [Schema.org](https://www.schema.org) framework, an open framework used by over 40M websites; Data Commons makes heavy use of [Schema.org](https://www.schema.org/docs/schemas.html) constructs and extends the model as required, introducing both general constructs (such as intervals) and values for common properties.
>> +
>> +The Data Commons [Knowledge Graph browser](https://datacommons.org/browser/) allows you to peek into the structure of the graph, and the APIs allow you to directly query the parts of the graph (e.g. nodes, triples, etc.).
>> +
>> +Importantly, numeric time series data are first-class entities, with "(statistical) variable" being an entity that represents a metric definition, and "observation" being an entity that represents the value of a variable at a specific time. The [Statistical Variable Explorer](https://datacommons.org/tools/statvar) allows you to browse existing variables, and the [Visualization tools](https://datacommons.org/tools/visualization) provide aggregated views of this data over time, geography, or 2-dimensional space. The APIs also allow you to directly query observations.
>> +
>> +The knowledge graph is also mapped to relational tables that allow for [SQL querying](https://docs.datacommons.org/bigquery/) (requiring a [Google Cloud BigQuery](https://cloud.google.com/bigquery) account) and easier joining to other datasets outside of Data Commons.
>> +
>> +<!--To learn more about the data model and key concepts, see [Key concepts](). -->
>> +
>> +## An open-source project and website platform
>> +
>> +Data Commons is intended to be a community-based resource, where individuals and organizations can contribute data, code, documentation and educational materials. Source code, schemas, and documentation are publicly available at [https://github.com/datacommonsorg](https://github.com/datacommonsorg).
>> +
>> +Google has partnerships with the [United Nations](https://unstats.un.org/UNSDWebsite/undatacommons/sdgs), the [World Health Organization](https://unstats.un.org/UNSDWebsite/undatacommons/areas/1471028664), [One.org](https://datacommons.one.org/), [TechSoup](https://publicdata.techsoup.org/), and many other non-profit, academic, and governmental organizations across the world. We are always looking to expand the available data sources and welcome contributions from data owners around the world.
>>
>> how about "data coverage" instead of "available data sources"
>> ------------------------------
>>
>> In index.md
>> <#421 (comment)>
>> :
>>
>> > ---
>>
>> -# Why Data Commons?
>> +{: .no_toc}
>> +# What is Data Commons?
>>
>> Should we point to the About page at all, for the Why?
>>
>> —
>> Reply to this email directly, view it on GitHub
>> <#421 (review)>,
>> or unsubscribe
>> <https://github.com/notifications/unsubscribe-auth/BHMM7UGITGPZI56JUXP5233ZIB3ILAVCNFSM6AAAAABJOTB4EWVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDCMRWGE3DOMZRGI>
>> .
>> You are receiving this because you authored the thread.Message ID:
>> <datacommonsorg/docsite/pull/421/review/2126167312 <(212)%20616-7312>@
>> github.com>
>>
>
--
==================================
Prem Ramaswami
Product Manager
DataCommons.org
Make Data Sing
***@***.***
Phone: +18579981598 <(857)%20998-1598>
==================================
|
We shoudl change this to 3rd person and remove the first person.
…On Tue, Jun 18, 2024 at 5:42 PM Kara Moscoe ***@***.***> wrote:
BTW, one of the odd things about the "About" page is that there is a
sentence that says, "I am bullish about...". But there is no indication of
who "I" is. Should there be some kind of byline about Guha there?
On Tue, Jun 18, 2024 at 1:54 PM Prem Ramaswami ***@***.***> wrote:
> Feel free to point this at Guha once before publishing.
>
> On Tue, Jun 18, 2024 at 2:31 PM Kara Moscoe ***@***.***> wrote:
>
>> +Prem Ramaswami ***@***.***>
>> It's been reviewed by Prem, who told me to "just publish it". :-) Prem,
>> should Guha review?
>>
>> On Tue, Jun 18, 2024 at 11:23 AM Carolyn Au ***@***.***>
>> wrote:
>>
>>> ***@***.**** commented on this pull request.
>>>
>>> Thanks for sharing this Kara. Has this been reviewed by Prem and Guha?
>>> ------------------------------
>>>
>>> In how_to_use.md
>>> <#421 (comment)>
>>> :
>>>
>>> > +nav_order: 1
>>> +---
>>> +
>>> +{: .no_toc}
>>> +# How to use Data Commons
>>> +
>>> +* TOC
>>> +{:toc}
>>> +
>>> +## Learn about the data in Data Commons
>>> +
>>> +To find out what data is available in Data Commons, check out the [Statistical Variable Explorer](https://datacommons.org/tools/statvar) and see the [Data sources](/datasets/index.html) pages.
>>> +
>>> +## Issue interactive data queries
>>> +
>>> +For quick analysis, use the search query bar on the home page or any of the visualization tools, such as the [Timeline](https://datacommons.org/tools/visualization#visType=timeline), [Scatter](https://datacommons.org/tools/visualization#visType%3Dscatter), and [Map](https://datacommons.org/tools/visualization#visType%3Dmap) explorers.
>>>
>>> It could be helpful to also link to the homepage.
>>>
>>> Also, to me, it reads as if they're also supposed ot use the search
>>> query bar on the vis tools. Should it be ... or use any of the ... or ...
>>> or explore using any of the visualization tools ...
>>> ------------------------------
>>>
>>> In how_to_use.md
>>> <#421 (comment)>
>>> :
>>>
>>> > +
>>> +## Issue interactive data queries
>>> +
>>> +For quick analysis, use the search query bar on the home page or any of the visualization tools, such as the [Timeline](https://datacommons.org/tools/visualization#visType=timeline), [Scatter](https://datacommons.org/tools/visualization#visType%3Dscatter), and [Map](https://datacommons.org/tools/visualization#visType%3Dmap) explorers.
>>> +
>>> +If you want to issue SQL queries, and you have a Google Cloud Platform account, use BigQuery Studio on Data Commons data in [Analytics Hub](https://cloud.google.com/analytics-hub). See the [Data Commons in BigQuery](/bigquery/index.html) page for more details.
>>> +
>>> +## Issue programmatic data queries
>>> +
>>> +Data Commons publishes REST, Python, Pandas, Google Sheets, and SPARQL [APIs](/api/index.html). The APIs support both low-level exploration of the knowledge graph as well as higher-level statistical analysis of variables.
>>> +
>>> +The Python and pandas APIs provide convenient wrappers for calling the APIs; we have developed a set of [Google Colab tutorials](/tutorials/index.html) to help you get started with analysis. We have also developed a [Data science curriculum](/courseware/intro_data_science.html) featuring our API and data, currently in use at MIT.
>>> +
>>> +## Embed data analyses and visualizations in your site
>>> +
>>> +If you would like to include Data Commons result visualizations in your own website, we provide a [Javascript API](/api/web_components/index.html) that makes it a snap to embed various chart elements, such as scatter plots, maps, pie charts, and many more.
>>>
>>> could we label the link Web Components API (it's not quite javascript
>>> only...)
>>> ------------------------------
>>>
>>> In how_to_use.md
>>> <#421 (comment)>
>>> :
>>>
>>> > +The Python and pandas APIs provide convenient wrappers for calling the APIs; we have developed a set of [Google Colab tutorials](/tutorials/index.html) to help you get started with analysis. We have also developed a [Data science curriculum](/courseware/intro_data_science.html) featuring our API and data, currently in use at MIT.
>>> +
>>> +## Embed data analyses and visualizations in your site
>>> +
>>> +If you would like to include Data Commons result visualizations in your own website, we provide a [Javascript API](/api/web_components/index.html) that makes it a snap to embed various chart elements, such as scatter plots, maps, pie charts, and many more.
>>> +
>>> +## Download data for offline analysis
>>> +
>>> +Data Commons provides several tools for downloading its data:
>>> +
>>> +- To preview and download for selected places and statistical variables, use the standalone [Data Download Tool](https://datacommons.org/tools/download) or click the **Download** link in any of the results pages of the visualization tools.
>>> +- To load data into Google Sheets for analysis and charting, install and run the Data Commons Google [Sheets add-on](/api/sheets/index.html).
>>> +
>>> +## Build machine learning models
>>> +
>>> +Data Commons data sets provide ideal training data for developing machine learning models and other data science applications. The [data science tutorials](/courseware/intro_data_science.html) show you how to use our APIs and data to get started.
>>>
>>> We try not to use "data sets".. perhaps just Data Commons provides
>>> ideal training ...
>>> ------------------------------
>>>
>>> In how_to_use.md
>>> <#421 (comment)>
>>> :
>>>
>>> > +## Embed data analyses and visualizations in your site
>>> +
>>> +If you would like to include Data Commons result visualizations in your own website, we provide a [Javascript API](/api/web_components/index.html) that makes it a snap to embed various chart elements, such as scatter plots, maps, pie charts, and many more.
>>> +
>>> +## Download data for offline analysis
>>> +
>>> +Data Commons provides several tools for downloading its data:
>>> +
>>> +- To preview and download for selected places and statistical variables, use the standalone [Data Download Tool](https://datacommons.org/tools/download) or click the **Download** link in any of the results pages of the visualization tools.
>>> +- To load data into Google Sheets for analysis and charting, install and run the Data Commons Google [Sheets add-on](/api/sheets/index.html).
>>> +
>>> +## Build machine learning models
>>> +
>>> +Data Commons data sets provide ideal training data for developing machine learning models and other data science applications. The [data science tutorials](/courseware/intro_data_science.html) show you how to use our APIs and data to get started.
>>> +
>>> +## Contribute data to Data Commons
>>>
>>> Can we point to this page instead, which lists other ways to contibute:
>>> https://docs.datacommons.org/contributing/
>>> ------------------------------
>>>
>>> In index.md
>>> <#421 (comment)>
>>> :
>>>
>>> >
>>> -One of the salient aspects of Data Commons is that it is not a repository of data sets. There are many great repositories (Dataverse, BQ public datasets, data.gov) that more than adequately address that need. Instead, it is a single unified database created by normalizing/aligning the schemas and entity references across these different datasets (to the extent possible). So, for example, if a researcher wants the population, violent crime rate and unemployment rate of a county, the researcher does not have to go to three different datasets (Census, FBI and BLS), but can instead, get it from a single database, using one schema, one API. Of course, she would want to know the provenance of the data, which is recorded with every data point, something enabled in the APIs.
>>> +Behind the scenes, Data Commons does the tedious work of finding data, understanding the data collection methodologies, cleaning the data, reconciling different formats and schemas, figuring out how to merge data about the same entity from different sources, and so on – saving organizations months of tedious, costly and error-prone work. Data Commons is not a repository of public datasets (such as Kaggle or Google Cloud BiqQuery Public Datasets). Instead, it is a single unified data source created by normalizing and aligning schemas and references to the same entities (such as cities, counties, organizations, etc.) across different datasets.
>>> +
>>> +For example, if you wanted to get [population stats, poverty and unemployment rates of a specific county](https://datacommons.org/place/geoId/06081), you don't need to go to three different datasets; instead, you can get the data from a single data source, using one schema, and one API. Data Commons is also used by Google Search whenever it can provide the most relevant statistical results to a query. For example, the top Google Search result for the query "what is the life expectancy of Vietnam" returns a Data Commons timeline graph and a link to the [Place page](https://datacommons.org/place/country/VNM?utm_medium=explore&mprop=lifeExpectancy&popt=Person&hl=en) for Vietnam:
>>> +
>>> +![Google Search query result](/assets/images/dcoverview1.png){: width="800"}
>>> +
>>> +## A standards-based knowledge graph, schema, and APIs
>>> +
>>> +Data Commons needs to be able to stitch together data from disparate data sets in different formats and encodings, in a wide range of domains, from time series about demographics and employment, to hurricanes, to protein structures. To do so, it models the world as a [knowledge graph](https://blog.google/products/search/introducing-knowledge-graph-things-not/) consisting of nodes, or entities, with properties (attributes) and relationships between them forming directed edges between the nodes. The data model is based on the [Schema.org](https://www.schema.org) framework, an open framework used by over 40M websites; Data Commons makes heavy use of [Schema.org](https://www.schema.org/docs/schemas.html) constructs and extends the model as required, introducing both general constructs (such as intervals) and values for common properties.
>>>
>>> rather than "makes heavy use of schema.org constructs", data commons
>>> is an extension of schema.org
>>> ------------------------------
>>>
>>> In index.md
>>> <#421 (comment)>
>>> :
>>>
>>> > +
>>> +## A standards-based knowledge graph, schema, and APIs
>>> +
>>> +Data Commons needs to be able to stitch together data from disparate data sets in different formats and encodings, in a wide range of domains, from time series about demographics and employment, to hurricanes, to protein structures. To do so, it models the world as a [knowledge graph](https://blog.google/products/search/introducing-knowledge-graph-things-not/) consisting of nodes, or entities, with properties (attributes) and relationships between them forming directed edges between the nodes. The data model is based on the [Schema.org](https://www.schema.org) framework, an open framework used by over 40M websites; Data Commons makes heavy use of [Schema.org](https://www.schema.org/docs/schemas.html) constructs and extends the model as required, introducing both general constructs (such as intervals) and values for common properties.
>>> +
>>> +The Data Commons [Knowledge Graph browser](https://datacommons.org/browser/) allows you to peek into the structure of the graph, and the APIs allow you to directly query the parts of the graph (e.g. nodes, triples, etc.).
>>> +
>>> +Importantly, numeric time series data are first-class entities, with "(statistical) variable" being an entity that represents a metric definition, and "observation" being an entity that represents the value of a variable at a specific time. The [Statistical Variable Explorer](https://datacommons.org/tools/statvar) allows you to browse existing variables, and the [Visualization tools](https://datacommons.org/tools/visualization) provide aggregated views of this data over time, geography, or 2-dimensional space. The APIs also allow you to directly query observations.
>>> +
>>> +The knowledge graph is also mapped to relational tables that allow for [SQL querying](https://docs.datacommons.org/bigquery/) (requiring a [Google Cloud BigQuery](https://cloud.google.com/bigquery) account) and easier joining to other datasets outside of Data Commons.
>>> +
>>> +<!--To learn more about the data model and key concepts, see [Key concepts](). -->
>>> +
>>> +## An open-source project and website platform
>>> +
>>> +Data Commons is intended to be a community-based resource, where individuals and organizations can contribute data, code, documentation and educational materials. Source code, schemas, and documentation are publicly available at [https://github.com/datacommonsorg](https://github.com/datacommonsorg).
>>>
>>> Can we drop "intended" and just say that it is a community-based
>>> resource? We do have external contributions...
>>> ------------------------------
>>>
>>> In index.md
>>> <#421 (comment)>
>>> :
>>>
>>> > +
>>> +The Data Commons [Knowledge Graph browser](https://datacommons.org/browser/) allows you to peek into the structure of the graph, and the APIs allow you to directly query the parts of the graph (e.g. nodes, triples, etc.).
>>> +
>>> +Importantly, numeric time series data are first-class entities, with "(statistical) variable" being an entity that represents a metric definition, and "observation" being an entity that represents the value of a variable at a specific time. The [Statistical Variable Explorer](https://datacommons.org/tools/statvar) allows you to browse existing variables, and the [Visualization tools](https://datacommons.org/tools/visualization) provide aggregated views of this data over time, geography, or 2-dimensional space. The APIs also allow you to directly query observations.
>>> +
>>> +The knowledge graph is also mapped to relational tables that allow for [SQL querying](https://docs.datacommons.org/bigquery/) (requiring a [Google Cloud BigQuery](https://cloud.google.com/bigquery) account) and easier joining to other datasets outside of Data Commons.
>>> +
>>> +<!--To learn more about the data model and key concepts, see [Key concepts](). -->
>>> +
>>> +## An open-source project and website platform
>>> +
>>> +Data Commons is intended to be a community-based resource, where individuals and organizations can contribute data, code, documentation and educational materials. Source code, schemas, and documentation are publicly available at [https://github.com/datacommonsorg](https://github.com/datacommonsorg).
>>> +
>>> +Google has partnerships with the [United Nations](https://unstats.un.org/UNSDWebsite/undatacommons/sdgs), the [World Health Organization](https://unstats.un.org/UNSDWebsite/undatacommons/areas/1471028664), [One.org](https://datacommons.one.org/), [TechSoup](https://publicdata.techsoup.org/), and many other non-profit, academic, and governmental organizations across the world. We are always looking to expand the available data sources and welcome contributions from data owners around the world.
>>> +
>>> +In addition, Data Commons makes its data and visualizations accessible to any website through [REST](/api/rest/v2/index.html) and [Javascript](/api/web_components/index.html) APIs.
>>>
>>> Javascript --> Web Components
>>> ------------------------------
>>>
>>> In index.md
>>> <#421 (comment)>
>>> :
>>>
>>> > +
>>> +Importantly, numeric time series data are first-class entities, with "(statistical) variable" being an entity that represents a metric definition, and "observation" being an entity that represents the value of a variable at a specific time. The [Statistical Variable Explorer](https://datacommons.org/tools/statvar) allows you to browse existing variables, and the [Visualization tools](https://datacommons.org/tools/visualization) provide aggregated views of this data over time, geography, or 2-dimensional space. The APIs also allow you to directly query observations.
>>> +
>>> +The knowledge graph is also mapped to relational tables that allow for [SQL querying](https://docs.datacommons.org/bigquery/) (requiring a [Google Cloud BigQuery](https://cloud.google.com/bigquery) account) and easier joining to other datasets outside of Data Commons.
>>> +
>>> +<!--To learn more about the data model and key concepts, see [Key concepts](). -->
>>> +
>>> +## An open-source project and website platform
>>> +
>>> +Data Commons is intended to be a community-based resource, where individuals and organizations can contribute data, code, documentation and educational materials. Source code, schemas, and documentation are publicly available at [https://github.com/datacommonsorg](https://github.com/datacommonsorg).
>>> +
>>> +Google has partnerships with the [United Nations](https://unstats.un.org/UNSDWebsite/undatacommons/sdgs), the [World Health Organization](https://unstats.un.org/UNSDWebsite/undatacommons/areas/1471028664), [One.org](https://datacommons.one.org/), [TechSoup](https://publicdata.techsoup.org/), and many other non-profit, academic, and governmental organizations across the world. We are always looking to expand the available data sources and welcome contributions from data owners around the world.
>>> +
>>> +In addition, Data Commons makes its data and visualizations accessible to any website through [REST](/api/rest/v2/index.html) and [Javascript](/api/web_components/index.html) APIs.
>>> +
>>> +Finally, Data Commons provides an open-source, [customizable website](/custom_dc/index.html) implementation, for organizations that want to host their own version of a Data Commons website, using their own data and user interfaces.
>>>
>>> is "customizable website" the best way to describe custom dc? could we
>>> call it [customizable implementation]
>>> ------------------------------
>>>
>>> In index.md
>>> <#421 (comment)>
>>> :
>>>
>>> > +
>>> +Data Commons needs to be able to stitch together data from disparate data sets in different formats and encodings, in a wide range of domains, from time series about demographics and employment, to hurricanes, to protein structures. To do so, it models the world as a [knowledge graph](https://blog.google/products/search/introducing-knowledge-graph-things-not/) consisting of nodes, or entities, with properties (attributes) and relationships between them forming directed edges between the nodes. The data model is based on the [Schema.org](https://www.schema.org) framework, an open framework used by over 40M websites; Data Commons makes heavy use of [Schema.org](https://www.schema.org/docs/schemas.html) constructs and extends the model as required, introducing both general constructs (such as intervals) and values for common properties.
>>> +
>>> +The Data Commons [Knowledge Graph browser](https://datacommons.org/browser/) allows you to peek into the structure of the graph, and the APIs allow you to directly query the parts of the graph (e.g. nodes, triples, etc.).
>>> +
>>> +Importantly, numeric time series data are first-class entities, with "(statistical) variable" being an entity that represents a metric definition, and "observation" being an entity that represents the value of a variable at a specific time. The [Statistical Variable Explorer](https://datacommons.org/tools/statvar) allows you to browse existing variables, and the [Visualization tools](https://datacommons.org/tools/visualization) provide aggregated views of this data over time, geography, or 2-dimensional space. The APIs also allow you to directly query observations.
>>> +
>>> +The knowledge graph is also mapped to relational tables that allow for [SQL querying](https://docs.datacommons.org/bigquery/) (requiring a [Google Cloud BigQuery](https://cloud.google.com/bigquery) account) and easier joining to other datasets outside of Data Commons.
>>> +
>>> +<!--To learn more about the data model and key concepts, see [Key concepts](). -->
>>> +
>>> +## An open-source project and website platform
>>> +
>>> +Data Commons is intended to be a community-based resource, where individuals and organizations can contribute data, code, documentation and educational materials. Source code, schemas, and documentation are publicly available at [https://github.com/datacommonsorg](https://github.com/datacommonsorg).
>>> +
>>> +Google has partnerships with the [United Nations](https://unstats.un.org/UNSDWebsite/undatacommons/sdgs), the [World Health Organization](https://unstats.un.org/UNSDWebsite/undatacommons/areas/1471028664), [One.org](https://datacommons.one.org/), [TechSoup](https://publicdata.techsoup.org/), and many other non-profit, academic, and governmental organizations across the world. We are always looking to expand the available data sources and welcome contributions from data owners around the world.
>>>
>>> how about "data coverage" instead of "available data sources"
>>> ------------------------------
>>>
>>> In index.md
>>> <#421 (comment)>
>>> :
>>>
>>> > ---
>>>
>>> -# Why Data Commons?
>>> +{: .no_toc}
>>> +# What is Data Commons?
>>>
>>> Should we point to the About page at all, for the Why?
>>>
>>> —
>>> Reply to this email directly, view it on GitHub
>>> <#421 (review)>,
>>> or unsubscribe
>>> <https://github.com/notifications/unsubscribe-auth/BHMM7UGITGPZI56JUXP5233ZIB3ILAVCNFSM6AAAAABJOTB4EWVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDCMRWGE3DOMZRGI>
>>> .
>>> You are receiving this because you authored the thread.Message ID:
>>> <datacommonsorg/docsite/pull/421/review/2126167312 <(212)%20616-7312>@
>>> github.com>
>>>
>>
>
> --
> ==================================
> Prem Ramaswami
> Product Manager
> DataCommons.org
> Make Data Sing
> ***@***.***
> Phone: +18579981598 <(857)%20998-1598>
> ==================================
>
>
>
>
--
==================================
Prem Ramaswami
Product Manager
DataCommons.org
Make Data Sing
***@***.***
Phone: +18579981598
==================================
|
You mean second person plural, i.e. "we"?
…On Tue, Jun 18, 2024 at 3:40 PM Prem Ramaswami ***@***.***> wrote:
We shoudl change this to 3rd person and remove the first person.
On Tue, Jun 18, 2024 at 5:42 PM Kara Moscoe ***@***.***> wrote:
> BTW, one of the odd things about the "About" page is that there is a
> sentence that says, "I am bullish about...". But there is no indication of
> who "I" is. Should there be some kind of byline about Guha there?
>
> On Tue, Jun 18, 2024 at 1:54 PM Prem Ramaswami ***@***.***> wrote:
>
>> Feel free to point this at Guha once before publishing.
>>
>> On Tue, Jun 18, 2024 at 2:31 PM Kara Moscoe ***@***.***> wrote:
>>
>>> +Prem Ramaswami ***@***.***>
>>> It's been reviewed by Prem, who told me to "just publish it". :-)
>>> Prem, should Guha review?
>>>
>>> On Tue, Jun 18, 2024 at 11:23 AM Carolyn Au ***@***.***>
>>> wrote:
>>>
>>>> ***@***.**** commented on this pull request.
>>>>
>>>> Thanks for sharing this Kara. Has this been reviewed by Prem and Guha?
>>>> ------------------------------
>>>>
>>>> In how_to_use.md
>>>> <#421 (comment)>
>>>> :
>>>>
>>>> > +nav_order: 1
>>>> +---
>>>> +
>>>> +{: .no_toc}
>>>> +# How to use Data Commons
>>>> +
>>>> +* TOC
>>>> +{:toc}
>>>> +
>>>> +## Learn about the data in Data Commons
>>>> +
>>>> +To find out what data is available in Data Commons, check out the [Statistical Variable Explorer](https://datacommons.org/tools/statvar) and see the [Data sources](/datasets/index.html) pages.
>>>> +
>>>> +## Issue interactive data queries
>>>> +
>>>> +For quick analysis, use the search query bar on the home page or any of the visualization tools, such as the [Timeline](https://datacommons.org/tools/visualization#visType=timeline), [Scatter](https://datacommons.org/tools/visualization#visType%3Dscatter), and [Map](https://datacommons.org/tools/visualization#visType%3Dmap) explorers.
>>>>
>>>> It could be helpful to also link to the homepage.
>>>>
>>>> Also, to me, it reads as if they're also supposed ot use the search
>>>> query bar on the vis tools. Should it be ... or use any of the ... or ...
>>>> or explore using any of the visualization tools ...
>>>> ------------------------------
>>>>
>>>> In how_to_use.md
>>>> <#421 (comment)>
>>>> :
>>>>
>>>> > +
>>>> +## Issue interactive data queries
>>>> +
>>>> +For quick analysis, use the search query bar on the home page or any of the visualization tools, such as the [Timeline](https://datacommons.org/tools/visualization#visType=timeline), [Scatter](https://datacommons.org/tools/visualization#visType%3Dscatter), and [Map](https://datacommons.org/tools/visualization#visType%3Dmap) explorers.
>>>> +
>>>> +If you want to issue SQL queries, and you have a Google Cloud Platform account, use BigQuery Studio on Data Commons data in [Analytics Hub](https://cloud.google.com/analytics-hub). See the [Data Commons in BigQuery](/bigquery/index.html) page for more details.
>>>> +
>>>> +## Issue programmatic data queries
>>>> +
>>>> +Data Commons publishes REST, Python, Pandas, Google Sheets, and SPARQL [APIs](/api/index.html). The APIs support both low-level exploration of the knowledge graph as well as higher-level statistical analysis of variables.
>>>> +
>>>> +The Python and pandas APIs provide convenient wrappers for calling the APIs; we have developed a set of [Google Colab tutorials](/tutorials/index.html) to help you get started with analysis. We have also developed a [Data science curriculum](/courseware/intro_data_science.html) featuring our API and data, currently in use at MIT.
>>>> +
>>>> +## Embed data analyses and visualizations in your site
>>>> +
>>>> +If you would like to include Data Commons result visualizations in your own website, we provide a [Javascript API](/api/web_components/index.html) that makes it a snap to embed various chart elements, such as scatter plots, maps, pie charts, and many more.
>>>>
>>>> could we label the link Web Components API (it's not quite javascript
>>>> only...)
>>>> ------------------------------
>>>>
>>>> In how_to_use.md
>>>> <#421 (comment)>
>>>> :
>>>>
>>>> > +The Python and pandas APIs provide convenient wrappers for calling the APIs; we have developed a set of [Google Colab tutorials](/tutorials/index.html) to help you get started with analysis. We have also developed a [Data science curriculum](/courseware/intro_data_science.html) featuring our API and data, currently in use at MIT.
>>>> +
>>>> +## Embed data analyses and visualizations in your site
>>>> +
>>>> +If you would like to include Data Commons result visualizations in your own website, we provide a [Javascript API](/api/web_components/index.html) that makes it a snap to embed various chart elements, such as scatter plots, maps, pie charts, and many more.
>>>> +
>>>> +## Download data for offline analysis
>>>> +
>>>> +Data Commons provides several tools for downloading its data:
>>>> +
>>>> +- To preview and download for selected places and statistical variables, use the standalone [Data Download Tool](https://datacommons.org/tools/download) or click the **Download** link in any of the results pages of the visualization tools.
>>>> +- To load data into Google Sheets for analysis and charting, install and run the Data Commons Google [Sheets add-on](/api/sheets/index.html).
>>>> +
>>>> +## Build machine learning models
>>>> +
>>>> +Data Commons data sets provide ideal training data for developing machine learning models and other data science applications. The [data science tutorials](/courseware/intro_data_science.html) show you how to use our APIs and data to get started.
>>>>
>>>> We try not to use "data sets".. perhaps just Data Commons provides
>>>> ideal training ...
>>>> ------------------------------
>>>>
>>>> In how_to_use.md
>>>> <#421 (comment)>
>>>> :
>>>>
>>>> > +## Embed data analyses and visualizations in your site
>>>> +
>>>> +If you would like to include Data Commons result visualizations in your own website, we provide a [Javascript API](/api/web_components/index.html) that makes it a snap to embed various chart elements, such as scatter plots, maps, pie charts, and many more.
>>>> +
>>>> +## Download data for offline analysis
>>>> +
>>>> +Data Commons provides several tools for downloading its data:
>>>> +
>>>> +- To preview and download for selected places and statistical variables, use the standalone [Data Download Tool](https://datacommons.org/tools/download) or click the **Download** link in any of the results pages of the visualization tools.
>>>> +- To load data into Google Sheets for analysis and charting, install and run the Data Commons Google [Sheets add-on](/api/sheets/index.html).
>>>> +
>>>> +## Build machine learning models
>>>> +
>>>> +Data Commons data sets provide ideal training data for developing machine learning models and other data science applications. The [data science tutorials](/courseware/intro_data_science.html) show you how to use our APIs and data to get started.
>>>> +
>>>> +## Contribute data to Data Commons
>>>>
>>>> Can we point to this page instead, which lists other ways to
>>>> contibute: https://docs.datacommons.org/contributing/
>>>> ------------------------------
>>>>
>>>> In index.md
>>>> <#421 (comment)>
>>>> :
>>>>
>>>> >
>>>> -One of the salient aspects of Data Commons is that it is not a repository of data sets. There are many great repositories (Dataverse, BQ public datasets, data.gov) that more than adequately address that need. Instead, it is a single unified database created by normalizing/aligning the schemas and entity references across these different datasets (to the extent possible). So, for example, if a researcher wants the population, violent crime rate and unemployment rate of a county, the researcher does not have to go to three different datasets (Census, FBI and BLS), but can instead, get it from a single database, using one schema, one API. Of course, she would want to know the provenance of the data, which is recorded with every data point, something enabled in the APIs.
>>>> +Behind the scenes, Data Commons does the tedious work of finding data, understanding the data collection methodologies, cleaning the data, reconciling different formats and schemas, figuring out how to merge data about the same entity from different sources, and so on – saving organizations months of tedious, costly and error-prone work. Data Commons is not a repository of public datasets (such as Kaggle or Google Cloud BiqQuery Public Datasets). Instead, it is a single unified data source created by normalizing and aligning schemas and references to the same entities (such as cities, counties, organizations, etc.) across different datasets.
>>>> +
>>>> +For example, if you wanted to get [population stats, poverty and unemployment rates of a specific county](https://datacommons.org/place/geoId/06081), you don't need to go to three different datasets; instead, you can get the data from a single data source, using one schema, and one API. Data Commons is also used by Google Search whenever it can provide the most relevant statistical results to a query. For example, the top Google Search result for the query "what is the life expectancy of Vietnam" returns a Data Commons timeline graph and a link to the [Place page](https://datacommons.org/place/country/VNM?utm_medium=explore&mprop=lifeExpectancy&popt=Person&hl=en) for Vietnam:
>>>> +
>>>> +![Google Search query result](/assets/images/dcoverview1.png){: width="800"}
>>>> +
>>>> +## A standards-based knowledge graph, schema, and APIs
>>>> +
>>>> +Data Commons needs to be able to stitch together data from disparate data sets in different formats and encodings, in a wide range of domains, from time series about demographics and employment, to hurricanes, to protein structures. To do so, it models the world as a [knowledge graph](https://blog.google/products/search/introducing-knowledge-graph-things-not/) consisting of nodes, or entities, with properties (attributes) and relationships between them forming directed edges between the nodes. The data model is based on the [Schema.org](https://www.schema.org) framework, an open framework used by over 40M websites; Data Commons makes heavy use of [Schema.org](https://www.schema.org/docs/schemas.html) constructs and extends the model as required, introducing both general constructs (such as intervals) and values for common properties.
>>>>
>>>> rather than "makes heavy use of schema.org constructs", data commons
>>>> is an extension of schema.org
>>>> ------------------------------
>>>>
>>>> In index.md
>>>> <#421 (comment)>
>>>> :
>>>>
>>>> > +
>>>> +## A standards-based knowledge graph, schema, and APIs
>>>> +
>>>> +Data Commons needs to be able to stitch together data from disparate data sets in different formats and encodings, in a wide range of domains, from time series about demographics and employment, to hurricanes, to protein structures. To do so, it models the world as a [knowledge graph](https://blog.google/products/search/introducing-knowledge-graph-things-not/) consisting of nodes, or entities, with properties (attributes) and relationships between them forming directed edges between the nodes. The data model is based on the [Schema.org](https://www.schema.org) framework, an open framework used by over 40M websites; Data Commons makes heavy use of [Schema.org](https://www.schema.org/docs/schemas.html) constructs and extends the model as required, introducing both general constructs (such as intervals) and values for common properties.
>>>> +
>>>> +The Data Commons [Knowledge Graph browser](https://datacommons.org/browser/) allows you to peek into the structure of the graph, and the APIs allow you to directly query the parts of the graph (e.g. nodes, triples, etc.).
>>>> +
>>>> +Importantly, numeric time series data are first-class entities, with "(statistical) variable" being an entity that represents a metric definition, and "observation" being an entity that represents the value of a variable at a specific time. The [Statistical Variable Explorer](https://datacommons.org/tools/statvar) allows you to browse existing variables, and the [Visualization tools](https://datacommons.org/tools/visualization) provide aggregated views of this data over time, geography, or 2-dimensional space. The APIs also allow you to directly query observations.
>>>> +
>>>> +The knowledge graph is also mapped to relational tables that allow for [SQL querying](https://docs.datacommons.org/bigquery/) (requiring a [Google Cloud BigQuery](https://cloud.google.com/bigquery) account) and easier joining to other datasets outside of Data Commons.
>>>> +
>>>> +<!--To learn more about the data model and key concepts, see [Key concepts](). -->
>>>> +
>>>> +## An open-source project and website platform
>>>> +
>>>> +Data Commons is intended to be a community-based resource, where individuals and organizations can contribute data, code, documentation and educational materials. Source code, schemas, and documentation are publicly available at [https://github.com/datacommonsorg](https://github.com/datacommonsorg).
>>>>
>>>> Can we drop "intended" and just say that it is a community-based
>>>> resource? We do have external contributions...
>>>> ------------------------------
>>>>
>>>> In index.md
>>>> <#421 (comment)>
>>>> :
>>>>
>>>> > +
>>>> +The Data Commons [Knowledge Graph browser](https://datacommons.org/browser/) allows you to peek into the structure of the graph, and the APIs allow you to directly query the parts of the graph (e.g. nodes, triples, etc.).
>>>> +
>>>> +Importantly, numeric time series data are first-class entities, with "(statistical) variable" being an entity that represents a metric definition, and "observation" being an entity that represents the value of a variable at a specific time. The [Statistical Variable Explorer](https://datacommons.org/tools/statvar) allows you to browse existing variables, and the [Visualization tools](https://datacommons.org/tools/visualization) provide aggregated views of this data over time, geography, or 2-dimensional space. The APIs also allow you to directly query observations.
>>>> +
>>>> +The knowledge graph is also mapped to relational tables that allow for [SQL querying](https://docs.datacommons.org/bigquery/) (requiring a [Google Cloud BigQuery](https://cloud.google.com/bigquery) account) and easier joining to other datasets outside of Data Commons.
>>>> +
>>>> +<!--To learn more about the data model and key concepts, see [Key concepts](). -->
>>>> +
>>>> +## An open-source project and website platform
>>>> +
>>>> +Data Commons is intended to be a community-based resource, where individuals and organizations can contribute data, code, documentation and educational materials. Source code, schemas, and documentation are publicly available at [https://github.com/datacommonsorg](https://github.com/datacommonsorg).
>>>> +
>>>> +Google has partnerships with the [United Nations](https://unstats.un.org/UNSDWebsite/undatacommons/sdgs), the [World Health Organization](https://unstats.un.org/UNSDWebsite/undatacommons/areas/1471028664), [One.org](https://datacommons.one.org/), [TechSoup](https://publicdata.techsoup.org/), and many other non-profit, academic, and governmental organizations across the world. We are always looking to expand the available data sources and welcome contributions from data owners around the world.
>>>> +
>>>> +In addition, Data Commons makes its data and visualizations accessible to any website through [REST](/api/rest/v2/index.html) and [Javascript](/api/web_components/index.html) APIs.
>>>>
>>>> Javascript --> Web Components
>>>> ------------------------------
>>>>
>>>> In index.md
>>>> <#421 (comment)>
>>>> :
>>>>
>>>> > +
>>>> +Importantly, numeric time series data are first-class entities, with "(statistical) variable" being an entity that represents a metric definition, and "observation" being an entity that represents the value of a variable at a specific time. The [Statistical Variable Explorer](https://datacommons.org/tools/statvar) allows you to browse existing variables, and the [Visualization tools](https://datacommons.org/tools/visualization) provide aggregated views of this data over time, geography, or 2-dimensional space. The APIs also allow you to directly query observations.
>>>> +
>>>> +The knowledge graph is also mapped to relational tables that allow for [SQL querying](https://docs.datacommons.org/bigquery/) (requiring a [Google Cloud BigQuery](https://cloud.google.com/bigquery) account) and easier joining to other datasets outside of Data Commons.
>>>> +
>>>> +<!--To learn more about the data model and key concepts, see [Key concepts](). -->
>>>> +
>>>> +## An open-source project and website platform
>>>> +
>>>> +Data Commons is intended to be a community-based resource, where individuals and organizations can contribute data, code, documentation and educational materials. Source code, schemas, and documentation are publicly available at [https://github.com/datacommonsorg](https://github.com/datacommonsorg).
>>>> +
>>>> +Google has partnerships with the [United Nations](https://unstats.un.org/UNSDWebsite/undatacommons/sdgs), the [World Health Organization](https://unstats.un.org/UNSDWebsite/undatacommons/areas/1471028664), [One.org](https://datacommons.one.org/), [TechSoup](https://publicdata.techsoup.org/), and many other non-profit, academic, and governmental organizations across the world. We are always looking to expand the available data sources and welcome contributions from data owners around the world.
>>>> +
>>>> +In addition, Data Commons makes its data and visualizations accessible to any website through [REST](/api/rest/v2/index.html) and [Javascript](/api/web_components/index.html) APIs.
>>>> +
>>>> +Finally, Data Commons provides an open-source, [customizable website](/custom_dc/index.html) implementation, for organizations that want to host their own version of a Data Commons website, using their own data and user interfaces.
>>>>
>>>> is "customizable website" the best way to describe custom dc? could we
>>>> call it [customizable implementation]
>>>> ------------------------------
>>>>
>>>> In index.md
>>>> <#421 (comment)>
>>>> :
>>>>
>>>> > +
>>>> +Data Commons needs to be able to stitch together data from disparate data sets in different formats and encodings, in a wide range of domains, from time series about demographics and employment, to hurricanes, to protein structures. To do so, it models the world as a [knowledge graph](https://blog.google/products/search/introducing-knowledge-graph-things-not/) consisting of nodes, or entities, with properties (attributes) and relationships between them forming directed edges between the nodes. The data model is based on the [Schema.org](https://www.schema.org) framework, an open framework used by over 40M websites; Data Commons makes heavy use of [Schema.org](https://www.schema.org/docs/schemas.html) constructs and extends the model as required, introducing both general constructs (such as intervals) and values for common properties.
>>>> +
>>>> +The Data Commons [Knowledge Graph browser](https://datacommons.org/browser/) allows you to peek into the structure of the graph, and the APIs allow you to directly query the parts of the graph (e.g. nodes, triples, etc.).
>>>> +
>>>> +Importantly, numeric time series data are first-class entities, with "(statistical) variable" being an entity that represents a metric definition, and "observation" being an entity that represents the value of a variable at a specific time. The [Statistical Variable Explorer](https://datacommons.org/tools/statvar) allows you to browse existing variables, and the [Visualization tools](https://datacommons.org/tools/visualization) provide aggregated views of this data over time, geography, or 2-dimensional space. The APIs also allow you to directly query observations.
>>>> +
>>>> +The knowledge graph is also mapped to relational tables that allow for [SQL querying](https://docs.datacommons.org/bigquery/) (requiring a [Google Cloud BigQuery](https://cloud.google.com/bigquery) account) and easier joining to other datasets outside of Data Commons.
>>>> +
>>>> +<!--To learn more about the data model and key concepts, see [Key concepts](). -->
>>>> +
>>>> +## An open-source project and website platform
>>>> +
>>>> +Data Commons is intended to be a community-based resource, where individuals and organizations can contribute data, code, documentation and educational materials. Source code, schemas, and documentation are publicly available at [https://github.com/datacommonsorg](https://github.com/datacommonsorg).
>>>> +
>>>> +Google has partnerships with the [United Nations](https://unstats.un.org/UNSDWebsite/undatacommons/sdgs), the [World Health Organization](https://unstats.un.org/UNSDWebsite/undatacommons/areas/1471028664), [One.org](https://datacommons.one.org/), [TechSoup](https://publicdata.techsoup.org/), and many other non-profit, academic, and governmental organizations across the world. We are always looking to expand the available data sources and welcome contributions from data owners around the world.
>>>>
>>>> how about "data coverage" instead of "available data sources"
>>>> ------------------------------
>>>>
>>>> In index.md
>>>> <#421 (comment)>
>>>> :
>>>>
>>>> > ---
>>>>
>>>> -# Why Data Commons?
>>>> +{: .no_toc}
>>>> +# What is Data Commons?
>>>>
>>>> Should we point to the About page at all, for the Why?
>>>>
>>>> —
>>>> Reply to this email directly, view it on GitHub
>>>> <#421 (review)>,
>>>> or unsubscribe
>>>> <https://github.com/notifications/unsubscribe-auth/BHMM7UGITGPZI56JUXP5233ZIB3ILAVCNFSM6AAAAABJOTB4EWVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDCMRWGE3DOMZRGI>
>>>> .
>>>> You are receiving this because you authored the thread.Message ID:
>>>> <datacommonsorg/docsite/pull/421/review/2126167312 <(212)%20616-7312>@
>>>> github.com>
>>>>
>>>
>>
>> --
>> ==================================
>> Prem Ramaswami
>> Product Manager
>> DataCommons.org
>> Make Data Sing
>> ***@***.***
>> Phone: +18579981598 <(857)%20998-1598>
>> ==================================
>>
>>
>>
>>
--
==================================
Prem Ramaswami
Product Manager
DataCommons.org
Make Data Sing
***@***.***
Phone: +18579981598 <(857)%20998-1598>
==================================
|
Yes. That too 😂
…On Tue, Jun 18, 2024, 6:56 PM Kara Moscoe ***@***.***> wrote:
You mean second person plural, i.e. "we"?
On Tue, Jun 18, 2024 at 3:40 PM Prem Ramaswami ***@***.***> wrote:
> We shoudl change this to 3rd person and remove the first person.
>
> On Tue, Jun 18, 2024 at 5:42 PM Kara Moscoe ***@***.***> wrote:
>
>> BTW, one of the odd things about the "About" page is that there is a
>> sentence that says, "I am bullish about...". But there is no indication of
>> who "I" is. Should there be some kind of byline about Guha there?
>>
>> On Tue, Jun 18, 2024 at 1:54 PM Prem Ramaswami ***@***.***> wrote:
>>
>>> Feel free to point this at Guha once before publishing.
>>>
>>> On Tue, Jun 18, 2024 at 2:31 PM Kara Moscoe ***@***.***> wrote:
>>>
>>>> +Prem Ramaswami ***@***.***>
>>>> It's been reviewed by Prem, who told me to "just publish it". :-)
>>>> Prem, should Guha review?
>>>>
>>>> On Tue, Jun 18, 2024 at 11:23 AM Carolyn Au ***@***.***>
>>>> wrote:
>>>>
>>>>> ***@***.**** commented on this pull request.
>>>>>
>>>>> Thanks for sharing this Kara. Has this been reviewed by Prem and Guha?
>>>>> ------------------------------
>>>>>
>>>>> In how_to_use.md
>>>>> <#421 (comment)>
>>>>> :
>>>>>
>>>>> > +nav_order: 1
>>>>> +---
>>>>> +
>>>>> +{: .no_toc}
>>>>> +# How to use Data Commons
>>>>> +
>>>>> +* TOC
>>>>> +{:toc}
>>>>> +
>>>>> +## Learn about the data in Data Commons
>>>>> +
>>>>> +To find out what data is available in Data Commons, check out the [Statistical Variable Explorer](https://datacommons.org/tools/statvar) and see the [Data sources](/datasets/index.html) pages.
>>>>> +
>>>>> +## Issue interactive data queries
>>>>> +
>>>>> +For quick analysis, use the search query bar on the home page or any of the visualization tools, such as the [Timeline](https://datacommons.org/tools/visualization#visType=timeline), [Scatter](https://datacommons.org/tools/visualization#visType%3Dscatter), and [Map](https://datacommons.org/tools/visualization#visType%3Dmap) explorers.
>>>>>
>>>>> It could be helpful to also link to the homepage.
>>>>>
>>>>> Also, to me, it reads as if they're also supposed ot use the search
>>>>> query bar on the vis tools. Should it be ... or use any of the ...
>>>>> or ... or explore using any of the visualization tools ...
>>>>> ------------------------------
>>>>>
>>>>> In how_to_use.md
>>>>> <#421 (comment)>
>>>>> :
>>>>>
>>>>> > +
>>>>> +## Issue interactive data queries
>>>>> +
>>>>> +For quick analysis, use the search query bar on the home page or any of the visualization tools, such as the [Timeline](https://datacommons.org/tools/visualization#visType=timeline), [Scatter](https://datacommons.org/tools/visualization#visType%3Dscatter), and [Map](https://datacommons.org/tools/visualization#visType%3Dmap) explorers.
>>>>> +
>>>>> +If you want to issue SQL queries, and you have a Google Cloud Platform account, use BigQuery Studio on Data Commons data in [Analytics Hub](https://cloud.google.com/analytics-hub). See the [Data Commons in BigQuery](/bigquery/index.html) page for more details.
>>>>> +
>>>>> +## Issue programmatic data queries
>>>>> +
>>>>> +Data Commons publishes REST, Python, Pandas, Google Sheets, and SPARQL [APIs](/api/index.html). The APIs support both low-level exploration of the knowledge graph as well as higher-level statistical analysis of variables.
>>>>> +
>>>>> +The Python and pandas APIs provide convenient wrappers for calling the APIs; we have developed a set of [Google Colab tutorials](/tutorials/index.html) to help you get started with analysis. We have also developed a [Data science curriculum](/courseware/intro_data_science.html) featuring our API and data, currently in use at MIT.
>>>>> +
>>>>> +## Embed data analyses and visualizations in your site
>>>>> +
>>>>> +If you would like to include Data Commons result visualizations in your own website, we provide a [Javascript API](/api/web_components/index.html) that makes it a snap to embed various chart elements, such as scatter plots, maps, pie charts, and many more.
>>>>>
>>>>> could we label the link Web Components API (it's not quite javascript
>>>>> only...)
>>>>> ------------------------------
>>>>>
>>>>> In how_to_use.md
>>>>> <#421 (comment)>
>>>>> :
>>>>>
>>>>> > +The Python and pandas APIs provide convenient wrappers for calling the APIs; we have developed a set of [Google Colab tutorials](/tutorials/index.html) to help you get started with analysis. We have also developed a [Data science curriculum](/courseware/intro_data_science.html) featuring our API and data, currently in use at MIT.
>>>>> +
>>>>> +## Embed data analyses and visualizations in your site
>>>>> +
>>>>> +If you would like to include Data Commons result visualizations in your own website, we provide a [Javascript API](/api/web_components/index.html) that makes it a snap to embed various chart elements, such as scatter plots, maps, pie charts, and many more.
>>>>> +
>>>>> +## Download data for offline analysis
>>>>> +
>>>>> +Data Commons provides several tools for downloading its data:
>>>>> +
>>>>> +- To preview and download for selected places and statistical variables, use the standalone [Data Download Tool](https://datacommons.org/tools/download) or click the **Download** link in any of the results pages of the visualization tools.
>>>>> +- To load data into Google Sheets for analysis and charting, install and run the Data Commons Google [Sheets add-on](/api/sheets/index.html).
>>>>> +
>>>>> +## Build machine learning models
>>>>> +
>>>>> +Data Commons data sets provide ideal training data for developing machine learning models and other data science applications. The [data science tutorials](/courseware/intro_data_science.html) show you how to use our APIs and data to get started.
>>>>>
>>>>> We try not to use "data sets".. perhaps just Data Commons provides
>>>>> ideal training ...
>>>>> ------------------------------
>>>>>
>>>>> In how_to_use.md
>>>>> <#421 (comment)>
>>>>> :
>>>>>
>>>>> > +## Embed data analyses and visualizations in your site
>>>>> +
>>>>> +If you would like to include Data Commons result visualizations in your own website, we provide a [Javascript API](/api/web_components/index.html) that makes it a snap to embed various chart elements, such as scatter plots, maps, pie charts, and many more.
>>>>> +
>>>>> +## Download data for offline analysis
>>>>> +
>>>>> +Data Commons provides several tools for downloading its data:
>>>>> +
>>>>> +- To preview and download for selected places and statistical variables, use the standalone [Data Download Tool](https://datacommons.org/tools/download) or click the **Download** link in any of the results pages of the visualization tools.
>>>>> +- To load data into Google Sheets for analysis and charting, install and run the Data Commons Google [Sheets add-on](/api/sheets/index.html).
>>>>> +
>>>>> +## Build machine learning models
>>>>> +
>>>>> +Data Commons data sets provide ideal training data for developing machine learning models and other data science applications. The [data science tutorials](/courseware/intro_data_science.html) show you how to use our APIs and data to get started.
>>>>> +
>>>>> +## Contribute data to Data Commons
>>>>>
>>>>> Can we point to this page instead, which lists other ways to
>>>>> contibute: https://docs.datacommons.org/contributing/
>>>>> ------------------------------
>>>>>
>>>>> In index.md
>>>>> <#421 (comment)>
>>>>> :
>>>>>
>>>>> >
>>>>> -One of the salient aspects of Data Commons is that it is not a repository of data sets. There are many great repositories (Dataverse, BQ public datasets, data.gov) that more than adequately address that need. Instead, it is a single unified database created by normalizing/aligning the schemas and entity references across these different datasets (to the extent possible). So, for example, if a researcher wants the population, violent crime rate and unemployment rate of a county, the researcher does not have to go to three different datasets (Census, FBI and BLS), but can instead, get it from a single database, using one schema, one API. Of course, she would want to know the provenance of the data, which is recorded with every data point, something enabled in the APIs.
>>>>> +Behind the scenes, Data Commons does the tedious work of finding data, understanding the data collection methodologies, cleaning the data, reconciling different formats and schemas, figuring out how to merge data about the same entity from different sources, and so on – saving organizations months of tedious, costly and error-prone work. Data Commons is not a repository of public datasets (such as Kaggle or Google Cloud BiqQuery Public Datasets). Instead, it is a single unified data source created by normalizing and aligning schemas and references to the same entities (such as cities, counties, organizations, etc.) across different datasets.
>>>>> +
>>>>> +For example, if you wanted to get [population stats, poverty and unemployment rates of a specific county](https://datacommons.org/place/geoId/06081), you don't need to go to three different datasets; instead, you can get the data from a single data source, using one schema, and one API. Data Commons is also used by Google Search whenever it can provide the most relevant statistical results to a query. For example, the top Google Search result for the query "what is the life expectancy of Vietnam" returns a Data Commons timeline graph and a link to the [Place page](https://datacommons.org/place/country/VNM?utm_medium=explore&mprop=lifeExpectancy&popt=Person&hl=en) for Vietnam:
>>>>> +
>>>>> +![Google Search query result](/assets/images/dcoverview1.png){: width="800"}
>>>>> +
>>>>> +## A standards-based knowledge graph, schema, and APIs
>>>>> +
>>>>> +Data Commons needs to be able to stitch together data from disparate data sets in different formats and encodings, in a wide range of domains, from time series about demographics and employment, to hurricanes, to protein structures. To do so, it models the world as a [knowledge graph](https://blog.google/products/search/introducing-knowledge-graph-things-not/) consisting of nodes, or entities, with properties (attributes) and relationships between them forming directed edges between the nodes. The data model is based on the [Schema.org](https://www.schema.org) framework, an open framework used by over 40M websites; Data Commons makes heavy use of [Schema.org](https://www.schema.org/docs/schemas.html) constructs and extends the model as required, introducing both general constructs (such as intervals) and values for common properties.
>>>>>
>>>>> rather than "makes heavy use of schema.org constructs", data commons
>>>>> is an extension of schema.org
>>>>> ------------------------------
>>>>>
>>>>> In index.md
>>>>> <#421 (comment)>
>>>>> :
>>>>>
>>>>> > +
>>>>> +## A standards-based knowledge graph, schema, and APIs
>>>>> +
>>>>> +Data Commons needs to be able to stitch together data from disparate data sets in different formats and encodings, in a wide range of domains, from time series about demographics and employment, to hurricanes, to protein structures. To do so, it models the world as a [knowledge graph](https://blog.google/products/search/introducing-knowledge-graph-things-not/) consisting of nodes, or entities, with properties (attributes) and relationships between them forming directed edges between the nodes. The data model is based on the [Schema.org](https://www.schema.org) framework, an open framework used by over 40M websites; Data Commons makes heavy use of [Schema.org](https://www.schema.org/docs/schemas.html) constructs and extends the model as required, introducing both general constructs (such as intervals) and values for common properties.
>>>>> +
>>>>> +The Data Commons [Knowledge Graph browser](https://datacommons.org/browser/) allows you to peek into the structure of the graph, and the APIs allow you to directly query the parts of the graph (e.g. nodes, triples, etc.).
>>>>> +
>>>>> +Importantly, numeric time series data are first-class entities, with "(statistical) variable" being an entity that represents a metric definition, and "observation" being an entity that represents the value of a variable at a specific time. The [Statistical Variable Explorer](https://datacommons.org/tools/statvar) allows you to browse existing variables, and the [Visualization tools](https://datacommons.org/tools/visualization) provide aggregated views of this data over time, geography, or 2-dimensional space. The APIs also allow you to directly query observations.
>>>>> +
>>>>> +The knowledge graph is also mapped to relational tables that allow for [SQL querying](https://docs.datacommons.org/bigquery/) (requiring a [Google Cloud BigQuery](https://cloud.google.com/bigquery) account) and easier joining to other datasets outside of Data Commons.
>>>>> +
>>>>> +<!--To learn more about the data model and key concepts, see [Key concepts](). -->
>>>>> +
>>>>> +## An open-source project and website platform
>>>>> +
>>>>> +Data Commons is intended to be a community-based resource, where individuals and organizations can contribute data, code, documentation and educational materials. Source code, schemas, and documentation are publicly available at [https://github.com/datacommonsorg](https://github.com/datacommonsorg).
>>>>>
>>>>> Can we drop "intended" and just say that it is a community-based
>>>>> resource? We do have external contributions...
>>>>> ------------------------------
>>>>>
>>>>> In index.md
>>>>> <#421 (comment)>
>>>>> :
>>>>>
>>>>> > +
>>>>> +The Data Commons [Knowledge Graph browser](https://datacommons.org/browser/) allows you to peek into the structure of the graph, and the APIs allow you to directly query the parts of the graph (e.g. nodes, triples, etc.).
>>>>> +
>>>>> +Importantly, numeric time series data are first-class entities, with "(statistical) variable" being an entity that represents a metric definition, and "observation" being an entity that represents the value of a variable at a specific time. The [Statistical Variable Explorer](https://datacommons.org/tools/statvar) allows you to browse existing variables, and the [Visualization tools](https://datacommons.org/tools/visualization) provide aggregated views of this data over time, geography, or 2-dimensional space. The APIs also allow you to directly query observations.
>>>>> +
>>>>> +The knowledge graph is also mapped to relational tables that allow for [SQL querying](https://docs.datacommons.org/bigquery/) (requiring a [Google Cloud BigQuery](https://cloud.google.com/bigquery) account) and easier joining to other datasets outside of Data Commons.
>>>>> +
>>>>> +<!--To learn more about the data model and key concepts, see [Key concepts](). -->
>>>>> +
>>>>> +## An open-source project and website platform
>>>>> +
>>>>> +Data Commons is intended to be a community-based resource, where individuals and organizations can contribute data, code, documentation and educational materials. Source code, schemas, and documentation are publicly available at [https://github.com/datacommonsorg](https://github.com/datacommonsorg).
>>>>> +
>>>>> +Google has partnerships with the [United Nations](https://unstats.un.org/UNSDWebsite/undatacommons/sdgs), the [World Health Organization](https://unstats.un.org/UNSDWebsite/undatacommons/areas/1471028664), [One.org](https://datacommons.one.org/), [TechSoup](https://publicdata.techsoup.org/), and many other non-profit, academic, and governmental organizations across the world. We are always looking to expand the available data sources and welcome contributions from data owners around the world.
>>>>> +
>>>>> +In addition, Data Commons makes its data and visualizations accessible to any website through [REST](/api/rest/v2/index.html) and [Javascript](/api/web_components/index.html) APIs.
>>>>>
>>>>> Javascript --> Web Components
>>>>> ------------------------------
>>>>>
>>>>> In index.md
>>>>> <#421 (comment)>
>>>>> :
>>>>>
>>>>> > +
>>>>> +Importantly, numeric time series data are first-class entities, with "(statistical) variable" being an entity that represents a metric definition, and "observation" being an entity that represents the value of a variable at a specific time. The [Statistical Variable Explorer](https://datacommons.org/tools/statvar) allows you to browse existing variables, and the [Visualization tools](https://datacommons.org/tools/visualization) provide aggregated views of this data over time, geography, or 2-dimensional space. The APIs also allow you to directly query observations.
>>>>> +
>>>>> +The knowledge graph is also mapped to relational tables that allow for [SQL querying](https://docs.datacommons.org/bigquery/) (requiring a [Google Cloud BigQuery](https://cloud.google.com/bigquery) account) and easier joining to other datasets outside of Data Commons.
>>>>> +
>>>>> +<!--To learn more about the data model and key concepts, see [Key concepts](). -->
>>>>> +
>>>>> +## An open-source project and website platform
>>>>> +
>>>>> +Data Commons is intended to be a community-based resource, where individuals and organizations can contribute data, code, documentation and educational materials. Source code, schemas, and documentation are publicly available at [https://github.com/datacommonsorg](https://github.com/datacommonsorg).
>>>>> +
>>>>> +Google has partnerships with the [United Nations](https://unstats.un.org/UNSDWebsite/undatacommons/sdgs), the [World Health Organization](https://unstats.un.org/UNSDWebsite/undatacommons/areas/1471028664), [One.org](https://datacommons.one.org/), [TechSoup](https://publicdata.techsoup.org/), and many other non-profit, academic, and governmental organizations across the world. We are always looking to expand the available data sources and welcome contributions from data owners around the world.
>>>>> +
>>>>> +In addition, Data Commons makes its data and visualizations accessible to any website through [REST](/api/rest/v2/index.html) and [Javascript](/api/web_components/index.html) APIs.
>>>>> +
>>>>> +Finally, Data Commons provides an open-source, [customizable website](/custom_dc/index.html) implementation, for organizations that want to host their own version of a Data Commons website, using their own data and user interfaces.
>>>>>
>>>>> is "customizable website" the best way to describe custom dc? could
>>>>> we call it [customizable implementation]
>>>>> ------------------------------
>>>>>
>>>>> In index.md
>>>>> <#421 (comment)>
>>>>> :
>>>>>
>>>>> > +
>>>>> +Data Commons needs to be able to stitch together data from disparate data sets in different formats and encodings, in a wide range of domains, from time series about demographics and employment, to hurricanes, to protein structures. To do so, it models the world as a [knowledge graph](https://blog.google/products/search/introducing-knowledge-graph-things-not/) consisting of nodes, or entities, with properties (attributes) and relationships between them forming directed edges between the nodes. The data model is based on the [Schema.org](https://www.schema.org) framework, an open framework used by over 40M websites; Data Commons makes heavy use of [Schema.org](https://www.schema.org/docs/schemas.html) constructs and extends the model as required, introducing both general constructs (such as intervals) and values for common properties.
>>>>> +
>>>>> +The Data Commons [Knowledge Graph browser](https://datacommons.org/browser/) allows you to peek into the structure of the graph, and the APIs allow you to directly query the parts of the graph (e.g. nodes, triples, etc.).
>>>>> +
>>>>> +Importantly, numeric time series data are first-class entities, with "(statistical) variable" being an entity that represents a metric definition, and "observation" being an entity that represents the value of a variable at a specific time. The [Statistical Variable Explorer](https://datacommons.org/tools/statvar) allows you to browse existing variables, and the [Visualization tools](https://datacommons.org/tools/visualization) provide aggregated views of this data over time, geography, or 2-dimensional space. The APIs also allow you to directly query observations.
>>>>> +
>>>>> +The knowledge graph is also mapped to relational tables that allow for [SQL querying](https://docs.datacommons.org/bigquery/) (requiring a [Google Cloud BigQuery](https://cloud.google.com/bigquery) account) and easier joining to other datasets outside of Data Commons.
>>>>> +
>>>>> +<!--To learn more about the data model and key concepts, see [Key concepts](). -->
>>>>> +
>>>>> +## An open-source project and website platform
>>>>> +
>>>>> +Data Commons is intended to be a community-based resource, where individuals and organizations can contribute data, code, documentation and educational materials. Source code, schemas, and documentation are publicly available at [https://github.com/datacommonsorg](https://github.com/datacommonsorg).
>>>>> +
>>>>> +Google has partnerships with the [United Nations](https://unstats.un.org/UNSDWebsite/undatacommons/sdgs), the [World Health Organization](https://unstats.un.org/UNSDWebsite/undatacommons/areas/1471028664), [One.org](https://datacommons.one.org/), [TechSoup](https://publicdata.techsoup.org/), and many other non-profit, academic, and governmental organizations across the world. We are always looking to expand the available data sources and welcome contributions from data owners around the world.
>>>>>
>>>>> how about "data coverage" instead of "available data sources"
>>>>> ------------------------------
>>>>>
>>>>> In index.md
>>>>> <#421 (comment)>
>>>>> :
>>>>>
>>>>> > ---
>>>>>
>>>>> -# Why Data Commons?
>>>>> +{: .no_toc}
>>>>> +# What is Data Commons?
>>>>>
>>>>> Should we point to the About page at all, for the Why?
>>>>>
>>>>> —
>>>>> Reply to this email directly, view it on GitHub
>>>>> <#421 (review)>,
>>>>> or unsubscribe
>>>>> <https://github.com/notifications/unsubscribe-auth/BHMM7UGITGPZI56JUXP5233ZIB3ILAVCNFSM6AAAAABJOTB4EWVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDCMRWGE3DOMZRGI>
>>>>> .
>>>>> You are receiving this because you authored the thread.Message ID:
>>>>> <datacommonsorg/docsite/pull/421/review/2126167312 <(212)%20616-7312>
>>>>> @github.com>
>>>>>
>>>>
>>>
>>> --
>>> ==================================
>>> Prem Ramaswami
>>> Product Manager
>>> DataCommons.org
>>> Make Data Sing
>>> ***@***.***
>>> Phone: +18579981598 <(857)%20998-1598>
>>> ==================================
>>>
>>>
>>>
>>>
>
> --
> ==================================
> Prem Ramaswami
> Product Manager
> DataCommons.org
> Make Data Sing
> ***@***.***
> Phone: +18579981598 <(857)%20998-1598>
> ==================================
>
>
>
>
|
oh my - we should drop that sentence |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the updates!
Co-authored-by: Carolyn Au <[email protected]>
No description provided.