-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update custom DC docs for single env.list file #456
Merged
Merged
Changes from all commits
Commits
Show all changes
22 commits
Select commit
Hold shift + click to select a range
6af1ba7
integrate custom docs with new UI
kmoscoe cd0fe76
more edits
kmoscoe c45d61b
use website wording for intro
kmoscoe f5cbbca
fix numbering in table
kmoscoe 90c57e7
Merge branch 'master' into custom_dc
kmoscoe 57ca62f
rename and some edits
kmoscoe 2277cb1
Merge branch 'custom_dc' of https://github.com/kmoscoe/docsite into c…
kmoscoe fb33722
rename manage_repo file, per Bo
kmoscoe 0ebdd67
Merge.
kmoscoe e3148c4
merge
kmoscoe ee5f580
Merge branch 'datacommonsorg:master' into master
kmoscoe 5993fb7
Merge branch 'master' of https://github.com/datacommonsorg/docsite
kmoscoe 61ce06d
Merge branch 'custom_dc'
kmoscoe 2b37137
Merge branch 'datacommonsorg:master' into master
kmoscoe 37f3e87
formatting edits
kmoscoe d83db88
updates per Keyur's feedback
kmoscoe 03b906f
Fix typos
kmoscoe 03538af
fix nav order
kmoscoe d7fb58f
fix link to API key request form
kmoscoe c746a7e
update form link
kmoscoe d5e04af
update key request form and output dir env var
kmoscoe 0183130
Merge branch 'master' into custom_dc
kmoscoe File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -21,6 +21,7 @@ Once you have tested locally, you need to get your data into Google Cloud so you | |
|
||
You will upload your CSV and JSON files to [Google Cloud Storage](https://cloud.google.com/storage), and the custom Data Commons importer will transform, store, and query the data in a [Google Cloud SQL](https://cloud.google.com/sql) database. | ||
|
||
|
||
## Prerequisites | ||
|
||
- A [GCP](https://console.cloud.google.com/welcome) billing account and project. | ||
|
@@ -57,22 +58,22 @@ While you are testing, you can start with a single Google Cloud region; to be cl | |
1. For the **Location type**, choose the same regional options as for Cloud SQL above. | ||
1. When you have finished setting all the configuration options, click **Create**. | ||
1. In the **Bucket Details** page, click **Create Folder** to create a new folder to hold your data. | ||
1. Name the folder as desired. Record the folder path as <code>gs://<var>BUCKET_NAME</var>/<var>FOLDER_PATH</var></code> for setting environment variables below. You can start with the sample data provided under `custom_dc/sample` and update to your own data later. | ||
1. Name the folder as desired. Record the folder path as <code>gs://<var>BUCKET_NAME</var>/<var>FOLDER_PATH</var></code> for setting the `OUTPUT_DIR` environment variable below. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we mention the deprecated There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, it's in mentioned in the file. |
||
|
||
### Set up environment variables | ||
### Set environment variables | ||
|
||
1. Using your favorite editor, open `custom_dc/cloudsql_env.list`. | ||
1. Enter the relevant values for `DC_API_KEY` and `MAPS_API_KEY`. | ||
1. Using your favorite editor, open `custom_dc/env.list`. | ||
1. Set `USE_SQLITE=false` and `USE_CLOUDSQL=true` | ||
1. Set values for all of the following: | ||
|
||
- `GCS_DATA_PATH` | ||
- `CLOUDSQL_INSTANCE` | ||
- `GOOGLE_CLOUD_PROJECT` | ||
- `DB_NAME` | ||
- `DB_USER` | ||
- `DB_PASS` | ||
- `OUTPUT_DIR` | ||
|
||
See comments in the [`cloudsql_env.list`](https://github.com/datacommonsorg/website/blob/master/custom_dc/cloudsql_env.list) file for the correct format for each option. | ||
See comments in the [`env.list`](https://github.com/datacommonsorg/website/blob/master/custom_dc/env.list) file for the correct format for each option. | ||
|
||
1. Optionally, set an `ADMIN_SECRET` to use when loading the data through the `/admin` page later. | ||
|
||
|
@@ -117,23 +118,23 @@ If you are prompted to install the Cloud Resource Manager API, press `y` to acce | |
|
||
If you have not made changes that require a local build, and just want to run the pre-downloaded image, from your repository root, run: | ||
|
||
```shell | ||
<pre> | ||
docker run -it \ | ||
--env-file $PWD/custom_dc/cloudsql_env.list \ | ||
--env-file $PWD/custom_dc/env.list \ | ||
-p 8080:8080 \ | ||
-e DEBUG=true \ | ||
-e GOOGLE_APPLICATION_CREDENTIALS=/gcp/creds.json \ | ||
-v $HOME/.config/gcloud/application_default_credentials.json:/gcp/creds.json:ro \ | ||
gcr.io/datcom-ci/datacommons-website-compose:stable | ||
``` | ||
</pre> | ||
|
||
#### Run with a locally built repo | ||
|
||
If you have made local changes and have a [locally built repo](/custom_dc/manage_repo.html#build-repo), from the root of the repository, run the following: | ||
If you have made local changes and have a [locally built repo](/custom_dc/build_image.html#build-repo), from the root of the repository, run the following: | ||
|
||
<pre> | ||
docker run -it \ | ||
--env-file $PWD/custom_dc/cloudsql_env.list \ | ||
--env-file $PWD/custom_dc/env.list \ | ||
-p 8080:8080 \ | ||
-e DEBUG=true \ | ||
-e GOOGLE_APPLICATION_CREDENTIALS=/gcp/creds.json \ | ||
|
@@ -149,7 +150,7 @@ Each time you upload new versions of the source CSV and JSON files, you need to | |
|
||
You can load the new/updated data from Cloud Storage using the `/admin` page on the site: | ||
|
||
1. Optionally, in the `cloudsql_env.list` file, set the `ADMIN_SECRET` environment variable to a string that authorizes users to load data. | ||
1. Optionally, in the `env.list` file, set the `ADMIN_SECRET` environment variable to a string that authorizes users to load data. | ||
1. Start the Docker container as described above. | ||
1. With the services running, navigate to the `/admin` page. If a secret is required, enter it in the text field, and click **Load**. | ||
This runs a script inside the Docker container, that converts the CSV data in Cloud Storage into SQL tables, and stores them in the Cloud SQL database you created earlier. It also generates embeddings in the Google Cloud Storage folder into which you uploaded the CSV/JSON files, in a `datacommons/nl/` subfolder. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding a short note to explain this dichotomy. Something along the lines of a separate utility for managing data coming soon where input and output dirs can be different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nah, let's not bother. Too much to explain now.