-
Notifications
You must be signed in to change notification settings - Fork 468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Documented Bulk IO issues #2340
Conversation
v2.0/backup.md
Outdated
{{site.data.alerts.callout_info}}Because CockroachDB is a distributed system, you cannot meaningfully store backups "locally" on nodes. The entire backup file must be stored in a single location, so attempts to store backups locally must point to an NFS drive to be useful.{{site.data.alerts.end}} | ||
- The backup location parameters often contain special characters that need to be URI-encoded. Use Javascript's [encodeURIComponent](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURIComponent) function or Go language's [url.QueryEscape](https://golang.org/pkg/net/url/#QueryEscape) function to URI-encode the parameters. Other languages provide similar functions to URI-encode special characters. | ||
- [<sup>1</sup>](#import-file-temp-directory-urls) Because CockroachDB is a distributed system, you cannot meaningfully store backups "locally" on nodes. The entire backup file must be stored in a single location, so attempts to store backups locally must point to an NFS drive to be useful. | ||
- [<sup>2</sup>](#import-file-temp-directory-urls) <span class="version-tag">New in v2.0</span> The file system backup location on a local node or NFS drive is relative to the path specified by the `--external-io-dir` flag set while [starting the node](start-a-node.html). If the flag is set to `disabled`, then backups to local directories and NFS drives are disabled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dt Please review the bullet point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is correct, though we could include in the last one (which defaults a subdirectory of the first configured store)
or similar if we wanted to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also mention that, if you do not want to restart your nodes to change this flag, you can create symlinks to the locations you want to use from within the extern dir to read/write to paths outside the extern dir
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dt Does this sound right: "The file system backup location on a local node or NFS drive is relative to the path specified by the --external-io-dir
flag set while starting the node. If the --external-io-dir
flag is set to disabled
, then backups to local directories and NFS drives are disabled. By default, the flag is set to the extern
subdirectory of the first configured store
. To avoid restarting your nodes to set the --external-io-dir
flag to the locations you want to use, create symlinks to the desired locations from within the extern
directory."
|
||
{{site.data.alerts.callout_info}}Because CockroachDB is a distributed system, you cannot meaningfully store backups "locally" on nodes. The entire backup file must be stored in a single location, so attempts to store backups locally must point to an NFS drive to be useful.{{site.data.alerts.end}} | ||
- The backup location parameters often contain special characters that need to be URI-encoded. Use Javascript's [encodeURIComponent](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURIComponent) function or Go language's [url.QueryEscape](https://golang.org/pkg/net/url/#QueryEscape) function to URI-encode the parameters. Other languages provide similar functions to URI-encode special characters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@benesch Please review this paragraph.
v2.0/import.md
Outdated
@@ -172,14 +172,17 @@ URLs for the file you want to import and your temp directory must use the follow | |||
| Google Cloud [<sup>1</sup>](#notes) | `gs` | Bucket name | None | | |||
| HTTP [<sup>2</sup>](#notes) | `http` | Remote host | N/A | | |||
| NFS/Local [<sup>3</sup>](#notes) | `nodelocal` | File system location | N/A | | |||
| S3-compatible services | `s3` | Bucket name | `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `AWS_REGION`, `AWS_ENDPOINT` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mjibson Please review this row.
@@ -109,8 +109,13 @@ The URL for your backup's locations must use the following format: | |||
| Google Cloud Storage | `gs` | Bucket name | None––currently only supports instance auth, but we can build non-instance auth at a customer's request | | |||
| HTTP | `http` | Remote host | N/A | | |||
| NFS | `nodelocal` | File system location | N/A | | |||
| S3-compatible services | `s3` | Bucket name | `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `AWS_REGION`, `AWS_ENDPOINT` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mjibson Please review this row.
{{site.data.alerts.callout_info}}Backups stored on NFSes work only if each node in the cluster accesses the drive in the same way.{{site.data.alerts.end}} | ||
#### Considerations | ||
|
||
- The backup location parameters often contain special characters that need to be URI-encoded. Use Javascript's [encodeURIComponent](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURIComponent) function or Go language's [url.QueryEscape](https://golang.org/pkg/net/url/#QueryEscape) function to URI-encode the parameters. Other languages provide similar functions to URI-encode special characters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@benesch Please review this section
#### Considerations | ||
|
||
- The backup location parameters often contain special characters that need to be URI-encoded. Use Javascript's [encodeURIComponent](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURIComponent) function or Go language's [url.QueryEscape](https://golang.org/pkg/net/url/#QueryEscape) function to URI-encode the parameters. Other languages provide similar functions to URI-encode special characters. | ||
- Backups stored on NFSes work only if each node in the cluster accesses the drive in the same way. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dt Please review this section
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The --external-io-dir
mitigates this limitation to some extent -- previously, since the path was relative to /
, every node had to mount the NFS share in the same place, like /mnt/nfs/backups
or whatever. Now that the backup path is relative to the node's configured external-io-dir, nodes can have the NFS mount at whatever location they want as long as they all point to the same share using their external-io-dir
flag.. e.g. one node might be started with --external-io-dir=/mnt/nfs/backups
while another might use --external-io-dir=/Volumes/backups
or whatever.
The important thing is that all nodes use --external-io-dir
to point to the same shared storage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dt So I can delete the second bullet (Backups stored on NFSes...), right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Amruta-Ranade, @dt, has this conversation been resolved?
v2.0/start-a-node.md
Outdated
@@ -42,6 +42,7 @@ Flag | Description | |||
`--background` | Set this to start the node in the background. This is better than appending `&` to the command because control is returned to the shell only once the node is ready to accept requests. | |||
`--cache` | The total size for caches, shared evenly if there are multiple storage devices. This can be a percentage or any bytes-based unit, for example: <br><br>`--cache=25%`<br>`--cache=1000000000 ----> 1000000000 bytes`<br>`--cache=1GB ----> 1000000000 bytes`<br>`--cache=1GiB ----> 1073741824 bytes` <br><br><strong>Note:</strong> If you enter the cache size as a percentage, you might need to escape the `%` sign, for instance, while configuring CockroachDB through systemd service files.<br><br><span class="version-tag">Changed in v1.1: </span>**Default:** `128MiB`<br><br>The default cache size is reasonable for local development clusters. For production deployments, this should be increased to 25% or higher. See [Recommended Production Settings](recommended-production-settings.html#cache-and-sql-memory-size-changed-in-v1-1) for more details. | |||
`--certs-dir` | The path to the [certificate directory](create-security-certificates.html). The directory must contain valid certificates if running in secure mode.<br><br>**Default:** `${HOME}/.cockroach-certs/` | |||
`--external-io-dir` | <span class="version-tag">New in v2.0</span> The path of the external IO directory with which the local file access paths are prefixed while performing backup and restore operations using local node directories or NFS drives. If set to `disabled`, backups and restores using local node directories and NFS drives are disabled.<br><br>**Default:** `extern` directory on the first storage device specified by the `--store` flag. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dt Please review this section
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we just call them stores? though we generally only advise configuring one per physical storage device. To be a bit pedantic, since we default to cockroach-data
, it isn't always explicitly specified by --store
.
I might go for something like:
extern
subdirectory of the first configured store (see the--store
flag).
| Backup Location | scheme | host | parameters | | ||
|-----------------|--------|------|------------| | ||
| Amazon S3 | `s3` | Bucket name | `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY` | | ||
| Azure | `azure` | Container name | `AZURE_ACCOUNT_KEY`, `AZURE_ACCOUNT_NAME` | | ||
| Google Cloud Storage | `gs` | Bucket name | None––currently only supports instance auth, but we can build non-instance auth at a customer's request | | ||
| HTTP | `http` | Remote host | N/A | | ||
| NFS | `nodelocal` | File system location | N/A | | ||
| NFS [<sup>1</sup>](#considerations) | `nodelocal` | File system location [<sup>2</sup>](#considerations) | N/A | | ||
| S3-compatible services | `s3` | Bucket name | `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `AWS_REGION`, `AWS_ENDPOINT` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mjibson Please review this row.
v2.0/import.md
Outdated
|
||
#### Notes | ||
|
||
[<sup>1</sup>](#import-file-temp-directory-urls) Only supports instance auth. | ||
|
||
[<sup>2</sup>](#import-file-temp-directory-urls) You can easily create your own HTTP server with [Caddy or nginx](create-a-file-server.html). | ||
|
||
[<sup>3</sup>](#import-file-temp-directory-urls) If using NFS for your temp directory, each node in the cluster must have access to the NFS using the same URL. | ||
[<sup>3</sup>](#import-file-temp-directory-urls) If using NFS for your temp directory, each node in the cluster must have access to the NFS using the same URL. The file system backup location on the NFS drive is relative to the path specified by the `--external-io-dir` flag set while [starting the node](start-a-node.html). If the flag is set to `disabled`, then imports from local directories and NFS drives are disabled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dt Please review this bullet point
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First sentence of this is no longer quite true (due to the second).
Now it is important only that each node in your cluster configure its --external-io-dir
flag to point to its local mount point of the that NFS share (or, as an alternative, symlink from the default extern
subdirectory to the NFS-backed directory).
v2.0/import.md
Outdated
[<sup>3</sup>](#import-file-temp-directory-urls) If using NFS for your temp directory, each node in the cluster must have access to the NFS using the same URL. | ||
[<sup>3</sup>](#import-file-temp-directory-urls) If using NFS for your temp directory, each node in the cluster must have access to the NFS using the same URL. The file system backup location on the NFS drive is relative to the path specified by the `--external-io-dir` flag set while [starting the node](start-a-node.html). If the flag is set to `disabled`, then imports from local directories and NFS drives are disabled. | ||
|
||
The location parameters often contain special characters that need to be URI-encoded. Use Javascript's [encodeURIComponent](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURIComponent) function or Go language's [url.QueryEscape](https://golang.org/pkg/net/url/#QueryEscape) function to URI-encode the parameters. Other languages provide similar functions to URI-encode special characters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@benesch Please review this paragraph
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My parts LGTM
Reviewed 4 of 4 files at r1. Comments from Reviewable |
Review status: 1 of 4 files reviewed at latest revision, 10 unresolved discussions, some commit checks failed. v2.0/backup.md, line 135 at r1 (raw file): Previously, Amruta-Ranade (Amruta Ranade) wrote…
👍 that last sentence could be prefixed with v2.0/restore.md, line 117 at r1 (raw file): Previously, Amruta-Ranade (Amruta Ranade) wrote…
yes. Comments from Reviewable |
Review status: 1 of 4 files reviewed at latest revision, 10 unresolved discussions, some commit checks failed. Comments from Reviewable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay, @Amruta-Ranade. Looks good overall. Just a few comments/questions.
v2.0/backup.md
Outdated
{{site.data.alerts.callout_info}}Because CockroachDB is a distributed system, you cannot meaningfully store backups "locally" on nodes. The entire backup file must be stored in a single location, so attempts to store backups locally must point to an NFS drive to be useful.{{site.data.alerts.end}} | ||
- The backup location parameters often contain special characters that need to be URI-encoded. Use Javascript's [encodeURIComponent](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURIComponent) function or Go language's [url.QueryEscape](https://golang.org/pkg/net/url/#QueryEscape) function to URI-encode the parameters. Other languages provide similar functions to URI-encode special characters. | ||
- [<sup>1</sup>](#import-file-temp-directory-urls) Because CockroachDB is a distributed system, you cannot meaningfully store backups "locally" on nodes. The entire backup file must be stored in a single location, so attempts to store backups locally must point to an NFS drive to be useful. | ||
- [<sup>2</sup>](#import-file-temp-directory-urls) <span class="version-tag">New in v2.0</span> The file system backup location on a local node or NFS drive is relative to the path specified by the `--external-io-dir` flag set while [starting the node](start-a-node.html). If the `--external-io-dir` flag is set to `disabled`, then backups to local directories and NFS drives are disabled. By default, the flag is set to the `extern` subdirectory of the first configured [`store`](#store). To avoid restarting your nodes to set the `--external-io-dir` flag to the locations you want to use, create symlinks to the desired locations from within the `extern` directory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
The
<span class="version-tag">New in v2.0</span>
doesn't behave well will bulleted lists. Renders weird:So let's just remove it here. It's called out as new in the
cockroach start
docs, which is sufficient. -
(#store)
should be(start-a-node.html#store)
. -
In the last sentence, "To avoid restarting your nodes to set the
--external-io-dir
flag to the locations you want to use` is confusing me. Can you rephrase?
v2.0/import.md
Outdated
@@ -172,14 +172,17 @@ URLs for the file you want to import and your temp directory must use the follow | |||
| Google Cloud [<sup>1</sup>](#notes) | `gs` | Bucket name | None | | |||
| HTTP [<sup>2</sup>](#notes) | `http` | Remote host | N/A | | |||
| NFS/Local [<sup>3</sup>](#notes) | `nodelocal` | File system location | N/A | | |||
| S3-compatible services | `s3` | Bucket name | `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `AWS_REGION`, `AWS_ENDPOINT` | | |||
|
|||
#### Notes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is called Considerations
in backup.md
, perhaps use that heading here as well? Also use bullets like in backup.md
?
#### Considerations | ||
|
||
- The backup location parameters often contain special characters that need to be URI-encoded. Use Javascript's [encodeURIComponent](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURIComponent) function or Go language's [url.QueryEscape](https://golang.org/pkg/net/url/#QueryEscape) function to URI-encode the parameters. Other languages provide similar functions to URI-encode special characters. | ||
- Backups stored on NFSes work only if each node in the cluster accesses the drive in the same way. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Amruta-Ranade, @dt, has this conversation been resolved?
v2.0/restore.md
Outdated
|
||
- The backup location parameters often contain special characters that need to be URI-encoded. Use Javascript's [encodeURIComponent](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURIComponent) function or Go language's [url.QueryEscape](https://golang.org/pkg/net/url/#QueryEscape) function to URI-encode the parameters. Other languages provide similar functions to URI-encode special characters. | ||
- Backups stored on NFSes work only if each node in the cluster accesses the drive in the same way. | ||
- <span class="version-tag">New in v2.0</span> The file system backup location on a local node or NFS drive is relative to the path specified by the `--external-io-dir` flag set while [starting the node](start-a-node.html). If the flag is set to `disabled`, then backups and restores from local directories and NFS drives are disabled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same feedback as for backup.md
: Let's remove <span class="version-tag">New in v2.0</span>
here.
v2.0/start-a-node.md
Outdated
@@ -42,6 +42,7 @@ Flag | Description | |||
`--background` | Set this to start the node in the background. This is better than appending `&` to the command because control is returned to the shell only once the node is ready to accept requests. | |||
`--cache` | The total size for caches, shared evenly if there are multiple storage devices. This can be a percentage or any bytes-based unit, for example: <br><br>`--cache=25%`<br>`--cache=1000000000 ----> 1000000000 bytes`<br>`--cache=1GB ----> 1000000000 bytes`<br>`--cache=1GiB ----> 1073741824 bytes` <br><br><strong>Note:</strong> If you enter the cache size as a percentage, you might need to escape the `%` sign, for instance, while configuring CockroachDB through systemd service files.<br><br><span class="version-tag">Changed in v1.1: </span>**Default:** `128MiB`<br><br>The default cache size is reasonable for local development clusters. For production deployments, this should be increased to 25% or higher. See [Recommended Production Settings](recommended-production-settings.html#cache-and-sql-memory-size-changed-in-v1-1) for more details. | |||
`--certs-dir` | The path to the [certificate directory](create-security-certificates.html). The directory must contain valid certificates if running in secure mode.<br><br>**Default:** `${HOME}/.cockroach-certs/` | |||
`--external-io-dir` | <span class="version-tag">New in v2.0</span> The path of the external IO directory with which the local file access paths are prefixed while performing backup and restore operations using local node directories or NFS drives. If set to `disabled`, backups and restores using local node directories and NFS drives are disabled.<br><br>**Default:** `extern` subdirectory of the first configured `store` (see [`store`](#store) below).<br><br>To avoid restarting your nodes to set the `--external-io-dir` flag to the locations you want to use, create symlinks to the desired locations from within the `extern` directory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
Add colon to "new" callout, i.e.,
<span class="version-tag">New in v2.0:</span>
. -
Let's remove "see
store
below" and just make the previous instance ofstore
a link, i.e.,the first configured
store
. -
As I mentioned in my review of
backup.md
, I don't understand what this is saying:To avoid restarting your nodes to set the
--external-io-dir
flag to the locations you want to use, create symlinks to the desired locations from within theextern
directory.Can you rephrase, or explain it to me in other terms so I can figure out what I'm not getting?
LGTM |
…to bulk_io_small_issues
1 similar comment
TC is failing on a link to www.bottlecaps.de/rr/ui, which is an attribution at the bottom of our full sql grammar page. Merging in the expectation that the site comes back eventually. |
Closes #2318 #2238 #2254 #2253 #2255
Because the PR has multiple reviewers, I have pointed out the sections to be reviewed for each reviewer. Hope that helps!