Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add etcd operational guide #1354

Merged
merged 7 commits into from
Feb 7, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/how_to/query.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ All namespaces that you wish to query from must be configured when [setting up M

### etcd

The configuration file linked above uses an embedded etcd cluster, which is fine for development purposes. However, if you wish to use this in production, you will want an external etcd cluster.
The configuration file linked above uses an embedded etcd cluster, which is fine for development purposes. However, if you wish to use this in production, you will want an [external etcd](../operational_guide/etcd.md) cluster.

<!-- TODO: link to etcd operational guide -->

Expand Down
83 changes: 83 additions & 0 deletions docs/operational_guide/etcd.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
# etcd
richardartoul marked this conversation as resolved.
Show resolved Hide resolved

The M3 stack leverages `etcd` as a distributed key-value storage to:

1. Update cluster configuration in realtime
2. Manage placements for our distributed / sharded tiers like M3DB and M3Aggregator
3. Perform leader-election in M3Aggregator

and much more!

## Overview

`M3DB` ships with support for running embedded `etcd` (called `seed nodes`), and while this is convenient for testing and development, we don't recommend running with this setup in production.

Both `M3` and `etcd` are complex distributed systems, and trying to operate both within the same binary is challenging and dangerous for production workloads.

Instead, we recommend running an external `etcd` cluster that is isolated from the `M3` stack so that performing operations like node adds, removes, and replaces are easier.

While M3 relies on `etcd` to provide strong consistency, the perations we use it for are all low-throughput so you should be able to operate a very low maintenance `etcd` cluster. [A 3-node setup for high availability](https://github.com/etcd-io/etcd/blob/v3.3.11/Documentation/faq.md#what-is-failure-tolerance) should be more than sufficient for most workloads.

## Configuring an External etcd Cluster

### M3DB

Most of our documentation demonstrates how to run `M3DB` with embedded etcd nodes. Once you're ready to switch to an external `etcd` cluster, all you need to do is modify the `M3DB` config to remove the `seedNodes` field entirely and then change the `endpoints` under `etcdClusters` to point to your external `etcd` nodes instead of the `M3DB` seed nodes.

For example this portion of the config

```yaml
config:
service:
env: default_env
zone: embedded
service: m3db
cacheDir: /var/lib/m3kv
etcdClusters:
- zone: embedded
endpoints:
- http://m3db_seed1:2379
- http://m3db_seed2:2379
- http://m3db_seed3:2379
seedNodes:
initialCluster:
- hostID: m3db_seed1
endpoint: http://m3db_seed1:2380
- hostID: m3db_seed2
endpoint: http://m3db_seed2:2380
- hostID: m3db_seed3
endpoint: http://m3db_seed3:2380
```

would become

```yaml
config:
service:
env: default_env
zone: embedded
service: m3db
cacheDir: /var/lib/m3kv
etcdClusters:
- zone: embedded
endpoints:
- http://external_etcd1:2379
- http://external_etcd2:2379
- http://external_etcd3:2379
```

**Note**: `M3DB` placements and namespaces are stored in `etcd` so if you want to switch to an external `etcd` cluster you'll need to recreate all your placements and namespaces. You can do this manually or use `etcdctl`'s [Mirror Maker](https://github.com/etcd-io/etcd/blob/v3.3.11/etcdctl/doc/mirror_maker.md) functionality.

### M3Coordinator

`M3Coordinator` does not run embedded `etcd`, so configuring it to use an external `etcd` cluster is simple. Just replace the `endpoints` under `etcdClusters` in the YAML config to point to your external `etcd` nodes instead of the `M3DB` seed nodes. See the `M3DB` example above for a detailed before/after comparison of the YAML config.

## etcd Operations

### Embedded etcd

If you're running `M3DB seed nodes` with embedded `etcd` (which we do not recommend for production workloads) and need to perform a node add/replace/remove then follow our [placement configuration guide](./placement_configuration.md) and pay special attention to follow the special instructions for `seed nodes`.

### External etcd

Just follow the instructions in the [etcd docs.](https://github.com/etcd-io/etcd/tree/master/Documentation)
2 changes: 1 addition & 1 deletion docs/operational_guide/placement_configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -139,7 +139,7 @@ After sending the delete command you will need to wait for the M3DB cluster to r

#### Adding / Removing Seed Nodes

If you find yourself adding or removing etcd seed nodes then we highly recommend setting up an external etcd cluster, as
If you find yourself adding or removing etcd seed nodes then we highly recommend setting up an [external etcd](../etcd.md) cluster, as
the overhead of operating two stateful systems at once is non-trivial. As this is not a recommended production setup,
this section is intentionally brief.

Expand Down
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,7 @@ pages:
- "Namespace Configuration": "operational_guide/namespace_configuration.md"
- "Bootstrapping": "operational_guide/bootstrapping.md"
- "Kernel Configuration": "operational_guide/kernel_configuration.md"
- "etcd": "operational_guide/etcd.md"
- "Integrations":
- "Prometheus": "integrations/prometheus.md"
- "Graphite": "integrations/graphite.md"
Expand Down