Skip to content

Commit

Permalink
content trasferred to hugo site
Browse files Browse the repository at this point in the history
  • Loading branch information
gibbscullen committed Apr 27, 2020
1 parent ff7d60c commit 21dc3af
Show file tree
Hide file tree
Showing 53 changed files with 2,785 additions and 103 deletions.
8 changes: 5 additions & 3 deletions docs-beta/content/about_m3/_index.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,13 @@
+++
title = "About the project"
date = 2020-04-01T19:43:46-04:00
weight = 1
weight = 2
chapter = true
pre = "<b>1. </b>"
pre = "<b>2. </b>"
+++

### Section 1

# Overview
# Overview


177 changes: 176 additions & 1 deletion docs-beta/content/about_m3/contributing.md

Large diffs are not rendered by default.

18 changes: 17 additions & 1 deletion docs-beta/content/about_m3/glossary.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,22 @@
---
title: "Glossary"
title: "I. Glossary"
date: 2020-04-21T20:45:40-04:00
draft: true
---

Glossary
Bootstrapping: Process by which an M3DB node is brought up. Bootstrapping consists of determining the integrity of data that the node has, replay writes from the commit log, and/or stream missing data from its peers.
Cardinality: The number of unique metrics within the M3DB index. Cardinality increases with the number of unique tag/value combinations that are being emitted.
Datapoint: A single timestamp/value. Timeseries are composed of multiple datapoints and a series of tag/value pairs.
Labels: Pairs of descriptive words that give meaning to a metric. Tags and Labels are interchangeable terms.
Metric: A collection of uniquely identifiable tags.
M3: Highly scalable, distributed metrics platform that is comprised of a native, distributed time series database, a highly-dynamic and performant aggregation service, a query engine, and other supporting infrastructure.
M3Coordinator: A service within M3 that coordinates reads and writes between upstream systems, such as Prometheus, and downstream systems, such as M3DB.
M3DB: Distributed time series database influenced by Gorilla and Cassandra released as open source by Uber Technologies.
M3Query: A distributed query engine for M3DB. Unlike M3Coordinator, M3Query only provides supports for reads.
Namespace: Similar to a table in other types of databases, namespaces in M3DB have a unique name and a set of configuration options, such as data retention and block size.
Placement: Map of the M3DB cluster's shard replicas to nodes. Each M3DB cluster has only one placement. Placement and Topology are interchangeable terms.
Shard: Effectively the same as a "virtual shard" in Cassandra in that it provides an arbitrary distribution of time series data via a simple hash of the series ID.
Tags: Pairs of descriptive words that give meaning to a metric. Tags and Labels are interchangeable terms.
Timeseries: A series of data points tracking a particular metric over time.
Topology: Map of the M3DB cluster's shard replicas to nodes. Each M3DB cluster has only one placement. Placement and Topology are interchangeable terms.
2 changes: 1 addition & 1 deletion docs-beta/content/about_m3/release_notes.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: "Release_notes"
title: "II. Release notes"
date: 2020-04-21T20:45:33-04:00
draft: true
---
Expand Down
13 changes: 7 additions & 6 deletions docs-beta/content/getting_started/_index.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,14 @@
+++
title = "Getting Started"
date = 2020-04-01T19:26:56-04:00
weight = 2
weight = 4
chapter = true
pre = "<b>2. </b>"
pre = "<b>4. </b>"
+++

### Section 2
### Getting Started

# Getting Started


Getting started with M3 is as easy as following one of the How-To guides:
* Getting started from the M3 Binary
* Getting started in Kubernetes
* Geting started in Docker
42 changes: 41 additions & 1 deletion docs-beta/content/getting_started/docker.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,46 @@
---
title: "Docker"
title: "In Docker"
date: 2020-04-21T20:47:48-04:00
draft: true
---

Docker & Kernel Configuration
This document lists the Kernel tweaks M3DB needs to run well. If you are running on Kubernetes, you may use our sysctl-setter DaemonSet that will set these values for you. Please read the comment in that manifest to understand the implications of applying it.
Running with Docker
When running M3DB inside Docker, it is recommended to add the SYS_RESOURCE capability to the container (using the --cap-add argument to docker run) so that it can raise its file limits:
docker run --cap-add SYS_RESOURCE quay.io/m3/m3dbnode:latest

If M3DB is being run as a non-root user, M3's setcap images are required:
docker run --cap-add SYS_RESOURCE -u 1000:1000 quay.io/m3/m3dbnode:latest-setcap

More information on Docker's capability settings can be found here.
vm.max_map_count
M3DB uses a lot of mmap-ed files for performance, as a result, you might need to bump vm.max_map_count. We suggest setting this value to 3000000, so you don’t have to come back and debug issues later.
On Linux, you can increase the limits by running the following command as root:
sysctl -w vm.max_map_count=3000000

To set this value permanently, update the vm.max_map_count setting in /etc/sysctl.conf.
vm.swappiness
vm.swappiness controls how much the virtual memory subsystem will try to swap to disk. By default, the kernel configures this value to 60, and will try to swap out items in memory even when there is plenty of RAM available to the system.
We recommend sizing clusters such that M3DB is running on a substrate (hosts/containers) such that no-swapping is necessary, i.e. the process is only using 30-50% of the maximum available memory. And therefore recommend setting the value of vm.swappiness to 1. This tells the kernel to swap as little as possible, without altogether disabling swapping.
On Linux, you can configure this by running the following as root:
sysctl -w vm.swappiness=1

To set this value permanently, update the vm.swappiness setting in /etc/sysctl.conf.
rlimits
M3DB also can use a high number of files and we suggest setting a high max open number of files due to per partition fileset volumes.
You may need to override the system and process-level limits set by the kernel with the following commands. To check the existing values run:
sysctl -n fs.file-max

and
sysctl -n fs.nr_open

to see the kernel and process limits respectively. If either of the values are less than three million (our minimum recommended value), then you can update them with the following commands:
sysctl -w fs.file-max=3000000

sysctl -w fs.nr_open=3000000

To set these values permanently, update the fs.file-max and fs.nr_open settings in /etc/sysctl.conf.
Alternatively, if you wish to have M3DB run under systemd you can use our service example which will set sane defaults. Keep in mind that you'll still need to configure the kernel and process limits because systemd will not allow a process to exceed them and will silently fallback to a default value which could cause M3DB to crash due to hitting the file descriptor limit. Also note that systemd has a system.conf file and a user.conf file which may contain limits that the service-specific configuration files cannot override. Be sure to check that those files aren't configured with values lower than the value you configure at the service level.
Before running the process make sure the limits are set, if running manually you can raise the limit for the current user with ulimit -n 3000000.

209 changes: 208 additions & 1 deletion docs-beta/content/getting_started/kube.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,213 @@
---
title: "Kube"
title: "In Kubernetes"
date: 2020-04-21T20:47:43-04:00
draft: true
---

M3DB on Kubernetes
Please note: If possible PLEASE USE THE OPERATOR to deploy to Kubernetes if you can. It is a considerly more streamlined setup.
The operator leverages custom resource definitions (CRDs) to automatically handle operations such as managing cluster topology.
The guide below provides static manifests to bootstrap a cluster on Kubernetes and should be considered as a guide to running M3 on Kubernetes, if and only if you have significant custom requirements not satisified by the operator.
Prerequisites
M3DB performs better when it has access to fast disks. Every incoming write is written to a commit log, which at high volumes of writes can be sensitive to spikes in disk latency. Additionally the random seeks into files when loading cold files benefit from lower random read latency.
Because of this, the included manifests reference a StorageClass named fast. Manifests are provided to provide such a StorageClass on AWS / Azure / GCP using the respective cloud provider's premium disk class.
If you do not already have a StorageClass named fast, create one using one of the provided manifests:
# AWS EBS (class io1)
kubectl apply -f https://raw.githubusercontent.com/m3db/m3/master/kube/storage-fast-aws.yaml

# Azure premium LRS
kubectl apply -f https://raw.githubusercontent.com/m3db/m3/master/kube/storage-fast-azure.yaml

# GCE Persistent SSD
kubectl apply -f https://raw.githubusercontent.com/m3db/m3/master/kube/storage-fast-gcp.yaml

If you wish to use your cloud provider's default remote disk, or another disk class entirely, you'll have to modify them manifests.
If your Kubernetes cluster spans multiple availability zones, it's important to specify a Volume Binding Mode of WaitForFirstConsumer in your StorageClass to delay the binding of the PersistentVolume until the Pod is created.
Kernel Configuration
We provide a Kubernetes daemonset that can make setting host-level sysctls easier. Please see the kernel docs for more.
Note that our default StatefulSet spec will give the M3DB container CAP_SYS_RESOURCE so it may raise its file limits. Uncomment the securityContext on the m3db container in the StatefulSet if running with a Pod Security Policy or similar enforcement mechanism that prevents adding capabilities to containers.
Deploying
Apply the following manifest to create your cluster:
kubectl apply -f https://raw.githubusercontent.com/m3db/m3/master/kube/bundle.yaml

Applying this bundle will create the following resources:
An m3db Namespace for all M3DB-related resources.
A 3-node etcd cluster in the form of a StatefulSet backed by persistent remote SSDs. This cluster stores the DB topology and other runtime configuration data.
A 3-node M3DB cluster in the form of a StatefulSet.
Headless services for the etcd and m3db StatefulSets to provide stable DNS hostnames per-pod.
Wait until all created pods are listed as ready:
$ kubectl -n m3db get po
NAME READY STATUS RESTARTS AGE
etcd-0 1/1 Running 0 22m
etcd-1 1/1 Running 0 22m
etcd-2 1/1 Running 0 22m
m3dbnode-0 1/1 Running 0 22m
m3dbnode-1 1/1 Running 0 22m
m3dbnode-2 1/1 Running 0 22m

You can now proceed to initialize a namespace and placement for the cluster the same as you would for our other how-to guides:
# Open a local connection to the coordinator service:
$ kubectl -n m3db port-forward svc/m3coordinator 7201
Forwarding from 127.0.0.1:7201 -> 7201
Forwarding from [::1]:7201 -> 7201

# Create an initial cluster topology
curl -sSf -X POST localhost:7201/api/v1/placement/init -d '{
"num_shards": 1024,
"replication_factor": 3,
"instances": [
{
"id": "m3dbnode-0",
"isolation_group": "pod0",
"zone": "embedded",
"weight": 100,
"endpoint": "m3dbnode-0.m3dbnode:9000",
"hostname": "m3dbnode-0.m3dbnode",
"port": 9000
},
{
"id": "m3dbnode-1",
"isolation_group": "pod1",
"zone": "embedded",
"weight": 100,
"endpoint": "m3dbnode-1.m3dbnode:9000",
"hostname": "m3dbnode-1.m3dbnode",
"port": 9000
},
{
"id": "m3dbnode-2",
"isolation_group": "pod2",
"zone": "embedded",
"weight": 100,
"endpoint": "m3dbnode-2.m3dbnode:9000",
"hostname": "m3dbnode-2.m3dbnode",
"port": 9000
}
]
}'

# Create a namespace to hold your metrics
curl -X POST localhost:7201/api/v1/namespace -d '{
"name": "default",
"options": {
"bootstrapEnabled": true,
"flushEnabled": true,
"writesToCommitLog": true,
"cleanupEnabled": true,
"snapshotEnabled": true,
"repairEnabled": false,
"retentionOptions": {
"retentionPeriodDuration": "720h",
"blockSizeDuration": "12h",
"bufferFutureDuration": "1h",
"bufferPastDuration": "1h",
"blockDataExpiry": true,
"blockDataExpiryAfterNotAccessPeriodDuration": "5m"
},
"indexOptions": {
"enabled": true,
"blockSizeDuration": "12h"
}
}
}'

Shortly after you should see your nodes finish bootstrapping:
$ kubectl -n m3db logs -f m3dbnode-0
21:36:54.831698[I] cluster database initializing topology
21:36:54.831732[I] cluster database resolving topology
21:37:22.821740[I] resolving namespaces with namespace watch
21:37:22.821813[I] updating database namespaces [{adds [metrics]} {updates []} {removals []}]
21:37:23.008109[I] node tchannelthrift: listening on 0.0.0.0:9000
21:37:23.008384[I] cluster tchannelthrift: listening on 0.0.0.0:9001
21:37:23.217090[I] node httpjson: listening on 0.0.0.0:9002
21:37:23.217240[I] cluster httpjson: listening on 0.0.0.0:9003
21:37:23.217526[I] bootstrapping shards for range starting [{run bootstrap-data} {bootstrapper filesystem} ...
...
21:37:23.239534[I] bootstrap data fetched now initializing shards with series blocks [{namespace metrics} {numShards 256} {numSeries 0}]
21:37:23.240778[I] bootstrap finished [{namespace metrics} {duration 23.325194ms}]
21:37:23.240856[I] bootstrapped
21:37:29.733025[I] successfully updated topology to 3 hosts

You can now write and read metrics using the API on the DB nodes:
$ kubectl -n m3db port-forward svc/m3dbnode 9003
Forwarding from 127.0.0.1:9003 -> 9003
Forwarding from [::1]:9003 -> 9003

curl -sSf -X POST localhost:9003/writetagged -d '{
"namespace": "default",
"id": "foo",
"tags": [
{
"name": "city",
"value": "new_york"
},
{
"name": "endpoint",
"value": "/request"
}
],
"datapoint": {
"timestamp": '"$(date "+%s")"',
"value": 42.123456789
}
}'

$ curl -sSf -X POST http://localhost:9003/query -d '{
"namespace": "default",
"query": {
"regexp": {
"field": "city",
"regexp": ".*"
}
},
"rangeStart": 0,
"rangeEnd": '"$(date "+%s")"'
}' | jq .

{
"results": [
{
"id": "foo",
"tags": [
{
"name": "city",
"value": "new_york"
},
{
"name": "endpoint",
"value": "/request"
}
],
"datapoints": [
{
"timestamp": 1527630053,
"value": 42.123456789
}
]
}
],
"exhaustive": true
}

Adding nodes
You can easily scale your M3DB cluster by scaling the StatefulSet and informing the cluster topology of the change:
kubectl -n m3db scale --replicas=4 statefulset/m3dbnode

Once the pod is ready you can modify the cluster topology:
kubectl -n m3db port-forward svc/m3coordinator 7201
Forwarding from 127.0.0.1:7201 -> 7201
Forwarding from [::1]:7201 -> 7201

curl -sSf -X POST localhost:7201/api/v1/placement -d '{
"instances": [
{
"id": "m3dbnode-3",
"isolation_group": "pod3",
"zone": "embedded",
"weight": 100,
"endpoint": "m3dbnode-3.m3dbnode:9000",
"hostname": "m3dbnode-3.m3dbnode",
"port": 9000
}
]
}'
Loading

0 comments on commit 21dc3af

Please sign in to comment.