Skip to content

Commit

Permalink
Allow configuring the operator via the YAML manifest. (#326)
Browse files Browse the repository at this point in the history
* Up until now, the operator read its own configuration from the
configmap.  That has a number of limitations, i.e. when the
configuration value is not a scalar, but a map or a list. We use a
custom code based on github.com/kelseyhightower/envconfig to decode
non-scalar values out of plain text keys, but that breaks when the data
inside the keys contains both YAML-special elememtns (i.e. commas) and
complex quotes, one good example for that is search_path inside
`team_api_role_configuration`. In addition, reliance on the configmap
forced a flag structure on the configuration, making it hard to write
and to read (see
zalando/postgres-operator#308 (comment)).

The changes allow to supply the operator configuration in a proper YAML
file. That required registering a custom CRD to support the operator
configuration and provide an example at
manifests/postgresql-operator-default-configuration.yaml. At the moment,
both old configmap and the new CRD configuration is supported, so no
compatibility issues, however, in the future I'd like to deprecate the
configmap-based configuration altogether. Contrary to the
configmap-based configuration, the CRD one doesn't embed defaults into
the operator code, however, one can use the
manifests/postgresql-operator-default-configuration.yaml as a starting
point in order to build a custom configuration.

Since previously `ReadyWaitInterval` and `ReadyWaitTimeout` parameters
used to create the CRD were taken from the operator configuration, which
is not possible if the configuration itself is stored in the CRD object,
I've added the ability to specify them as environment variables
`CRD_READY_WAIT_INTERVAL` and `CRD_READY_WAIT_TIMEOUT` respectively.

Per review by @zerg-junior  and  @Jan-M.
  • Loading branch information
alexeyklyukin authored May 7, 2020
1 parent aa59276 commit 27a95e3
Show file tree
Hide file tree
Showing 14 changed files with 583 additions and 44 deletions.
20 changes: 20 additions & 0 deletions cmd/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ import (
"os/signal"
"sync"
"syscall"
"time"

"github.com/zalando-incubator/postgres-operator/pkg/controller"
"github.com/zalando-incubator/postgres-operator/pkg/spec"
Expand All @@ -20,6 +21,14 @@ var (
config spec.ControllerConfig
)

func mustParseDuration(d string) time.Duration {
duration, err := time.ParseDuration(d)
if err != nil {
panic(err)
}
return duration
}

func init() {
flag.StringVar(&kubeConfigFile, "kubeconfig", "", "Path to kubeconfig file with authorization and master location information.")
flag.BoolVar(&outOfCluster, "outofcluster", false, "Whether the operator runs in- our outside of the Kubernetes cluster.")
Expand All @@ -38,6 +47,17 @@ func init() {
log.Printf("Fully qualified configmap name: %v", config.ConfigMapName)

}
if crd_interval := os.Getenv("CRD_READY_WAIT_INTERVAL"); crd_interval != "" {
config.CRDReadyWaitInterval = mustParseDuration(crd_interval)
} else {
config.CRDReadyWaitInterval = 4 * time.Second
}

if crd_timeout := os.Getenv("CRD_READY_WAIT_TIMEOUT"); crd_timeout != "" {
config.CRDReadyWaitTimeout = mustParseDuration(crd_timeout)
} else {
config.CRDReadyWaitTimeout = 30 * time.Second
}
}

func main() {
Expand Down
8 changes: 8 additions & 0 deletions docs/reference/command_line_and_environment.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,3 +48,11 @@ The following environment variables are accepted by the operator:
* **SCALYR_API_KEY**
the value of the Scalyr API key to supply to the pods. Overrides the
`scalyr_api_key` operator parameter.

* **CRD_READY_WAIT_TIMEOUT**
defines the timeout for the complete postgres CRD creation. When not set
default is 30s.

* **CRD_READY_WAIT_INTERVAL**
defines the interval between consecutive attempts waiting for the postgres
CRD to be created. The default is 5s.
104 changes: 98 additions & 6 deletions docs/reference/operator_parameters.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,54 @@

Postgres operator is configured via a ConfigMap defined by the
`CONFIG_MAP_NAME` environment variable. Variable names are underscore-separated
words.
There are two mutually-exclusive methods to set the Postgres Operator
configuration.

* ConfigMaps-based, the legacy one. The configuration is supplied in a
key-value configmap, defined by the `CONFIG_MAP_NAME` environment variable.
Non-scalar values, i.e. lists or maps, are encoded in the value strings using
the comma-based syntax for lists and coma-separated `key:value` syntax for
maps. String values containing ':' should be enclosed in quotes. The
configuration is flat, parameter group names below are not reflected in the
configuration structure. There is an
[example](https://github.com/zalando-incubator/postgres-operator/blob/master/manifests/configmap.yaml)

* CRD-based configuration. The configuration is stored in the custom YAML
manifest, an instance of the custom resource definition (CRD) called
`postgresql-operator-configuration`. This CRD is registered by the operator
during the start when `POSTGRES_OPERATOR_CONFIGURATION_OBJECT` variable is
set to a non-empty value. The CRD-based configuration is a regular YAML
document; non-scalar keys are simply represented in the usual YAML way. The
usage of the CRD-based configuration is triggered by setting the
`POSTGRES_OPERATOR_CONFIGURATION_OBJECT` variable, which should point to the
`postgresql-operator-configuration` object name in the operators namespace.
There are no default values built-in in the operator, each parameter that is
not supplied in the configuration receives an empty value. In order to
create your own configuration just copy the [default
one](https://github.com/zalando-incubator/postgres-operator/blob/wip/operator_configuration_via_crd/manifests/postgresql-operator-default-configuration.yaml)
and change it.

CRD-based configuration is more natural and powerful then the one based on
ConfigMaps and should be used unless there is a compatibility requirement to
use an already existing configuration. Even in that case, it should be rather
straightforward to convert the configmap based configuration into the CRD-based
one and restart the operator. The ConfigMaps-based configuration will be
deprecated and subsequently removed in future releases.

Note that for the CRD-based configuration configuration groups below correspond
to the non-leaf keys in the target YAML (i.e. for the Kubernetes resources the
key is `kubernetes`). The key is mentioned alongside the group description. The
ConfigMap-based configuration is flat and does not allow non-leaf keys.

Since in the CRD-based case the operator needs to create a CRD first, which is
controlled by the `resource_check_interval` and `resource_check_timeout`
parameters, those parameters have no effect and are replaced by the
`CRD_READY_WAIT_INTERVAL` and `CRD_READY_WAIT_TIMEOUT` environment variables.
They will be deprecated and removed in the future.

Variable names are underscore-separated words.

## General

Those are top-level keys, containing both leaf keys and groups.

* **etcd_host**
Etcd connection string for Patroni defined as `host:port`. Not required when
Patroni native Kubernetes support is used. The default is empty (use
Expand Down Expand Up @@ -38,6 +83,10 @@ words.
period between consecutive sync requests. The default is `5m`.

## Postgres users

Parameters describing Postgres users. In a CRD-configuration, they are grouped
under the `users` key.

* **super_username**
postgres `superuser` name to be created by `initdb`. The default is
`postgres`.
Expand All @@ -47,6 +96,11 @@ words.
`standby`.

## Kubernetes resources

Parameters to configure cluster-related Kubernetes objects created by the
operator, as well as some timeouts associated with them. In a CRD-based
configuration they are grouped under the `kubernetes` key.

* **pod_service_account_name**
service account used by Patroni running on individual Pods to communicate
with the operator. Required even if native Kubernetes support in Patroni is
Expand Down Expand Up @@ -127,6 +181,11 @@ words.
operator. The default is empty.

## Kubernetes resource requests

This group allows you to configure resource requests for the Postgres pods.
Those parameters are grouped under the `postgres_pod_resources` key in a
CRD-based configuration.

* **default_cpu_request**
CPU request value for the postgres containers, unless overridden by
cluster-specific settings. The default is `100m`.
Expand All @@ -144,6 +203,13 @@ words.
settings. The default is `1Gi`.

## Operator timeouts

This set of parameters define various timeouts related to some operator
actions, affecting pod operations and CRD creation. In the CRD-based
configuration `resource_check_interval` and `resource_check_timeout` have no
effect, and the parameters are grouped under the `timeouts` key in the
CRD-based configuration.

* **resource_check_interval**
interval to wait between consecutive attempts to check for the presence of
some Kubernetes resource (i.e. `StatefulSet` or `PodDisruptionBudget`). The
Expand Down Expand Up @@ -171,6 +237,10 @@ words.
the timeout for the complete postgres CRD creation. The default is `30s`.

## Load balancer related options

Those options affect the behavior of load balancers created by the operator.
In the CRD-based configuration they are grouped under the `load_balancer` key.

* **db_hosted_zone**
DNS zone for the cluster DNS name when the load balancer is configured for
the cluster. Only used when combined with
Expand Down Expand Up @@ -202,6 +272,12 @@ words.
No other placeholders are allowed.

## AWS or GSC interaction

The options in this group configure operator interactions with non-Kubernetes
objects from AWS or Google cloud. They have no effect unless you are using
either. In the CRD-based configuration those options are grouped under the
`aws_or_gcp` key.

* **wal_s3_bucket**
S3 bucket to use for shipping WAL segments with WAL-E. A bucket has to be
present and accessible by Patroni managed pods. At the moment, supported
Expand All @@ -218,9 +294,12 @@ words.
[kube2iam](https://github.com/jtblin/kube2iam) project on AWS. The default is empty.

* **aws_region**
AWS region used to store ESB volumes.
AWS region used to store ESB volumes. The default is `eu-central-1`.

## Debugging the operator

Options to aid debugging of the operator itself. Grouped under the `debug` key.

* **debug_logging**
boolean parameter that toggles verbose debug logs from the operator. The
default is `true`.
Expand All @@ -230,7 +309,12 @@ words.
access to the postgres database, i.e. creating databases and users. The default
is `true`.

### Automatic creation of human users in the database
## Automatic creation of human users in the database

Options to automate creation of human users with the aid of the teams API
service. In the CRD-based configuration those are grouped under the `teams_api`
key.

* **enable_teams_api**
boolean parameter that toggles usage of the Teams API by the operator.
The default is `true`.
Expand Down Expand Up @@ -276,6 +360,9 @@ words.
infrastructure role. The default is `admin`.

## Logging and REST API

Parameters affecting logging and REST API listener. In the CRD-based configuration they are grouped under the `logging_rest_api` key.

* **api_port**
REST API listener listens to this port. The default is `8080`.

Expand All @@ -286,6 +373,11 @@ words.
number of entries in the cluster history ring buffer. The default is `1000`.

## Scalyr options

Those parameters define the resource requests/limits and properties of the
scalyr sidecar. In the CRD-based configuration they are grouped under the
`scalyr` key.

* **scalyr_api_key**
API key for the Scalyr sidecar. The default is empty.

Expand Down
81 changes: 81 additions & 0 deletions manifests/postgresql-operator-default-configuration.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
apiVersion: "acid.zalan.do/v1"
kind: postgresql-operator-configuration
metadata:
name: postgresql-operator-default-configuration
configuration:
etcd_host: ""
docker_image: registry.opensource.zalan.do/acid/spilo-cdp-10:1.4-p8
workers: 4
min_instances: -1
max_instances: -1
resync_period: 5m
#sidecar_docker_images:
# example: "exampleimage:exampletag"
users:
super_username: postgres
replication_username: standby
kubernetes:
pod_service_account_name: operator
pod_terminate_grace_period: 5m
pdb_name_format: "postgres-{cluster}-pdb"
secret_name_template: "{username}.{cluster}.credentials.{tprkind}.{tprgroup}"
oauth_token_secret_name: postgresql-operator
pod_role_label: spilo-role
cluster_labels:
application: spilo
cluster_name_label: cluster-name
# watched_namespace:""
# node_readiness_label: ""
# toleration: {}
# infrastructure_roles_secret_name: ""
# pod_environment_configmap: ""
postgres_pod_resources:
default_cpu_request: 100m
default_memory_request: 100Mi
default_cpu_limit: "3"
default_memory_limit: 1Gi
timeouts:
resource_check_interval: 3s
resource_check_timeout: 10m
pod_label_wait_timeout: 10m
pod_deletion_wait_timeout: 10m
ready_wait_interval: 4s
ready_wait_timeout: 30s
load_balancer:
enable_master_load_balancer: false
enable_replica_load_balancer: false
master_dns_name_format: "{cluster}.{team}.{hostedzone}"
replica_dns_name_format: "{cluster}-repl.{team}.{hostedzone}"
aws_or_gcp:
# db_hosted_zone: ""
# wal_s3_bucket: ""
# log_s3_bucket: ""
# kube_iam_role: ""
aws_region: eu-central-1
debug:
debug_logging: true
enable_database_access: true
teams_api:
enable_teams_api: false
team_api_role_configuration:
log_statement: all
enable_team_superuser: false
team_admin_role: admin
pam_role_name: zalandos
# pam_configuration: ""
protected_role_names:
- admin
# teams_api_url: ""
logging_rest_api:
api_port: 8008
ring_log_lines: 100
cluster_history_entries: 1000
scalyr:
scalyr_cpu_request: 100m
scalyr_memory_request: 50Mi
scalyr_cpu_limit: "1"
scalyr_memory_limit: 1Gi
# scalyr_api_key: ""
# scalyr_image: ""
# scalyr_server_url: ""

2 changes: 1 addition & 1 deletion pkg/cluster/cluster.go
Original file line number Diff line number Diff line change
Expand Up @@ -155,7 +155,7 @@ func (c *Cluster) setStatus(status spec.PostgresStatus) {

_, err = c.KubeClient.CRDREST.Patch(types.MergePatchType).
Namespace(c.Namespace).
Resource(constants.CRDResource).
Resource(constants.PostgresCRDResource).
Name(c.Name).
Body(request).
DoRaw()
Expand Down
2 changes: 1 addition & 1 deletion pkg/cluster/util.go
Original file line number Diff line number Diff line change
Expand Up @@ -424,7 +424,7 @@ func (c *Cluster) credentialSecretNameForCluster(username string, clusterName st
return c.OpConfig.SecretNameTemplate.Format(
"username", strings.Replace(username, "_", "-", -1),
"cluster", clusterName,
"tprkind", constants.CRDKind,
"tprkind", constants.PostgresCRDKind,
"tprgroup", constants.CRDGroup)
}

Expand Down
Loading

0 comments on commit 27a95e3

Please sign in to comment.