Skip to content

Commit

Permalink
Merged master
Browse files Browse the repository at this point in the history
  • Loading branch information
ewoutp committed Jun 8, 2018
2 parents 9f8f41a + c68289f commit df09665
Show file tree
Hide file tree
Showing 22 changed files with 326 additions and 24 deletions.
45 changes: 34 additions & 11 deletions docs/Manual/Deployment/Kubernetes/ServicesAndLoadBalancer.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,50 +39,53 @@ If you want to create external access services manually, follow the instructions
### Single server

For a single server deployment, the operator creates a single
`Service` named `<cluster-name>`. This service has a normal cluster IP
`Service` named `<deployment-name>`. This service has a normal cluster IP
address.

### Full cluster

For a full cluster deployment, the operator creates two `Services`.

- `<cluster-name>_servers` a headless `Service` intended to provide
- `<deployment-name>-int` a headless `Service` intended to provide
DNS names for all pods created by the operator.
It selects all ArangoDB & ArangoSync servers in the cluster.

- `<cluster-name>` a normal `Service` that selects only the coordinators
- `<deployment-name>` a normal `Service` that selects only the coordinators
of the cluster. This `Service` is configured with `ClientIP` session
affinity. This is needed for cursor requests, since they are bound to
a specific coordinator.

When the coordinators are asked to provide endpoints of the cluster
(e.g. when calling `client.SynchronizeEndpoints()` in the go driver)
the DNS names of the individual `Pods` will be returned
(`<pod>.<cluster-name>_servers.<namespace>.svc`)
(`<pod>.<deployment-name>-int.<namespace>.svc`)

### Full cluster with DC2DC

For a full cluster with datacenter replication deployment,
the same `Services` are created as for a Full cluster, with the following
additions:

- `<cluster-name>_sync` a normal `Service` that selects only the syncmasters
- `<deployment-name>-sync` a normal `Service` that selects only the syncmasters
of the cluster.

## Load balancer

To reach the ArangoDB servers from outside the Kubernetes cluster, you
have to deploy additional services.
If you want full control of the `Services` needed to access the ArangoDB deployment
from outside your Kubernetes cluster, set `spec.externalAccess.Type` of the `ArangoDeployment` to `None`
and create a `Service` as specified below.

You can use `LoadBalancer` or `NodePort` services, depending on your
Create a `Service` of type `LoadBalancer` or `NodePort`, depending on your
Kubernetes deployment.

This service should select:

- `arangodb_cluster_name: <cluster-name>`
- `arango_deployment: <deployment-name>`
- `role: coordinator`

For example:
The following example yields a service of type `LoadBalancer` with a specific
load balancer IP address.
With this service, the ArangoDB cluster can now be reached on `https://1.2.3.4:8529`.

```yaml
kind: Service
Expand All @@ -91,7 +94,27 @@ metadata:
name: arangodb-cluster-exposed
spec:
selector:
arangodb_cluster_name: arangodb-cluster
arango_deployment: arangodb-cluster
role: coordinator
type: LoadBalancer
loadBalancerIP: 1.2.3.4
ports:
- protocol: TCP
port: 8529
targetPort: 8529
```
The following example yields a service of type `NodePort` with the ArangoDB
cluster exposed on port 30529 of all nodes of the Kubernetes cluster.

```yaml
kind: Service
apiVersion: v1
metadata:
name: arangodb-cluster-exposed
spec:
selector:
arango_deployment: arangodb-cluster
role: coordinator
type: NodePort
ports:
Expand Down
8 changes: 8 additions & 0 deletions examples/production-cluster.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
apiVersion: "database.arangodb.com/v1alpha"
kind: "ArangoDeployment"
metadata:
name: "production-cluster"
spec:
mode: Cluster
image: arangodb/arangodb:3.3.10
environment: Production
5 changes: 5 additions & 0 deletions pkg/apis/deployment/v1alpha/environment.go
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,11 @@ func (e Environment) Validate() error {
}
}

// IsProduction returns true when the given environment is a production environment.
func (e Environment) IsProduction() bool {
return e == EnvironmentProduction
}

// NewEnvironment returns a reference to a string with given value.
func NewEnvironment(input Environment) *Environment {
return &input
Expand Down
3 changes: 3 additions & 0 deletions pkg/deployment/reconcile/action.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ package reconcile

import (
"context"
"time"
)

// Action executes a single Plan item.
Expand All @@ -35,4 +36,6 @@ type Action interface {
// CheckProgress checks the progress of the action.
// Returns true if the action is completely finished, false otherwise.
CheckProgress(ctx context.Context) (bool, error)
// Timeout returns the amount of time after which this action will timeout.
Timeout() time.Duration
}
6 changes: 6 additions & 0 deletions pkg/deployment/reconcile/action_add_member.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ package reconcile

import (
"context"
"time"

api "github.com/arangodb/kube-arangodb/pkg/apis/deployment/v1alpha"
"github.com/rs/zerolog"
Expand Down Expand Up @@ -64,3 +65,8 @@ func (a *actionAddMember) CheckProgress(ctx context.Context) (bool, error) {
// Nothing todo
return true, nil
}

// Timeout returns the amount of time after which this action will timeout.
func (a *actionAddMember) Timeout() time.Duration {
return addMemberTimeout
}
6 changes: 6 additions & 0 deletions pkg/deployment/reconcile/action_cleanout_member.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ package reconcile

import (
"context"
"time"

api "github.com/arangodb/kube-arangodb/pkg/apis/deployment/v1alpha"
"github.com/rs/zerolog"
Expand Down Expand Up @@ -114,3 +115,8 @@ func (a *actionCleanoutMember) CheckProgress(ctx context.Context) (bool, error)
// Cleanout completed
return true, nil
}

// Timeout returns the amount of time after which this action will timeout.
func (a *actionCleanoutMember) Timeout() time.Duration {
return cleanoutMemberTimeout
}
6 changes: 6 additions & 0 deletions pkg/deployment/reconcile/action_remove_member.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ package reconcile

import (
"context"
"time"

"github.com/pkg/errors"
"github.com/rs/zerolog"
Expand Down Expand Up @@ -94,3 +95,8 @@ func (a *actionRemoveMember) CheckProgress(ctx context.Context) (bool, error) {
// Nothing todo
return true, nil
}

// Timeout returns the amount of time after which this action will timeout.
func (a *actionRemoveMember) Timeout() time.Duration {
return removeMemberTimeout
}
6 changes: 6 additions & 0 deletions pkg/deployment/reconcile/action_renew_tls_certificate.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ package reconcile

import (
"context"
"time"

api "github.com/arangodb/kube-arangodb/pkg/apis/deployment/v1alpha"
"github.com/rs/zerolog"
Expand Down Expand Up @@ -69,3 +70,8 @@ func (a *renewTLSCertificateAction) Start(ctx context.Context) (bool, error) {
func (a *renewTLSCertificateAction) CheckProgress(ctx context.Context) (bool, error) {
return true, nil
}

// Timeout returns the amount of time after which this action will timeout.
func (a *renewTLSCertificateAction) Timeout() time.Duration {
return renewTLSCertificateTimeout
}
6 changes: 6 additions & 0 deletions pkg/deployment/reconcile/action_rotate_member.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ package reconcile

import (
"context"
"time"

api "github.com/arangodb/kube-arangodb/pkg/apis/deployment/v1alpha"
"github.com/rs/zerolog"
Expand Down Expand Up @@ -116,3 +117,8 @@ func (a *actionRotateMember) CheckProgress(ctx context.Context) (bool, error) {
}
return true, nil
}

// Timeout returns the amount of time after which this action will timeout.
func (a *actionRotateMember) Timeout() time.Duration {
return rotateMemberTimeout
}
5 changes: 5 additions & 0 deletions pkg/deployment/reconcile/action_shutdown_member.go
Original file line number Diff line number Diff line change
Expand Up @@ -111,3 +111,8 @@ func (a *actionShutdownMember) CheckProgress(ctx context.Context) (bool, error)
// Member still not shutdown, retry soon
return false, nil
}

// Timeout returns the amount of time after which this action will timeout.
func (a *actionShutdownMember) Timeout() time.Duration {
return shutdownMemberTimeout
}
6 changes: 6 additions & 0 deletions pkg/deployment/reconcile/action_upgrade_member.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ package reconcile

import (
"context"
"time"

api "github.com/arangodb/kube-arangodb/pkg/apis/deployment/v1alpha"
"github.com/rs/zerolog"
Expand Down Expand Up @@ -126,3 +127,8 @@ func (a *actionUpgradeMember) CheckProgress(ctx context.Context) (bool, error) {
}
return isUpgrading, nil
}

// Timeout returns the amount of time after which this action will timeout.
func (a *actionUpgradeMember) Timeout() time.Duration {
return upgradeMemberTimeout
}
6 changes: 6 additions & 0 deletions pkg/deployment/reconcile/action_wait_for_member_up.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ package reconcile

import (
"context"
"time"

driver "github.com/arangodb/go-driver"
"github.com/arangodb/go-driver/agency"
Expand Down Expand Up @@ -164,3 +165,8 @@ func (a *actionWaitForMemberUp) checkProgressArangoSync(ctx context.Context) (bo
}
return true, nil
}

// Timeout returns the amount of time after which this action will timeout.
func (a *actionWaitForMemberUp) Timeout() time.Duration {
return waitForMemberUpTimeout
}
3 changes: 3 additions & 0 deletions pkg/deployment/reconcile/context.go
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,9 @@ type Context interface {
GetAgencyClients(ctx context.Context, predicate func(id string) bool) ([]driver.Connection, error)
// GetSyncServerClient returns a cached client for a specific arangosync server.
GetSyncServerClient(ctx context.Context, group api.ServerGroup, id string) (client.API, error)
// CreateEvent creates a given event.
// On error, the error is logged.
CreateEvent(evt *v1.Event)
// CreateMember adds a new member to the given group.
// If ID is non-empty, it will be used, otherwise a new ID is created.
CreateMember(group api.ServerGroup, id string) error
Expand Down
20 changes: 18 additions & 2 deletions pkg/deployment/reconcile/plan_executor.go
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,13 @@ package reconcile
import (
"context"
"fmt"
"time"

"github.com/rs/zerolog"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"

api "github.com/arangodb/kube-arangodb/pkg/apis/deployment/v1alpha"
"github.com/rs/zerolog"
"github.com/arangodb/kube-arangodb/pkg/util/k8sutil"
)

// ExecutePlan tries to execute the plan as far as possible.
Expand Down Expand Up @@ -106,7 +108,21 @@ func (d *Reconciler) ExecutePlan(ctx context.Context) (bool, error) {
}
log.Debug().Bool("ready", ready).Msg("Action CheckProgress completed")
if !ready {
// Not ready check, come back soon
// Not ready yet, check timeout
deadline := planAction.CreationTime.Add(action.Timeout())
if time.Now().After(deadline) {
// Timeout has expired
log.Warn().Msg("Action not finished in time. Removing the entire plan")
d.context.CreateEvent(k8sutil.NewPlanTimeoutEvent(d.context.GetAPIObject(), string(planAction.Type), planAction.MemberID, planAction.Group.AsRole()))
// Replace plan with empty one and save it.
status.Plan = api.Plan{}
if err := d.context.UpdateStatus(status); err != nil {
log.Debug().Err(err).Msg("Failed to update CR status")
return false, maskAny(err)
}
return true, nil
}
// Timeout not yet expired, come back soon
return true, nil
}
// Continue with next action
Expand Down
36 changes: 36 additions & 0 deletions pkg/deployment/reconcile/timeouts.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
//
// DISCLAIMER
//
// Copyright 2018 ArangoDB GmbH, Cologne, Germany
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//
// Copyright holder is ArangoDB GmbH, Cologne, Germany
//
// Author Ewout Prangsma
//

package reconcile

import "time"

const (
addMemberTimeout = time.Minute * 5
cleanoutMemberTimeout = time.Hour * 12
removeMemberTimeout = time.Minute * 15
renewTLSCertificateTimeout = time.Minute * 30
rotateMemberTimeout = time.Minute * 30
shutdownMemberTimeout = time.Minute * 30
upgradeMemberTimeout = time.Hour * 6
waitForMemberUpTimeout = time.Minute * 15
)
2 changes: 1 addition & 1 deletion pkg/deployment/resources/context.go
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ type Context interface {
GetLifecycleImage() string
// GetNamespace returns the namespace that contains the deployment
GetNamespace() string
// createEvent creates a given event.
// CreateEvent creates a given event.
// On error, the error is logged.
CreateEvent(evt *v1.Event)
// GetOwnedPods returns a list of all pods owned by the deployment.
Expand Down
3 changes: 2 additions & 1 deletion pkg/deployment/resources/pvcs.go
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ func (r *Resources) EnsurePVCs() error {
owner := apiObject.AsOwner()
iterator := r.context.GetServerGroupIterator()
status := r.context.GetStatus()
enforceAntiAffinity := r.context.GetSpec().GetEnvironment().IsProduction()

if err := iterator.ForeachServerGroup(func(group api.ServerGroup, spec api.ServerGroupSpec, status *api.MemberStatusList) error {
for _, m := range *status {
Expand All @@ -51,7 +52,7 @@ func (r *Resources) EnsurePVCs() error {
role := group.AsRole()
resources := spec.Resources
finalizers := r.createPVCFinalizers(group)
if err := k8sutil.CreatePersistentVolumeClaim(kubecli, m.PersistentVolumeClaimName, deploymentName, ns, storageClassName, role, resources, finalizers, owner); err != nil {
if err := k8sutil.CreatePersistentVolumeClaim(kubecli, m.PersistentVolumeClaimName, deploymentName, ns, storageClassName, role, enforceAntiAffinity, resources, finalizers, owner); err != nil {
return maskAny(err)
}
}
Expand Down
Loading

0 comments on commit df09665

Please sign in to comment.