Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update xDS doc #528

Merged
merged 2 commits into from
May 9, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
113 changes: 105 additions & 8 deletions docs/src/xds.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,16 +4,9 @@ In addition to static configuration provided upon startup, a Quiklin proxy's con

Communication between the proxy and management server uses the [xDS gRPC protocol][xDS], similar to an [envoy proxy]. xDS is one of the standard configuration mechanisms for software proxies and as a result, Quilkin can be setup to discover configuration resources from any API compatible server. Also, given that the protocol is [well specified][xDS-protocol], it is similarly straight-forward to implement a custom server to suit any deployment's needs.

> The [go-control-plane] project provides production ready implementations of the API on top of which custom servers can be built relatively easily.

As described within the [xDS-api] documentation, the xDS API comprises a set of resource discovery APIs, each serving a specific set of configuration resource types, while the protocol itself comes in several [variants][xds-variants].
Quilkin implements the **Aggregated Discovery Service (ADS)** _State of the World (SotW)_ variant with gRPC.

## Sample Control Plane Implementation

A sample control plane can be found [here][control-plane], with both a demo project that uses a configuration file
as its data source, as well as an opinionated integration with [Agones] game server orchestration framework.

## Supported APIs

Since the range of resources configurable by the xDS API extends that of Quilkin's domain (i.e being UDP based, Quilkin does not have a need for HTTP/TCP resources), only a subset of the API is supported. The following lists these relevant parts and any limitation to the provided support as a result:
Expand All @@ -36,6 +29,70 @@ Since the range of resources configurable by the xDS API extends that of Quilkin
* Only the list of [filters][xds-filters] specified in the [filter chain][xds-filter-chain] is used by the proxy - i.e other fields like `filter_chain_match` are ignored. This list also specifies the order that the corresponding filter chain will be constructed.
* gRPC proto configuration for Quilkin's built-in filters [can be found here][filter-protos]. They are equivalent to the filter's static configuration.

## Available Providers
The server can be run by a quilkin commmand name _manage_.

### Agones

1. Cluster information is retrieved from [Agones] - the server watches for `Allocated`
[Agones GameServers] and exposes their IP address and Port as [upstream endpoints][upstream-endpoint] to
any connected Quilkin proxies.
The set of tokens for the associated endpoint can be set by adding a comma separated standard base64 encoded strings.
This must be added under an annotation `quilkin.dev/tokens` in the [GameServer][Agones GameServers]'s spec.
For example:
```yaml
annotations:
Sets two tokens for the corresponding endpoint with values 1x7ijy6 and 8gj3v2i respectively.
quilkin.dev/tokens: MXg3aWp5Ng==,OGdqM3YyaQ==
```

> Since an Agones GameServer can have multiple ports exposed, if multiple ports are in
> use, the server looks for the port named `default` and picks that as the endpoint's
> port (otherwise it picks the first port in the port list).

2. Filter chain is configurable on a per-proxy basis. By default an empty filter chain is used and from there the filter chain can configured using a configMap named `quilkin-config` on the proxy's pod.

As an example, the following runs the server with subcommnad `manage agones` against a cluster (using default kubeconfig configuration) where Quilkin pods run in the `quilkin` namespace and game-server pods run in the `gameservers` namespace:

```sh
quilkin manage --port 18000 agones --config-namespace quilkin --gameservers-namespace gameservers
```

> A proxy's pod must have a `quilkin.dev/role` key in `quilkin-config` configMap set to the value `proxy` in order for the management server to detect the pod as a proxy and push updates to it.

> Note that currently, the server can only discover resources within a single cluster.

### Filesystem

The filesystem provider watches a configuration file on disk and sends updates to proxies whenever that file changes.

It can be started with using subcommnad `manage file` as the following:
```sh
quilkin manage --port 18000 file --config-file-path config.yaml
```

After running this command, any proxy that connects to port 18000 will receive updates as configured in `config.yaml` file.
You can find the configuration file schema here in [Proxy Configuration][proxy-configuration].

Example:
```yaml
clusters:
- name: cluster-a
endpoints:
- ip: 123.0.0.1
port": 29
metadata:
'quilkin.dev':
tokens:
- "MXg3aWp5Ng=="
filters:
- name: quilkin.filters.debug.v1alpha1.Debug
config:
id: hello
```

To add an HTTP admin server, check out the [Administration][admin] page.

## Metrics

Quilkin exposes the following metrics around the management servers and its resources:
Expand All @@ -61,10 +118,45 @@ Quilkin exposes the following metrics around the management servers and its reso
The total number of [DiscoveryRequest]s made by the proxy to management servers. This tracks messages flowing in the direction from the proxy to the management server.


The following metrics are exposed by the management server.

- `quilkin_management_server_connected_proxies` (Gauge)

The number of proxies currently connected to the server.
- `quilkin_management_server_discovery_requests_total{request_type}` (Counter)

The total number of xDS Discovery requests received across all proxies.
- `request_type` = `type.googleapis.com/envoy.config.cluster.v3.Cluster` | `type.googleapis.com/envoy.config.endpoint.v3.ClusterLoadAssignment` | `type.googleapis.com/envoy.config.listener.v3.Listener`
Type URL of the requested resource
- `quilkin_management_server_discovery_responses_total` (Counter)

The total number of xDS Discovery responses sent back across all proxies in response to Discovery Requests.
Each Discovery response sent corresponds to a configuration update for some proxy.
- `request_type` = `type.googleapis.com/envoy.config.cluster.v3.Cluster` | `type.googleapis.com/envoy.config.endpoint.v3.ClusterLoadAssignment` | `type.googleapis.com/envoy.config.listener.v3.Listener`
Type URL of the requested resource
- `quilkin_management_server_endpoints_total` (Gauge)

The number of active endpoints discovered by the server. The number of active endpoints
correlates with the size of the cluster configuration update sent to proxies.
- `quilkin_management_server_snapshot_generation_errors_total` (Counter)

The total number of errors encountered while generating a configuration snapshot update for a proxy.
- `quilkin_management_server_snapshots_generated_total` (Counter)

The total number of configuration snapshot generated across all proxies. A snapshot corresponds
to a point in time view of a proxy's configuration. However it does not necessarily correspond
to a proxy update - a proxy only gets the latest snapshot so it might miss intermediate
snapshots if it lags behind.
- `quilkin_management_server_snapshots_cache_size` (Gauge)

The current number of snapshots in the in-memory snapshot cache. This corresponds 1-1 to
proxies that connect to the server. However the number may be slightly higher than the number
of connected proxies since snapshots for disconnected proxies are only periodically cleared
from the cache.

[xDS]: https://www.envoyproxy.io/docs/envoy/latest/api-docs/xds_protocol#xds-rest-and-grpc-protocol
[envoy proxy]: https://www.envoyproxy.io/docs/envoy/latest/
[xDS-protocol]: https://www.envoyproxy.io/docs/envoy/latest/api-docs/xds_protocol#the-xds-transport-protocol
[go-control-plane]: https://github.com/envoyproxy/go-control-plane
[xDS-api]: https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/operations/dynamic_configuration
[CDS]: https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/operations/dynamic_configuration#cds
[EDS]: https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/operations/dynamic_configuration#eds
Expand All @@ -86,3 +178,8 @@ Quilkin exposes the following metrics around the management servers and its reso
[endpoint-metadata]: ./proxy.md#endpoint-metadata
[control-plane]: https://github.com/googleforgames/quilkin/tree/main/xds
[Agones]: https://agones.dev
[Kubernetes]: https://kubernetes.io/
[Agones GameServers]: https://agones.dev/site/docs/getting-started/create-gameserver/
[upstream-endpoint]: https://googleforgames.github.io/quilkin/main/book/proxy.html#upstream-endpoint
[proxy-configuration]: https://googleforgames.github.io/quilkin/main/book/proxy-configuration.html
[admin]: https://googleforgames.github.io/quilkin/main/book/admin.html
173 changes: 0 additions & 173 deletions xds/README.md

This file was deleted.