Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: support endpoint name grouping by OpenAPI definitions. #7130

Merged
merged 21 commits into from
Jun 19, 2021
Merged
Show file tree
Hide file tree
Changes from 15 commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
f718f39
Support endpoint name grouping by OpenAPI definitions
wankai123 Jun 18, 2021
e2f4672
Merge branch 'master' of github.com:apache/skywalking into openAPI
wankai123 Jun 18, 2021
4bd111e
polish some docs and configs
wankai123 Jun 18, 2021
83007d8
turn off enableEndpointNameGroupingByOpenapi
wankai123 Jun 18, 2021
a555601
polish docs
wankai123 Jun 18, 2021
73c887f
polish docs
wankai123 Jun 18, 2021
d41c5cc
Apply suggestions from code review
wankai123 Jun 18, 2021
5ac177e
remove swagger dependency
wankai123 Jun 18, 2021
fcae896
Merge branch 'master' into openAPI
wu-sheng Jun 18, 2021
8a4e0a3
polish docs
wankai123 Jun 18, 2021
2794f04
Merge branch 'openAPI' of github.com:wankai123/skywalking into openAPI
wankai123 Jun 18, 2021
2c3fc5f
polish doc use `` instead \ to escape characters
wankai123 Jun 18, 2021
69b37b6
Merge branch 'master' into openAPI
wu-sheng Jun 18, 2021
9912a51
set directory name as the default service name, set the file reader r…
wankai123 Jun 19, 2021
e13d088
Merge branch 'openAPI' of github.com:wankai123/skywalking into openAPI
wankai123 Jun 19, 2021
f2c22d2
Merge branch 'master' into openAPI
wu-sheng Jun 19, 2021
90d7167
set this feature enabled by default, throw exception if openAPI file …
wankai123 Jun 19, 2021
084499b
Merge branch 'openAPI' of github.com:wankai123/skywalking into openAPI
wankai123 Jun 19, 2021
026a6d5
polish doc
wankai123 Jun 19, 2021
814eb19
Update docs/en/setup/backend/endpoint-grouping-rules.md
wu-sheng Jun 19, 2021
e014c05
polish codes
wankai123 Jun 19, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ Release Notes.
* Upgrade snake yaml caused by CVE-2017-18640.
* Upgrade embed tomcat caused by CVE-2020-13935.
* Upgrade commons-lang3 to avoid potential NPE in some JDK versions.
* Support endpoint name grouping by OpenAPI definitions.

#### UI

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
package org.apache.skywalking.apm.util;

import java.util.ArrayList;
import java.util.Comparator;
import java.util.List;
import java.util.regex.Pattern;
import lombok.Getter;
Expand Down Expand Up @@ -68,6 +69,10 @@ public FormatResult format(String string) {
return new FormatResult(false, string, string);
}

public void sortRules(Comparator<? super PatternRule> comparator) {
rules.sort(comparator);
}

@Getter
@RequiredArgsConstructor
public static class FormatResult {
Expand All @@ -78,7 +83,7 @@ public static class FormatResult {

@Getter
@ToString
private static class PatternRule {
public static class PatternRule {
private final String name;
private final Pattern pattern;

Expand Down
1 change: 1 addition & 0 deletions docs/en/setup/backend/configuration-vocabulary.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ core|default|role|Option values, `Mixed/Receiver/Aggregator`. **Receiver** mode
| - | - | maxSizeOfAnalyzeProfileSnapshot|The max number of snapshots analyzed by OAP| - | 12000 |
| - | - | syncThreads|The number of threads used to synchronously refresh the metrics data to the storage.| SW_CORE_SYNC_THREADS | 2 |
| - | - | maxSyncOperationNum|The maximum number of processes supported for each synchronous storage operation. When the number of the flush data is greater than this value, it will be assigned to multiple cores for execution.| SW_CORE_MAX_SYNC_OPERATION_NUM | 50000 |
| - | - | enableEndpointNameGroupingByOpenapi |Turn it on then automatically grouping endpoint by the given OpenAPI definitions.| SW_CORE_ENABLE_ENDPOINT_NAME_GROUPING_BY_OPAENAPI | false |
|cluster|standalone| - | standalone is not suitable for one node running, no available configuration.| - | - |
| - | zookeeper|nameSpace|The namespace, represented by root path, isolates the configurations in the zookeeper.|SW_NAMESPACE| `/`, root path|
| - | - | hostPort|hosts and ports of Zookeeper Cluster|SW_CLUSTER_ZK_HOST_PORT| localhost:2181|
Expand Down
287 changes: 283 additions & 4 deletions docs/en/setup/backend/endpoint-grouping-rules.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,294 @@
# Group Parameterized Endpoints
In most cases, the endpoint should be detected automatically through the language agents, service mesh observability solution,
In most cases, the endpoint should be detected automatically through the language agents, service mesh observability solution,
or configuration of meter system.

There are some special cases, especially when people use REST style URI, the application codes put the parameter in the endpoint name,
There are some special cases, especially when people use REST style URI, the application codes put the parameter in the endpoint name,
such as putting order id in the URI, like `/prod/ORDER123` and `/prod/ORDER123`. But logically, people expect they could
have an endpoint name like `prod/{order-id}`. This is the feature of parameterized endpoint grouping designed for.

If the incoming endpoint name hit the rules, SkyWalking will grouping the endpoint by rules.

SkyWalking provides 2 ways to support endpoint grouping:
1. Endpoint name grouping by OpenAPI definitions.
2. Endpoint name grouping by custom configuration.

The 2 grouping features can work together in sequence.
## Endpoint name grouping by OpenAPI definitions
The OpenAPI definitions are the documents based on The [OpenAPI Specification (OAS)](https://github.com/OAI/OpenAPI-Specification) which used to define a standard, language-agnostic interface for HTTP APIs.
wu-sheng marked this conversation as resolved.
Show resolved Hide resolved
SkyWalking now support `OAS v2.0+`, could parse the documents `(yaml)` and build the grouping rules from them automatically.


### How to use
1. Add some `Specification Extensions` for SkyWalking config in the OpenAPI definition documents, otherwise, all configs are default:<br />
`${METHOD}` is a reserved placeholder which represents the HTTP method eg. `POST/GET...` <br />
`${PATH}` is a reserved placeholder which represents the path eg. `/products/{id}`.

| Extension Name | Required | Description | Default Value |
|-----|-----|-----|-----|
| x-sw-service-name | false | The service name to which these endpoints belong | The directory name which the definition documents belong to|
wu-sheng marked this conversation as resolved.
Show resolved Hide resolved
| x-sw-endpoint-name-match-rule | false | The rule used to match the endpoint.| `${METHOD}:${PATH}` |
| x-sw-endpoint-name-format | false | The endpoint name after grouping.| `${METHOD}:${PATH}` |

These extensions are under `OpenAPI Object`. For example, the document below has a full custom config:

``` yaml
openapi: 3.0.0
x-sw-service-name: serviceB
x-sw-endpoint-name-match-rule: "<${METHOD}>:${PATH}"
x-sw-endpoint-name-format: "<${METHOD}>:${PATH}"
wu-sheng marked this conversation as resolved.
Show resolved Hide resolved

info:
description: OpenAPI definition for SkyWalking test.
version: v2
title: Product API
...
```

We highly recommend using the default config, the custom config (`x-sw-endpoint-name-match-rule/x-sw-endpoint-name-format`) would be considered as part of the match rules (regex pattern).
We provide some cases in `org.apache.skywalking.oap.server.core.config.group.openapi.EndpointGroupingRuleReader4OpenapiTest`, you could validate your custom config as well.

1. Put the OpenAPI definition documents into directory `openapi-definitions`, SkyWalking could read all documents or documents in this subDirectorys from it, you can organize these documents by yourself. Recommend using the service name as the subDirectory name then you are not necessary to set `x-sw-service-name`. For example:
wu-sheng marked this conversation as resolved.
Show resolved Hide resolved
```
├── openapi-definitions
│   ├── serviceA
│   │   ├── customerAPI-v1.yaml
│   │   └── productAPI-v1.yaml
│   └── serviceB
│   └── productAPI-v2.yaml
```
3. Turn the feature on by setting the `Core Module` configuration `${SW_CORE_ENABLE_ENDPOINT_NAME_GROUPING_BY_OPAENAPI:true}`
wu-sheng marked this conversation as resolved.
Show resolved Hide resolved

### Rules match priority
We recommend designing the API path as clear as possible. If the API path is fuzzy and an endpoint name might match multiple paths, SkyWalking would follow the match priority to select one as below orders:
1. The exact path matched first.
Eg. `/products or /products/inventory`
2. The path which has the less variables.
Eg. `/products/{var1}/{var2} and /products/{var1}/abc`, endpoint name `/products/123/abc` will match the second one.
3. If the paths have the same number of variables, match the longest path, and the vars are considered to be `1`.
Eg. `/products/abc/{var1} and products/{var12345}/ef`, endpoint name `/products/abc/ef` will match the first one, because `length("abc") = 3` is larger than `length("ef") = 2`.
### Examples
If we have an OpenAPI definition doc `productAPI-v2.yaml` in directory `serviceB` like this:
```yaml

openapi: 3.0.0

info:
description: OpenAPI definition for SkyWalking test.
version: v2
title: Product API

tags:
- name: product
description: product
- name: relatedProducts
description: Related Products

paths:
/products:
get:
tags:
- product
summary: Get all products list
description: Get all products list.
operationId: getProducts
responses:
"200":
description: Success
content:
application/json:
schema:
type: array
items:
$ref: "#/components/schemas/Product"
/products/{region}/{country}:
get:
tags:
- product
summary: Get products regional
description: Get products regional with the given id.
operationId: getProductRegional
parameters:
- name: region
in: path
description: Products region
required: true
schema:
type: string
- name: country
in: path
description: Products country
required: true
schema:
type: string
responses:
"200":
description: successful operation
content:
application/json:
schema:
$ref: "#/components/schemas/Product"
"400":
description: Invalid parameters supplied
/products/{id}:
get:
tags:
- product
summary: Get product details
description: Get product details with the given id.
operationId: getProduct
parameters:
- name: id
in: path
description: Product id
required: true
schema:
type: integer
format: int64
responses:
"200":
description: successful operation
content:
application/json:
schema:
$ref: "#/components/schemas/ProductDetails"
"400":
description: Invalid product id
post:
tags:
- product
summary: Update product details
description: Update product details with the given id.
operationId: updateProduct
parameters:
- name: id
in: path
description: Product id
required: true
schema:
type: integer
format: int64
- name: name
in: query
description: Product name
required: true
schema:
type: string
responses:
"200":
description: successful operation
delete:
tags:
- product
summary: Delete product details
description: Delete product details with the given id.
operationId: deleteProduct
parameters:
- name: id
in: path
description: Product id
required: true
schema:
type: integer
format: int64
responses:
"200":
description: successful operation
/products/{id}/relatedProducts:
get:
tags:
- relatedProducts
summary: Get related products
description: Get related products with the given product id.
operationId: getRelatedProducts
parameters:
- name: id
in: path
description: Product id
required: true
schema:
type: integer
format: int64
responses:
"200":
description: successful operation
content:
application/json:
schema:
$ref: "#/components/schemas/RelatedProducts"
"400":
description: Invalid product id

components:
schemas:
Product:
type: object
description: Product id and name
properties:
id:
type: integer
format: int64
description: Product id
name:
type: string
description: Product name
required:
- id
- name
ProductDetails:
type: object
description: Product details
properties:
id:
type: integer
format: int64
description: Product id
name:
type: string
description: Product name
description:
type: string
description: Product description
required:
- id
- name
RelatedProducts:
type: object
description: Related Products
properties:
id:
type: integer
format: int32
description: Product id
relatedProducts:
type: array
description: List of related products
items:
$ref: "#/components/schemas/Product"


```

Here are some cases:

| Incoming Endpiont | Incoming Service | x-sw-service-name | x-sw-endpoint-name-match-rule | x-sw-endpoint-name-format | Matched | Grouping Result |
|-----|-----|-----|-----|-----|-----|-----|
| `GET:/products` | serviceB | default | default | default | true | `GET:/products` |
| `GET:/products/123` | serviceB | default | default | default | true | `GET:/products{id}` |
| `GET:/products/asia/cn` | serviceB | default | default | default | true | `GET:/products/{region}/{country}` |
| `GET:/products/123/abc/efg` | serviceB | default | default | default | false | `GET:/products/123/abc/efg` |
| `<GET>:/products/123` | serviceB | default | default | default | false | `<GET>:/products/123`|
| `GET:/products/123` | serviceC | default | default | default | false | `GET:/products/123` |
| `GET:/products/123` | serviceC | serviceC | default | default | true | `GET:/products/123` |
| `<GET>:/products/123` | serviceB | default | `<${METHOD}>:${PATH}` | `<${METHOD}>:${PATH}` | true | `<GET>:/products/{id}` |
| `GET:/products/123` | serviceB | default | default | `${PATH}:<${METHOD}>` | true | `/products/{id}:<GET>` |
| `/products/123:<GET>` | serviceB | default | `${PATH}:<${METHOD}>` | default | true | `GET:/products/{id}` |


## Endpoint name grouping by custom configuration
Current, user could set up grouping rules through the static YAML file, named `endpoint-name-grouping.yml`,
or use [Dynamic Configuration](dynamic-config.md) to initial and update the endpoint grouping rule.

## Configuration Format
### Configuration Format
No matter in static local file or dynamic configuration value, they are sharing the same YAML format.

```yaml
Expand All @@ -20,4 +299,4 @@ grouping:
# Logic name when the regex expression matched.
- endpoint-name: /prod/{id}
regex: \/prod\/.+
```
```
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,8 @@ core:
syncThreads: ${SW_CORE_SYNC_THREADS:2}
# The maximum number of processes supported for each synchronous storage operation. When the number of the flush data is greater than this value, it will be assigned to multiple cores for execution.
maxSyncOperationNum: ${SW_CORE_MAX_SYNC_OPERATION_NUM:50000}
# Turn it on then automatically grouping endpoint by the given OpenAPI definitions.
enableEndpointNameGroupingByOpenapi: ${SW_CORE_ENABLE_ENDPOINT_NAME_GROUPING_BY_OPAENAPI:false}
storage:
selector: ${SW_STORAGE:h2}
elasticsearch:
Expand Down
Loading