Skip to content

Commit

Permalink
Simplify status compression (#102)
Browse files Browse the repository at this point in the history
Resolves #101

Status compression was quite complicated and confusing. It used 4 different
hashes and lots of code to enable compression and detection of lost messages
(during server restarts).

We simplified this significantly by doing this instead:
- Eliminate all "hash" fields.
- Allow omitting the sub-messages of AgentToServer message. When omitted it is implied that previously reported value of the sub-message is current (unchanged).
- To detect lost messages have one auto-incremented sequence_num field AgentToServer message. Server can easily detect losses by just keeping the last sequence_num (as opposed to keeping 4 different hashes).

Here is a PR that demonstrates how it works:
open-telemetry/opamp-go#93

It is more than 100 lines of code removed.
  • Loading branch information
tigrannajaryan authored Jun 29, 2022
1 parent 060673c commit 1f56f86
Showing 1 changed file with 49 additions and 96 deletions.
145 changes: 49 additions & 96 deletions specification.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ Note: this document requires a simplification pass to reduce the scope, size and
* [AgentToServer and ServerToAgent Messages](#agenttoserver-and-servertoagent-messages)
+ [AgentToServer Message](#agenttoserver-message)
- [AgentToServer.instance_uid](#agenttoserverinstance_uid)
- [AgentToServer.sequence_num](#agenttoserversequence_num)
- [AgentToServer.agent_description](#agenttoserveragent_description)
- [AgentToServer.capabilities](#agenttoservercapabilities)
- [AgentToServer.effective_config](#agenttoservereffective_config)
Expand All @@ -50,19 +51,15 @@ Note: this document requires a simplification pass to reduce the scope, size and
* [Status Reporting](#status-reporting)
+ [Agent Status Compression](#agent-status-compression)
+ [AgentDescription Message](#agentdescription-message)
- [AgentDescription.hash](#agentdescriptionhash)
- [AgentDescription.identifying_attributes](#agentdescriptionidentifying_attributes)
- [AgentDescription.non_identifying_attributes](#agentdescriptionnon_identifying_attributes)
+ [EffectiveConfig Message](#effectiveconfig-message)
- [EffectiveConfig.hash](#effectiveconfighash)
- [EffectiveConfig.config_map](#effectiveconfigconfig_map)
+ [RemoteConfigStatus Message](#remoteconfigstatus-message)
- [RemoteConfigStatus.hash](#remoteconfigstatushash)
- [RemoteConfigStatus.last_remote_config_hash](#remoteconfigstatuslast_remote_config_hash)
- [RemoteConfigStatus.status](#remoteconfigstatusstatus)
- [RemoteConfigStatus.error_message](#remoteconfigstatuserror_message)
+ [PackageStatuses Message](#packagestatuses-message)
- [PackageStatuses.hash](#packagestatuseshash)
- [PackageStatuses.packages](#packagestatusespackages)
- [PackageStatuses.server_provided_all_packages_hash](#packagestatusesserver_provided_all_packages_hash)
- [PackageStatuses.error_message](#packagestatuseserror_message)
Expand Down Expand Up @@ -375,13 +372,14 @@ document are specified in
```protobuf
message AgentToServer {
string instance_uid = 1;
AgentDescription agent_description = 2;
AgentCapabilities capabilities = 3;
EffectiveConfig effective_config = 4;
RemoteConfigStatus remote_config_status = 5;
PackageStatuses package_statuses = 6;
AgentDisconnect agent_disconnect = 7;
AgentToServerFlags flags = 8;
uint64 sequence_num = 2;
AgentDescription agent_description = 3;
AgentCapabilities capabilities = 4;
EffectiveConfig effective_config = 5;
RemoteConfigStatus remote_config_status = 6;
PackageStatuses package_statuses = 7;
AgentDisconnect agent_disconnect = 8;
AgentToServerFlags flags = 9;
}
```

Expand All @@ -400,6 +398,14 @@ lifetime of the Agent process. The recommended format for the instance_uid is
In case the Agent wants to use an identifier generated by the Server, the field
SHOULD be set with a temporary value and RequestInstanceUid flag MUST be set.

#### AgentToServer.sequence_num

The sequence number is incremented by 1 for every AgentToServer message sent
by the Agent. This allows the Server to detect that it missed a message when
it notices that the sequence_num is not exactly by 1 greater than the previously
received one. See [Agent Status Compression](#agent-status-compression) for more
details.

#### AgentToServer.agent_description

Data that describes the Agent, its type, where it runs, etc. See
Expand Down Expand Up @@ -600,23 +606,14 @@ enum Flags {
// Flags is a bit mask. Values below define individual bits.
// The Server asks the Agent to report full AgentDescription.
ReportAgentDescription = 0x00000001;
// The Server asks the Agent to report full EffectiveConfig. This bit MUST NOT be
// set if the Agent indicated it cannot report effective config by setting
// the ReportsEffectiveConfig bit to 0 in AgentToServer.capabilities field.
ReportEffectiveConfig = 0x00000002;
// The Server asks the Agent to report full RemoteConfigStatus. This bit MUST NOT be
// set if the Agent indicated it cannot accept remote config by setting
// the AcceptsRemoteConfig bit to 0 in AgentToServer.capabilities field.
ReportRemoteConfigStatus = 0x00000004;
// The Server asks the Agent to report full PackageStatuses. This bit MUST NOT be
// set if the Agent indicated it cannot report package status by setting
// the ReportsPackageStatuses bit to 0 in AgentToServer.capabilities field.
ReportPackageStatuses = 0x00000008;
// ReportFullState flag can be used by the Server if the Agent did not include
// some sub-message in the last AgentToServer message (which is an allowed
// optimization) but the Server detects that it does not have it (e.g. was
// restarted and lost state). The detection happens using
// AgentToServer.sequence_num values.
// The Server asks the Agent to report the full status again by sending
// a new, full AgentToServer message.
ReportFullState = 0x00000001;
}
```

Expand Down Expand Up @@ -854,20 +851,20 @@ The Agent notifies the Server about Agent's status by sending AgentToServer mess
The status for example includes the Agent description, its effective configuration,
the status of the remote configuration it received from the Server and the status
of the packages. The Server tracks the status of the Agent using the data
specified in the messages referenced from AgentToServer message.
specified in the sub-messages referenced from AgentToServer message.

The Agent MAY compress some of these messages by omitting the data that has not changed
since that particular data was reported last time. The following messages can be subject
The Agent MAY compress the AgentToServer message by omitting the sub-messages that have not changed
since that particular data was reported last time. The following sub-messages can be subject
to such compression:
[AgentDescription](#agentdescription-message),
[EffectiveConfig](#effectiveconfig-message),
[RemoteConfigStatus](#remoteconfigstatus-message) and
[PackageStatuses](#packagestatuses-message).

The compression is done by omitting all fields in the message, except
the hash field which MUST always be present (see below for how the hash field is used).
If any of the fields in the message has changed then the compression cannot be used
and all fields MUST be present.
The compression is done by omitting the sub-message in the AgentToServer message.
If any of the fields in the sub-message has changed then the compression cannot be used
for that particular sub-message and the sub-message with all its relevant fields MUST
be present.

If all AgentToServer messages are reliably delivered to the Server and the Server
correctly processes them then such compression is safe and the Server should always
Expand All @@ -878,43 +875,30 @@ believes the Server has the latest data while in reality the Server doesn't. Thi
possible for example if the Server is restarted while the Agent keeps running and sends
AgentToServer messages, which the Server does not receive because it is temporarily down.

In order to detect this situation and recover from it, every compressible message
contains a hash field. The field is the hash of the content of every other field.
The hash is computed on full, uncompressed message (as if no compression is used) and
then unchanged fields may be omitted from the message. Note that either all fields in the
message must be present or all fields (except hash) must be omitted.
In order to detect this situation and recover from it, the AgentToServer message
contains a sequence_num field. The field is an integer number that is incremented
every time the Agent has a new AgentToServer message to send.

The Server SHOULD store the received hash value for each message type. When the Server
receives a message of the same type with a hash value that is different from the last
stored hash and with omitted data then the Server knows it does not have full status
of this particular message.
When the Server receives an AgentToServer message sequence_num field value that is not
exactly by one greater than the previously received sequence_num value then the Server
knows it does not have full status of the AgentToServer message.

When this situation is encountered, to recover the lost status the Server MUST request
the Agent to report the omitted data. To make this request the Server MUST send
a ServerToAgent message to the Agent and set the corresponding `Report*` bit in
a ServerToAgent message to the Agent and set the `ReportFullState` bit in
the [flags](#servertoagentflags) field of [ServerToAgent message](#servertoagent-message).
The flags field contains one `Report*` bit per type of compressible message.

For the details of the flags field see the [descriptions here](#servertoagentflags).

### AgentDescription Message

The AgentDescription message has the following structure:

```protobuf
message AgentDescription {
bytes hash = 1;
repeated KeyValue identifying_attributes = 2;
repeated KeyValue non_identifying_attributes = 3;
repeated KeyValue identifying_attributes = 1;
repeated KeyValue non_identifying_attributes = 2;
}
```

#### AgentDescription.hash

The hash of the content of all other fields (even if the other fields are omitted
for compression). See [Agent Status Compression](#agent-status-compression)
for details about hash field usage.

#### AgentDescription.identifying_attributes

Attributes that identify the Agent.
Expand All @@ -940,9 +924,6 @@ telemetry. The combination of identifying attributes SHOULD be sufficient to
uniquely identify the Agent's own telemetry in the destination system to which
the Agent sends its own telemetry.

This field MUST be set if the Agent has received the ReportAgentDescription flag in the
ServerToAgent message.

#### AgentDescription.non_identifying_attributes

Attributes that do not necessarily identify the Agent but help describe where it
Expand All @@ -958,33 +939,19 @@ The following attributes SHOULD be included:
- any user-defined attributes that the end user would like to associate with
this Agent.

This field MUST be set if the Agent has received the ReportAgentDescription flag in the
ServerToAgent message.

### EffectiveConfig Message

The EffectiveConfig message has the following structure:

```protobuf
message EffectiveConfig {
bytes hash = 1;
AgentConfigMap config_map = 2;
AgentConfigMap config_map = 1;
}
```

#### EffectiveConfig.hash

The hash of the content of all other fields (even if the other fields are omitted
for compression). See [Agent Status Compression](#agent-status-compression)
for details about hash field usage.

#### EffectiveConfig.config_map

The effective config of the Agent. SHOULD be omitted if unchanged since last
reported.

MUST be set if the Agent has received the ReportEffectiveConfig flag in the
ServerToAgent message.
The effective config of the Agent.

See AgentConfigMap message definition in the [Configuration](#configuration)
section.
Expand All @@ -995,8 +962,7 @@ The RemoteConfigStatus message has the following structure:

```protobuf
message RemoteConfigStatus {
bytes hash = 1;
bytes last_remote_config_hash = 2;
bytes last_remote_config_hash = 1;
enum Status {
// The value of status field is not set.
UNSET = 0;
Expand All @@ -1011,17 +977,11 @@ message RemoteConfigStatus {
// See error_message for more details.
FAILED = 3;
}
Status status = 3;
string error_message = 4;
Status status = 2;
string error_message = 3;
}
```

#### RemoteConfigStatus.hash

The hash of the content of all other fields (even if the other fields are omitted
for compression). See [Agent Status Compression](#agent-status-compression)
for details about hash field usage.

#### RemoteConfigStatus.last_remote_config_hash

The hash of the remote config that was last received by this Agent in the
Expand All @@ -1045,19 +1005,12 @@ has or was offered. The message has the following structure:

```protobuf
message PackageStatuses {
bytes hash = 1;
map<string, PackageStatus> packages = 2;
bytes server_provided_all_packages_hash = 3;
string error_message = 4;
map<string, PackageStatus> packages = 1;
bytes server_provided_all_packages_hash = 2;
string error_message = 3;
}
```

#### PackageStatuses.hash

The hash of the content of all other fields (even if the other fields are omitted
for compression). See [Agent Status Compression](#agent-status-compression)
for details about hash field usage.

#### PackageStatuses.packages

A map of PackageStatus messages, where the keys are package names. The key MUST
Expand Down

0 comments on commit 1f56f86

Please sign in to comment.