Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add semantic conventions for redis receiver #2145

Closed
wants to merge 4 commits into from

Conversation

manang-splunk
Copy link

Changes

Added new resource attribute redis.instance for redis receiver.

@manang-splunk manang-splunk requested review from a team November 22, 2021 05:51
@arminru arminru added area:semantic-conventions Related to semantic conventions spec:resource Related to the specification/resource directory labels Nov 22, 2021
<!-- semconv redis -->
| Attribute | Type | Description | Examples | Required |
|---|---|---|---|---|
| `redis.instance` | string | Reported name of the redis instance | `localhost:6379` | No |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @manang-splunk! Could you please elaborate on how this attribute would be used?
As this is a resource attribute, it would have to be populated by the Redis instance itself, i.e., an OpenTelemetry instrumentation built into Redis server directly. If this is your intention, please clarify it accordingly.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

redis.instance attribute would be used as a resource attribute whose value is configurable by user. This attribute was missing from redis metrics so we added it as resource attribute.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arminru Please share your thoughts on this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arminru I am referring to PR #5922 where we have added a resource attribute redis.instance in code and we are simply adding its semantic convention for the same in this PR. Please go through PR 5922 to understand the usage of redis.instance attribute.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reference @manang-splunk, now it makes more sense to me!
I wonder if the service.name and service.instance.id resource attributes defined here and suggested by @bogdandrutu on your PR would be sufficient or not, have you looked into these? They would be used to describe a "logical" name for the service and are expected to be provided by users.
If the value is supposed to describe the host on which the Redis instance is running like you indicate in your example, the host.name attribute defined here could be suitable but having the port in there might be unexpected, even if the definition says "or another name specified by the user".
@bogdandrutu WDYT?

Copy link
Author

@manang-splunk manang-splunk Nov 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @arminru ,
CC: @bogdandrutu

The current task is to add a resource attribute in the Redis Receiver similar to what is existing in the SignalFx agent Collectd/Redis monitor.

The SignalFx agent Collectd/Redis monitor has a plugin_instance attribute which expects a value in the format of {host}:{port} or a normal string. This helps to label the metrics with user-configured values. Similar functionality does not exist today in the Redis receiver and needs to be implemented as a part of the current task.

To implement this functionality, we have the below two possibilities :

  1. Option 1 - We can leverage existing resource attributes in the Redis receiver such that it can accommodate the value of both the formats {host}:{port} or a normal string based on the above requirements. We have three attributes, service.name, service.instance.id and host.name but if {host}:{port} format of value gets populated within these existing attributes, then it would seem unexpected as the data format {host}:{port} is more like an endpoint type of data rather than service-name-related, service-instance-id-related or only host-name-related data. Hence, this option is not suggested.

  2. Option 2 - We can introduce a new resource attribute called redis.instance which can accommodate both these formats {host}:{port} or a normal string. Hence, this option is suggested and this logical workflow is implemented as a part of this task.

Please let us know your inputs or queries.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation.
I looked at service.instance.id again and agree that it is not really suitable as it is intended to distinguish multiple instances of the same service, so this would only be applicable for Redis cluster nodes.
Having just the form host:port for service.name is not ideal indeed as this does not correspond well with the definition of a logical service name. If we'd add redis.instance, which service.name would the Redis receiver set?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open for adding redis.instance if we don't see a good fit with the existing attributes but would like to get more opinions from the other @open-telemetry/specs-approvers. This would be the first technology/framework-specific resource attribute to be added and I wonder if we were already able to cover similar cases with the existing, generic attributes.

Copy link
Author

@manang-splunk manang-splunk Dec 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The usage of service.name attribute is user specific. Based on the observed data of the Redis server, there is no service-name-related data and as per our usecase host:port doesn't fit with service.name. Hence, we are not setting the service.name attribute. I have attached the sample data we observed from the Redis server (data.txt).

@arminru arminru requested a review from a team November 22, 2021 09:51
<!-- semconv redis -->
| Attribute | Type | Description | Examples | Required |
|---|---|---|---|---|
| `redis.instance` | string | Reported name of the redis instance | `localhost:6379` | No |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation.
I looked at service.instance.id again and agree that it is not really suitable as it is intended to distinguish multiple instances of the same service, so this would only be applicable for Redis cluster nodes.
Having just the form host:port for service.name is not ideal indeed as this does not correspond well with the definition of a logical service name. If we'd add redis.instance, which service.name would the Redis receiver set?

specification/resource/semantic_conventions/redis.md Outdated Show resolved Hide resolved
semantic_conventions/resource/redis.yaml Outdated Show resolved Hide resolved
- id: instance
type: string
brief: >
Reported name of the redis instance
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Reported name of the redis instance
Reported name of the Redis instance. This can be in the form `{host}:{port}` or any other name provided manually when setting up the instrumentation.

Not yet happy with my suggestion but this lacks clarification. I'm open for better suggestions :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, added an appropriate description.

@tigrannajaryan
Copy link
Member

We have conventions for databases: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/semantic_conventions/database.md which uses prefix db.redis.* for Redis attributes. Should this attribute be under db.redis.*?

Side note: I am not sure it is a good idea to have db.<dbname>.* namespace when we also have db.* namespace. This has a lot of potential for name clashes. These conventions may need to be revamped.

@hprajapati-splunk
Copy link

We have conventions for databases: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/semantic_conventions/database.md which uses prefix db.redis.* for Redis attributes. Should this attribute be under db.redis.*?

Side note: I am not sure it is a good idea to have db.<dbname>.* namespace when we also have db.* namespace. This has a lot of potential for name clashes. These conventions may need to be revamped.

Hello @tigrannajaryan,
We have added redis.instance as a resource attribute. The database semantics you referred are for trace attributes, which are used for database client calls. If we add db.redis.* in the resource attribute then it will have conflicts with db.* namespace as you mentioned earlier, so we have added redis.instance (it can be an endpoint or any string value provided by the user). You can refer this conversation for more details about the field we have added(redis.instance).

Please let us know your thoughts.

@tigrannajaryan
Copy link
Member

The database semantics you referred are for trace attributes, which are used for database client calls. If we add db.redis.* in the resource attribute then it will have conflicts with db.* namespace as you mentioned earlier

I do not see any reason to limit db.* attributes to traces only. They can be also valid for metrics describing databases and I would expect for example db.system or db.name to be recorded in the Resource for metrics. I think we can have db.* namespace in the Resource, which can also contain db.redis.instance attribute.

This raises a good question though: how do we define semantic conventions for attributes that may be recorded both at the Resource level and at the Span or Metric level? I don't think Otel spec has a good answer for this and we have an open issue about that #1367

To make progress on this I suggest the following. Change this PR to use db.redis.instance. I do not think top-level redis.* is acceptable. We can accept this as experimental convention. Be prepared that the convention may be changed when #1367 is resolved.

@manang-splunk
Copy link
Author

The database semantics you referred are for trace attributes, which are used for database client calls. If we add db.redis.* in the resource attribute then it will have conflicts with db.* namespace as you mentioned earlier

I do not see any reason to limit db.* attributes to traces only. They can be also valid for metrics describing databases and I would expect for example db.system or db.name to be recorded in the Resource for metrics. I think we can have db.* namespace in the Resource, which can also contain db.redis.instance attribute.

This raises a good question though: how do we define semantic conventions for attributes that may be recorded both at the Resource level and at the Span or Metric level? I don't think Otel spec has a good answer for this and we have an open issue about that #1367

To make progress on this I suggest the following. Change this PR to use db.redis.instance. I do not think top-level redis.* is acceptable. We can accept this as experimental convention. Be prepared that the convention may be changed when #1367 is resolved.

We tried to add db.redis.instance with namespace as db.* as well as with db.redis.* at the resource level, we got the following error while running make table-generation command as the namespace already exists at the trace level:

Error parsing /source/trace/database.yaml

Semantic convention 'db' is already defined.
Error parsing /source/trace/database.yaml

Semantic convention 'db.redis' is already defined.

Is there something that we are missing from our end?

@tigrannajaryan
Copy link
Member

@manang-splunk I am not sure what the problem is, perhaps the semantic convention generator has limitations that prevent having the same namespace for resources and traces. This may require fixing the generator first. @open-telemetry/build-tools-approvers any thoughts?

@github-actions
Copy link

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions
Copy link

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@tigrannajaryan
Copy link
Member

@manang-splunk can you create an issue describing the limitation of the semantic convention generator tool, along with a proposal of how to fix it (if you have one)?

@github-actions github-actions bot removed the Stale label Jan 13, 2022
@manang-splunk manang-splunk force-pushed the main branch 2 times, most recently from 7fd7a95 to 0ad251c Compare January 13, 2022 13:22
@manang-splunk
Copy link
Author

@manang-splunk can you create an issue describing the limitation of the semantic convention generator tool, along with a proposal of how to fix it (if you have one)?

@tigrannajaryan As we were setting group id as “db.redis” and it was already used at trace level we were getting error that “db.redis” is already there. So we used different group id as “redis” for this. Now we are able to add the “db.redis.instance” as a resource semantic convention. We have pushed the changes accordingly can you please review it?

As it is a type of database, we have created a folder “db” (semantic_conventions/resource/db) and kept conventions of Redis under that. Any other database conventions can be added under that in future.

@github-actions
Copy link

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Jan 21, 2022
Copy link
Member

@tigrannajaryan tigrannajaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The namespace and the name of the new attribute look good to me.
I do not know much of Redis to tell if the attribute value is what we want to record. I would like others to chime in too.

@tigrannajaryan
Copy link
Member

@open-telemetry/specs-approvers @open-telemetry/instr-wg please review.

Copy link
Member

@joaopgrassi joaopgrassi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with the Redis receiver but I went through the PRs and I think this makes sense. Just some editorial suggestions and a question about the structure of the convention.

@@ -0,0 +1,14 @@
groups:
- id: redis
prefix: db.redis
Copy link
Member

@joaopgrassi joaopgrassi Jan 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I quickly searched and I didn't find another convention where the id is at a "different level" than the prefix. I'm not sure if this is a problem though, just something that stand up for me.

I guess you will also get a conflict if you have a structure like this: id: db > id: db.redis > attributes > id: instance? Similar to this


?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I was getting conflicts while using id as db and db.redis.
As we already have db and db.redis as id value in database attributes, we cannot use the same here. Therefore we have kept id as redis.

- id: instance
type: string
brief: >
Reported name of the Redis instance. This can be in the form
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Reported name of the Redis instance. This can be in the form
The reported name of the Redis instance. This can be in the form of

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed the changes accordingly.

brief: >
Reported name of the Redis instance. This can be in the form
`{host}:{port}` or any other name provided manually while configuring
the instrumentation and defaults to the `endpoint` value provided in the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the sentence is a bit too long and hard to follow. A suggestion would be maybe ending the sentence in the instrumentation. and starting a new one with. The default value is the endpoint... or When not provided, the default value is.... What do you think?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed the changes accordingly.

@@ -0,0 +1,13 @@
# Redis

**Status**: [Experimental](../../document-status.md)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The link is missing one more level I think (that's why the docfx check is failing)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed the changes accordingly.

@joaopgrassi
Copy link
Member

Sorry for the back and forth, but I'm not sure the current solution (due to the conflict in the namespace) is the best. As @tigrannajaryan mentioned, we might need to change the build-tool or how we invoke it.

I tried something quickly and I managed to call the table-generation without the conflict, but it's also a bit "hackish". If you change your redis.yaml to have

id: db.redis
    prefix: db.redis
    ...

Then running the make table-generation will fail with the conflict. As far as I could see, that's because the command invokes the generation tool passing the whole semconv folder which contains traces and resources. If you change it, passing the resource folder alone it works. Like this:

# Generate markdown tables from YAML definitions
.PHONY: table-generation
table-generation:
   docker run --rm -v $(PWD)/semantic_conventions/resource:/source -v $(PWD)/specification/resource:/spec \
   	otel/semconvgen:$(SEMCONVGEN_VERSION) -f /source markdown -md /spec

This will generate the table on db/redis.md correctly.

I'm not sure if this is a feasible solution though. Maybe @Oberon00 can chime in and share what he thinks about it. Maybe we will really need to change the generator.

@github-actions
Copy link

github-actions bot commented Feb 8, 2022

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Feb 8, 2022
@github-actions
Copy link

Closed as inactive. Feel free to reopen if this PR is still being worked on.

@github-actions github-actions bot closed this Feb 15, 2022
This was referenced Sep 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:semantic-conventions Related to semantic conventions spec:resource Related to the specification/resource directory Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants