Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add runtime validation in setAttribute #348

Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
6914ff3
Validate attribute value data types before adding to span
jakemalachowski Dec 27, 2019
59107ad
Change order of checks to remove an else condition
jakemalachowski Dec 27, 2019
8152fdb
Create separate sequence check method, add tests, fix linting issues
jakemalachowski Dec 29, 2019
1d972e8
Fix attribute value typing
jakemalachowski Dec 29, 2019
c535e0c
Apply linting changes
jakemalachowski Dec 29, 2019
59dd5f5
Use is not None, use optional type instead of union with None
jakemalachowski Dec 31, 2019
2a5df2b
Validate attribute value data types before adding to span
jakemalachowski Dec 27, 2019
bacb09b
Change order of checks to remove an else condition
jakemalachowski Dec 27, 2019
3e67a9e
Create separate sequence check method, add tests, fix linting issues
jakemalachowski Dec 29, 2019
ff61b9e
Fix attribute value typing
jakemalachowski Dec 29, 2019
4d74316
Apply linting changes
jakemalachowski Dec 29, 2019
8a7ec1c
Use is not None, use optional type instead of union with None
jakemalachowski Dec 31, 2019
f05956b
Merge branch 'ISSUE-347/attribue-value-type-enforcement' of github.co…
jakemalachowski Jan 3, 2020
e7e976a
Update opentelemetry-sdk/src/opentelemetry/sdk/trace/__init__.py
jakemalachowski Jan 3, 2020
75261d8
Clarify variable and method names, remove redundant check
jakemalachowski Jan 3, 2020
ca68796
Merge branch 'ISSUE-347/attribue-value-type-enforcement' of github.co…
jakemalachowski Jan 3, 2020
a9c554a
Merge branch 'master' of https://github.com/open-telemetry/openteleme…
jakemalachowski Jan 3, 2020
1321134
Re-add explicit return None
jakemalachowski Jan 3, 2020
269b006
Commit lint changes
jakemalachowski Jan 7, 2020
47a5887
Merge branch 'master' of https://github.com/open-telemetry/openteleme…
jakemalachowski Jan 7, 2020
669b3eb
Lint changes
jakemalachowski Jan 7, 2020
4a5812c
Lint changes
jakemalachowski Jan 7, 2020
6f1dd5d
Use number instead of int, float
jakemalachowski Jan 11, 2020
5225928
Revert AttributeValue typing change
jakemalachowski Jan 16, 2020
b83dc7b
Prevent duplicate isinstance checks, run linter
jakemalachowski Jan 17, 2020
d8e5946
Lint
jakemalachowski Jan 17, 2020
ad58890
Move length check inside validation method
jakemalachowski Jan 18, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion opentelemetry-api/src/opentelemetry/util/types.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,5 +15,7 @@

import typing

AttributeValue = typing.Union[str, bool, float]
AttributeValue = typing.Union[
jakemalachowski marked this conversation as resolved.
Show resolved Hide resolved
str, bool, int, float, typing.Sequence[typing.Union[str, bool, int, float]]
jakemalachowski marked this conversation as resolved.
Show resolved Hide resolved
]
Attributes = typing.Optional[typing.Dict[str, AttributeValue]]
33 changes: 32 additions & 1 deletion opentelemetry-sdk/src/opentelemetry/sdk/trace/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,9 @@
import random
import threading
from contextlib import contextmanager
from numbers import Number
from types import TracebackType
from typing import Iterator, Optional, Sequence, Tuple, Type
from typing import Iterator, Optional, Sequence, Tuple, Type, Union

from opentelemetry import trace as trace_api
from opentelemetry.context import Context
Expand Down Expand Up @@ -208,8 +209,38 @@ def set_attribute(self, key: str, value: types.AttributeValue) -> None:
if has_ended:
logger.warning("Setting attribute on ended span.")
return

if not isinstance(value, (int, float, bool, str, list, tuple)):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this int, float be consolidated into Number here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I use Number rather than int, float here, I will need to change it to:

if not isinstance(value, (bool, str, list, tuple)) and not issubclass(type(value), Number):

I think just using int, float here is more concise, but I can understand the appeal of being consistent with what is done in _check_attribute_value_sequence. What do you think?

Copy link
Member

@toumorokoshi toumorokoshi Jan 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isinstance seems to work fine:

$ python -c "from numbers import Number; print(isinstance(1, (bool, Number)))"
True

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, please consider using collections.abc.Sequence instead of list, tuple.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And if using Number here, also use it in the AttributeValue type for consistency (and note that this allows Decimal and Fraction numbers too).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@toumorokoshi I'm not sure why my tests were failing after originally making this change. Just tried this again and it works.

logger.warning("invalid type for attribute value")
return
if isinstance(value, (list, tuple)) and len(value) > 0:
return_code = self._check_sequence(value)
jakemalachowski marked this conversation as resolved.
Show resolved Hide resolved
if return_code:
jakemalachowski marked this conversation as resolved.
Show resolved Hide resolved
logger.warning("%s in attribute value sequence", return_code)
return

self.attributes[key] = value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a potential edge case where AttributeValues that are lists can be mutated afterward, resulting in invalid types that exporters will run into.

I've added a followup ticket on that here: #352

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great catch. Would adding a copy of the list rather than the original list resolve this?

Copy link
Contributor

@ocelotl ocelotl Jan 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could store the copies of the sequence values in tuples. Now that you mention this, instead of accepting lists or tuples we should accept sequences.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I think that's a good solution. Would be good to make a separate PR for that.


@staticmethod
def _check_sequence(sequence: (list, tuple)) -> Union[str, None]:
jakemalachowski marked this conversation as resolved.
Show resolved Hide resolved
"""
Checks if sequence items are valid and identical
jakemalachowski marked this conversation as resolved.
Show resolved Hide resolved
"""
first_element_type = type(sequence[0])

if issubclass(type(sequence[0]), Number):
jakemalachowski marked this conversation as resolved.
Show resolved Hide resolved
first_element_type = Number

for element in sequence:

if not isinstance(element, (bool, str, Number)):
jakemalachowski marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this tuple be moved into a constant in the module? This set should be shared with the tuple on line 213 (could modify line 213 to be that list + list, tuple:

VALID_ATTRIBUTE_TYPES + (list, tuple)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this should be a constant. But do you think it might be misleading to have the constant VALID_ATTRIBUTE_TYPES without list, tuple included, since they are actually valid?

Maybe we could define VALID_ATTRIBUTE_SEQUENCE_TYPES and VALID_ATTRIBUTE_NON_SEQUENCE_TYPES, then define VALID_ATTRIBUTE_TYPES as the union of the two. What do you think?

return "invalid type"

if not isinstance(element, first_element_type):
return "different type"

return None
jakemalachowski marked this conversation as resolved.
Show resolved Hide resolved

def add_event(
self,
name: str,
Expand Down
45 changes: 44 additions & 1 deletion opentelemetry-sdk/tests/trace/test_trace.py
Original file line number Diff line number Diff line change
Expand Up @@ -368,7 +368,11 @@ def test_attributes(self):
root.set_attribute("attr-key", "attr-value1")
root.set_attribute("attr-key", "attr-value2")

self.assertEqual(len(root.attributes), 7)
root.set_attribute("empty-list", [])
root.set_attribute("list-of-bools", [True, True, False])
root.set_attribute("list-of-numerics", [123, 3.14, 0])

self.assertEqual(len(root.attributes), 10)
self.assertEqual(root.attributes["component"], "http")
self.assertEqual(root.attributes["http.method"], "GET")
self.assertEqual(
Expand All @@ -379,6 +383,13 @@ def test_attributes(self):
self.assertEqual(root.attributes["http.status_text"], "OK")
self.assertEqual(root.attributes["misc.pi"], 3.14)
self.assertEqual(root.attributes["attr-key"], "attr-value2")
self.assertEqual(root.attributes["empty-list"], [])
self.assertEqual(
root.attributes["list-of-bools"], [True, True, False]
)
self.assertEqual(
root.attributes["list-of-numerics"], [123, 3.14, 0]
)

attributes = {
"attr-key": "val",
Expand All @@ -393,6 +404,38 @@ def test_attributes(self):
self.assertEqual(root.attributes["attr-key2"], "val2")
self.assertEqual(root.attributes["attr-in-both"], "span-attr")

def test_invalid_attribute_values(self):
with self.tracer.start_as_current_span("root") as root:
root.set_attribute("non-primitive-data-type", dict())
root.set_attribute(
"list-of-mixed-data-types-numeric-first",
[123, False, "string"],
)
root.set_attribute(
"list-of-mixed-data-types-non-numeric-first",
[False, 123, "string"],
)
root.set_attribute(
"list-with-non-primitive-data-type", [dict(), 123]
)

self.assertEqual(len(root.attributes), 0)

def test_check_sequence_helper(self):
# pylint: disable=protected-access
self.assertEqual(
trace.Span._check_sequence([1, 2, 3.4, "ss", 4]), "different type"
)
self.assertEqual(
trace.Span._check_sequence([1, 2, 3.4, dict(), 4]), "invalid type"
)
self.assertEqual(
trace.Span._check_sequence(["sw", "lf", 3.4, "ss"]),
"different type",
jakemalachowski marked this conversation as resolved.
Show resolved Hide resolved
)
self.assertIsNone(trace.Span._check_sequence([1, 2, 3.4, 5]))
self.assertIsNone(trace.Span._check_sequence(["ss", "dw", "fw"]))

def test_sampling_attributes(self):
decision_attributes = {
"sampler-attr": "sample-val",
Expand Down