-
Notifications
You must be signed in to change notification settings - Fork 373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for http.client_ip
tag for Rack-based frameworks
#2248
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few changes needed but overall LGTM.
It seems like CircleCI ought to catch some linting issues but it did not run, or Rubocop is set to ignore them.
I'd rather have a second review from the Tracing
team as well :)
lib/datadog/tracing/client_ip.rb
Outdated
valid_ipv4?(ip) || valid_ipv6?(ip) | ||
end | ||
|
||
# --- Section vendored from the ipaddress gem --- # |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a rationale for not using IPAddr.new(addr).ipv4?
and IPAddr.new(addr).ipv6?
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we decide to vendor third party code, we should find a way to slip the corresponding license, at least in LICENSE-3rd_party.csv
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I avoided IPAddr
initially because it has the netmask handling which I didn't want but I guess it's solvable with some verification and it spares vendoring from ipaddress
.
I'll give IPAddr
a shot
lib/datadog/tracing/client_ip.rb
Outdated
end | ||
|
||
def self.strip_zone_specifier(ipv6) | ||
if /\A(.*?)%.*/ =~ ipv6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, I had to add this as well at Sqreen.
This could be achieved with ipv6.gsub(/%.*$/, '')
instead of relying on intermediate match objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Fixed
o.default { env_to_bool(Tracing::Configuration::Ext::ClientIp::ENV_DISABLED, false) } | ||
o.lazy | ||
end | ||
# An optional name of a custom header to resolve the client IP from. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# An optional name of a custom header to resolve the client IP from. | |
# An optional name of a custom header to resolve the client IP from. |
Linter should have caught this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops.
Rubocop seems to pass but I shouldv'e considered the formatting in the other options.
Fixed
lib/datadog/tracing/client_ip.rb
Outdated
TAG_MULTIPLE_IP_HEADERS = '_dd.multiple-ip-headers'.freeze | ||
|
||
# A header collection implementation that looks up headers in a Hash. | ||
class HashHeaderCollection < HeaderCollection |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this could go to the Core
namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, fixed
lib/datadog/tracing/client_ip.rb
Outdated
].freeze | ||
|
||
# A collection of headers. | ||
class HeaderCollection |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this could go to the Core
namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
lib/datadog/tracing/client_ip.rb
Outdated
return | ||
end | ||
|
||
unless valid_ip?(ip) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel this nesting of unless
is confusing.
Is this necessary since extract_ip_from_full_address
seems to return the address when its internal criteria don't match?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I completely refactored that part, let me know if it makes more sense now :)
lib/datadog/tracing/client_ip.rb
Outdated
return {} unless headers | ||
|
||
{}.tap do |result| | ||
DEFAULT_IP_HEADERS_NAMES.each do |name| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a job for DEFAULT_IP_HEADERS_NAMES.each_with_object({}) do |name, result|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done with reduce
(each_with_object
returns an array)
EDIT: No it does not, I misread the documentation. Changed
lib/datadog/tracing/client_ip.rb
Outdated
return headers.get(configuration.header_name) if configuration.header_name && headers | ||
|
||
ip_values_from_headers = ip_headers(headers).values | ||
case ip_values_from_headers.size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
case ip_values_from_headers.size | |
case ip_values_from_headers.size |
Method return value of this method comes from the case
return value, so I'd suggest to space this to conform with the Ruby style guide where implicit returns live on their own "line".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, done
@@ -219,13 +226,13 @@ def configuration | |||
Datadog.configuration.tracing[:rack] | |||
end | |||
|
|||
def parse_request_headers(env) | |||
def parse_request_headers(headers) | |||
{}.tap do |result| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly to above, looks like a job for each_with_object
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
lib/datadog/tracing/client_ip.rb
Outdated
return if configuration.disabled | ||
|
||
ip = client_address_from_request(headers, remote_ip) | ||
if !configuration.header_name && ip.nil? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is only done when there's no IP reported? This feels off somehow (like, when would this happen), so I must be missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me know if this is still confusing after the refactor
lib/datadog/tracing/client_ip.rb
Outdated
# @param [Span] span The span that's associated with the request. | ||
# @param [HeaderCollection, #get, nil] headers A collection with the request headers. | ||
# @param [String, nil] remote_ip The remote IP the request associated with the span is sent to. | ||
def self.set_client_ip_tag(span, headers, remote_ip) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noob question: should this have !
in the name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
!
is a Ruby convention for hinting side-effect. For example, Array#map
returns a new array while Array#map!
returns a mutated self. Another common pattern is whether a method raise an exception, such as find_by vs find_by!.
I don't think this method deserve !
, but this one seems to be a better candidate.
Since headers
, remote_ip
are optional. It is awkward to see
client_ip.set_client_ip_tag(span, nil, '1.1.1.1')
.
Consider making those keyword argument, so the caller makes the invokation a bit more explicit like,
client_ip.set_client_ip_tag(span, remote_ip: '1.1.1.1')
or
client_ip.set_client_ip_tag(span, headers: headers)
.
Control flow using exception handling is slow in Ruby. This method need to be changed for
- Reconsider whether if the scenario is truly an exception
- Streamline the exceptional handling cases to avoid nested rescue/raise
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool thanks!
As for using exceptions, I was indeed wondering whether this would have a performance impact. I'll change it so it doesn't use any exceptions.
# | ||
# @default `DD_TRACE_CLIENT_IP_HEADER_DISABLED` environment variable, otherwise `false`. | ||
# @return [Boolean] | ||
option :disabled do |o| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Semantically, prefer enabled
to disabled
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I usually agree but since the environment variable is DISABLED
I figured I'd keep it disabled
.
Do you still think it's worth the setting name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also rename the environment variable to DD_TRACE_CLIENT_IP_HEADER_ENABLED
to be consistent with other environment variable?
lib/datadog/tracing/client_ip.rb
Outdated
# @param [Span] span The span that's associated with the request. | ||
# @param [HeaderCollection, #get, nil] headers A collection with the request headers. | ||
# @param [String, nil] remote_ip The remote IP the request associated with the span is sent to. | ||
def self.set_client_ip_tag(span, headers, remote_ip) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
!
is a Ruby convention for hinting side-effect. For example, Array#map
returns a new array while Array#map!
returns a mutated self. Another common pattern is whether a method raise an exception, such as find_by vs find_by!.
I don't think this method deserve !
, but this one seems to be a better candidate.
Since headers
, remote_ip
are optional. It is awkward to see
client_ip.set_client_ip_tag(span, nil, '1.1.1.1')
.
Consider making those keyword argument, so the caller makes the invokation a bit more explicit like,
client_ip.set_client_ip_tag(span, remote_ip: '1.1.1.1')
or
client_ip.set_client_ip_tag(span, headers: headers)
.
Control flow using exception handling is slow in Ruby. This method need to be changed for
- Reconsider whether if the scenario is truly an exception
- Streamline the exceptional handling cases to avoid nested rescue/raise
|
||
describe 'ip validation' do | ||
context 'when given valid ip addresses' do | ||
subject do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this could be simplified, without custom matcher
describe ".validate_ip" do
ips = [
...
]
ips.each do |ip|
context 'when given #{ip}' do
let(:normalised) { described_class.strip_decorations(ip) }
subject { described_class.validate_ip(normalised) }
it { is_expected.to be true }
end
end
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
let(:tagging_enabled) { false } | ||
|
||
it 'does nothing' do | ||
expect(span).to_not receive(:set_tag).with(any_args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with(any_args)
is redundant and assertion expect(span).to_not receive(:set_tag)
should be strong enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, fixed
One of the remaining issues in this PR is choosing to use the builtin In order to check whether a string is a valid IP with One of my concerns was whether raising an exceptions will have a significant performance impact, considering that in cases of misconfiguration, an invalid IP can be present in every traced request. After a discussion with @TonyCTHsu, and following his advice, I ran some benchmarks that make it clear there's no significant performance hit:
|
Approved, provided the integration CI failures are investigated if they still fail after a re-run. |
class HashHeaderCollection < HeaderCollection | ||
def initialize(hash) | ||
super() | ||
@hash = hash.transform_keys(&:downcase) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hash#transform_keys
is only available in Ruby 2.5 or later:
puts({}.transform_keys(&:downcase))
# == 2.4.10 ==
# -e:1:in `<main>': undefined method `transform_keys' for {}:Hash (NoMethodError)
# Did you mean? transform_values
#
# == 2.5.6 ==
# {}
Is this method being hit during our test runs? It should show up as a failure in 2.1-2.4 test cases, if so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's used in spec/datadog/tracing/client_ip_spec
, odd. I'll try to see why it isn't called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some reason these tests don't fail in the pipeline:
https://app.circleci.com/pipelines/github/DataDog/dd-trace-rb/7179/workflows/73531d46-88fe-4910-af07-bb3d97451c23/jobs/265659/parallel-runs/1?filterBy=ALL
https://app.circleci.com/pipelines/github/DataDog/dd-trace-rb/7179/workflows/73531d46-88fe-4910-af07-bb3d97451c23/jobs/265659/parallel-runs/2?filterBy=ALL
What does this PR do?
Add support for the
http.client_ip
tag for Rack-based API frameworks.Motivation
The
http.client_ip
tag will help distinguish whether the request came from a public or a private address, allowing Datadog to distinguish external from internal services and API endpoints.How to test the change?
The change is validated by RSpec behavior tests.
To test integration:
Set up and start a demo Rails server and enable auto-instrumentation.
Then use the following commands:
Review the logs from the server. The emitted traces should reflect the expected results.