Skip to content

Commit

Permalink
Fix issue where gem is installed inside another gem's path
Browse files Browse the repository at this point in the history
I ran into this issue in the tests being run on GitHub Actions,
since it installs our dependencies inside the dd-trace-rb folder.

It's unclear to me if it can happen in actual customer setups,
but I've decided to fix it anyway.
  • Loading branch information
ivoanjo committed Jan 7, 2022
1 parent 9af7949 commit d3b464b
Show file tree
Hide file tree
Showing 2 changed files with 48 additions and 0 deletions.
20 changes: 20 additions & 0 deletions lib/ddtrace/profiling/collectors/code_provenance.rb
Original file line number Diff line number Diff line change
Expand Up @@ -58,12 +58,32 @@ def record_library(library)
libraries_by_path[library.path] = library
end

# Ruby hash maps are guaranteed to keep the insertion order of keys. Here, we sort @libraries_by_path so
# that the hash can be iterated in reverse order of paths.
#
# Why we do this: We do this to make sure that if there are libraries with paths that are prefixes of other
# libraries, e.g. '/home/foo' and '/home/foo/bar', we match to the longest path first.
# When reverse sorting paths as strings, '/home/foo/bar' will come before '/home/foo'.
#
# This way, when we iterate the @libraries_by_path hash, we know the first hit will also be the longest.
#
# Alternatively/in the future we could instead use a trie to match paths, but I doubt for the data sizes we're
# looking at that a trie is that much faster than using Ruby's built-in native collections.
def sort_libraries_by_longest_path_first
@libraries_by_path = @libraries_by_path.sort.reverse!.to_h
end

def record_loaded_specs(loaded_specs)
recorded_library = false

loaded_specs.each do |spec|
next if libraries_by_name.key?(spec.name)

record_library(Library.new(type: 'library', name: spec.name, version: spec.version, path: spec.gem_dir))
recorded_library = true
end

sort_libraries_by_longest_path_first if recorded_library
end

def record_loaded_files(loaded_files)
Expand Down
28 changes: 28 additions & 0 deletions spec/ddtrace/profiling/collectors/code_provenance_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,34 @@
it 'returns self' do
expect(code_provenance.refresh).to be code_provenance
end

context "when a gem's path is inside another gem's path" do
# I'm not entirely sure if this can happen in end-user apps, but can happen in CI if bundler is configured to
# install dependencies into a subfolder of ddtrace. In particular GitHub Actions does this.

it 'matches the loaded file to the longest matching path' do
code_provenance.refresh(
loaded_files: ['/dd-trace-rb/vendor/bundle/ruby/2.7.0/gems/byebug-11.1.3/lib/byebug.rb'],
loaded_specs: [
instance_double(
Gem::Specification,
name: 'ddtrace',
version: '1.2.3',
gem_dir: '/dd-trace-rb'
),
instance_double(
Gem::Specification,
name: 'byebug',
version: '4.5.6',
gem_dir: '/dd-trace-rb/vendor/bundle/ruby/2.7.0/gems/byebug-11.1.3'
)
],
)

expect(code_provenance.generate).to have(1).item
expect(code_provenance.generate.first).to have_attributes(name: 'byebug')
end
end
end

describe '#generate_json' do
Expand Down

0 comments on commit d3b464b

Please sign in to comment.