Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GetStatsTest.Parameters Failure #164

Open
asamadiya opened this issue Feb 19, 2023 · 4 comments
Open

GetStatsTest.Parameters Failure #164

asamadiya opened this issue Feb 19, 2023 · 4 comments

Comments

@asamadiya
Copy link

Background

I want to try tcmalloc per-cpu cache & hpaa with TensorFlow. gperftools tcmalloc so is distributed as a yum package so I didn't have to build it.

Setup

I read through the platform & quick start documentation and tried to build google's tcmalloc from source. I'm on Cent OS 7.4 kernel 5.4 with devtoolset-9 (gcc 9.3.1), python 3.7 & bazel 5.3.0. I can build tensorflow successfully so I don't suspect general issues with my setup.

Issue

GetStatsTest.Parameters test fails. Spot checking other failures, I see the only failure is related to stats like GetStatsLowLevel. What does this failure imply?

Build:

Executed 307 out of 310 tests: 290 tests pass and 20 fail locally.
INFO: Build completed, 20 tests FAILED, 2350 total actions

GetStatsTest.Parameters Failure:

exec ${PAGER:-/usr/bin/less} "$0" || exit 1
Executing tests from //tcmalloc/testing:get_stats_test
-----------------------------------------------------------------------------
Running main() from gmock_main.cc
[==========] Running 2 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 2 tests from GetStatsTest
[ RUN      ] GetStatsTest.Pbtxt
[       OK ] GetStatsTest.Pbtxt (22 ms)
[ RUN      ] GetStatsTest.Parameters
tcmalloc/testing/get_stats_test.cc:105: Failure
Value of: buf
Expected: has substring "PARAMETER hpaa_subrelease 0"
  Actual: "" (of type std::string)
tcmalloc/testing/get_stats_test.cc:107: Failure
Value of: buf
Expected: has substring "PARAMETER tcmalloc_guarded_sample_parameter -1"
  Actual: "" (of type std::string)
tcmalloc/testing/get_stats_test.cc:109: Failure
Value of: buf
Expected: has substring "PARAMETER tcmalloc_per_cpu_caches 0"
  Actual: "" (of type std::string)
tcmalloc/testing/get_stats_test.cc:110: Failure
Value of: buf
Expected: has substring "PARAMETER tcmalloc_max_per_cpu_cache_size -1"
  Actual: "" (of type std::string)
tcmalloc/testing/get_stats_test.cc:112: Failure
Value of: buf
Expected: has substring "PARAMETER tcmalloc_max_total_thread_cache_bytes -1"
  Actual: "" (of type std::string)
tcmalloc/testing/get_stats_test.cc:115: Failure
Value of: buf
Expected: has substring "PARAMETER tcmalloc_skip_subrelease_interval 1s"
  Actual: "" (of type std::string)
tcmalloc/testing/get_stats_test.cc:146: Failure
Value of: buf
Expected: has substring "PARAMETER hpaa_subrelease 1"
  Actual: "" (of type std::string)
tcmalloc/testing/get_stats_test.cc:148: Failure
Value of: buf
Expected: has substring "PARAMETER tcmalloc_guarded_sample_parameter 50"
  Actual: "" (of type std::string)
tcmalloc/testing/get_stats_test.cc:150: Failure
Value of: buf
Expected: has substring "PARAMETER desired_usage_limit_bytes 18446744073709551615"
  Actual: "" (of type std::string)
tcmalloc/testing/get_stats_test.cc:154: Failure
Value of: buf
Expected: has substring "PARAMETER tcmalloc_per_cpu_caches 1"
  Actual: "" (of type std::string)
tcmalloc/testing/get_stats_test.cc:155: Failure
Value of: buf
Expected: has substring "PARAMETER tcmalloc_max_per_cpu_cache_size 3145728"
  Actual: "" (of type std::string)
tcmalloc/testing/get_stats_test.cc:157: Failure
Value of: buf
Expected: has substring "PARAMETER tcmalloc_max_total_thread_cache_bytes 4194304"
  Actual: "" (of type std::string)
  Actual: "" (of type std::string)
tcmalloc/testing/get_stats_test.cc:160: Failure
Value of: buf
Expected: has substring "PARAMETER tcmalloc_skip_subrelease_interval 1m0.125s"
  Actual: "" (of type std::string)
[  FAILED  ] GetStatsTest.Parameters (35 ms)
[----------] 2 tests from GetStatsTest (58 ms total)

[----------] Global test environment tear-down
[==========] 2 tests from 1 test suite ran. (58 ms total)
[  PASSED  ] 1 test.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] GetStatsTest.Parameters

 1 FAILED TEST
@ckennelly
Copy link
Collaborator

Given the actual values are all "", it looks like TCMalloc is not being linked in for this particular test, so the human-readable statistics are returning an empty string.

@asamadiya
Copy link
Author

@ckennelly
That's worrying. I did not see any linking warnings or errors during the build.
The Pbtxt test passes but the Parameters one fails. I dumped the buf from the Pbtxt text below. For example, I see tcmalloc_per_cpu_caches: true when stats are read as pbtxt. Is there another way I can check if tcmalloc is linking properly?

https://sharetext.me/k93k4z7iqy

@ckennelly
Copy link
Collaborator

If you look at the binary's symbols with nm, is MallocExtension_Internal_GetStats linked in? Based on the other passing test, it seems like MallocExtension_Internal_GetStatsInPbtxt is present.

@asamadiya
Copy link
Author

asamadiya commented Feb 23, 2023

I see both linked in.

[root@3638c1513bd9 tcmalloc]# nm bazel-bin/tcmalloc/testing/get_stats_test | grep MallocExtension_Internal_GetStats
0000000000492369 T MallocExtension_Internal_GetStats
0000000000492106 T MallocExtension_Internal_GetStatsInPbtxt

Will this failure (and a few other similar test failures) imply that I would just not get the stats when I link this into TensorFlow? (and everything thing else the tcmalloc supports should work just fine?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants