Skip to content
This repository has been archived by the owner on Feb 4, 2021. It is now read-only.

Build cop notes and tracking 2017-05-17 through 2017-05-31ish #21

Closed
nuclearsandwich opened this issue May 17, 2017 · 15 comments
Closed

Comments

@nuclearsandwich
Copy link
Member

Regressions 2017-05-17

Nightly OSX Repeated #688
  • test_tutorial_add_two_ints_client_async_rmw_fastrtps_cpp 🔗 link
  • test_add_two_ints_server_add_two_ints_client_async__rmw_fastrtps_cpp_.test_executable🔗 link

The test output what appears to be a successful communication between the two processes. The issue looks like it might be related to differences in expected test output.

Nightly Linux Coverage #383
  • projectroot.test_tutorial_add_two_ints_server_add_two_ints_client__rmw_fastrtps_cpp 🔗 link
  • projectroot.test_tutorial_add_two_ints_server_add_two_ints_client_async__rmw_fastrtps_cpp 🔗 link
  • demo_nodes_cpp.test_tutorial_add_two_ints_server_add_two_ints_client_async__rmw_fastrtps_cpp.xunit.missing_result 🔗 link
  • demo_nodes_cpp.test_tutorial_add_two_ints_server_add_two_ints_client__rmw_fastrtps_cpp.xunit.missing_result 🔗 link

I can't find anything that looks obviously like the reson for the timeouts in the console logs for job #383 demo_nodes_cpp.

@mikaelarguedas
Copy link
Member

That comes back to the discussion we had a few weeks ago: "Should we make all tests using launch_testing match regex instead of full console outputs ?" (renaming the .txt files to .regex in this directory for example). These failures seem to be because of a false positive fastrtps error message while the participant did succeed to match and communicate after all.

@dirk-thomas
Copy link
Member

@mikaelarguedas I looked at similar failures in the last sprint and the problem is that our rmw specific rmw_output_filter list doesn't support the ascii codes FastRTPS is using to colorize the output.

@nuclearsandwich
Copy link
Member Author

nuclearsandwich commented May 18, 2017

Regressions 2017-05-18

Nightly Linux aarch64 Repeated #43
  • test_communication.test_requester_replier__Primitives__rclpy__rmw_fastrtps_cpp.xunit. missing_result 🔗
    I'm still not certain how to interpret or resolve these "missing result" tests.

  • test_showimage_cam2image__rmw_fastrtps_cpp_.test_reliable_qos 🔗
    This looks like it could be a similar issue FastRTPS false positives as some of the Nightly Linux Coverage #383 and Nightly OSX Repeated #688 from yesterday.
    @dirk-thomas you mention that we have an output filter which isn't processing ascii color escapes. Is there a way to disable color output from Fast RTPS?

Nightly OSX Repeated #689
  • projectroot.test_get_node_names__rmw_fastrtps_cpp 🔗
  • projectroot.gtest_subscription__rmw_connext_cpp 🔗
  • projectroot.test_composition__rmw_connext_cpp 🔗
  • projectroot.test_tutorial_set_and_get_parameters_async__rmw_fastrtps_cpp 🔗
  • projectroot.test_find_weak_nodes 🔗
  • rclcpp.test_find_weak_nodes.gtest.missing_result 🔗
  • test_subscription__rmw_connext_cpp.subscription_and_spinning 🔗
  • TestGetNodeNames__rmw_fastrtps_cpp.test_rcl_get_node_names 🔗
  • test_requester_replier__Primitives__rclpy__rmw_fastrtps_cpp_.test_requester_replier 🔗
  • composition.test_composition__rmw_connext_cpp.xunit.missing_result 🔗
  • demo_nodes_cpp.test_tutorial_set_and_get_parameters_async__rmw_fastrtps_cpp.xunit.missing_result 🔗
Nightly Windows Release #433
  • gtest_timeout_subscriber__rmw_connext_cpp 🔗
  • test_timeout_subscriber__rmw_connext_cpp.timeout_subscriber 🔗

These tests have two different outputs but it isn't clear from the test title what the difference between them is. These tests ran on icecube but these tests have also failed on windshield previously with similar output.

@dirk-thomas
Copy link
Member

Is there a way to disable color output from Fast RTPS?

The default consumer doesn't allow any customization to the output (see https://github.com/eProsima/Fast-RTPS/blob/6322cb9e875f685e9f68619143ded9374765ecb9/src/cpp/log/StdoutConsumer.cpp#L21). We would need to provide our own consumer implementation and register it as the default on startup.

@nuclearsandwich
Copy link
Member Author

Regressions 2017-05-19 🎂

Nightly Linux Packaging #434

The Linux packaging jobs have been green for a while but when it has failed recently it's been related to the dynamic_bridge tests.

  • projectroot.test_dynamic_bridge__rmw_fastrtps_cpp 🔗
  • ros1_bridge.test_dynamic_bridge__rmw_fastrtps_cpp.xunit.missing_result 🔗
Nightly OSX Debug
  • projectroot.gtest_executor__rmw_connext_cpp 🔗
  • test_rclcpp.gtest_executor__rmw_connext_cpp.gtest.missing_result 🔗
Nightly OSX Release
  • projectroot.gtest_executor__rmw_connext_cpp 🔗
  • test_rclcpp.gtest_executor__rmw_connext_cpp.gtest.missing_result 🔗
Nightly OSX Repeated
  • test_talker_listener__rmw_fastrtps_cpp_.test_executable 🔗
  • test_rclcpp.gtest_executor__rmw_connext_cpp.gtest.missing_result 🔗

Other changes

We got a 🍏 Nightly Windows Release build looking at the recent history they seem to pop up from time to time.

The frequent timeouts are logged in #11 if the build flakes again this might be something to revisit now that namespaces have landed.

@nuclearsandwich
Copy link
Member Author

nuclearsandwich commented May 19, 2017

I looked at similar failures in the last sprint and the problem is that our rmw specific rmw_output_filter list doesn't support the ascii codes FastRTPS is using to colorize the output.

I found some info on rmw_output_filters for connext but nothing about the one for Fast-RTPS. Are output filters still based only on prefix or are regex filters supported somewhere?

@dirk-thomas
Copy link
Member

Only on prefix atm.

@nuclearsandwich
Copy link
Member Author

So it seems like we have a few possible ways to resolve these false positives:

  1. Figure out why Fast-RTPS is printing error messages here and resolve that.
  2. Implement a colorless Fast-RTPS logger and switch to it
  3. Implement regexp-capable rmw_output_filters and account for colorized Fast-RTPS output with that.
  4. Brute force rmw_output_filters for every ASCII color escape used by Fast-RTPS

I spent like four minutes glancing at what it would take to make the upstream logger only use color conditionally. There's no C++ or cross-platform equivalent of isatty, we could use isatty from C anyway and just leave color always disabled on Windows. If we're conditionally disabling color we could also set an environment variable that disables color at a tty for running tests locally.

It's currently a compile time setting based on _WIN32 being defined. We could propose a FASTRTPS_LOG_NOCOLOR compiler flag that uses the empty color definitions cross platform when defined but that requires us to only ever use Fast-RTPS that we build and package with the color off (for testing) which might not be a terrible trade-off but shouldn't pass unconsidered.

@nuclearsandwich
Copy link
Member Author

Regressions 2017-05-22

Nightly Linux AArch64
  • TestStateMachineInfo.available_transitions 🔗
  • projectroot.test_state_machine_info 🔗
Nightly Linux Repeated
  • projectroot.gtest_executor__rmw_connext_cpp 🔗
  • projectroot.test_composition__rmw_connext_cpp 🔗
  • projectroot.test_demo_cyclic_pipeline__rmw_fastrtps_cpp 🔗
  • projectroot.test_pendulum_teleop__rmw_connext_cpp 🔗
  • projectroot.test_tutorial_list_parameters__rmw_fastrtps_cpp 🔗

In addition to the above, there's also a bunch of Missing Result tests on this job.

Nightly OSX Release
  • projectroot.gtest_executor__rmw_connext_cpp 🔗
  • test_rclcpp.gtest_executor__rmw_connext_cpp.gtest.missing_result 🔗

The OSX Release build failed with the same connext failures as previous OSX builds.

@nuclearsandwich
Copy link
Member Author

Regressions 2017-05-23

Nightly OS X Debug
  • projectroot.gtest_executor__rmw_connext_cpp 🔗
  • test_rclcpp.gtest_executor__rmw_connext_cpp.gtest.missing_result 🔗
Nightly OS X Repeated
  • TestGetNodeNames__rmw_connext_cpp.test_rcl_get_node_names 🔗
  • test_requester_replier__Primitives__rclpy__rmw_fastrtps_cpp_.test_requester_replier 🔗

@nuclearsandwich
Copy link
Member Author

Regressions 2017-05-26

Nightly Windows Debug #485

This build had 14 new failures. Enough that I actually tried to automate the way I generate these reports. Unfortunately the Jenkins REST API will give me the test data but not a url to the test result pages that I've been linking to and templating the url using the test className and name requires a guess as to whether the url should contain a (root) fragment or not. So it looks like automation will have to wait til I'm even lazier or until I understand the test hierarchy better enough to template urls.

@nuclearsandwich
Copy link
Member Author

Regressions 2017-05-30

Nightly Linux Coverage
  • projectroot.test_tutorial_add_two_ints_server_add_two_ints_client_async__rmw_fastrtps_cpp 🔗
  • demo_nodes_cpp.test_tutorial_add_two_ints_server_add_two_ints_client_async__rmw_fastrtps_cpp.xunit.missing_result 🔗

@nuclearsandwich
Copy link
Member Author

No major takeaways from this stint. The only identifiable way to reduce the flakes I looked at would be to swap out FastRTPS's logger for one that only outputs ANSI color escapes at terminals or build and provide our own. Both seem too much to worry about before Beta 2 but perhaps after.

@dirk-thomas
Copy link
Member

The only identifiable way to reduce the flakes I looked at would be to swap out FastRTPS's logger for one that only outputs ANSI color escapes at terminals or build and provide our own. Both seem too much to worry about before Beta 2 but perhaps after.

Have you created a ticket upstream regarding the ansi color?

@dhood
Copy link
Member

dhood commented Aug 22, 2017

updating this thread for posterity:

So it seems like we have a few possible ways to resolve these false positives:
-1 Figure out why Fast-RTPS is printing error messages here and resolve that.
-2 Implement a colorless Fast-RTPS logger and switch to it
-3 Implement regexp-capable rmw_output_filters and account for colorized Fast-RTPS output with that.
-4 Brute force rmw_output_filters for every ASCII color escape used by Fast-RTPS

I asked eProsima about (1) and they removed the error messages: eProsima/Fast-DDS#128. so for now, this particular error shouldn't cause flaky tests.

For similar issues in the future:
For (3), ros2/launch#59 adds functionality for filtering regexes.
Lately I enabled fastrtps debug logging and the colourisation was colouring our prints as well, making it difficult to brute force rmw_output_filters for coloured prints as suggested in (4). So, one day we might need (2) if we want to be able to run tests with debug output enabled.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants