Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix non-atomic atomic_configure_file #1150

Merged
merged 2 commits into from
Jul 4, 2022
Merged

Conversation

zig-for
Copy link
Contributor

@zig-for zig-for commented Sep 2, 2021

While building very parallel project trees (for example, catkin run_tests on a large project tree that is already built without tests) on a machine with many cores (ie, an xlarge aws compute unit with 48 cores), environment configuration can fail with errors similar to:

SyntaxError: invalid syntax
CMake Error at /opt/ros/kinetic/share/catkin/cmake/safe_execute_process.cmake:11 (message):
  execute_process(/usr/bin/python2
  "/home/embark/catkin_ws/build/remote_operation/catkin_generated/generate_cached_setup.py")
  returned error code 1
Call Stack (most recent call first):
  /opt/ros/kinetic/share/catkin/cmake/all.cmake:208 (safe_execute_process)
  /opt/ros/kinetic/share/catkin/cmake/catkinConfig.cmake:20 (include)
  CMakeLists.txt:4 (find_package)
[3:22 PM] Traceback (most recent call last):
  File "/home/embark/catkin_ws/build/node_framework/catkin_generated/generate_cached_setup.py", line 22, in <module>
    code = generate_environment_script('/home/embark/catkin_ws/devel/env.sh')
  File "/opt/ros/kinetic/lib/python2.7/dist-packages/catkin/environment_cache.py", line 62, in generate_environment_script
    output = subprocess.check_output([env_script, sys.executable, '-c', python_code])
  File "/usr/lib/python2.7/subprocess.py", line 567, in check_output
    process = Popen(stdout=PIPE, *popenargs, **kwargs)
  File "/usr/lib/python2.7/subprocess.py", line 711, in __init__
    errread, errwrite)
  File "/usr/lib/python2.7/subprocess.py", line 1343, in _execute_child
    raise child_exception
OSError: [Errno 2] No such file or directory
CMake Error at /opt/ros/kinetic/share/catkin/cmake/safe_execute_process.cmake:11 (message):
  execute_process(/usr/bin/python2
  "/home/embark/catkin_ws/build/node_framework/catkin_generated/generate_cached_setup.py")
  returned error code 1
Call Stack (most recent call first):
  /opt/ros/kinetic/share/catkin/cmake/all.cmake:208 (safe_execute_process)
  /opt/ros/kinetic/share/catkin/cmake/catkinConfig.cmake:20 (include)
  CMakeLists.txt:14 (find_package)

Reading the directory before and after the error will show that the file exists, with proper permissions.

As far as I can tell, there are two defects:

  1. file(LOCK ...) doesn't lock between independent cmake instances, as catkin build uses.
  2. There is no locking when executing scripts, so when the copy happens, there is a short time when another instance is trying to read that the script is missing. See https://gitlab.kitware.com/cmake/cmake/-/blob/master/Source/kwsys/SystemTools.cxx#L2384

I've changed atomic_configure_file to first copy alongside the destination, then do a rename. This should allow atomic configuration while not causing a regression with #422 .

While building very parallel project trees (for example, `catkin run_tests` on a large project tree that is already built without tests) on a machine with many cores (ie, an xlarge aws compute unit with 48 cores), environment configuration can fail with errors similar to:

```
SyntaxError: invalid syntax
CMake Error at /opt/ros/kinetic/share/catkin/cmake/safe_execute_process.cmake:11 (message):
  execute_process(/usr/bin/python2
  "/home/embark/catkin_ws/build/remote_operation/catkin_generated/generate_cached_setup.py")
  returned error code 1
Call Stack (most recent call first):
  /opt/ros/kinetic/share/catkin/cmake/all.cmake:208 (safe_execute_process)
  /opt/ros/kinetic/share/catkin/cmake/catkinConfig.cmake:20 (include)
  CMakeLists.txt:4 (find_package)
[3:22 PM] Traceback (most recent call last):
  File "/home/embark/catkin_ws/build/node_framework/catkin_generated/generate_cached_setup.py", line 22, in <module>
    code = generate_environment_script('/home/embark/catkin_ws/devel/env.sh')
  File "/opt/ros/kinetic/lib/python2.7/dist-packages/catkin/environment_cache.py", line 62, in generate_environment_script
    output = subprocess.check_output([env_script, sys.executable, '-c', python_code])
  File "/usr/lib/python2.7/subprocess.py", line 567, in check_output
    process = Popen(stdout=PIPE, *popenargs, **kwargs)
  File "/usr/lib/python2.7/subprocess.py", line 711, in __init__
    errread, errwrite)
  File "/usr/lib/python2.7/subprocess.py", line 1343, in _execute_child
    raise child_exception
OSError: [Errno 2] No such file or directory
CMake Error at /opt/ros/kinetic/share/catkin/cmake/safe_execute_process.cmake:11 (message):
  execute_process(/usr/bin/python2
  "/home/embark/catkin_ws/build/node_framework/catkin_generated/generate_cached_setup.py")
  returned error code 1
Call Stack (most recent call first):
  /opt/ros/kinetic/share/catkin/cmake/all.cmake:208 (safe_execute_process)
  /opt/ros/kinetic/share/catkin/cmake/catkinConfig.cmake:20 (include)
  CMakeLists.txt:14 (find_package)
```
@jaredjxyz
Copy link

I can confirm we have seen this issue as well

@Axel13fr
Copy link

We've had the issue as well on random occurences on a project that runs tests on ~120 packages over 8 cores. Can this get merged ?

@gbiggs
Copy link
Contributor

gbiggs commented Jun 30, 2022

@zig-for There is one style issue that needs to be fixed (the first line isn't indented correctly). If you can fix that, I think we'll be able to merge this.

@zig-for
Copy link
Contributor Author

zig-for commented Jul 1, 2022

@gbiggs fixed

@gbiggs gbiggs merged commit 2612c4c into ros:noetic-devel Jul 4, 2022
@zig-for
Copy link
Contributor Author

zig-for commented Jul 6, 2022

@gbiggs what's the plan for this repo, anyhow? It's great that this is merged in, but users have to be running catkin from source to get the benefit.

@zig-for zig-for deleted the patch-1 branch July 6, 2022 02:23
@gbiggs
Copy link
Contributor

gbiggs commented Jul 6, 2022

I don't know if any further releases of catkin are planned in the near future, but I expect there will be at least one between now and the EOL date for Noetic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants