Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build: detect and report on build warnings when not verbose #1

Closed
wjwwood opened this issue Mar 12, 2014 · 24 comments
Closed

build: detect and report on build warnings when not verbose #1

wjwwood opened this issue Mar 12, 2014 · 24 comments

Comments

@wjwwood
Copy link
Member

wjwwood commented Mar 12, 2014

I switched the default behavior to be quiet (hiding the output from build commands unless there is an error). The remaining concern is that warnings will go unnoticed by users.

To alleviate this we should find and report warnings from cmake and gcc/clang and report them. This shouldn't be too hard because we are already capturing all of the output from commands and rosmake did the same kind of parsing in order to collect and report on the warnings in a similar fashion. In fact, I plan on stealing the code from rosmake and refactoring it to work for the build verb.

@wjwwood wjwwood added this to the 0.1.0 milestone Mar 12, 2014
@xqms
Copy link
Contributor

xqms commented May 28, 2014

I had a look at the rosmake code, and it's just grepping the compiler output for the string warning: and counting the occurrences. While that is helpful to at least notice that there were warnings, I would prefer to actually see the warnings (and the following lines, which usually contain further information).

I propose collecting stderr separately from stdout, and printing the entire stderr if it is non-empty. In my experiments, only warnings/errors ever land up in stderr. This solution avoids parsing gcc's output and thus extends nicely to other compilers / build systems.

So -v would mean "print both stdout and stderr interleaved", and without -v "print stderr if non-empty".

@wjwwood
Copy link
Member Author

wjwwood commented May 28, 2014

There is technical difficulty in collecting stdout and stderr separately, because you will basically be giving up the interleave ordering. Meaning that things on stderr and stdout will be recorded in a different order than they would have appeared on the screen.

This might be ok for the compiler, since errors tend to be all output on stderr, including the location of the error, but what you might loose is the location of the error with respect to normal compiler output or the output of make.

I think the grepping will be needed additionally if we want to catch CMake also, because CMake does not output warnings to stderr in my experience.

We should look more closely at the stderr capture trade-offs, but currently I think it is a pretty unattractive option.

@xqms
Copy link
Contributor

xqms commented May 28, 2014

I'm assuming you are talking about these CMake warnings:

message(WARNING Some warning)

The warning is sent to stderr. Try cmake . > /dev/null with a CMakeLists.txt containing the above line. Built-in CMake warnings (like missing cmake_minimum_required()) are also sent to stderr. I'm using CMake 2.8.11.2 if that matters.

I still feel parsing the output is the wrong way to go. The output may come from cmake, gcc, clang, msggen, Qt's moc and uic, custom build scripts, etc. So we cannot make many assumptions about the output format. And false negatives in a "warning detector" are bad, as anything printed over stderr is something the user should know about.

I'll research more if stdout/stderr interleaving is possible without losing the information from which channel the data came from. You're right, it's not trivial, but we cannot be the first ones wondering about it. I have a feeling that any solution will be platform-specific, though.

xqms added a commit to xqms/catkin_tools that referenced this issue May 28, 2014
…atkin#1

If something got printed on stderr by a build command, the user will want
to know about it. So gather stderr separately from stdout and check if
it is non-empty when the command finishes.
@xqms
Copy link
Contributor

xqms commented May 28, 2014

I made a rough draft of what I imagine here: xqms@6bc02c4

It's not in any way ready for close review or pulling as I haven't touched the Windows side yet. But maybe it's enough for evaluating the approach.

I have just used the standard poll() approach for separate stdout/stderr channels and it seems to work fine. It dumps the interleaved stdout/stderr log if stderr is non-empty.

I haven't tested with anything other than gcc yet, but I haven't seen any stdout/stderr reordering problems yet. And in any case, most people are running concurrent make, which will reorder things pretty arbitrarily anyway...

@wjwwood
Copy link
Member Author

wjwwood commented May 29, 2014

@xqms it would be useful for you to open a pull request even if you do not consider it ready for merge so that we could discuss it and those discussions could be easily referenced from here.

Thanks for taking some time to dig into it, maybe we can come up with a solution.

@xqms
Copy link
Contributor

xqms commented Apr 21, 2015

Has there been progress on this point? We've adapted catkin_tools in our working group, with many beneficial effects, but one undesired side effect is that developers often do not fix warnings, because they only see them while building with -v.

Is this affecting other people as well? Are you always using something like catkin build --this --no-deps -v while developing? I'm just curious how other people using catkin_tools in daily work handle this issue...

@jbohren
Copy link
Contributor

jbohren commented Apr 21, 2015

@xqms There hasn't been any progress on this that I know of. My focus has still been general usability (i.e. #196) but I think gathering warnings is also really important.

@xqms
Copy link
Contributor

xqms commented Apr 21, 2015

@jbohren Okay, I will try to look at the current state this weekend and will take a look at what @wjwwood outlined in #57. Hopefully I'll be able to contribute ;-)

@jbohren
Copy link
Contributor

jbohren commented Apr 21, 2015

@jbohren Okay, I will try to look at the current state this weekend and will take a look at what @wjwwood outlined in #57. Hopefully I'll be able to contribute ;-)

Cool. I've had to refactor a bit of the Executor and Job stuff for the catkin clean jobs, but I think a lot of that stuff has ended up more complicated than it needs to be. I haven't taken a look at the asyncio stuff that @wjwwood mentioned in #57, but I suspect that that would enable a lot of it to be cleaned up. If you re-implement the Executor with the stuff from osrf_pycommon.process_utils it shouldn't be hard to re-implement the per-package catkin clean stuff to use that framework.

@wjwwood
Copy link
Member Author

wjwwood commented Apr 22, 2015

@xqms refactoring catkin build to properly use Python's asyncio is probably a too big of a task, but if you wanted to try to wrap up the async subprocess calls (http://osrf-pycommon.readthedocs.org/en/latest/process_utils.html#osrf_pycommon.process_utils.async_execute_process) into a synchronous call and use that to replace the subprocess.Popen stuff in catkin build that would be doable. I'm aware of some changes in asyncio upstream which make my examples in the documentation of osrf_pycommon not work correctly, but unfortunately I haven't had a chance to audit the entire documentation looking for fixes to the examples.

This CI script I recently wrote for ROS2 is very similar to what you'd have to do:

https://github.com/ros2/ros2/blob/ci_scripts/ros2_batch_job/util.py#L168-L216

The benefit of using the async_execute_process function from osrf_pycommon (even synchronously) is that you get good capture behavior in all operating systems.

@wjwwood
Copy link
Member Author

wjwwood commented Apr 22, 2015

One more thing, I haven't tried the stuff I linked to above in a multi threading situation. So your YMMV, and ultimately you may need to use a since asyncio loop.

@jbohren
Copy link
Contributor

jbohren commented May 4, 2015

@wjwwood How does this look as a starting point for the job executor? It lets you run n jobs asynchronously which can be composed of a mixture of program executions and blocking python callbacks.

https://gist.github.com/jbohren/9cd689c08782d6ffbcdf

@xqms
Copy link
Contributor

xqms commented May 6, 2015

I had little time so far to really look at this myself :-(

@jbohren That sounds very promising! If I understand correctly, you propose to completely remove the threading solution currently in place and replace it with your solution, right? It seems to me that there is not much missing...

@wjwwood
Copy link
Member Author

wjwwood commented May 6, 2015

@jbohren That looks like a good start, but I was wondering, why are you using the multiprocessing module?

@jbohren
Copy link
Contributor

jbohren commented May 6, 2015

I'm using multiprocessing in order to use a subclass of multiprocessing.Process which will capture stdout and stderr from a job stage which simply calls a blocking python function.

@wjwwood
Copy link
Member Author

wjwwood commented May 6, 2015

My understanding is that the multiprocessing.Process will fork another instance of the Python interpreter. Why not temporarily capture the stdout and stderr by replacing them with a string buffer? I do this in my tests in bloom, for example:

https://github.com/ros-infrastructure/bloom/blob/master/test/utils/common.py#L141-L151

@jbohren
Copy link
Contributor

jbohren commented May 6, 2015

My understanding is that the multiprocessing.Process will fork another instance of the Python interpreter. Why not temporarily capture the stdout and stderr by replacing them with a string buffer? I do this in my tests in bloom, for example:

https://github.com/ros-infrastructure/bloom/blob/master/test/utils/common.py#L141-L151

First, by forking a process, it means that they can truly perform asynchronously (no GIL). Second, doing this means it doesn't interfere with the stdout and stderr of the main catkin process. I guess we could store them somewhere when we initialize the log function, but this way everything stays isolated.

@jbohren
Copy link
Contributor

jbohren commented May 6, 2015

@jbohren That sounds very promising! If I understand correctly, you propose to completely remove the threading solution currently in place and replace it with your solution, right? It seems to me that there is not much missing...

Yeah, basically. As it stands, all it needs is to pump events into the comm_queue so we can display them to the console. Then it just needs to be integrated with the right jobs, dependencies, job failure policies, and the result display.

@wjwwood
Copy link
Member Author

wjwwood commented May 6, 2015

First, by forking a process, it means that they can truly perform asynchronously (no GIL).

The reason I point this out is that forking is expensive, and this is an activity that happens frequently (multiple times per package) and the only time it's advantageous to fork to avoid the GIL is when the resulting code is a long running process with a small amount of data to communicate across the IPC (which in our case is almost never true).

For example, the first version of catkin_tools used multiprocessing instead of threading for the executors, but that ended up being slower (a lot slower actually) than just doing threading. I think it was a combination of the overhead of the multiprocessing IPC for sending log data back and forth and the fact that other CPU cores were already busy with building code most of the time. In this case you've got all of that as well as the constant forking cost. You could mitigate the forking cost by using a multiprocessing.Pool, but at best I think it would be only just as fast as threading, and maybe even a little slower.

I would highly encourage you to try it without the multiprocess and compare the speed. And even if the forking is only 200ms slower, that's probably around 400-600ms per package, and on a 200 package workspace that will be one and a half minutes slower.

Second, doing this means it doesn't interfere with the stdout and stderr of the main catkin process. I guess we could store them somewhere when we initialize the log function, but this way everything stays isolated.

You're right that the example I gave above would affect the whole system, but there are other options for controlling logging of the data in these functors. You can capture stdout and stderr in a narrower scope (in the thread or even adjusting the locals of the function) or an even better solution, imo, is to just pass a logging object into the function and use that in the job stages which are actually Python functions.

@jbohren
Copy link
Contributor

jbohren commented May 6, 2015

@wjwwood Here's the latest version which just passes a logging object to the blocking python functions:
https://gist.github.com/jbohren/9cd689c08782d6ffbcdf

@wjwwood
Copy link
Member Author

wjwwood commented May 6, 2015

That all looks like the right direction to me; have you had a chance to look at the interleaved result when you capture stdout and stderr separately? I'd be curious to see if the output ordering is preserved when the system is under load.

@jbohren
Copy link
Contributor

jbohren commented May 7, 2015

That all looks like the right direction to me; have you had a chance to look at the interleaved result when you capture stdout and stderr separately? I'd be curious to see if the output ordering is preserved when the system is under load.

The interleaved capture definitely works under light load, but I haven't tested it under anything heavy.

@jbohren
Copy link
Contributor

jbohren commented May 20, 2015

Updated the asynctest now with jobserver integration.

https://gist.github.com/jbohren/9cd689c08782d6ffbcdf

I haven't actually run it with a GNU make job yet, but it's now using the same underlying jobserver stuff to manager the number of async jobs. See here: https://gist.github.com/jbohren/9cd689c08782d6ffbcdf#file-asynctest-py-L216

@jbohren
Copy link
Contributor

jbohren commented Jan 20, 2016

This is now implemented following the merge of #249.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants