Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process IO probe #14

Closed
wants to merge 5 commits into from
Closed

Process IO probe #14

wants to merge 5 commits into from

Conversation

timonv
Copy link
Contributor

@timonv timonv commented Nov 22, 2015

io (in/out in bytes/ops)

@timonv timonv self-assigned this Nov 15, 2015
@thijsc
Copy link
Member Author

thijsc commented Nov 16, 2015

There was a comment here right? :-)

For our use case it's important that the probes don't need root permissions. It's not a design constraint to only use /proc, I'd say we use the most convenient route for every metric we want.

@timonv
Copy link
Contributor

timonv commented Nov 16, 2015

Ah yes, deleted it for a better one and forgot the latter. I'll look into it.

@timonv
Copy link
Contributor

timonv commented Nov 22, 2015

Ok, pidstat version. The integration test still fails even though I'm on stable now. Same issue as with other application, can't read stdout from child in tests.

So I've been playing a little with trying to pretify the code and develop some style. Some things I came up with:

  • Don't do variable assignment if you don't have to
  • Many try!'s and nested try!'s are a code smell, can be fixed by using monads and their methods
  • to_string() is a code smell (it's really a clone). Don't use it unless the new string adds value. Hence the lifetimes.
  • Hardcoded strings are bad, even for error messages
  • From rules can remove map_err lines, however, they are kinda viral in error, so you might end up adding more lines than you remove.

Some stylistic points on the project itself:

  • What's the point of the os module? I think this works just as well, saves a lot of lines and removes a layer of indirection.
  • I think errors should have &'static str instead of String. This allows just using the const and I think is more readable. You can also use &str to match against a pattern.
  • If we want to opensource we should rustdoc :-)

Okthxbye, back to phoenix and elixir :-P

@timonv timonv assigned thijsc and unassigned timonv Nov 22, 2015
@timonv
Copy link
Contributor

timonv commented Nov 22, 2015

Appreciate to do a review first, then I'll clean it up

@timonv
Copy link
Contributor

timonv commented Nov 22, 2015

Hmm, another thing to play with is a ProbeError::Error<T> which just wraps any type of error. To be used for uncommon errors that don't need explicit handling (i.e. parse error on a byteslice that happens once in the app). Thoughts?

@thijsc
Copy link
Member Author

thijsc commented Nov 22, 2015

I like the map_err and and_then based style. 👍 on using that instead of a lot of try's.

Responses to your other points:

What's the point of the os module? I think this works just as well, saves a lot of lines and removes a layer of indirection.

This allows us to put all the implementation code for a certain os in one clear separated scope. This will be handy when we add support for more os'es.

Also this crate will be a dependency for the agent which runs on multiple platforms. So it's vital that the code does compile on other platforms, even though it doesn't offer any functionality. The easiest way to do that is use an os mod so you can disable all functions with one attribute.

I think errors should have &'static str instead of String. This allows just using the const and I think is more readable. You can also use &str to match against a pattern.

I did that before, the disadvantage is that you can never have dynamic information in your error messages. See https://github.com/appsignal/probes/blob/master/src/memory.rs#L87 for an example.

Maybe there's a smarter solution for this possible?

If we want to opensource we should rustdoc :-)

Indeed

Hmm, another thing to play with is a ProbeError::Error which just wraps any type of error. To be used for uncommon errors that don't need explicit handling (i.e. parse error on a byteslice that happens once in the app). Thoughts?

I kind of like the enum based approach because it makes it very clear which types of errors are expected. Also you can use pattern matching to make sure you cover all cases. Of course at some number of expected types it becomes impractical.

I'd say we leave it as is and if it turns out we end up with more than 10ish types we take another look at it?

@timonv
Copy link
Contributor

timonv commented Nov 24, 2015

This allows us to put all the implementation code for a certain os in one clear separated scope. This will be handy when we add support for more os'es.

Also this crate will be a dependency for the agent which runs on multiple platforms. So it's vital that the code does compile on other platforms, even though it doesn't offer any functionality. The easiest way to do that is use an os mod so you can disable all functions with one attribute.

Makes sense!

Maybe there's a smarter solution for this possible?

Can't you do:

const ERROR_MESSAGE: &'static str = "I did a booboo at {}";

...
Error(&format(ERROR_MESSAGE, "this one time at bandcamp"))

I kind of like the enum based approach because it makes it very clear which types of errors are expected. Also you can use pattern matching to make sure you cover all cases. Of course at some number of expected types it becomes impractical.

I'd say we leave it as is and if it turns out we end up with more than 10ish types we take another look at it?

Love the enum approach and absolutely want to keep it. There are some errors (like with the map_err here) that are very spare / happen only at one point. So you could choose to just use the generic wrapper in the enum and not have to use the map_err to get the type chain correctly. I feel that both leave room for playing dirty/lazy, but the first is the better approach as you force preservation of the error. It doesn't feel right to add every little error to the enum, only the ones you want to give more context and/or are common to the application.

@timonv
Copy link
Contributor

timonv commented Nov 24, 2015

Also, pidstat gives different names than /proc/io. Should we rename the values to something more human?

@thijsc
Copy link
Member Author

thijsc commented Nov 25, 2015

Can't you do:

That ends up having the same result as just inlining the string no?

Love the enum approach and absolutely want to keep it. There are some errors (like with the map_err here) that are very spare / happen only at one point. So you could choose to just use the generic wrapper in the enum and not have to use the map_err to get the type chain correctly. I feel that both leave room for playing dirty/lazy, but the first is the better approach as you force preservation of the error. It doesn't feel right to add every little error to the enum, only the ones you want to give more context and/or are common to the application.

Makes sense. Not against this if we find more of the less-important errors. I think at the moment we're not there yet.

@thijsc
Copy link
Member Author

thijsc commented Nov 25, 2015

Also, pidstat gives different names than /proc/io. Should we rename the values to something more human?

👍

@timonv
Copy link
Contributor

timonv commented Nov 27, 2015

👍

Suggestions?

Updated the code. Integration test running pidstat still fails as the child's stdout still isn't captured in tests. Any ideas to fix that? I'd like to have this part under tests as well. I could use manual pipes instead (which I think would work as master Stdout is captured), but it's a lot more code :(

@timonv
Copy link
Contributor

timonv commented Nov 27, 2015

How about

read_kbs
write_kbs
canceled_kbs

iodelay isn't in proc/io btw

Block I/O delay of the task being monitored, measured in clock ticks. This metric includes the delays spent waiting for sync block I/O completion and for swapin block I/O completion.

Bit worried it's a (super small) blocking call. Seems instant though locally (with time)

@thijsc
Copy link
Member Author

thijsc commented Nov 27, 2015

👍 on the names.


pub fn read_process_io(pid: u32) -> Result<ProcessIO> {
let raw = try!(run_pidstat(pid));
println!("RAW: {}", raw);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You forgot to delete this :-)

@thijsc
Copy link
Member Author

thijsc commented Nov 27, 2015

Where did you find that information about stdout capturing not working in test @timonv?

@timonv
Copy link
Contributor

timonv commented Nov 27, 2015

In the test suite. It's not working (at least for a child process)

@timonv
Copy link
Contributor

timonv commented Nov 27, 2015

I have the same problem with github.com/timonv/benv, but the binary works fine

@timonv
Copy link
Contributor

timonv commented Nov 27, 2015

Updated. Child process stdout still an issue

@thijsc thijsc mentioned this pull request Nov 29, 2015
.arg("-d")
.arg(format!("-p {}", pid))
.output()
.map_err(|_| ProbeError::UnexpectedContent(PIDSTAT_READ_ERROR.to_string()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you not just pass the underlying error along?

@thijsc
Copy link
Member Author

thijsc commented Nov 29, 2015

A just noticed another small thing: The pid variables should be of type libc::pid_t. That's an alias to whichever integer type the OS happens to use for pids. This ensures maximum compatibility.

@thijsc
Copy link
Member Author

thijsc commented Nov 29, 2015

On my default Ubuntu 14.04 installation pidstat is not installed:

The program 'pidstat' is currently not installed.  You can install it by typing:
sudo apt-get install sysstat

Is there no other way to get at this information?

@timonv
Copy link
Contributor

timonv commented Dec 1, 2015

Is there no other way to get at this information?

Hot damnit it used to be standard. I'll take a look.

@timonv
Copy link
Contributor

timonv commented Dec 1, 2015

All our 14.04 servers have pidstat. Maybe we should make sysstat a dependency? We can't read from file for non user processes, and I think that if we user other tools, it's just waiting for the first distro that doesn't use it.

@thijsc
Copy link
Member Author

thijsc commented Dec 2, 2015

pidstat is missing on the Vagrant image we're using in this project. Is there no alternative that does not require root access?

@timonv
Copy link
Contributor

timonv commented Dec 2, 2015

It has to get the data from somewhere. Need to look into this. How far do we want to go in parsing hard to get linux data in favor of (marginal) performance?

@timonv
Copy link
Contributor

timonv commented Jan 25, 2016

Funny, had some possibly more helpful errors with 1.6 for getting that integration test to work. Will look at it tonight.

@timonv
Copy link
Contributor

timonv commented Jan 25, 2016

Ok, false positive. No child stdout is still an issue. Added early return and a link to the issue in the tests. I think we need to figure something out as most probes call a child process. Apart from that, I think I'm done here.

@timonv
Copy link
Contributor

timonv commented Jan 25, 2016

cc @thijsc

@thijsc
Copy link
Member Author

thijsc commented Jan 28, 2016

We have a very similar test in another project:

    #[test]
    fn test_hostname() {
        let hostname = match Command::new("hostname").output() {
            Ok(p)  => {
                match from_utf8(&p.stdout) {
                    Ok(s)  => Some(s.trim().to_owned()),
                    Err(_) => None
                }
            },
            Err(e) => panic!("Getting hostname failed {}", e)
        }.unwrap();
        assert_eq!(Ok(hostname), super::hostname());
    }

Not sure why this would be failing here. It looks like the issue described in rust-lang/rust#12309 is about reading stdout from the test process itself. Not stdout from Command?

@timonv
Copy link
Contributor

timonv commented Feb 1, 2016

Alright cleaned up. Integration test still failing. The issue mentions tests not capturing stdout of subprocesses.

@thijsc
Copy link
Member Author

thijsc commented Mar 11, 2016

What's the latest status here? Is this one still stuck on the test issue?

@timonv
Copy link
Contributor

timonv commented Mar 14, 2016

Yep, it is

@thijsc
Copy link
Member Author

thijsc commented Apr 25, 2016

@timonv Do you still want to finish this pull or are your priorities elsewhere at the moment? :-)

@timonv
Copy link
Contributor

timonv commented Apr 25, 2016

Elsewhere!

On Mon, 25 Apr 2016 at 16:50, Thijs Cadier [email protected] wrote:

@timonv https://github.com/timonv Do you still want to finish this pull
or are your priorities elsewhere at the moment? :-)


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#14 (comment)

@thijsc
Copy link
Member Author

thijsc commented Apr 25, 2016

In that case I will close this pull, we're not using this feature at the moment.

@thijsc thijsc closed this Apr 25, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants