Process IO probe #14

timonv · 2015-11-22T09:09:35Z

io (in/out in bytes/ops)

thijsc · 2015-11-16T09:04:59Z

There was a comment here right? :-)

For our use case it's important that the probes don't need root permissions. It's not a design constraint to only use /proc, I'd say we use the most convenient route for every metric we want.

timonv · 2015-11-16T11:10:46Z

Ah yes, deleted it for a better one and forgot the latter. I'll look into it.

timonv · 2015-11-22T09:18:37Z

Ok, pidstat version. The integration test still fails even though I'm on stable now. Same issue as with other application, can't read stdout from child in tests.

So I've been playing a little with trying to pretify the code and develop some style. Some things I came up with:

Don't do variable assignment if you don't have to
Many try!'s and nested try!'s are a code smell, can be fixed by using monads and their methods
to_string() is a code smell (it's really a clone). Don't use it unless the new string adds value. Hence the lifetimes.
Hardcoded strings are bad, even for error messages
From rules can remove map_err lines, however, they are kinda viral in error, so you might end up adding more lines than you remove.

Some stylistic points on the project itself:

What's the point of the os module? I think this works just as well, saves a lot of lines and removes a layer of indirection.
I think errors should have &'static str instead of String. This allows just using the const and I think is more readable. You can also use &str to match against a pattern.
If we want to opensource we should rustdoc :-)

Okthxbye, back to phoenix and elixir :-P

timonv · 2015-11-22T09:20:13Z

Appreciate to do a review first, then I'll clean it up

timonv · 2015-11-22T09:38:25Z

Hmm, another thing to play with is a ProbeError::Error<T> which just wraps any type of error. To be used for uncommon errors that don't need explicit handling (i.e. parse error on a byteslice that happens once in the app). Thoughts?

thijsc · 2015-11-22T19:58:38Z

I like the map_err and and_then based style. 👍 on using that instead of a lot of try's.

Responses to your other points:

What's the point of the os module? I think this works just as well, saves a lot of lines and removes a layer of indirection.

This allows us to put all the implementation code for a certain os in one clear separated scope. This will be handy when we add support for more os'es.

Also this crate will be a dependency for the agent which runs on multiple platforms. So it's vital that the code does compile on other platforms, even though it doesn't offer any functionality. The easiest way to do that is use an os mod so you can disable all functions with one attribute.

I think errors should have &'static str instead of String. This allows just using the const and I think is more readable. You can also use &str to match against a pattern.

I did that before, the disadvantage is that you can never have dynamic information in your error messages. See https://github.com/appsignal/probes/blob/master/src/memory.rs#L87 for an example.

Maybe there's a smarter solution for this possible?

If we want to opensource we should rustdoc :-)

Indeed

Hmm, another thing to play with is a ProbeError::Error which just wraps any type of error. To be used for uncommon errors that don't need explicit handling (i.e. parse error on a byteslice that happens once in the app). Thoughts?

I kind of like the enum based approach because it makes it very clear which types of errors are expected. Also you can use pattern matching to make sure you cover all cases. Of course at some number of expected types it becomes impractical.

I'd say we leave it as is and if it turns out we end up with more than 10ish types we take another look at it?

timonv · 2015-11-24T16:31:13Z

This allows us to put all the implementation code for a certain os in one clear separated scope. This will be handy when we add support for more os'es.

Also this crate will be a dependency for the agent which runs on multiple platforms. So it's vital that the code does compile on other platforms, even though it doesn't offer any functionality. The easiest way to do that is use an os mod so you can disable all functions with one attribute.

Makes sense!

Maybe there's a smarter solution for this possible?

Can't you do:

const ERROR_MESSAGE: &'static str = "I did a booboo at {}";

...
Error(&format(ERROR_MESSAGE, "this one time at bandcamp"))

I kind of like the enum based approach because it makes it very clear which types of errors are expected. Also you can use pattern matching to make sure you cover all cases. Of course at some number of expected types it becomes impractical.

I'd say we leave it as is and if it turns out we end up with more than 10ish types we take another look at it?

Love the enum approach and absolutely want to keep it. There are some errors (like with the map_err here) that are very spare / happen only at one point. So you could choose to just use the generic wrapper in the enum and not have to use the map_err to get the type chain correctly. I feel that both leave room for playing dirty/lazy, but the first is the better approach as you force preservation of the error. It doesn't feel right to add every little error to the enum, only the ones you want to give more context and/or are common to the application.

timonv · 2015-11-24T16:37:07Z

Also, pidstat gives different names than /proc/io. Should we rename the values to something more human?

thijsc · 2015-11-25T14:08:59Z

Can't you do:

That ends up having the same result as just inlining the string no?

Love the enum approach and absolutely want to keep it. There are some errors (like with the map_err here) that are very spare / happen only at one point. So you could choose to just use the generic wrapper in the enum and not have to use the map_err to get the type chain correctly. I feel that both leave room for playing dirty/lazy, but the first is the better approach as you force preservation of the error. It doesn't feel right to add every little error to the enum, only the ones you want to give more context and/or are common to the application.

Makes sense. Not against this if we find more of the less-important errors. I think at the moment we're not there yet.

thijsc · 2015-11-25T14:09:14Z

Also, pidstat gives different names than /proc/io. Should we rename the values to something more human?

👍

timonv · 2015-11-27T07:04:26Z

👍

Suggestions?

Updated the code. Integration test running pidstat still fails as the child's stdout still isn't captured in tests. Any ideas to fix that? I'd like to have this part under tests as well. I could use manual pipes instead (which I think would work as master Stdout is captured), but it's a lot more code :(

timonv · 2015-11-27T07:08:25Z

How about

read_kbs
write_kbs
canceled_kbs

iodelay isn't in proc/io btw

Block I/O delay of the task being monitored, measured in clock ticks. This metric includes the delays spent waiting for sync block I/O completion and for swapin block I/O completion.

Bit worried it's a (super small) blocking call. Seems instant though locally (with time)

thijsc · 2015-11-27T15:25:26Z

👍 on the names.

thijsc · 2015-11-27T15:26:25Z

src/process_io.rs

+
+    pub fn read_process_io(pid: u32) -> Result<ProcessIO> {
+        let raw = try!(run_pidstat(pid));
+        println!("RAW: {}", raw);


You forgot to delete this :-)

thijsc · 2015-11-27T15:53:39Z

Where did you find that information about stdout capturing not working in test @timonv?

timonv · 2015-11-27T17:53:03Z

In the test suite. It's not working (at least for a child process)

timonv · 2015-11-27T17:54:08Z

I have the same problem with github.com/timonv/benv, but the binary works fine

timonv · 2015-11-27T17:58:19Z

Updated. Child process stdout still an issue

thijsc · 2015-11-29T09:23:33Z

src/process_io.rs

+            .arg("-d")
+            .arg(format!("-p {}", pid))
+            .output()
+            .map_err(|_| ProbeError::UnexpectedContent(PIDSTAT_READ_ERROR.to_string()))


Why do you not just pass the underlying error along?

thijsc · 2015-11-29T19:45:34Z

A just noticed another small thing: The pid variables should be of type libc::pid_t. That's an alias to whichever integer type the OS happens to use for pids. This ensures maximum compatibility.

thijsc · 2015-11-29T20:02:54Z

On my default Ubuntu 14.04 installation pidstat is not installed:

The program 'pidstat' is currently not installed.  You can install it by typing:
sudo apt-get install sysstat

Is there no other way to get at this information?

timonv · 2015-12-01T11:29:50Z

Is there no other way to get at this information?

Hot damnit it used to be standard. I'll take a look.

timonv · 2015-12-01T11:36:10Z

All our 14.04 servers have pidstat. Maybe we should make sysstat a dependency? We can't read from file for non user processes, and I think that if we user other tools, it's just waiting for the first distro that doesn't use it.

thijsc · 2015-12-02T15:34:18Z

pidstat is missing on the Vagrant image we're using in this project. Is there no alternative that does not require root access?

timonv · 2015-12-02T19:47:25Z

It has to get the data from somewhere. Need to look into this. How far do we want to go in parsing hard to get linux data in favor of (marginal) performance?

timonv · 2016-01-25T09:50:03Z

Funny, had some possibly more helpful errors with 1.6 for getting that integration test to work. Will look at it tonight.

timonv · 2016-01-25T20:14:19Z

Ok, false positive. No child stdout is still an issue. Added early return and a link to the issue in the tests. I think we need to figure something out as most probes call a child process. Apart from that, I think I'm done here.

timonv · 2016-01-25T20:14:25Z

cc @thijsc

thijsc · 2016-01-28T20:48:02Z

We have a very similar test in another project:

    #[test]
    fn test_hostname() {
        let hostname = match Command::new("hostname").output() {
            Ok(p)  => {
                match from_utf8(&p.stdout) {
                    Ok(s)  => Some(s.trim().to_owned()),
                    Err(_) => None
                }
            },
            Err(e) => panic!("Getting hostname failed {}", e)
        }.unwrap();
        assert_eq!(Ok(hostname), super::hostname());
    }

Not sure why this would be failing here. It looks like the issue described in rust-lang/rust#12309 is about reading stdout from the test process itself. Not stdout from Command?

timonv · 2016-02-01T07:46:28Z

Alright cleaned up. Integration test still failing. The issue mentions tests not capturing stdout of subprocesses.

thijsc · 2016-03-11T16:31:19Z

What's the latest status here? Is this one still stuck on the test issue?

timonv · 2016-03-14T09:13:49Z

Yep, it is

thijsc · 2016-04-25T14:50:09Z

@timonv Do you still want to finish this pull or are your priorities elsewhere at the moment? :-)

timonv · 2016-04-25T17:03:42Z

Elsewhere!

On Mon, 25 Apr 2016 at 16:50, Thijs Cadier [email protected] wrote:

@timonv https://github.com/timonv Do you still want to finish this pull
or are your priorities elsewhere at the moment? :-)

—
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#14 (comment)

thijsc · 2016-04-25T19:35:16Z

In that case I will close this pull, we're not using this feature at the moment.

timonv self-assigned this Nov 15, 2015

timonv assigned thijsc and unassigned timonv Nov 22, 2015

timonv force-pushed the 14-process_io_probe branch from c7de6a0 to 1691967 Compare November 27, 2015 07:02

thijsc reviewed Nov 27, 2015
View reviewed changes

timonv force-pushed the 14-process_io_probe branch from 1691967 to e686e5f Compare November 27, 2015 17:58

thijsc mentioned this pull request Nov 29, 2015

Process memory #16

Merged

thijsc reviewed Nov 29, 2015
View reviewed changes

timonv force-pushed the 14-process_io_probe branch from 99e274b to 2de44c4 Compare January 25, 2016 09:49

timonv added 5 commits February 1, 2016 08:45

Implement process IO

e5f14b8

Switch to libc::pid_t

52d8496

fmt

74f780c

Docs and fmt

7720e73

PR Cleanup

bc6ab82

timonv force-pushed the 14-process_io_probe branch from 1168fbf to bc6ab82 Compare February 1, 2016 07:45

thijsc closed this Apr 25, 2016

Process IO probe #14

Process IO probe #14

Conversation

timonv commented Nov 22, 2015

thijsc commented Nov 16, 2015

timonv commented Nov 16, 2015

timonv commented Nov 22, 2015

timonv commented Nov 22, 2015

timonv commented Nov 22, 2015

thijsc commented Nov 22, 2015

timonv commented Nov 24, 2015

timonv commented Nov 24, 2015

thijsc commented Nov 25, 2015

thijsc commented Nov 25, 2015

timonv commented Nov 27, 2015

timonv commented Nov 27, 2015

thijsc commented Nov 27, 2015

thijsc Nov 27, 2015

Choose a reason for hiding this comment

thijsc commented Nov 27, 2015

timonv commented Nov 27, 2015

timonv commented Nov 27, 2015

timonv commented Nov 27, 2015

thijsc Nov 29, 2015

Choose a reason for hiding this comment

thijsc commented Nov 29, 2015

thijsc commented Nov 29, 2015

timonv commented Dec 1, 2015

timonv commented Dec 1, 2015

thijsc commented Dec 2, 2015

timonv commented Dec 2, 2015

timonv commented Jan 25, 2016

timonv commented Jan 25, 2016

timonv commented Jan 25, 2016

thijsc commented Jan 28, 2016

timonv commented Feb 1, 2016

thijsc commented Mar 11, 2016

timonv commented Mar 14, 2016

thijsc commented Apr 25, 2016

timonv commented Apr 25, 2016

thijsc commented Apr 25, 2016