Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

internal/stack: Use control flow for state #110

Merged
merged 6 commits into from
Oct 22, 2023
Merged

internal/stack: Use control flow for state #110

merged 6 commits into from
Oct 22, 2023

Conversation

abhinav
Copy link
Collaborator

@abhinav abhinav commented Oct 21, 2023

In anticipation of parsing more information from stack traces make the
stack trace parsing logic more manageable by moving it from a state
machine into a layout closer to a recursive descent parser.

That is, instead of a central loop that reads input line-by-line
and needs to manage its various states:

current, result := ...
for {
    input := read()
    if cond(input) {
        result.append(current)
        current = startNew(input)
    } else {
        current = accumulate(input)
    }
}
result = flush(current)

Break it down so that parsing of individual results is its own function,
representing the state machine via control flow.

result := ...
for {
    input := read()
    if cond(input) {
        result.append(parseOne())
    }
}

// where

func parseOne(input) {
    value := ...
    for ; !cond(input); input = read() {
        value = accumulate(input)
    }
    return value
}

The net effect of this is to make the parsing logic more maintainable
once it gets more complex -- adds more states.

For example, to parse more information for individual stacks
with a state machine, we'd have to make the main loop more complex.
State for an individual stack (e.g. "all the functions in the stack")
will leak into the state management for the whole state machine.
On the other hand, with this method, we'll only modify parseStack,
keeping its responsiblity encapsulated to parsing a single stack trace.

This idea was also demonstrated recently in the first section of
Storing Data in Control flow by Russ Cox.


To make it easy to write this parser, we switch from bufio.Reader
to bufio.Scanner, and wrap it with the ability to "Unscan":
basically "don't move forward on next Scan()".

Lastly, we need to bump the go directive in go.mod to Go 1.20
to allow use of errors.Join.

In anticipation of parsing more information from stack traces make the
stack trace parsing logic more manageable by moving it from a state
machine into a layout closer to a recursive descent parser.

That is, instead of a central loop that reads input line-by-line
and needs to manage its various states:

    current, result := ...
    for {
        input := read()
        if cond(input) {
            result.append(current)
            current = startNew(input)
        } else {
            current = accumulate(input)
        }
    }
    result = flush(current)

Break it down so that parsing of individual results is its own function,
representing the state machine via control flow.

    result := ...
    for {
        input := read()
        if cond(input) {
            result.append(parseOne())
        }
    }

    // where

    func parseOne(input) {
        value := ...
        for ; !cond(input); input = read() {
            value = accumulate(input)
        }
        return value
    }

The net effect of this is to make the parsing logic more maintainable
once it gets more complex -- adds more states.

For example, to parse more information for individual stacks
with a state machine, we'd have to make the main loop more complex.
State for an individual stack (e.g. "all the functions in the stack")
will leak into the state management for the whole state machine.
On the other hand, with this method, we'll only modify parseStack,
keeping its responsiblity encapsulated to parsing a single stack trace.

This idea was also demonstrated recently in the first section of
[Storing Data in Control flow by Russ Cox][1].

  [1]: https://research.swtch.com/pcdata#step

---

To make it easy to write this parser, we switch from bufio.Reader
to bufio.Scanner, and wrap it with the ability to "Unscan":
basically "don't move forward on next Scan()".
This is needed for use of `errors.Join`.
@codecov
Copy link

codecov bot commented Oct 21, 2023

Codecov Report

Merging #110 (aee45b6) into master (f995fdb) will increase coverage by 1.60%.
The diff coverage is 93.50%.

@@            Coverage Diff             @@
##           master     #110      +/-   ##
==========================================
+ Coverage   96.58%   98.18%   +1.60%     
==========================================
  Files           5        6       +1     
  Lines         234      276      +42     
==========================================
+ Hits          226      271      +45     
+ Misses          5        4       -1     
+ Partials        3        1       -2     
Files Coverage Δ
internal/stack/scan.go 100.00% <100.00%> (ø)
internal/stack/stacks.go 95.14% <92.53%> (+6.41%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

internal/stack/scan_test.go Show resolved Hide resolved
internal/stack/stacks.go Outdated Show resolved Hide resolved
internal/stack/stacks.go Outdated Show resolved Hide resolved
fullStack: &bytes.Buffer{},
stack, err := p.parseStack(line)
if err != nil {
p.errors = append(p.errors, err)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it make sense to continue scanning once we hit an error? especially since any error means we don't use the results

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only difference is that it'll report all the errors in all the traces instead of stopping at the first one.
I don't feel strongly about doing it this way. Do you think it's better not to?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only reason not to do this is if a failure here can cause subsequent parses to fail. However, since this method only looks for the goroutine prefix, which wouldn't be affected by where we stopped parsing, I think this is fine.

internal/stack/scan.go Outdated Show resolved Hide resolved
@abhinav abhinav merged commit 25cbb67 into master Oct 22, 2023
6 of 7 checks passed
@abhinav abhinav deleted the parse-flow branch October 22, 2023 18:15
abhinav added a commit that referenced this pull request Oct 23, 2023
Adds support to the stack parser for reading the full list of functions
for a stack trace.

NOTE:
The function that created the goroutine
is NOT considered part of the stack.

We don't maintain the order of the functions
since that's not something we need at this time.
The functions are all placed in a set.

This unblocks #41 and allows implementing an
IgnoreAnyFunction option (similar to the stalled #80 PR).

Depends on #110
@mway mway mentioned this pull request Oct 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants