Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactores Error Translation, resolves #9 #80

Merged
merged 1 commit into from
May 8, 2017
Merged

Conversation

daniel-j-h
Copy link
Collaborator

I won't be near a device until the weekend - maybe you can already give it a spin and / or review it already. Should compile in release and debug for dummy lib and driver.

See

return sweep_device_construct(port, 115200, error);
} catch (const std::exception& e) {
*error = sweep_error_construct(e.what());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure that isn't legal: e.what() returns a pointer to a string that doesn't exist after the destruction of the exception (at the end of the catch clause). sweep_error_construct() only copies that pointer, but doesn't copy the string, so when you try to access it later you read from freed memory.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right - I'm not 100% sure about the lifetime for exceptions here, but your explanation makes sense. Fix is to copy the string into the error and let the error_destruct function do the proper cleanup.

Thanks for catching these sneaky issues, you're a great help here! :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Un) Luckily I ran into the same problem during my initial attempt to introduce exceptions

@MikeGitb MikeGitb mentioned this pull request May 5, 2017
@@ -288,111 +235,72 @@ void sweep_device_accumulate_scans(sweep_device_s device) {
received = 1;
}
}
} catch (const std::exception& e) {
// worker thread is dead at this point
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If an error is thrown inside of sweep_device_accumulate_scans() and then caught by this catch block, how does this propagate to the user? I'm guessing that it wouldn't, but that get_scan from the main thread would error shortly after anyways? Any reason why this wouldn't be sufficient? If accumulate_scans() errors, we are pretty much guaranteed to see get_scan error right after, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

accumulate_scans does not have an out error argument:

// Accumulates scans in the queue (method to be used by background thread)
void sweep_device_accumulate_scans(sweep_device_s device);

So it's not possible to propagate it to the user. Maybe this was an oversight?

Copy link
Contributor

@MikeGitb MikeGitb May 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, the user-visible result of the worker thread ending unexpectedly is that sweep_device_get_scan will deadlock. Imho that function should signal an error in the worker thread. Also, I think it should be possible to cancel a call to sweep_device_get_scan (and/or use a timeout)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be clear, I think this can b addressed in a separate PR, as this isn't something newly introduced by this PR

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

accumulate_scans does not have an out error argument:

Given that this method is implementation ONLY and not part of the API, we could easily change its return type. But, given that accumulate_scans is only called from start_scanning, i'm unclear how we'd use the return type to propagate the error.

that function should signal an error in the worker thread

Could you elaborate on what you meant here. The get_scan method is running in the main thread, not the worker thread. Worker thread just accumulate scans using accumulate_scans.

it should be possible to cancel a call to sweep_device_get_scan (and/or use a timeout)

I completely agree that we should add a timeout to get_scan... the specifics of which we can flesh out later. I think it might be useful to do a PR that introduces timeouts for a few different methods (and the serial interactions) in a uniform manner.

Also agree that this PR doesn't introduce any new behavior. If there is not an obvious way to handle the exception in this method currently, then we should push this PR 👍 and address improvements in later PRs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so what I meant was that errors in the worker thread (I.e. Inside accumulate_scans) could be stored in the queue (e.g. via an additional data member) and then be presented to the user at the next call to get_scan

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh that makes more sense. Would it be better to keep the queue abstract and put the data member in the device structure? Or does it seem logical to put it in the queue given this is all related to thread communication... and the queue is designed as a thread safe queue.

@daniel-j-h daniel-j-h force-pushed the exceptions-in-impl branch 2 times, most recently from c72450a to 7b7371b Compare May 6, 2017 09:29
@daniel-j-h
Copy link
Collaborator Author

Given that this method is implementation ONLY and not part of the API, we could easily change its return type.

Hum can someone then tell me what's going on here in the lib's interface:

https://github.com/scanse/sweep-sdk/blob/master/libsweep/include/sweep/sweep.h#L60

1/ Why is
void sweep_device_wait_until_motor_ready(sweep_device_s device, sweep_error_s* error);
not exposed (i.e. missing the SWEEP_API visibility macro)?

2/ Why is
void sweep_device_accumulate_scans(sweep_device_s device); in the sweep.h interface header when it's only used internally? Internal functions have no place in this header.

3/ Why are

void sweep_device_attempt_start_scanning(sweep_device_s device, sweep_error_s* error);
sweep_scan_s sweep_device_get_scan_direct(sweep_device_s device, sweep_error_s* error);
void sweep_device_attempt_set_motor_speed(sweep_device_s device, int32_t hz, sweep_error_s* error);

not exposed (SWEEP_API) either? Are they internal functions, too?

@dcyoung
Copy link

dcyoung commented May 8, 2017

The firmware + protocol changeups required that the device be ready before starting data acquisition or before changing motor speed... otherwise the response indicates a failure. PR #70 introduced the sweep_device_wait_until_motor_ready() which was used internally by start_scanning() and set_motor_speed() in order to wait until the device is ready, effectively avoiding the need to parse a failure response.

Once the device was deemed ready, the actual process of starting data acquisition or setting the motor speed is accomplished via direct methods (the methods that were in place before PR #70 ).

In PR #70 and the discussion spread out through various issues, there was talk of additions to the API that offered users access to the original direct (non-blocking) versions of these methods. These non-blocking versions would error if the device wasn't ready, instead of block. This was mainly intended to increase compatibility with platforms such as micro-controllers which really struggle with long blocks. Because the required functionality is the same as the use case within the new start_scanning and set_motor_speed methods, we extracted the common functionality into sweep_device_attempt_start_scanning and sweep_device_attempt_set_motor_speed.

ie:

// will block until device is ready... then start scanning
sweep_device_start_scanning(...) {
    ...
    sweep_device_attempt_start_scanning(...)
}
// will error if device is not ready
sweep_device_attempt_start_scanning(...) {
    ...
}

However, as discussed at the end of PR #70... we decided to postpone the exposure of these direct methods until a later PR. Hence the SWEEP_API visibility macro was removed so they would not be exported. If the function definition should have been removed from the header as well, that was an oversight and a result of the hasty merge that had to happen. People were receiving devices with new firmware and libsweep was outdated.

As for void sweep_device_wait_until_motor_ready(sweep_device_s device, sweep_error_s* error);. It was originally exposed, but was causing errors when being called from one of the bindings. I believe it was sweeppy. Given that the start_scanning and set_motor_speed already use it internally, and there are no other methods that require waiting for the device to be ready... it was removed from the public API and both bindings. This was mentioned at the end of PR #70 and in the issues.

Internal functions have no place in this header.

Duly noted. I was only leaving the SWEEP_API macro off to indicate internal only. We can correct this with another PR.

Again, NONE of this is new to this PR, so we should be able to merge this and deal with those improvements later.

@daniel-j-h
Copy link
Collaborator Author

Again, NONE of this is new to this PR, so we should be able to merge this and deal with those improvements later.

Yes, the discussion above just just brought these things up for me. Sorry for derailing the pull request here a bit.

I will merge this in - what needs to be done then is:

  • API header fixes for internal functions
  • Provide an out error for accumulate_scans and handle possible deadlock

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants