Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Worker Thread Discussion #31

Closed
dcyoung opened this issue Jan 25, 2017 · 18 comments
Closed

Worker Thread Discussion #31

dcyoung opened this issue Jan 25, 2017 · 18 comments

Comments

@dcyoung
Copy link

dcyoung commented Jan 25, 2017

We've discussed the need to introduce a worker thread for IO. However, the implementation must consider cross-platform compatibility. The debate is whether to...

  1. Shift the libsweep library's implementation to C++, while still maintaining the C interface. The C++ implementation would allow for all threading to handled by C++11 standard thread library, and avoid the need for thread abstraction. This would both consolidate the code. @daniel-j-h You mentioned that this requires us to link in libstdc++/libc++ which is a pain for abi stability and shipping portable binaries. Could you elaborate on those two points a bit? Isn't shipping platform specific binaries pretty standard?

  2. Abstract the threading into a shared interface that is platform agnostic, and then provide platform specific implementations.

For option 2 (abstraction), we could do this in a similar fashion to serial.h + (serial_unix.c or serial_win.c). In this case, we'd have something like sweep_threading.h + (sweep_threading_unix.c or sweep_threading_win.c). This interface would define an abstracted thread, mutex, conditional var, and single producer single consumer queue. Then the methods from the serial and sweep files would make use of the threads and queues via this abstracted interface. The unix implementation could use pthreads. I'm still open to suggestions for the win implementation but it could use WINAPI. Using WINAPI would require the inclusion of <windows.h>. Any complications with the build process or bindings there? @daniel-j-h

What do we think about the pros/cons of both?

Given that the current style of sweepSDK is to avoid global state...

No global state: _s context objects (_s since POSIX reserves _t) hold state and have to be passed explicitly.

how should we store and reference the working queue? I spent a while today trying to implement a queue and a worker thread that conformed to the limitations of no global state and the Pimpl idiom, but I didn't get very far.... I'm still getting the hang of the Pimpl idom :/ @daniel-j-h if you've got a plan for how to accomplish this I'm happy to collaborate however I can.

@daniel-j-h
Copy link
Collaborator

The pain point with libstdc++ / libc++ is as follows:

if we ever want to ship pre-built binaries (for example to let users do npm install sweepjs and not have a compiler around for it to work) practically speaking what we have to do is dig out a old Linux distribution and link against its C++ stdlib.

libstdc++ is forward compatible using versioned symbols: we can build against older versions and still safely run against a newer version. This also means we have to find a libstdc++ version which already is C++11 feature complete (for threads) and at the same time old enough for us to support newer versions.

This is for example what the Python community is doing for building native wheels: they spin up Docker container with an old CentOS and compile+link their native code their.

If we decide this is worth the hassle and makes our life easier then I'm all up for changing the implementation to be in C++ linking in libstdc++ / libc++ and keeping a C interface to the outside.

But we should be aware of its downsides.

For reference: this is what I'm doing in https://github.com/daniel-j-h/libosrmc.

@daniel-j-h
Copy link
Collaborator

If we want to go the abstraction route:

the C interfaces resemble the C++ RAII pattern (constructor / destructor pair) in that you always have

  • ctx* thing_construct(args);
  • void thing_destruct(ctx*);

and ctx* being a opaque pointer to the object (think of it as a typed void* - this guarantees for ABI compatibility). Functions then work on the object as in void thing_push(ctx*, int); and int thing_pop(ctx*);.

This is equivalent to

class thing {
  thing();
  ~thing();
  void push(int);
  int pop();
};

works beautifully with FFI bindings, hides implementation details behind a void* for ABI compatibility, is easy to wrap in bindings and makes ownership semantics clear.

For a thread abstraction you would have an interface maybe something like

typedef struct sweep_thread* sweep_thread_s; // opaque pointer

sweep_thread_s sweep_thread_construct(FnPtr, Args); // constructor
void sweep_thread_destruct(sweep_thread_s); // destructor
void sweep_thread_join(sweep_thread_s); // arbitrary functionality

and then in the POSIX implementation file you fill in the blanks with the pthread API, e.g. storing the pthread_t thread id inside the context struct, creating the thread and setting thread attributes in the constructor. For the Windows implementation file you use the platform APIs there and do the same.

The interface (the .h file) stays the same across platforms. Only the implementation conforming to that interface changes. The switch happens in the build system: there we detect the OS and either compile thread_unix.c or thread_windows.c.

We're already using this technique throughout the libsweep library.

Your question about the queue and where to hold onto it: the queue's lifetime depends on the device, so I would store a queue object in the sweep_device struct, creating the queue in the sweep_device_construct constructor and destroying it in the destructor again.

@dcyoung
Copy link
Author

dcyoung commented Jan 27, 2017

@daniel-j-h thanks for the detailed response. That was extremely helpful. I took another stab at it today, which resulted in some success.
Obviously not ready to make a pull request for it yet... but how is this looking:
https://github.com/dcyoung/sweep-sdk/commit/b12637d108f8c412ab79c2fc4b98784513e3ba4a

@dcyoung
Copy link
Author

dcyoung commented Jan 27, 2017

Saw the pull request for the C++11 implementation. Are you just experimenting or advocating that we move forward with the C++ route?

@daniel-j-h
Copy link
Collaborator

daniel-j-h commented Jan 27, 2017 via email

@MikeGitb
Copy link
Contributor

MikeGitb commented Feb 1, 2017

While providing an internal io thread is probably a good default, I think there should be enough flexibility in the design of the library such that a user can provide its own thread or work completely single threaded.

@daniel-j-h
Copy link
Collaborator

Hm I understand how dependency injection could help there - not sure how to do this in an elegant way though. We would have to provide ways in the C API to set threading functionality for the C++ implementation to use. Which means we have to provide a thread abstraction. The reason we want to switch the implementation to C++ is exactly because we don't want to provide this abstraction ourselves.

@MikeGitb do you have ideas here?

@MikeGitb
Copy link
Contributor

MikeGitb commented Feb 2, 2017

Sorry, I expressed myself poorly. All I meant was that there should be some function that can be called by the user and e.g. returns after one scan (e.g. just like sweep_serial_device_read does now). The user then can call this function from whatever thread he want's - continuously in a loop or just once or interwoven with other function. The same function could then probably be called inside whatever worker thread the library provides. So what I'd expect the sweep class interface to look like would be something like this:

class sweep {
public:
     run(); //continuously produces new scans an puts them in some buffer. Only exits after `stop` is called
     run_async(){ _wth = std::thread(&sweep::run,this);  }; //  spins up internal worker thread that calls run
     void stop();
     scan getScan(); // returns scans produced by run()
     scan make_scan() // triggers and return a single scan synchronously - works without `run()`
     ...
private:
   std::thread _wth;
};

This is really just a sketch, but I would imagine, that people on less powerful embedded systems would appreciate it, if you give them full control over what threads run in their application. Wether run() should be public or just makes the interface to complicated I don't know.
This is btw. similar to the interface ASIO provides (although it doesn't give you a run_async() method).

@daniel-j-h
Copy link
Collaborator

Update: we switched the implementation to C++11. This should unblock further progress here.

I'm still thinking about the semantics here: the worker thread should probably put scans into a bounded (how large?) first-in last-out queue. This would give users a chance to poll slowly for scans without the issue of getting out of sync. But it would also mean the users seeing a drop in scans in case the user is not able to poll for scans fast enough.

Ideas on how to design this wrt. simplicity, usability and making it easy for users to understand the semantics?

@MikeGitb
Copy link
Contributor

As I said, I think there should remain a low level interface that pretty much works like the current and doesn't spin up a thread or anything like that. The way I see it, the serial interface on most operating systems will already provide some degree of buffering, so that might actually be enough in some cases and if not the user can use whatever threading strategy is native to his application. However, I realize now that this isn't that important as this is an open source-library anyway and if a user really needs full control he can just hack the lib.

That being said, I think your suggested design sounds pretty reasonable for the default interface (although I beleive you mean first-in-first-out) e.g.:

  • start_scanning starts the sensor and also spins up the internal worker thread
  • stop_scanning stops both (sensor and thread)
  • get_scan is a blocking call (maybe with timeout) that returns the next available 360 scan (returning a scan of size 0 indicates an error or a timeout)
  • For internally implementation use a bounded, threadsafe queue (delete oldest entries when maximum number of elements is reached). A simple implementation can just wrap a std::queue with a mutex and a condition variable. providing a send and a try_receive function.

@daniel-j-h
Copy link
Collaborator

Ah, yes @MikeGitb good catch - you are right. I agree with you on all points.

The problem I see here is with slow user polling. But dropping scans seems to be a trade-off we have to make here. Re. the bounded queue's size: with the device's max. motor speed of 10 Hz I would say the queue should be of length somewhere between 10--20 max. to buffer 1--2s worth of scans. Users should be able to keep up with that or we start dropping scans.

@MikeGitb
Copy link
Contributor

Well, if the user doesn't keep up you are going to loos scans one way or another I don't think there is a way around it. I would however make the buffer size configurable by the user.

@MikeGitb
Copy link
Contributor

In case you are looking for inspiration: Here is an untested version of libsweep that uses a thread internally but doesn't change the interface: https://github.com/MikeGitb/sweep-sdk/tree/worker_thread/libsweep.

@dcyoung
Copy link
Author

dcyoung commented Apr 6, 2017

@MikeGitb Thanks for the inspiration! The process of creating and then joining a thread wasn't quite as simple without the other changes you made. Instead I opted to implement a detached background thread that is created when scanning starts and cleans itself up when scanning ends. With a thread safe queue, we shouldn't have to deal with waiting or thread management ourselves.

The process works like this:

  • sweep_device struct
    • contains thread safe ScanQueue
    • contains bool stop_thread to indicate that an active thread should stop
  • sweep_device_start_scanning(device, ...)
    • various checks
    • starts scanning via sweep_device_attempt_start_scanning
    • flushes device->queue
    • creates background thread
      • sets device->stop_thread bool to false
      • creates a new thread to run sweep_device_accumulate_scans on the passed in device
      • detaches this thread so it can run in the background
  • sweep_device_stop_scanning(device, ...)
    • sets device->stop_thread bool to true
    • stops the scanning
    • sets device->is_scanning to false on response
  • sweep_device_accumulate_scans(device)
    • checks device->stop_thread at the granularity of reading individual sensor readings (ie: won't ever try to read a DX response)
    • gathers readings into properly formatted scans (now actually 0deg->360deg, instead of including sync reading of subsequent scans)
    • queues completed scans
  • sweep_device_get_scan(device, ...)
    • retrieves (dequeues) a scan from the device->scan_queue
  • sweep_device_get_scan_direct(device, ...)
    • functions like the old version of sweep_device_get_scan which gathered scans directly from the 2nd reading of the scan to the 1st (sync) reading of the subsequent scan).
  • sweep_device_attempt_start_scanning(device, ...)
    • same as old version of sweep_device_start_scanning, without the motor speed or stability checks etc.
    • sets device->is_scanning to true on response

High level (simple) use case is then unchanged:

sweep_device_start_scanning(sweep, &error);
...
for (int32_t num_scans = 0; num_scans < 10; ++num_scans) {
    sweep_scan_s scan = sweep_device_get_scan(sweep, &error);
    ...
sweep_device_stop_scanning(sweep, &error);
...

Low level use case is then:

sweep_device_attempt_start_scanning(sweep, &error);
...
for (int32_t num_scans = 0; num_scans < 10; ++num_scans) {
    sweep_scan_s scan = sweep_device_get_scan_direct(sweep, &error);
    ...
sweep_device_stop_scanning(sweep, &error);
...

I've implemented all of this in the scan-worker branch
I'm very open to feedback. After a little clean up I'll submit a PR.

@MikeGitb
Copy link
Contributor

MikeGitb commented Apr 6, 2017

Looking forward to it. Don't forget to make stop_thread an atomic variable.

@dcyoung
Copy link
Author

dcyoung commented Apr 6, 2017

Updated. Good catch.

Edit: looks like the travis build failed after changing the bool to an atomic variable. I'll have to investigate this.

@dcyoung
Copy link
Author

dcyoung commented Apr 11, 2017

Updated. Seems to work on Travis builds now. PR is up #70

@dcyoung
Copy link
Author

dcyoung commented Apr 21, 2017

Resolved in b867c37

@dcyoung dcyoung closed this as completed Apr 21, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants