Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

take advantage of libfringe for context switch #4

Closed
Xudong-Huang opened this issue Jan 27, 2018 · 6 comments
Closed

take advantage of libfringe for context switch #4

Xudong-Huang opened this issue Jan 27, 2018 · 6 comments

Comments

@Xudong-Huang
Copy link
Owner

Xudong-Huang commented Jan 27, 2018

take advantage of libfringe context switch

please ref Xudong-Huang/may#31

@Xudong-Huang Xudong-Huang changed the title take advantage of [libfringe](https://github.com/edef1c/libfringe) context switch take advantage of libfringe for context switch Jan 27, 2018
@alkis
Copy link

alkis commented Jan 28, 2018

there are other ideas worth taking from libfringe:

  • generator constructor does not allocate
  • resume/suspend take arguments which is nicer then set_co_para/get_co_para

@Xudong-Huang
Copy link
Owner Author

now the libfringe can't handle panic properly (edef1c/libfringe#75), don't know why. and it seems that the crate has no update for a long time.

I will do more investigate to see if we can safely use the new context switch method based on libfringe. there is no clue how to handle the return instructions in libfringe. if user want return a useful result, it seems it not supported.

Some thing like this should work...

fn test_return() {
    let mut g = Gn::new_scoped(|_s| {
        return 42;
    });
    assert_eq!(g.next(), Some(42));
    assert!(g.is_done());
}

@Xudong-Huang
Copy link
Owner Author

Xudong-Huang commented Feb 1, 2018

I bench for both swap on my computer, seems that the context switching doesn't speed as 10 fold, but libfringe is really faster. the libfringe's bench is based on 10 times context switching for each bench iter. So it more than 10 times faster than generator-rs, this is really amazing.

generator-rs

test scoped_yield_bench      ... bench:          36 ns/iter (+/- 0)
test single_yield_bench      ... bench:          39 ns/iter (+/- 0)
test single_yield_with_bench ... bench:          36 ns/iter (+/- 0)

libfringe (Amanieu/unwind branch)

test arch::tests::swap ... bench:          25 ns/iter (+/- 0)

@Xudong-Huang
Copy link
Owner Author

$ cargo benchcmp unix_master.txt unix_libfringe.txt 
  name                     unix_master.txt ns/iter  unix_libfringe.txt ns/iter  diff ns/iter   diff %  speedup 
- create_gen               103                      199                                   96   93.20%   x 0.52 
- fnbox_bench              40                       48                                     8   20.00%   x 0.83 
- init_gen                 25                       125                                  100  400.00%   x 0.20 
+ scoped_yield_bench       30                       15                                   -15  -50.00%   x 2.00 
+ single_yield_bench       33                       16                                   -17  -51.52%   x 2.06 
+ single_yield_with_bench  30                       14                                   -16  -53.33%   x 2.14 

The init would take an extra yield context switch which is not necessary, but need a way to save the passed in closure.

Also the libfringe branch only support nightly build because we use the special asm clobber

The performance of context switch improved about 2 times. but usually this is not the bottleneck for a system, e.g. the coroutine context switch.

@Xudong-Huang
Copy link
Owner Author

remove the unnecessary init context switch

$ cargo benchcmp unix_master.txt unix_libfringe_1.txt
 name                     unix_master.txt ns/iter  unix_libfringe_1.txt ns/iter  diff ns/iter   diff %  speedup 
+ create_gen               103                      94                                      -9   -8.74%   x 1.10 
- fnbox_bench              40                       42                                       2    5.00%   x 0.95 
+ init_gen                 25                       23                                      -2   -8.00%   x 1.09 
+ scoped_yield_bench       30                       17                                     -13  -43.33%   x 1.76 
+ single_yield_bench       33                       18                                     -15  -45.45%   x 1.83 
+ single_yield_with_bench  30                       15                                     -15  -50.00%   x 2.00

@Xudong-Huang
Copy link
Owner Author

the tech used in this branch is hard to port to stable because we depend on inline assembly feature which is not likely to be stable very soon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants