Skip to content
This repository has been archived by the owner on Jul 16, 2021. It is now read-only.

Adding a Shuffler Transformer #127

Closed
AtheMathmo opened this issue Sep 10, 2016 · 1 comment · Fixed by #135
Closed

Adding a Shuffler Transformer #127

AtheMathmo opened this issue Sep 10, 2016 · 1 comment · Fixed by #135

Comments

@AtheMathmo
Copy link
Owner

Note that this is blocked by #117 .

With the addition of swap_rows and swap_cols it would be nice if we added a Transformer to shuffle the rows of input data. We should use an algorithm similar to the in_place_fisher_yates in learning/toolkit/rand_utils to do the actual shuffle.

There are a couple ways that we can handle the inv_transform function.

  • Don't implement it at all and just return an Err, this is really ugly.
  • Implement it by keeping track of the swaps and doing them in reverse order. Has some memory overhead for a feature that will rarely be needed.
  • Separate out the traits: pub trait Invertible: Transformer. And don't implement Invertible for Shuffler.

None of these solutions feel all that clean. We could adopt the same approach as DBSCAN and hide the memory overhead behind a flag on the Shuffler struct. This is really ugly but does allow the user to pick the best option for them.

Would be happy to get feedback on these options.

@AtheMathmo
Copy link
Owner Author

There is a POC on branch shuffler - it needs the rulinalg 0.3 update so that we can use swap_rows.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant