Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop rechunking of cdist's result #78

Merged
merged 1 commit into from
Oct 7, 2017
Merged

Conversation

jakirkham
Copy link
Owner

There should be no need to rechunk cdist's result to match the input arrays. After all this chunking should just fall out from the operations that we have done in cdist. The only way the chunks would not match is if there was some rechunking done behind the scenes. If rechunking was done behind the scenes, there probably was a reason for it and thus we should leave the result's chunks alone.

This rechunking was likely added to guarantee that pdist's triu-based optimization strategy would work (as triu requires square chunks). However we have dropped the triu-based optimization from pdist in favor of slicing out the relevant pieces to keep ourselves. So there is no requirement from pdist to have cdist rechunk the result. Further if we did readd some triu-based optimization to pdist, it would be up to pdist to guarantee the chunking was appropriate. So pdist's former requirement should not constrain cdist's result's chunks.

There should be no need to rechunk `cdist`'s result to match the input
arrays. After all this chunking should just fall out from the operations
that we have done in `cdist`. The only way the chunks would not match is
if there was some rechunking done behind the scenes. If rechunking was
done behind the scenes, there probably was a reason for it and thus we
should leave the result's chunks alone.

This rechunking was likely added to guarantee that `pdist`'s
`triu`-based optimization strategy would work (as `triu` requires square
chunks). However we have dropped the `triu`-based optimization from
`pdist` in favor of slicing out the relevant pieces to keep ourselves.
So there is no requirement from `pdist` to have `cdist` rechunk the
result.  Further if we did readd some `triu`-based optimization to
`pdist`, it would be up to `pdist` to guarantee the chunking was
appropriate. So `pdist`'s former requirement should not constrain
`cdist`'s result's chunks.
@jakirkham jakirkham merged commit ca9970d into master Oct 7, 2017
@jakirkham jakirkham deleted the drop_cdist_fin_rchk branch October 7, 2017 19:13
@jakirkham jakirkham changed the title Drop rechunking of cdist's result Drop rechunking of cdist's result Oct 9, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant