-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix IntersectMapKeys first-wins bug #339
Conversation
There was a bug in the old version. The intended behavior was for the the map value to come from the *first* map in the list. However, the actual behavior was that the map value came from either the first map or from the *smallest* map, depending on the order of the input maps. Alternative considered: do a more efficient algorithm. This alternative wasn't chosen because it would be a lot of complexity, and the improvement would only matter for large datasets, and would probably be slower for small datasets. The most efficient algorithm I can think of would be a hash join: - Find the intersection of the keys: - Sort the input maps in increasing order of size - Iterate through the input maps and keep only the keys that appear in every map - Iterate again through each map - For each key that's already known to be in every map: - If this key doesn't already appear in the output map: - Add this key's value to the output map I didn't choose this option due to its complexity probably not being worth it.
Converted to draft while I rethink this algorithm |
You could still have the smallest map optimization.
|
What's the bug here? |
From first revision of dave's description: There was a bug in the old version. The intended behavior was for the the map value to come from the first map in the list. However, the actual behavior was that the map value came from either the first map or from the smallest map, depending on the size order of the input maps. |
Right. You can cherry-pick just the test changes to reproduce the bug. |
I'm pretty sure the fix can be much simpler at the cost of one more iteration. Given the number of maps we're intersecting is usually the lower bound, this seems negligible. |
From my point of view, the fix may be simpler in #340 (add a for loop), but the resulting algorithm is simpler in this PR. |
## What's Changed * checkout config from default branch in load-workflow-variables action by @gjonathanhong in #337 * Fix IntersectMapKeys first-wins bug by @drevell in #339 * Fix docstring for logging method by @pjh in #341 ## New Contributors * @pjh made their first contribution in #341 **Full Changelog**: v1.1.1...v1.1.2 Co-authored-by: token-minter-prod[bot] <125072751+token-minter-prod[bot]@users.noreply.github.com>
There was a bug in the old version. The intended behavior was for the the map value to come from the first map in the list. However, the actual behavior was that the map value came from either the first map or from the smallest map, depending on the size order of the input maps.