Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removing (with del) multiple items from an array can result in indexing problems #2027

Closed
nicktimko opened this issue May 2, 2024 · 3 comments · Fixed by #2210
Closed

Removing (with del) multiple items from an array can result in indexing problems #2027

nicktimko opened this issue May 2, 2024 · 3 comments · Fixed by #2210
Labels

Comments

@nicktimko
Copy link

Describe the bug
If removing multiple elements from an array, it seems like yq is computing where they are before doing any of the modifications, then removing those indicies, but as they get removed, the previously-calculated indicies are wrong.

Note that any how to questions should be posted in the discussion board and not raised as an issue.

Version of yq: 4.43.1
Operating system: macOS 14.4.1
Installed via: Homebrew

Input Yaml
names.yml:

{"ppl":[{"name":"alice"},{"name":"bob"},{"name":"charlie"},{"name":"deb"}]}

Script

#!/bin/bash

input='{"ppl":[{"name":"alice"},{"name":"bob"},{"name":"charlie"},{"name":"deb"}]}'
tmp_names=/tmp/names
jq <<<$input >$tmp_names.json

python3 -c "import platform;print(platform.platform())"
jq --version
yq --version

jq 'del(.ppl[] | select(.name == "alice")) | del(.ppl[] | select(.name == "bob"))' $tmp_names.json
yq 'del(.ppl[] | select(.name == "alice")) | del(.ppl[] | select(.name == "bob"))' $tmp_names.json
yq 'del(.ppl[] | select(.name == "bob")) | del(.ppl[] | select(.name == "alice"))' $tmp_names.json

Output

macOS-14.4.1-arm64-arm-64bit
jq-1.7.1
yq (https://github.com/mikefarah/yq/) version v4.43.1
{
  "ppl": [
    {
      "name": "charlie"
    },
    {
      "name": "deb"
    }
  ]
}
{
  "ppl": [
    {
      "name": "bob"  # [Ed: added comment] !!!
    },
    {
      "name": "deb"
    }
  ]
}
{
  "ppl": [
    {
      "name": "charlie"
    },
    {
      "name": "deb"
    }
  ]
}

Additional context
Feels kinda like this toy Python example, though here it will blow up if it tries to remove an out-of-bounds index, while if you change "bob" to "deb" in the yq example, then you wind up with it outputting bob/charlie/deb.

a = ['a', 'b', 'c', 'd']
to_remove = ['a', 'b']
for idx in [a.index(x) for x in to_remove]:
    a.pop(idx)
print(a)  #>>> ['b', 'd']
@jandubois
Copy link
Contributor

jandubois commented Dec 1, 2024

I ran into this issue today as well. Here is an even simpler example:

$ echo 'a: [0,1,2,3,4,5,6]' | yq 'del(.a[0])'
a: [1, 2, 3, 4, 5, 6]
$ echo 'a: [0,1,2,3,4,5,6]' | yq 'del(.a[0]) | del(.a[0])'
a: [1, 3, 4, 5, 6]
$ echo 'a: [0,1,2,3,4,5,6]' | yq 'del(.a[0]) | del(.a[0]) | del(.a[0])'
a: [1, 4, 5, 6]
$ echo 'a: [0,1,2,3,4,5,6]' | yq 'del(.a[0]) | del(.a[0]) | del(.a[0]) | del(.a[0])'
a: [1, 5, 6]

I expected that repeatedly executing del(.a[0]) would always remove the first element of the list, and not something in the middle. Every result except the first seems wrong. This is what I expected for the remaining commands:

$ echo 'a: [0,1,2,3,4,5,6]' | yq 'del(.a[0]) | del(.a[0])'
a: [2, 3, 4, 5, 6]
$ echo 'a: [0,1,2,3,4,5,6]' | yq 'del(.a[0]) | del(.a[0]) | del(.a[0])'
a: [3, 4, 5, 6]
$ echo 'a: [0,1,2,3,4,5,6]' | yq 'del(.a[0]) | del(.a[0]) | del(.a[0]) | del(.a[0])'
a: [4, 5, 6]

jandubois added a commit to jandubois/lima that referenced this issue Dec 1, 2024
This doesn't work because of mikefarah/yq#2027

Signed-off-by: Jan Dubois <[email protected]>
@jandubois
Copy link
Contributor

I tried to look into this a little bit. There is a hint in the logging output:

$ echo 'a: [0,1,2,3]' | yq -v 'del(.a[0]) | del(.a[0]) | del(.a[0])' 2>&1 | grep deleteChildOperator
[DEBUG] 23:32:01 operator_delete.go:22             deleteChildOperator       processing deletion of candidate D0, Pa[0], ScalarNode (!!int)::0
[DEBUG] 23:32:01 operator_delete.go:22             deleteChildOperator       processing deletion of candidate D0, Pa[1], ScalarNode (!!int)::1
[DEBUG] 23:32:01 operator_delete.go:22             deleteChildOperator       processing deletion of candidate D0, Pa[1], ScalarNode (!!int)::1

I suspect that the path should always be Pa[0] for successive del(.a[0]) calls, but it is Pa[1] for everything but the first one.

I don't really understand the code, but I wonder that if you delete an element, if the content[index].Key values need to be updated for all elements that got moved?

It looks like element 1 got moved to the 0 spot, but when it is to be deleted, it seems to still have a Key of 1, so this time the second element gets deleted instead (its original location).

And since element 0 survives the deletion, any successive del(.a[0]) statement will continue to delete a[1] instead.

@mikefarah Probably unrelated, but I also saw this comment in deleteChildOperator:

//need to iterate backwards to ensure correct indices when deleting multiple

I wonder what that relates too; how can you be sure that the matching nodes are ordered by increasing indices?

As I said, I don' really understand the code, but the comment is triggering my spidey-sense..

@mikefarah
Copy link
Owner

Fixed in 4.4.6

jandubois added a commit to jandubois/lima that referenced this issue Dec 8, 2024
This doesn't work because of mikefarah/yq#2027

Signed-off-by: Jan Dubois <[email protected]>
tmeijn pushed a commit to tmeijn/dotfiles that referenced this issue Dec 9, 2024
This MR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [mikefarah/yq](https://github.com/mikefarah/yq) | patch | `v4.44.5` -> `v4.44.6` |

MR created with the help of [el-capitano/tools/renovate-bot](https://gitlab.com/el-capitano/tools/renovate-bot).

**Proposed changes to behavior should be submitted there as MRs.**

---

### Release Notes

<details>
<summary>mikefarah/yq (mikefarah/yq)</summary>

### [`v4.44.6`](https://github.com/mikefarah/yq/releases/tag/v4.44.6)

[Compare Source](mikefarah/yq@v4.44.5...v4.44.6)

-   Fixed deleting items in array bug [#&#8203;2027](mikefarah/yq#2027), [#&#8203;2172](mikefarah/yq#2172); Thanks [@&#8203;jandubois](https://github.com/jandubois)
    -   Docker image for armv7 / raspberry pi3, Thanks [@&#8203;brianegge](https://github.com/brianegge)
    -   Fixed no-colors regression [#&#8203;2218](mikefarah/yq#2218)
    -   Fixed various panic scenarios [#&#8203;2211](mikefarah/yq#2211)
    -   Bumped dependencies

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever MR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this MR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this MR, check this box

---

This MR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy40NDAuNyIsInVwZGF0ZWRJblZlciI6IjM3LjQ0MC43IiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJSZW5vdmF0ZSBCb3QiXX0=-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants