Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filtering GET request via With{Min,Max}{Create,Mod}Rev has wrong result Count #18267

Open
4 tasks done
jcferretti opened this issue Jul 2, 2024 · 7 comments
Open
4 tasks done

Comments

@jcferretti
Copy link
Contributor

Bug report criteria

What happened?

root@cfs-campaign-1 18:58:58 ~
# etcdctl --write-out=fields get --prefix /k     # Show no keys with a "/k" prefix exist
"ClusterID" : 6736027128771979048
"MemberID" : 5334263768432527757
"Revision" : 1453
"RaftTerm" : 4
"More" : false
"Count" : 0

root@cfs-campaign-1 18:59:08 ~
# etcdctl --write-out=fields put /k1 v1        # Add /k1=v1 at revision 1454
"ClusterID" : 6736027128771979048
"MemberID" : 5334263768432527757
"Revision" : 1454
"RaftTerm" : 4

root@cfs-campaign-1 18:59:14 ~
# etcdctl --write-out=fields put /k2 v2        # Add /k2=v2 at revision 1455
"ClusterID" : 6736027128771979048
"MemberID" : 5334263768432527757
"Revision" : 1455
"RaftTerm" : 4

root@cfs-campaign-1 18:59:16 ~
# etcdctl --write-out=fields put /k3 v3        # Add /k3=v3 at revision 1456
"ClusterID" : 6736027128771979048
"MemberID" : 5334263768432527757
"Revision" : 1456
"RaftTerm" : 4

root@cfs-campaign-1 18:59:19 ~
# etcdctl --write-out=fields get --prefix --max-create-rev 1454 /k           # The count is wrong, should be 1.
"ClusterID" : 6736027128771979048
"MemberID" : 5334263768432527757
"Revision" : 1456
"RaftTerm" : 4
"Key" : "/k1"
"CreateRevision" : 1454
"ModRevision" : 1454
"Version" : 1
"Value" : "v1"
"Lease" : 0
"More" : false
"Count" : 3

What did you expect to happen?

Count in the GET request's result should be consistent with the size of the kvs array returned.

How can we reproduce it (as minimally and precisely as possible)?

See "what happened" above.

Anything else we need to know?

The bug seems to be in etc/server/etcdserver/txn/txn.go. When the filtering is applied:

if r.MaxModRevision != 0 {

the implementation of pruneKvs does update the resulting array of kvs

func pruneKVs(rr *mvcc.RangeResult, isPrunable func(*mvccpb.KeyValue) bool) {

but the count is set without taking into account the actual returned kvs that may have been filtered:

resp.Count = int64(rr.Count)

Etcd version (please run commands below)

# etcd --version
etcd Version: 3.5.12
Git SHA: e7b3bb6cc
Go Version: go1.20.13
Go OS/Arch: linux/amd64

# etcdctl version
etcdctl version: 3.6.0-alpha.0
API version: 3.6

Etcd configuration (command line flags or environment variables)

name: etcd-1

data-dir: /var/lib/etcd/dh/cbd0f3c32

listen-client-urls: https://10.128.1.160:2379,https://localhost:2379
listen-peer-urls: https://10.128.1.160:2380,https://localhost:2380

advertise-client-urls: https://10.128.1.160:2379
initial-advertise-peer-urls: https://10.128.1.160:2380

initial-cluster-state: new
initial-cluster-token: cbd0f3c32
initial-cluster: etcd-1=https://10.128.1.160:2380

strict-reconfig-check: true
enable-v2: false

# * DHE worker processes start on demand and sometimes en-masse.
# * A newly elected etcd leader after failover needs to validate lots
#   of credentials for all the new clients showing up.
# In either of these circumstances, too high a cost to bcrypt
# (used during role based authentication) can starve a machine of CPU.
# Reduce the default of 10 to the min of 4.
bcrypt-cost: 4

peer-transport-security:
  # the peer certs need to have CN=peer.etcd.deephaven.local
  cert-allowed-cn: peer.etcd.deephaven.local
  client-cert-auth: true
  trusted-ca-file: /etc/etcd/dh/cbd0f3c32/ssl/peer/ca.crt
  cert-file: /etc/etcd/dh/cbd0f3c32/ssl/peer/etcd-1.public.crt
  key-file: /etc/etcd/dh/cbd0f3c32/ssl/peer/etcd-1.private.key

client-transport-security:
  # clients are encrypted, but not authenticated - we rely on username / password for auth
  client-cert-auth: false
  # don't need to set CA since we aren't authenticating clients
  # trusted-ca-file: /etc/etcd/dh/cbd0f3c32/ssl/server/ca.crt
  cert-file: /etc/etcd/dh/cbd0f3c32/ssl/server/etcd-1.public.crt
  key-file: /etc/etcd/dh/cbd0f3c32/ssl/server/etcd-1.private.key

auto-compaction-mode: periodic
auto-compaction-retention: "168"

Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)

# etcdctl member list -w table
+------------------+---------+--------+---------------------------+---------------------------+------------+
|        ID        | STATUS  |  NAME  |        PEER ADDRS         |       CLIENT ADDRS        | IS LEARNER |
+------------------+---------+--------+---------------------------+---------------------------+------------+
| 4a071ca29fa8498d | started | etcd-1 | https://10.128.1.160:2380 | https://10.128.1.160:2379 |      false |
+------------------+---------+--------+---------------------------+---------------------------+------------+

# etcdctl endpoint status -w table
+---------------------------+------------------+---------+-----------------+---------+--------+-----------------------+-------+-----------+------------+-----------+------------+--------------------+--------+
|         ENDPOINT          |        ID        | VERSION | STORAGE VERSION | DB SIZE | IN USE | PERCENTAGE NOT IN USE | QUOTA | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+---------------------------+------------------+---------+-----------------+---------+--------+-----------------------+-------+-----------+------------+-----------+------------+--------------------+--------+
| https://10.128.1.160:2379 | 4a071ca29fa8498d |  3.5.12 |                 |  6.1 MB | 6.0 MB |                    1% |   0 B |      true |      false |         4 |       2325 |               2325 |        |
+---------------------------+------------------+---------+-----------------+---------+--------+-----------------------+-------+-----------+------------+-----------+------------+--------------------+--------+

Relevant log output

No response

@jcferretti jcferretti changed the title Filtering a GET request via With{Min,Max}{Create,Mod}Rev results in wrong result Count Filtering GET request via With{Min,Max}{Create,Mod}Rev has wrong result Count Jul 2, 2024
@jberkus
Copy link

jberkus commented Jul 18, 2024

Hey, we looked at this issue during today's Triage meeting. One of the reviewers needs to work on reproducing it, which isn't assigned yet. Will keep you posted!

@jcferretti
Copy link
Contributor Author

Thanks for the update. My team has a workaround in place in our code that uses etcd, so no rush from our side.
Main motivation for getting this in is other people not being surprised/bitten by it.

@ahrtr
Copy link
Member

ahrtr commented Jul 19, 2024

Thanks for reporting this issue.

It's valid bug. The RangeResponse.Count should be the count of keys which fall into the range. But please note that it may be > len(RangeResponse.Kvs) in pagination case, and RangeResponse.More should be true as well in such csae.

Count int64 `protobuf:"varint,4,opt,name=count,proto3" json:"count,omitempty"`

@jcferretti
Copy link
Contributor Author

jcferretti commented Jul 20, 2024

But please note that it may be > len(RangeResponse.Kvs) in pagination case, and RangeResponse.More should be true as well in such csae.

Thanks, I had missed that.
I've updated the PR #18268. All tests are passing (including new ones).

@ahrtr
Copy link
Member

ahrtr commented Sep 29, 2024

See #18268 (comment)

@jmhbnz
Copy link
Member

jmhbnz commented Oct 24, 2024

Discussed during sig-etcd triage meeting - Let's reframe this issue as a docs update and progress that in the short term. @ahrtr which docs page do you think this best fits under?

@ahrtr
Copy link
Member

ahrtr commented Oct 24, 2024

@ahrtr which docs page do you think this best fits under?

It should be https://etcd.io/docs/v3.5/dev-guide/api_reference_v3/.

We should update the comment for the field count firstly. It seems that the doc will be synced with the comment,

// count is set to the number of keys within the range when requested.
int64 count = 4;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

4 participants