Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rally times out when deleting the tsdb index without tls #1457

Closed
nik9000 opened this issue Mar 14, 2022 · 2 comments
Closed

Rally times out when deleting the tsdb index without tls #1457

nik9000 opened this issue Mar 14, 2022 · 2 comments
Labels
bug Something's wrong

Comments

@nik9000
Copy link
Member

nik9000 commented Mar 14, 2022

Rally version (get with esrally --version): master

Invoked command: esrally race --track tsdb --pipeline benchmark-only --client-options="basic_auth_user:'elastic',basic_auth_password:'password'" --test-mode --enable-assertions --kill-running-processes

Configuration file (located in ~/.rally/rally.ini)): some debug logs

JVM version: bundled

OS version:

$ uname -a
Linux porco 5.16.10-arch1-1 #1 SMP PREEMPT Wed, 16 Feb 2022 19:35:18 +0000 x86_64 GNU/Linux

Description of the problem including expected versus actual behavior:
The delete-index operation times out when the tsdb index exists.

Steps to reproduce:

  1. Run ES without tls enabled. The easiest way is to clone it and run ./gradlew run in the root.
  2. Run rally once. It'll work: esrally race --track tsdb --pipeline benchmark-only --client-options="basic_auth_user:'elastic',basic_auth_password:'password'" --test-mode --enable-assertions --kill-running-processes
  3. Run it again. It'll hang on delete: esrally race --track tsdb --pipeline benchmark-only --client-options="basic_auth_user:'elastic',basic_auth_password:'password'" --test-mode --enable-assertions --kill-running-processes

Provide logs (if relevant):
We grabbed the packets and ES replies with a delete:

DELETE /tsdb HTTP/1.1
Host: 127.0.0.1:9200
content-type: application/json
user-agent: elasticsearch-py/7.14.0 (Python 3.8.10)
authorization: Basic ZWxhc3RpYzpwYXNzd29yZA==
Connection: keep-alive
x-elastic-client-meta: es=7.14.0,py=3.8.10,t=7.14.0,ai=3.8.1
Accept: */*
Accept-Encoding: gzip, deflate
Content-Length: 0

HTTP/1.1 200 OK
X-elastic-product: Elasticsearch
content-type: application/json
content-encoding: gzip
content-length: 47

{"acknowledged":true}

Wireshark shows that sixty seconds later we RST ACK the tcp connection. That's when we declare a timeout.

Rally reports this timeout:

2022-03-14 14:59:32,20 -not-actor-/PID:775851 elasticsearch WARNING DELETE http://127.0.0.1:9200/foo [status:N/A request:60.217s]
Traceback (most recent call last):
  File "/home/nik9000/Code/Elastic/rally/.venv/lib/python3.8/site-packages/elasticsearch/_async/http_aiohttp.py", line 291, in perform_request
    async with self.session.request(
  File "/home/nik9000/Code/Elastic/rally/.venv/lib/python3.8/site-packages/aiohttp/client.py", line 1138, in __aenter__
    self._resp = await self._coro
  File "/home/nik9000/Code/Elastic/rally/.venv/lib/python3.8/site-packages/aiohttp/client.py", line 559, in _request
    await resp.start(conn)
  File "/home/nik9000/Code/Elastic/rally/.venv/lib/python3.8/site-packages/aiohttp/client_reqrep.py", line 913, in start
    self._continue = None
  File "/home/nik9000/Code/Elastic/rally/.venv/lib/python3.8/site-packages/aiohttp/helpers.py", line 721, in __exit__
    raise asyncio.TimeoutError from None
asyncio.exceptions.TimeoutError

But ES thinks the index is deleted.

You can delete just fine over curl:

$ curl -v -uelastic:password -XDELETE 127.0.0.1:9200/foo
*   Trying 127.0.0.1:9200...
* Connected to 127.0.0.1 (127.0.0.1) port 9200 (#0)
* Server auth using Basic with user 'elastic'
> DELETE /foo HTTP/1.1
> Host: 127.0.0.1:9200
> Authorization: Basic ZWxhc3RpYzpwYXNzd29yZA==
> User-Agent: curl/7.81.0
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< X-elastic-product: Elasticsearch
< content-type: application/json
< content-length: 21
< 
* Connection #0 to host 127.0.0.1 left intact
@pquentin pquentin added the bug Something's wrong label Jun 29, 2022
@inqueue
Copy link
Member

inqueue commented Aug 1, 2022

I also ran into the issue while testing elastic/rally-tracks#291 using --pipeline=benchmark-only.

  • tsdb exists from a previous race
  • delete-index waits for 60s before moving to the next operation

Watching the cluster during the "wait" period, the index is deleted immediately as it should. It seems like Rally just isn't getting or correctly parsing the response in order to move on to the next operation.

@DJRickyB
Copy link
Contributor

fixed with love in #1580

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something's wrong
Projects
None yet
Development

No branches or pull requests

4 participants