Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exit code behavior pip install vs. docker/k8s #1726

Open
boernd opened this issue Sep 11, 2024 · 4 comments
Open

Exit code behavior pip install vs. docker/k8s #1726

boernd opened this issue Sep 11, 2024 · 4 comments
Assignees
Labels

Comments

@boernd
Copy link

boernd commented Sep 11, 2024

Curator version: 8.0.16

We let curator run as a cronjob within Kubernetes. If for instance the pod cannot contact Elasticsearch during client creation it throws error logs but the job gets status Completed and not Error.

I tested pip install vs k8s and get different error codes.

pip install:

> curator --config ./config.yml ./action_file.yml
2024-09-11 10:27:51,147 INFO      Preparing Action ID: 1, "delete_indices"
2024-09-11 10:27:51,147 INFO      Creating client object and testing connection
2024-09-11 10:27:51,211 CRITICAL  Unable to establish client connection to Elasticsearch!
2024-09-11 10:27:51,212 CRITICAL  Exception encountered: Connection error caused by: ConnectionError(Connection error caused by: NameResolutionError(<urllib3.connection.HTTPConnection object at 0x7f718f4241d0>: Failed to resolve 'xyz' ([Errno -2] Name or service not known)))
Traceback (most recent call last):
  File "/home/bernd/.local/bin/curator", line 8, in <module>
    sys.exit(cli())
             ^^^^^
  File "/home/bernd/.local/curator/lib/python3.11/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/bernd/.local/curator/lib/python3.11/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/home/bernd/.local/curator/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/bernd/.local/curator/lib/python3.11/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/bernd/.local/curator/lib/python3.11/site-packages/click/decorators.py", line 33, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/bernd/.local/curator/lib/python3.11/site-packages/curator/cli.py", line 299, in cli
    run(ctx)
  File "/home/bernd/.local/curator/lib/python3.11/site-packages/curator/cli.py", line 223, in run
    if ilm_action_skip(client, action_def):
                       ^^^^^^
UnboundLocalError: cannot access local variable 'client' where it is not associated with a value

> echo $?
1

Triggering the command within a k8s pod (official docker image):

> k exec -ti curator-elasticsearch-curator-onetime-klrml sh                                                     

/ $ /curator/curator --config /etc/es-curator/config.yml /etc/es-curator/action_file.yml
2024-09-11 08:35:27,652 INFO      Preparing Action ID: 1, "delete_indices"
2024-09-11 08:35:27,743 INFO      Creating client object and testing connection
2024-09-11 08:35:27,826 CRITICAL  Unable to establish client connection to Elasticsearch!
2024-09-11 08:35:27,826 CRITICAL  Exception encountered: Connection error caused by: ConnectionError(Connection error caused by: NameResolutionError(<urllib3.connection.HTTPConnection object at 0x7f55e814dad0>: Failed to resolve 'xyz' ([Errno -2] Name does not resolve)))

/ $ echo $?
0

The Dockerfile builds an executable, maybe there is some difference in behavior?

I also had a look at the code. If I read the code correctly I saw that the get_client def in the es_client lib raises an ESClientException but curator just catches a ClientException.

@untergeek
Copy link
Member

Oh, this is fascinating. Thank you for raising this issue. I will definitely see if I can make the frozen binary exit with a 1 error code when it should.

@archon810
Copy link

@untergeek We just upgraded OpenSUSE and lost python 3.9. Trying with python 3.11, I get the error listed above.

Any ideas how to make curator work again?

/usr/local/bin/curator --config /etc/filebeat/ap_filebeat/curator/curator.yml /etc/filebeat/ap_filebeat/curator/actions/ap.yml                                                                                                                                               
2024-11-27 04:03:40,399 WARNING   Permitting operation on indices with an ILM policy
2024-11-27 04:03:40,399 INFO      Preparing Action ID: 1, "delete_indices"
2024-11-27 04:03:40,399 INFO      Creating client object and testing connection
Traceback (most recent call last):
  File "/usr/local/bin/curator", line 8, in <module>
    sys.exit(cli())
             ^^^^^
  File "/usr/local/lib/python3.11/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/click/decorators.py", line 33, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/curator/cli.py", line 299, in cli
    run(ctx)
  File "/usr/local/lib/python3.11/site-packages/curator/cli.py", line 223, in run
    if ilm_action_skip(client, action_def):
                       ^^^^^^
UnboundLocalError: cannot access local variable 'client' where it is not associated with a value

@archon810
Copy link

A similar issue with python 3.10.

bin/curator --dry-run --config /etc/filebeat/ap_filebeat/curator/curator.yml /etc/filebeat/ap_filebeat/curator/actions/ap.yml                                                                                                                                               
2024-11-27 04:15:35,887 WARNING   Permitting operation on indices with an ILM policy
2024-11-27 04:15:35,887 INFO      Preparing Action ID: 1, "delete_indices"
2024-11-27 04:15:35,887 INFO      Creating client object and testing connection
Traceback (most recent call last):
  File "/root/es_curator/curator-env/bin/curator", line 8, in <module>
    sys.exit(cli())
  File "/root/es_curator/curator-env/lib64/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/root/es_curator/curator-env/lib64/python3.10/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/root/es_curator/curator-env/lib64/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/root/es_curator/curator-env/lib64/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/root/es_curator/curator-env/lib64/python3.10/site-packages/click/decorators.py", line 33, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/root/es_curator/curator-env/lib64/python3.10/site-packages/curator/cli.py", line 299, in cli
    run(ctx)
  File "/root/es_curator/curator-env/lib64/python3.10/site-packages/curator/cli.py", line 223, in run
    if ilm_action_skip(client, action_def):
UnboundLocalError: local variable 'client' referenced before assignment

@archon810
Copy link

Sorry, disregard this. I was testing individual commands from the script that sets export ES_PWD=FOOBARETC but without this export, which apparently makes curator fail. Now that I ran the script with the export, it's working again.

I'm leaving this here in case someone runs into this bug again (chances are it'll be future me).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants