Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Yet another "Failed to renew SSL" issue #1482

Closed
dl-lim opened this issue Oct 13, 2021 · 5 comments
Closed

Yet another "Failed to renew SSL" issue #1482

dl-lim opened this issue Oct 13, 2021 · 5 comments
Labels

Comments

@dl-lim
Copy link

dl-lim commented Oct 13, 2021

Hi there, I'm using the official docker image, version 2.9.7

I've looked for many other similar instances of SSL certbot not working correctly, but haven't come across a solution for my case yet. Please do point out the right issue # if I've missed them.

So, I've been using this for a while now, so am not new to the project, and I've always had problems with certbot renewing.

Ports are correctly forwarded externally so no issue there. I've also been using Cloudflare as an additional layer to anonymise my IP + DDoS protection. I use Cloudflare's DNS service and also DDNS (where I use another service to automatically update my external IP address to Cloudflare)

I think Cloudflare would have been the problem when renewing certs, so I have disabled them temporarily to test renewing, but it still fails with the following logs.

The only approach I haven't tested is deleting all the certs and recreating them, because I have so many! A feature to do this in bulk would be much appreciated, but that's beside the point :)

So, back to the logs, what is immediately obvious?

I'd thought that Temporary failure in name resolution meant it's not reaching acme-v02.api.letsencrypt.org but this is not true, since I am able to ping it successfully from the machine hosting Nginx Proxy Manager.

Any thoughts?

[10/13/2021] [11:04:27 AM] [Migrate  ] › ℹ  info      Current database version: 20210210154703
[10/13/2021] [11:04:27 AM] [Setup    ] › ℹ  info      Logrotate Timer initialized
[10/13/2021] [11:04:27 AM] [Setup    ] › ℹ  info      Logrotate completed.
[10/13/2021] [11:04:27 AM] [IP Ranges] › ℹ  info      Fetching IP Ranges from online services...
[10/13/2021] [11:04:27 AM] [IP Ranges] › ℹ  info      Fetching https://ip-ranges.amazonaws.com/ip-ranges.json
[10/13/2021] [11:04:37 AM] [IP Ranges] › ✖  error     getaddrinfo EAI_AGAIN ip-ranges.amazonaws.com
[10/13/2021] [11:04:37 AM] [SSL      ] › ℹ  info      Let's Encrypt Renewal Timer initialized
[10/13/2021] [11:04:37 AM] [SSL      ] › ℹ  info      Renewing SSL certs close to expiry...
[10/13/2021] [11:04:37 AM] [IP Ranges] › ℹ  info      IP Ranges Renewal Timer initialized
[10/13/2021] [11:04:37 AM] [Global   ] › ℹ  info      Backend PID 239 listening on port 3000 ...
`QueryBuilder#allowEager` method is deprecated. You should use `allowGraph` instead. `allowEager` method will be removed in 3.0
`QueryBuilder#eager` method is deprecated. You should use the `withGraphFetched` method instead. `eager` method will be removed in 3.0
QueryBuilder#omit is deprecated. This method will be removed in version 3.0
[10/13/2021] [11:08:08 AM] [SSL      ] › ℹ  info      Renewing Let'sEncrypt certificates for Cert #5: domain.com
[10/13/2021] [11:08:08 AM] [SSL      ] › ℹ  info      Command: certbot renew --force-renewal --non-interactive --config "/etc/letsencrypt.ini" --cert-name "npm-5" --preferred-challenges "dns,http" --disable-hook-validation 
[10/13/2021] [11:08:09 AM] [Express  ] › ⚠  warning   Command failed: certbot renew --force-renewal --non-interactive --config "/etc/letsencrypt.ini" --cert-name "npm-5" --preferred-challenges "dns,http" --disable-hook-validation 
Another instance of Certbot is already running.
Ask for help or search for solutions at https://community.letsencrypt.org. See the logfile /tmp/tmpa03h0p60/log or re-run Certbot with -v for more details.
[10/13/2021] [11:17:45 AM] [SSL      ] › ✖  error     Error: Command failed: certbot renew --non-interactive --quiet --config "/etc/letsencrypt.ini" --preferred-challenges "dns,http" --disable-hook-validation  
Failed to renew certificate npm-11 with error: Some challenges have failed.
Failed to renew certificate npm-12 with error: Some challenges have failed.
Failed to renew certificate npm-13 with error: Some challenges have failed.
Failed to renew certificate npm-14 with error: Some challenges have failed.
Failed to renew certificate npm-15 with error: Some challenges have failed.
Failed to renew certificate npm-16 with error: HTTPSConnectionPool(host='acme-v02.api.letsencrypt.org', port=443): Max retries exceeded with url: /directory (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fc37e4a6e10>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution'))
Failed to renew certificate npm-17 with error: Some challenges have failed.
Failed to renew certificate npm-18 with error: Some challenges have failed.
Failed to renew certificate npm-26 with error: urn:ietf:params:acme:error:serverInternal :: The server experienced an internal error :: Error creating new order
Failed to renew certificate npm-36 with error: Some challenges have failed.
Failed to renew certificate npm-38 with error: Some challenges have failed.
Failed to renew certificate npm-40 with error: Some challenges have failed.
Failed to renew certificate npm-41 with error: Some challenges have failed.
Failed to renew certificate npm-43 with error: HTTPSConnectionPool(host='acme-v02.api.letsencrypt.org', port=443): Max retries exceeded with url: /directory (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fc37e4807f0>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution'))
Failed to renew certificate npm-44 with error: Some challenges have failed.
Failed to renew certificate npm-46 with error: Some challenges have failed.
Failed to renew certificate npm-47 with error: Some challenges have failed.
Failed to renew certificate npm-48 with error: Some challenges have failed.
Failed to renew certificate npm-50 with error: Some challenges have failed.
Failed to renew certificate npm-54 with error: Some challenges have failed.
Failed to renew certificate npm-8 with error: Some challenges have failed.
All renewals failed. The following certificates could not be renewed:
  /etc/letsencrypt/live/npm-11/fullchain.pem (failure)
  /etc/letsencrypt/live/npm-12/fullchain.pem (failure)
  /etc/letsencrypt/live/npm-13/fullchain.pem (failure)
  /etc/letsencrypt/live/npm-14/fullchain.pem (failure)
  /etc/letsencrypt/live/npm-15/fullchain.pem (failure)
  /etc/letsencrypt/live/npm-16/fullchain.pem (failure)
  /etc/letsencrypt/live/npm-17/fullchain.pem (failure)
  /etc/letsencrypt/live/npm-18/fullchain.pem (failure)
  /etc/letsencrypt/live/npm-26/fullchain.pem (failure)
  /etc/letsencrypt/live/npm-36/fullchain.pem (failure)
  /etc/letsencrypt/live/npm-38/fullchain.pem (failure)
  /etc/letsencrypt/live/npm-40/fullchain.pem (failure)
  /etc/letsencrypt/live/npm-41/fullchain.pem (failure)
  /etc/letsencrypt/live/npm-43/fullchain.pem (failure)
  /etc/letsencrypt/live/npm-44/fullchain.pem (failure)
  /etc/letsencrypt/live/npm-46/fullchain.pem (failure)
  /etc/letsencrypt/live/npm-47/fullchain.pem (failure)
  /etc/letsencrypt/live/npm-48/fullchain.pem (failure)
  /etc/letsencrypt/live/npm-50/fullchain.pem (failure)
  /etc/letsencrypt/live/npm-54/fullchain.pem (failure)
  /etc/letsencrypt/live/npm-8/fullchain.pem (failure)
21 renew failure(s), 0 parse failure(s)
    at ChildProcess.exithandler (node:child_process:326:12)
    at ChildProcess.emit (node:events:369:20)
    at maybeClose (node:internal/child_process:1067:16)
    at Process.ChildProcess._handle.onexit (node:internal/child_process:301:5)

On other occcassions, I'd get internal errors or timeouts too. Can't make any further changes, so I would docker-compose down and docker-compose up -d again to "reset" the app.

Generally, can't renew cert without error, and it is only by luck that it gets renewed successfully.

Most of the certs are shown as expired on the web UI, but https still works on the underlying reverse-proxied application

What can I do about this and what is a good practice to prevent this from happening?

@dl-lim dl-lim added the bug label Oct 13, 2021
@chaptergy
Copy link
Collaborator

Yeah, unfortunately certbot does have a number of issues which we can't do anything about, except to switch to a different tool to generate certificates, which is planned for v3.

Concerning some of the questions you asked:

  • Your one of your current error messages is Another instance of Certbot is already running., see Another instance of Certbot is already running #918 (comment) on how to fix this.
  • When certbot is renewing the certificates, and even one certificate fails to renew, the expiry time might not be updated for the other successfully renewed certificates in the database, which means the cert is shown as expired in the web ui, when it isn't actually expired. See SSL Certs Expiry Date Does Not Update with Each Renewal. #792.
  • Temporary failure in name resolution could mean that either certbot is not able to connect to the letsencrypt acme servers, or that they were not able to connect back to the webhost on port 80 to confirm you have control over this domain (depending on the context). Cloudflare is also often a source of issues. If you disable cloudflare you are able to access your proxied apps from the outside on port 80?

The certbot output in the normal docker lock is usually very limited and does not provide much useful information. The letsencrypt logs contain much more information, which would be useful to find the source of the problem. See #1271 (comment) on how to access these logs.

@dl-lim
Copy link
Author

dl-lim commented Oct 14, 2021

@chaptergy firstly, thanks a bunch for your detailed writeup, really appreciated.

planned for v3.

How close are we to this? We're at 2.9.9, so must be close? :D


Another instance of Certbot is already running

Thanks. I noted this one. Fixing it usually still leads to timeout.


the expiry time might not be updated for the other successfully renewed certificates in the database, which means the cert is shown as expired in the web ui, when it isn't

Yeah, this makes sense. The https still works without error on browser side, but it's just all wrong in the Web UI.


Temporary failure in name resolution

I know this is a DNS issue, sometimes even on client side, but I've triple checked to make sure internet access is connected. Without cloudflare proxy, my external IP ports are all forwarded correctly and can associated domains be reached thanks to Nginx Proxy Manager. I've had cloudflare proxy turned on in production and it didn't have issues with DNS previously. I turned it off to troubleshoot this.


The letsencrypt logs contain much more information, which would be useful to find the source of the problem.

Here they are:

letsencrypt.log
2021-10-14 08:51:58,068:DEBUG:certbot._internal.main:certbot version: 1.17.0
2021-10-14 08:51:58,068:DEBUG:certbot._internal.main:Location of certbot entry point: /opt/certbot/bin/certbot
2021-10-14 08:51:58,068:DEBUG:certbot._internal.main:Arguments: ['--force-renewal', '--non-interactive', '--config', '/etc/letsencrypt.ini', '--cert-name', 'npm-5', '--preferred-challenges', 'dns,http', '--disable-hook-validation']
2021-10-14 08:51:58,068:DEBUG:certbot._internal.main:Discovered plugins: PluginsRegistry(PluginEntryPoint#manual,PluginEntryPoint#null,PluginEntryPoint#standalone,PluginEntryPoint#webroot)
2021-10-14 08:51:58,076:DEBUG:certbot._internal.log:Root logging level set at 30
2021-10-14 08:51:58,428:DEBUG:certbot.display.util:Notifying user: Processing /etc/letsencrypt/renewal/npm-5.conf
2021-10-14 08:51:58,509:DEBUG:certbot._internal.plugins.selection:Requested authenticator <certbot._internal.cli.cli_utils._Default object at 0x7f4794936588> and installer <certbot._internal.cli.cli_utils._Default object at 0x7f4794936588>
2021-10-14 08:51:58,509:DEBUG:certbot._internal.cli:Var pref_challs=dns,http (set by user).
2021-10-14 08:51:58,509:DEBUG:certbot._internal.cli:Var key_type=ecdsa (set by user).
2021-10-14 08:51:58,509:DEBUG:certbot._internal.cli:Var elliptic_curve=secp384r1 (set by user).
2021-10-14 08:51:58,509:DEBUG:certbot._internal.cli:Var webroot_path=/data/letsencrypt-acme-challenge (set by user).
2021-10-14 08:51:58,509:DEBUG:certbot._internal.cli:Var webroot_map={'webroot_path'} (set by user).
2021-10-14 08:51:58,509:DEBUG:certbot._internal.cli:Var webroot_path=/data/letsencrypt-acme-challenge (set by user).
2021-10-14 08:51:58,542:DEBUG:certbot._internal.renewal:Auto-renewal forced with --force-renewal...
2021-10-14 08:51:58,542:INFO:certbot._internal.renewal:Non-interactive renewal: random delay of 74.31209800189319 seconds
2021-10-14 08:53:12,891:DEBUG:certbot._internal.plugins.selection:Requested authenticator webroot and installer None
2021-10-14 08:53:12,893:DEBUG:certbot._internal.plugins.selection:Single candidate plugin: * webroot
Description: Place files in webroot directory
Interfaces: IAuthenticator, IPlugin
Entry point: webroot = certbot._internal.plugins.webroot:Authenticator
Initialized: <certbot._internal.plugins.webroot.Authenticator object at 0x7f479493ad30>
Prep: True
2021-10-14 08:53:12,893:DEBUG:certbot._internal.plugins.selection:Selected authenticator <certbot._internal.plugins.webroot.Authenticator object at 0x7f479493ad30> and installer None
2021-10-14 08:53:12,893:INFO:certbot._internal.plugins.selection:Plugins selected: Authenticator webroot, Installer None
2021-10-14 08:53:12,910:DEBUG:certbot._internal.main:Picked account: <Account(RegistrationResource(body=Registration(key=None, contact=(), agreement=None, status=None, terms_of_service_agreed=None, only_return_existing=None, external_account_binding=None), uri='https://acme-v02.api.letsencrypt.org/acme/acct/106773887', new_authzr_uri=None, terms_of_service=None), f66c4973732295aa241e48610465d5f0, Meta(creation_dt=datetime.datetime(2020, 12, 18, 19, 44, 55, tzinfo=<UTC>), creation_host='c876fb42bf22', register_to_eff=None))>
2021-10-14 08:53:12,910:DEBUG:acme.client:Sending GET request to https://acme-v02.api.letsencrypt.org/directory.
2021-10-14 08:53:12,911:DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): acme-v02.api.letsencrypt.org:443
2021-10-14 08:53:22,923:ERROR:certbot._internal.renewal:Failed to renew certificate npm-5 with error: HTTPSConnectionPool(host='acme-v02.api.letsencrypt.org', port=443): Max retries exceeded with url: /directory (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f4794931710>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution'))
2021-10-14 08:53:22,925:DEBUG:certbot._internal.renewal:Traceback was:
Traceback (most recent call last):
  File "/opt/certbot/lib/python3.7/site-packages/urllib3/connection.py", line 170, in _new_conn
    (self._dns_host, self.port), self.timeout, **extra_kw
  File "/opt/certbot/lib/python3.7/site-packages/urllib3/util/connection.py", line 73, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
  File "/usr/lib/python3.7/socket.py", line 748, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -3] Temporary failure in name resolution

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/certbot/lib/python3.7/site-packages/urllib3/connectionpool.py", line 706, in urlopen
    chunked=chunked,
  File "/opt/certbot/lib/python3.7/site-packages/urllib3/connectionpool.py", line 382, in _make_request
    self._validate_conn(conn)
  File "/opt/certbot/lib/python3.7/site-packages/urllib3/connectionpool.py", line 1010, in _validate_conn
    conn.connect()
  File "/opt/certbot/lib/python3.7/site-packages/urllib3/connection.py", line 353, in connect
    conn = self._new_conn()
  File "/opt/certbot/lib/python3.7/site-packages/urllib3/connection.py", line 182, in _new_conn
    self, "Failed to establish a new connection: %s" % e
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7f4794931710>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/certbot/lib/python3.7/site-packages/requests/adapters.py", line 449, in send
    timeout=timeout
  File "/opt/certbot/lib/python3.7/site-packages/urllib3/connectionpool.py", line 756, in urlopen
    method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
  File "/opt/certbot/lib/python3.7/site-packages/urllib3/util/retry.py", line 574, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='acme-v02.api.letsencrypt.org', port=443): Max retries exceeded with url: /directory (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f4794931710>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/certbot/lib/python3.7/site-packages/certbot/_internal/renewal.py", line 474, in handle_renewal_request
    main.renew_cert(lineage_config, plugins, renewal_candidate)
  File "/opt/certbot/lib/python3.7/site-packages/certbot/_internal/main.py", line 1385, in renew_cert
    le_client = _init_le_client(config, auth, installer)
  File "/opt/certbot/lib/python3.7/site-packages/certbot/_internal/main.py", line 770, in _init_le_client
    return client.Client(config, acc, authenticator, installer, acme=acme)
  File "/opt/certbot/lib/python3.7/site-packages/certbot/_internal/client.py", line 253, in __init__
    acme = acme_from_config_key(config, self.account.key, self.account.regr)
  File "/opt/certbot/lib/python3.7/site-packages/certbot/_internal/client.py", line 41, in acme_from_config_key
    return acme_client.BackwardsCompatibleClientV2(net, key, config.server)
  File "/opt/certbot/lib/python3.7/site-packages/acme/client.py", line 824, in __init__
    directory = messages.Directory.from_json(net.get(server).json())
  File "/opt/certbot/lib/python3.7/site-packages/acme/client.py", line 1168, in get
    self._send_request('GET', url, **kwargs), content_type=content_type)
  File "/opt/certbot/lib/python3.7/site-packages/acme/client.py", line 1117, in _send_request
    response = self.session.request(method, url, *args, **kwargs)
  File "/opt/certbot/lib/python3.7/site-packages/requests/sessions.py", line 542, in request
    resp = self.send(prep, **send_kwargs)
  File "/opt/certbot/lib/python3.7/site-packages/requests/sessions.py", line 655, in send
    r = adapter.send(request, **kwargs)
  File "/opt/certbot/lib/python3.7/site-packages/requests/adapters.py", line 516, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='acme-v02.api.letsencrypt.org', port=443): Max retries exceeded with url: /directory (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f4794931710>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution'))

2021-10-14 08:53:22,925:DEBUG:certbot.display.util:Notifying user: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2021-10-14 08:53:22,925:ERROR:certbot._internal.renewal:All renewals failed. The following certificates could not be renewed:
2021-10-14 08:53:22,925:ERROR:certbot._internal.renewal:  /etc/letsencrypt/live/npm-5/fullchain.pem (failure)
2021-10-14 08:53:22,925:DEBUG:certbot.display.util:Notifying user: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2021-10-14 08:53:22,926:DEBUG:certbot._internal.log:Exiting abnormally:
Traceback (most recent call last):
  File "/opt/certbot/bin/certbot", line 8, in <module>
    sys.exit(main())
  File "/opt/certbot/lib/python3.7/site-packages/certbot/main.py", line 15, in main
    return internal_main.main(cli_args)
  File "/opt/certbot/lib/python3.7/site-packages/certbot/_internal/main.py", line 1574, in main
    return config.func(config, plugins)
  File "/opt/certbot/lib/python3.7/site-packages/certbot/_internal/main.py", line 1461, in renew
    renewal.handle_renewal_request(config)
  File "/opt/certbot/lib/python3.7/site-packages/certbot/_internal/renewal.py", line 500, in handle_renewal_request
    len(renew_failures), len(parse_failures)))
certbot.errors.Error: 1 renew failure(s), 0 parse failure(s)
2021-10-14 08:53:22,926:ERROR:certbot._internal.log:1 renew failure(s), 0 parse failure(s)

@chaptergy
Copy link
Collaborator

chaptergy commented Oct 14, 2021

Unfortunately v3 is still a while away I think, there is no official timeline yet.

Hm, but the logs also only contain the error that it fails to connect to acme-v02.api.letsencrypt.org, or rather it fails to resolve the domain. But you said you are able to ping it? So you have installed ping within the npm container and are able to ping the domain? Are you also able to run nslookup acme-v02.api.letsencrypt.org (ypu'll need to install the dnsutils package for nslookup?

@dl-lim
Copy link
Author

dl-lim commented Oct 20, 2021

Hrmm, now that you mention doing this, I'm not even able to apt update inside the docker container since it doesn't even reach the Internet, looks like...

image

Because of that, couldn't even install ping

image

Any ideas? I could ping just fine from the host machine.

I've also recreated this container several times. Only thing that persisted was the volumes.

@chaptergy
Copy link
Collaborator

It seems this is actually an issue with your installation of docker. As this could be caused by many things and has nothing to do with npm, I will close this issue.

Depending on what OS you used, how you installed docker, etc, you should be able to find articles and questions with similar issues which will help you with your issue.

Some links to get started:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants