Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minion returns "No response" for most commands in 2019.2.2 with "RNG" related error #55116

Closed
Paulo-Nunes opened this issue Oct 24, 2019 · 20 comments · Fixed by #55635
Closed
Assignees
Labels
fixed-pls-verify fix is linked, bug author to confirm fix P4 Priority 4 severity-high 2nd top severity, seen by most users, causes major problems ZRELEASED - Neon retired label
Milestone

Comments

@Paulo-Nunes
Copy link
Contributor

Paulo-Nunes commented Oct 24, 2019

Description of Issue

After upgrading from 2019.2.0 to 2019.2.2 some salt commands fail with [No response].
pillar.items works, but:

  • state.apply
  • test.ping
  • cmd.run
  • grains.items

Do not work.
I only tested these 5 commands.
state.apply test=True does work though, as does running salt-call state.apply from the minion.

I also made a typo when testing this and grais.items also gets no response instead of an error about how the command does not exist.

Steps to Reproduce Issue

Run any of the commands from the master noted in the Issue Description to reproduce the issue.

Minion logs show errors:

2019-10-24 13:23:55,967 [salt.utils.process                                    :754 ][ERROR   ][12525] An un-handled exception from the multiprocessing process 'SignalHandlingMultiprocessingProcess-1:25' was caught:
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/salt/utils/process.py", line 747, in run
    return super(MultiprocessingProcess, self).run()
	...
  File "/usr/lib64/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 153, in _check_pid
    raise AssertionError("PID check failed. RNG must be re-initialized after fork(). Hint: Try Random.atfork()")
AssertionError: PID check failed. RNG must be re-initialized after fork(). Hint: Try Random.atfork()
2019-10-24 13:24:01,072 [salt.utils.process                                    :754 ][ERROR   ][12526] An un-handled exception from the multiprocessing process 'SignalHandlingMultiprocessingProcess-1:26' was caught:
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/salt/utils/process.py", line 747, in run
    return super(MultiprocessingProcess, self).run()
	...
  File "/usr/lib64/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 153, in _check_pid
    raise AssertionError("PID check failed. RNG must be re-initialized after fork(). Hint: Try Random.atfork()")
AssertionError: PID check failed. RNG must be re-initialized after fork(). Hint: Try Random.atfork()

Versions Report

Salt Version:
           Salt: 2019.2.2
 
Dependency Versions:
           cffi: Not Installed
       cherrypy: Not Installed
       dateutil: Not Installed
      docker-py: Not Installed
          gitdb: Not Installed
      gitpython: Not Installed
          ioflo: Not Installed
         Jinja2: 2.8.1
        libgit2: 0.20.0
        libnacl: Not Installed
       M2Crypto: Not Installed
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.4.6
   mysql-python: 1.2.5
      pycparser: Not Installed
       pycrypto: 2.6.1
   pycryptodome: Not Installed
         pygit2: 0.20.3
         Python: 2.7.14 (default, Jan 31 2018, 02:12:13)
   python-gnupg: Not Installed
         PyYAML: 3.11
          PyZMQ: 14.5.0
           RAET: Not Installed
          smmap: Not Installed
        timelib: Not Installed
        Tornado: 4.2.1
            ZMQ: 4.0.5
 
System Versions:
           dist: redhat 6.10 Santiago
         locale: UTF-8
        machine: x86_64
        release: 2.6.32-754.18.2.el6.x86_64
         system: Linux
        version: Red Hat Enterprise Linux Server 6.10 Santiago
@Paulo-Nunes
Copy link
Contributor Author

Master/Minion on 2019.2.2.
Works with Minion on 2019.2.0.

Master and both Minions are on RHEL6.10

@arizvisa
Copy link
Contributor

@Paulo-Nunes, I think the result of your error is that you're instance of RHEL is running Python 2.6 which has been deprecated by 2017.7.0, but nonetheless...if you can't upgrade to Python 2.7, does installing pycryptodome help as a workaround perhaps?

@Paulo-Nunes
Copy link
Contributor Author

Installing pycryptodome did not resolve my issue.
I installed it on both the Minion and Master and restarted both, but even a simple test.ping still results in [No response].

Are there any other logs/reports that would be helpful to find a solution. Unfortunately the version of python installed is out of my control.

I also tested upgrading a RHEL 7 Minion on Python 2.7.5. In this case it is working without needing pycryptodome on the Minion.

@arizvisa
Copy link
Contributor

@Paulo-Nunes, I don't have any more answers, but maybe a maintainer will be able to help you further.

@xeacott
Copy link
Contributor

xeacott commented Nov 1, 2019

I'll ask someone with more experience in rhel and see what they think.

@xeacott xeacott added this to the Blocked milestone Nov 1, 2019
@xeacott xeacott added the Pending-Discussion The issue or pull request needs more discussion before it can be closed or merged label Nov 1, 2019
@xeacott
Copy link
Contributor

xeacott commented Nov 1, 2019

Do you think you could run a salt-call --local grains.items and see what you get back?

@xeacott
Copy link
Contributor

xeacott commented Nov 1, 2019

Afterwards, send a salt-call grains.items. If the above call works, that is a minion-only call, no master involved. If you run a salt-call without --local, it will go to the master. Check if the ports are open that we use, too. Otherwise we just booted up a rhel 6.10, made the upgrade, and everything seemed fine here.

@Paulo-Nunes
Copy link
Contributor Author

Paulo-Nunes commented Nov 1, 2019

Both salt-call grains.items and salt-call --local grains.items work as I am expecting on the minion.
salt minion grains.items does not work if run from master.

Ports are as they have been for the past 2 years or so. Only an issue after upgrading the minion/master.
And it happened for 2 separate minions of the 2 that I tested the upgrade on.

Correction: I upgraded salt-minion on 3 minions 2/3 have this issue. The 2/3 that have the issue are RHEL6 the 1/3 that appears to be unaffected is RHEL7

@lukasraska
Copy link
Contributor

We encountered this issue on CentOS 7 (CentOS Linux release 7.7.1908 (Core)) as well

2019-11-11 11:46:40,162 [salt.utils.process:754 ][ERROR   ][23049] An un-handled exception from the multiprocessing process 'SignalHandlingMultiprocessingProcess-1:88' was caught:
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/salt/utils/process.py", line 747, in run
    return super(MultiprocessingProcess, self).run()
  File "/usr/lib64/python2.7/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib/python2.7/site-packages/salt/minion.py", line 1594, in _target
    run_func(minion_instance, opts, data)
  File "/usr/lib/python2.7/site-packages/salt/minion.py", line 1589, in run_func
    return Minion._thread_return(minion_instance, opts, data)
  File "/usr/lib/python2.7/site-packages/salt/minion.py", line 1776, in _thread_return
    timeout=minion_instance._return_retry_timer()
  File "/usr/lib/python2.7/site-packages/salt/minion.py", line 1997, in _return_pub
    ret_val = self._send_req_sync(load, timeout=timeout)
  File "/usr/lib/python2.7/site-packages/salt/minion.py", line 1406, in _send_req_sync
    sig = salt.crypt.sign_message(minion_privkey_path, salt.serializers.msgpack.serialize(load))
  File "/usr/lib/python2.7/site-packages/salt/crypt.py", line 233, in sign_message
    return signer.sign(SHA.new(salt.utils.stringutils.to_bytes(message)))
  File "/usr/lib64/python2.7/site-packages/Crypto/Signature/PKCS1_v1_5.py", line 112, in sign
    m = self._key.decrypt(em)
  File "/usr/lib64/python2.7/site-packages/Crypto/PublicKey/RSA.py", line 174, in decrypt
    return pubkey.pubkey.decrypt(self, ciphertext)
  File "/usr/lib64/python2.7/site-packages/Crypto/PublicKey/pubkey.py", line 93, in decrypt
    plaintext=self._decrypt(ciphertext)
  File "/usr/lib64/python2.7/site-packages/Crypto/PublicKey/RSA.py", line 235, in _decrypt
    r = getRandomRange(1, self.key.n-1, randfunc=self._randfunc)
  File "/usr/lib64/python2.7/site-packages/Crypto/Util/number.py", line 123, in getRandomRange
    value = getRandomInteger(bits, randfunc)
  File "/usr/lib64/python2.7/site-packages/Crypto/Util/number.py", line 104, in getRandomInteger
    S = randfunc(N>>3)
  File "/usr/lib64/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 202, in read
    return self._singleton.read(bytes)
  File "/usr/lib64/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 178, in read
    return _UserFriendlyRNG.read(self, bytes)
  File "/usr/lib64/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 137, in read
    self._check_pid()
  File "/usr/lib64/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 153, in _check_pid
    raise AssertionError("PID check failed. RNG must be re-initialized after fork(). Hint: Try Random.atfork()")
AssertionError: PID check failed. RNG must be re-initialized after fork(). Hint: Try Random.atfork()

Minion is still able to return grains for example, but job events seem to have problem.
The minions runs with minion_sign_messages: True option and TCP transport. Running salt-call works just fine, which prompted me to disable "minion_sign_messages" and it started working again.

So probably this option is now broken in 2019.2.2.

Salt Version:
           Salt: 2019.2.2

Dependency Versions:
           cffi: Not Installed
       cherrypy: unknown
       dateutil: 1.5
      docker-py: Not Installed
          gitdb: Not Installed
      gitpython: Not Installed
          ioflo: Not Installed
         Jinja2: 2.7.2
        libgit2: Not Installed
        libnacl: Not Installed
       M2Crypto: Not Installed
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.5.6
   mysql-python: Not Installed
      pycparser: Not Installed
       pycrypto: 2.6.1
   pycryptodome: Not Installed
         pygit2: Not Installed
         Python: 2.7.5 (default, Aug  7 2019, 00:51:29)
   python-gnupg: Not Installed
         PyYAML: 3.11
          PyZMQ: 15.3.0
           RAET: Not Installed
          smmap: Not Installed
        timelib: Not Installed
        Tornado: 4.2.1
            ZMQ: 4.1.4

System Versions:
           dist: centos 7.7.1908 Core
         locale: UTF-8
        machine: x86_64
        release: 3.10.0-1062.4.1.el7.x86_64
         system: Linux
        version: CentOS Linux 7.7.1908 Core

Both master and minion are 2019.2.2 and PY2 versions.

@lukasraska
Copy link
Contributor

So I can confirm this is issue for RHEL 6 as well (in particular with minion_sign_messages) on 2019.2.2 (not really depending on transport there)

2019-12-12 13:51:11,020 [salt.utils.process                                    :754 ][ERROR   ][7728] An un-handled exception from the multiprocessing process 'SignalHandlingMultiprocessingProcess-1:3' was caught:
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/salt/utils/process.py", line 747, in run
    return super(MultiprocessingProcess, self).run()
  File "/usr/lib64/python2.7/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib/python2.7/site-packages/salt/minion.py", line 1594, in _target
    run_func(minion_instance, opts, data)
  File "/usr/lib/python2.7/site-packages/salt/minion.py", line 1589, in run_func
    return Minion._thread_return(minion_instance, opts, data)
  File "/usr/lib/python2.7/site-packages/salt/minion.py", line 1776, in _thread_return
    timeout=minion_instance._return_retry_timer()
  File "/usr/lib/python2.7/site-packages/salt/minion.py", line 1997, in _return_pub
    ret_val = self._send_req_sync(load, timeout=timeout)
  File "/usr/lib/python2.7/site-packages/salt/minion.py", line 1406, in _send_req_sync
    sig = salt.crypt.sign_message(minion_privkey_path, salt.serializers.msgpack.serialize(load))
  File "/usr/lib/python2.7/site-packages/salt/crypt.py", line 233, in sign_message
    return signer.sign(SHA.new(salt.utils.stringutils.to_bytes(message)))
  File "/usr/lib64/python2.7/site-packages/Crypto/Signature/PKCS1_v1_5.py", line 112, in sign
    m = self._key.decrypt(em)
  File "/usr/lib64/python2.7/site-packages/Crypto/PublicKey/RSA.py", line 174, in decrypt
    return pubkey.pubkey.decrypt(self, ciphertext)
  File "/usr/lib64/python2.7/site-packages/Crypto/PublicKey/pubkey.py", line 93, in decrypt
    plaintext=self._decrypt(ciphertext)
  File "/usr/lib64/python2.7/site-packages/Crypto/PublicKey/RSA.py", line 235, in _decrypt
    r = getRandomRange(1, self.key.n-1, randfunc=self._randfunc)
  File "/usr/lib64/python2.7/site-packages/Crypto/Util/number.py", line 123, in getRandomRange
    value = getRandomInteger(bits, randfunc)
  File "/usr/lib64/python2.7/site-packages/Crypto/Util/number.py", line 104, in getRandomInteger
    S = randfunc(N>>3)
  File "/usr/lib64/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 202, in read
    return self._singleton.read(bytes)
  File "/usr/lib64/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 178, in read
    return _UserFriendlyRNG.read(self, bytes)
  File "/usr/lib64/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 137, in read
    self._check_pid()
  File "/usr/lib64/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 153, in _check_pid
    raise AssertionError("PID check failed. RNG must be re-initialized after fork(). Hint: Try Random.atfork()")
AssertionError: PID check failed. RNG must be re-initialized after fork(). Hint: Try Random.atfork()

Once the minion process is forked, it should call Random.atfork() sometimes, if using crypto (and it happens in AsyncAuth class for example - using salt.utils.crypt.reinit_crypto()). The problem is that this is called only when AsyncAuth object is created, which might be just after things that need that are executed).

For me it solved calling reinit_crypto before the sign_messages (so basically when minion process is forked, to avoid any potential other issues). If @Paulo-Nunes can provide full stacktrace, I could be able to determine where that is called in his case (and if for example minion_sign_messages is used).

I will prepare PR to target this, although tests for this might be an issue to reproduce it.

@Paulo-Nunes
Copy link
Contributor Author

Paulo-Nunes commented Dec 16, 2019

The full stack trace on one of the problem minions:

AssertionError: PID check failed. RNG must be re-initialized after fork(). Hint: Try Random.atfork()
2019-12-16 10:32:17,557 [salt.utils.process                                    :754 ][ERROR   ][4023] An un-handled exception from the multiprocessing process 'SignalHandlingMultiprocessingProcess-1:4' was caught:
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/salt/utils/process.py", line 747, in run
    return super(MultiprocessingProcess, self).run()
  File "/usr/lib64/python2.7/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib/python2.7/site-packages/salt/minion.py", line 1594, in _target
    run_func(minion_instance, opts, data)
  File "/usr/lib/python2.7/site-packages/salt/minion.py", line 1589, in run_func
    return Minion._thread_return(minion_instance, opts, data)
  File "/usr/lib/python2.7/site-packages/salt/minion.py", line 1776, in _thread_return
    timeout=minion_instance._return_retry_timer()
  File "/usr/lib/python2.7/site-packages/salt/minion.py", line 1997, in _return_pub
    ret_val = self._send_req_sync(load, timeout=timeout)
  File "/usr/lib/python2.7/site-packages/salt/minion.py", line 1406, in _send_req_sync
    sig = salt.crypt.sign_message(minion_privkey_path, salt.serializers.msgpack.serialize(load))
  File "/usr/lib/python2.7/site-packages/salt/crypt.py", line 233, in sign_message
    return signer.sign(SHA.new(salt.utils.stringutils.to_bytes(message)))
  File "/usr/lib64/python2.7/site-packages/Crypto/Signature/PKCS1_v1_5.py", line 112, in sign
    m = self._key.decrypt(em)
  File "/usr/lib64/python2.7/site-packages/Crypto/PublicKey/RSA.py", line 174, in decrypt
    return pubkey.pubkey.decrypt(self, ciphertext)
  File "/usr/lib64/python2.7/site-packages/Crypto/PublicKey/pubkey.py", line 93, in decrypt
    plaintext=self._decrypt(ciphertext)
  File "/usr/lib64/python2.7/site-packages/Crypto/PublicKey/RSA.py", line 235, in _decrypt
    r = getRandomRange(1, self.key.n-1, randfunc=self._randfunc)
  File "/usr/lib64/python2.7/site-packages/Crypto/Util/number.py", line 123, in getRandomRange
    value = getRandomInteger(bits, randfunc)
  File "/usr/lib64/python2.7/site-packages/Crypto/Util/number.py", line 104, in getRandomInteger
    S = randfunc(N>>3)
  File "/usr/lib64/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 202, in read
    return self._singleton.read(bytes)
  File "/usr/lib64/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 178, in read
    return _UserFriendlyRNG.read(self, bytes)
  File "/usr/lib64/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 137, in read
    self._check_pid()
  File "/usr/lib64/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 153, in _check_pid
    raise AssertionError("PID check failed. RNG must be re-initialized after fork(). Hint: Try Random.atfork()")
AssertionError: PID check failed. RNG must be re-initialized after fork(). Hint: Try Random.atfork()

Yes, Minion has:
minion_sign_messages: True

Update: I tried setting minion_sign_messages to false, but this did not change anything.

@lukasraska
Copy link
Contributor

Thanks, so your stacktrace also fails inside salt.crypt.sign_message call, so PR #55635 should help with that.

Is the stacktrace any different when you set minion_sign_messages: False?

@Paulo-Nunes
Copy link
Contributor Author

Seems I didn't copy everything last time. It prints 2 errors everytime I do a test.ping

2019-12-17 09:53:40,968 [salt.utils.process                                    :754 ][ERROR   ][523] An un-handled exception from the multiprocessing process 'SignalHandlingMultiprocessingProcess-1:1468' was caught:
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/salt/utils/process.py", line 747, in run
    return super(MultiprocessingProcess, self).run()
  File "/usr/lib64/python2.7/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib/python2.7/site-packages/salt/minion.py", line 1594, in _target
    run_func(minion_instance, opts, data)
  File "/usr/lib/python2.7/site-packages/salt/minion.py", line 1589, in run_func
    return Minion._thread_return(minion_instance, opts, data)
  File "/usr/lib/python2.7/site-packages/salt/minion.py", line 1776, in _thread_return
    timeout=minion_instance._return_retry_timer()
  File "/usr/lib/python2.7/site-packages/salt/minion.py", line 1997, in _return_pub
    ret_val = self._send_req_sync(load, timeout=timeout)
  File "/usr/lib/python2.7/site-packages/salt/minion.py", line 1406, in _send_req_sync
    sig = salt.crypt.sign_message(minion_privkey_path, salt.serializers.msgpack.serialize(load))
  File "/usr/lib/python2.7/site-packages/salt/crypt.py", line 233, in sign_message
    return signer.sign(SHA.new(salt.utils.stringutils.to_bytes(message)))
  File "/usr/lib64/python2.7/site-packages/Crypto/Signature/PKCS1_v1_5.py", line 112, in sign
    m = self._key.decrypt(em)
  File "/usr/lib64/python2.7/site-packages/Crypto/PublicKey/RSA.py", line 174, in decrypt
    return pubkey.pubkey.decrypt(self, ciphertext)
  File "/usr/lib64/python2.7/site-packages/Crypto/PublicKey/pubkey.py", line 93, in decrypt
    plaintext=self._decrypt(ciphertext)
  File "/usr/lib64/python2.7/site-packages/Crypto/PublicKey/RSA.py", line 235, in _decrypt
    r = getRandomRange(1, self.key.n-1, randfunc=self._randfunc)
  File "/usr/lib64/python2.7/site-packages/Crypto/Util/number.py", line 123, in getRandomRange
    value = getRandomInteger(bits, randfunc)
  File "/usr/lib64/python2.7/site-packages/Crypto/Util/number.py", line 104, in getRandomInteger
    S = randfunc(N>>3)
  File "/usr/lib64/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 202, in read
    return self._singleton.read(bytes)
  File "/usr/lib64/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 178, in read
    return _UserFriendlyRNG.read(self, bytes)
  File "/usr/lib64/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 137, in read
    self._check_pid()
  File "/usr/lib64/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 153, in _check_pid
    raise AssertionError("PID check failed. RNG must be re-initialized after fork(). Hint: Try Random.atfork()")
AssertionError: PID check failed. RNG must be re-initialized after fork(). Hint: Try Random.atfork()
2019-12-17 09:53:45,810 [salt.utils.process                                    :754 ][ERROR   ][527] An un-handled exception from the multiprocessing process 'SignalHandlingMultiprocessingProcess-1:1469' was caught:
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/salt/utils/process.py", line 747, in run
    return super(MultiprocessingProcess, self).run()
  File "/usr/lib64/python2.7/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/lib/python2.7/site-packages/salt/minion.py", line 1594, in _target
    run_func(minion_instance, opts, data)
  File "/usr/lib/python2.7/site-packages/salt/minion.py", line 1589, in run_func
    return Minion._thread_return(minion_instance, opts, data)
  File "/usr/lib/python2.7/site-packages/salt/minion.py", line 1776, in _thread_return
    timeout=minion_instance._return_retry_timer()
  File "/usr/lib/python2.7/site-packages/salt/minion.py", line 1997, in _return_pub
    ret_val = self._send_req_sync(load, timeout=timeout)
  File "/usr/lib/python2.7/site-packages/salt/minion.py", line 1406, in _send_req_sync
    sig = salt.crypt.sign_message(minion_privkey_path, salt.serializers.msgpack.serialize(load))
  File "/usr/lib/python2.7/site-packages/salt/crypt.py", line 233, in sign_message
    return signer.sign(SHA.new(salt.utils.stringutils.to_bytes(message)))
  File "/usr/lib64/python2.7/site-packages/Crypto/Signature/PKCS1_v1_5.py", line 112, in sign
    m = self._key.decrypt(em)
  File "/usr/lib64/python2.7/site-packages/Crypto/PublicKey/RSA.py", line 174, in decrypt
    return pubkey.pubkey.decrypt(self, ciphertext)
  File "/usr/lib64/python2.7/site-packages/Crypto/PublicKey/pubkey.py", line 93, in decrypt
    plaintext=self._decrypt(ciphertext)
  File "/usr/lib64/python2.7/site-packages/Crypto/PublicKey/RSA.py", line 235, in _decrypt
    r = getRandomRange(1, self.key.n-1, randfunc=self._randfunc)
  File "/usr/lib64/python2.7/site-packages/Crypto/Util/number.py", line 123, in getRandomRange
    value = getRandomInteger(bits, randfunc)
  File "/usr/lib64/python2.7/site-packages/Crypto/Util/number.py", line 104, in getRandomInteger
    S = randfunc(N>>3)
  File "/usr/lib64/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 202, in read
    return self._singleton.read(bytes)
  File "/usr/lib64/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 178, in read
    return _UserFriendlyRNG.read(self, bytes)
  File "/usr/lib64/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 137, in read
    self._check_pid()
  File "/usr/lib64/python2.7/site-packages/Crypto/Random/_UserFriendlyRNG.py", line 153, in _check_pid
    raise AssertionError("PID check failed. RNG must be re-initialized after fork(). Hint: Try Random.atfork()")
AssertionError: PID check failed. RNG must be re-initialized after fork(). Hint: Try Random.atfork()

The only lines that are different between True and False are these:
True:

2019-12-17 09:53:40,968 [salt.utils.process                                    :754 ][ERROR   ][523] An un-handled exception from the multiprocessing process 'SignalHandlingMultiprocessingProcess-1:1468' was caught:

2019-12-17 09:53:45,810 [salt.utils.process                                    :754 ][ERROR   ][527] An un-handled exception from the multiprocessing process 'SignalHandlingMultiprocessingProcess-1:1469' was caught:

False:

2019-12-17 09:52:03,158 [salt.utils.process                                    :754 ][ERROR   ][30765] An un-handled exception from the multiprocessing process 'SignalHandlingMultiprocessingProcess-1:1466' was caught:

2019-12-17 09:52:08,186 [salt.utils.process                                    :754 ][ERROR   ][30770] An un-handled exception from the multiprocessing process 'SignalHandlingMultiprocessingProcess-1:1467' was caught:

Doesn't look too significant to me, but I don't know enough to judge that.

@lukasraska
Copy link
Contributor

Yeah, name of the process it not relevant here. What is interesting is that it always tries to sign the message:

File "/usr/lib/python2.7/site-packages/salt/minion.py", line 1406, in _send_req_sync
    sig = salt.crypt.sign_message(minion_privkey_path, salt.serializers.msgpack.serialize(load))

But in the actual version it's this:

salt/salt/minion.py

Lines 1403 to 1407 in ca2afa5

if self.opts['minion_sign_messages']:
log.trace('Signing event to be published onto the bus.')
minion_privkey_path = os.path.join(self.opts['pki_dir'], 'minion.pem')
sig = salt.crypt.sign_message(minion_privkey_path, salt.serializers.msgpack.serialize(load))
load['sig'] = sig

So it really shouldn't™ do that. Are you sure it's really disabled? There might be more config files that override those settings (note that if you enforce sign validations on master, it might reject minion events when you disable the setting on minion only). But more or less the problem should be solved by the PR above.

@Ch3LL Ch3LL added severity-high 2nd top severity, seen by most users, causes major problems ZRELEASED - Neon retired label P4 Priority 4 and removed Pending-Discussion The issue or pull request needs more discussion before it can be closed or merged needs-triage labels Dec 17, 2019
@Ch3LL Ch3LL modified the milestones: Blocked, Approved Dec 17, 2019
@Ch3LL Ch3LL added team-core fixed-pls-verify fix is linked, bug author to confirm fix labels Dec 17, 2019
@Ch3LL
Copy link
Contributor

Ch3LL commented Dec 17, 2019

i verified the fix in #55635 does indeed fix this issue. anyone else want to verify here?

@Paulo-Nunes
Copy link
Contributor Author

It is possible that minion_sign_messages is getting somehow applied somewhere else.
I have this:

Minion:

minion_sign_messages: True
verify_master_pubkey_sign: True

Master:

sign_pub_messages: True
require_minion_sign_messages: True
drop_messages_signature_fail: False
master_sign_pubkey: True

I can do more testing later today where I play with some of these settings to see if they make a difference.

@Paulo-Nunes
Copy link
Contributor Author

After applying the changes from #55635 the minion returns when master issues test.ping and no error is logged in minion log.

I have not tested how those changes affect other functions or minions.

Tested on Minion:

Salt Version:
           Salt: 2019.2.2
 
Dependency Versions:
           cffi: Not Installed
       cherrypy: Not Installed
       dateutil: Not Installed
      docker-py: Not Installed
          gitdb: Not Installed
      gitpython: Not Installed
          ioflo: Not Installed
         Jinja2: 2.8.1
        libgit2: Not Installed
        libnacl: Not Installed
       M2Crypto: Not Installed
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.4.6
   mysql-python: Not Installed
      pycparser: Not Installed
       pycrypto: 2.6.1
   pycryptodome: Not Installed
         pygit2: Not Installed
         Python: 2.7.14 (default, Jan 31 2018, 02:12:13)
   python-gnupg: Not Installed
         PyYAML: 3.11
          PyZMQ: 14.5.0
           RAET: Not Installed
          smmap: Not Installed
        timelib: Not Installed
        Tornado: 4.2.1
            ZMQ: 4.0.5
 
System Versions:
           dist: redhat 6.10 Santiago
         locale: UTF-8
        machine: x86_64
        release: 2.6.32-754.18.2.el6.x86_64
         system: Linux
        version: Red Hat Enterprise Linux Server 6.10 Santiago

@Ch3LL
Copy link
Contributor

Ch3LL commented Dec 19, 2019

okay to close this?

@Paulo-Nunes
Copy link
Contributor Author

On my end it looks like #55635 resolves the issue.
I personally would not consider this closed until #55635 has been closed.

As a workaround this does seem to work, but I wouldn't apply this fix to hundreds of minions. I will wait to upgrade until the fix is released.

If it is sufficient to show that a workaround exists, then yes the issue can be closed.

@Ch3LL
Copy link
Contributor

Ch3LL commented Jan 7, 2020

thanks for verifying the fix. we can close when the PR Is merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fixed-pls-verify fix is linked, bug author to confirm fix P4 Priority 4 severity-high 2nd top severity, seen by most users, causes major problems ZRELEASED - Neon retired label
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants