Skip to content

Commit

Permalink
Adding retry functionality when parallel is set to True (#61631)
Browse files Browse the repository at this point in the history
* Making the retry state system feature available when parallel is set to True.

* swapping out using time.sleep Salt function for Python time.sleep.

* Skipping parallel tests on Windows.

* Fixing items, values, and keys functions in the data module.  Moving integration to functional tests.  Converting unit test over to pytest.

* adding changleog.

* fixes #62044 add ignore_missing to file.comment state

* fixes #61662 fix file.comment reports changes in test mode

* add 61662.fixed changelog

* add final test mode run after comment is successful

* fix uncomment_regex

* fix uncomment_regex

* Try not to trigger the GLIBC race condition.

See https://sourceware.org/bugzilla/show_bug.cgi?id=19329

Fixes #62071

Signed-off-by: Pedro Algarvio <[email protected]>

* Add missing CLI example to ``network.fqdns``

Signed-off-by: Pedro Algarvio <[email protected]>

* Sleep between submitting new jobs to thread pool

Signed-off-by: Pedro Algarvio <[email protected]>

* Sleep before trying to resolve instead

Signed-off-by: Pedro Algarvio <[email protected]>

* Shorter sleeps

* fixes #61944 fixed backslash literal bytestring

* When states are running in parallel, ensure that the total run time produced by the highstate outputter takes that into account.

* removing unrelated change

* Adding changelog.

* Skip parallel test on Windows.

* skip pdbedit unit tests on ubuntu 2204 because md4 is not supported as a hash type

* refresh db at the beginning for destructive aptpkg functional tests

* fix failing test_pkgrepo_managed_absent and actually capture and keep signed-by information in salt/modules/aptpkg.py custom implementation of apt repository management

* Update tests/unit/modules/test_pdbedit.py

Co-authored-by: Pedro Algarvio <[email protected]>

* Disable hack to force allowing forking on macOS

* Use the newer nox platforms file which installs Py3.9

Signed-off-by: Pedro Algarvio <[email protected]>

* Support macOS defaulting to spawn as the default multiprocessing method

* Remove unused attribute

* Switch to ``salt.utils.platform.spawning_platform()``

Signed-off-by: Pedro Algarvio <[email protected]>

* Reconstruct client instances on spawning platforms

* Also skip this test for now on macOS

Signed-off-by: Pedro Algarvio <[email protected]>

* Reconstruct the ``State`` class on spawned processes.

Signed-off-by: Pedro Algarvio <[email protected]>

* Force a reference to ``salt.ext.tornado.iostream.StreamClosedError`` to avoid ``NameError`` issues

Signed-off-by: Pedro Algarvio <[email protected]>

* Skip, for now, problematic tests on spawning platforms

Signed-off-by: Pedro Algarvio <[email protected]>

* DeltaProxy minions do not work on spawning platforms

Signed-off-by: Pedro Algarvio <[email protected]>

* Avoid a ``NameError`` when failing to initialize the process list

Signed-off-by: Pedro Algarvio <[email protected]>

* Use Python 3.9 on macOS

Signed-off-by: Pedro Algarvio <[email protected]>

* Drop requirements on Darwin for Py<3.9

Signed-off-by: Pedro Algarvio <[email protected]>

* Same test run timeout as windows

Signed-off-by: Pedro Algarvio <[email protected]>

* Don't shell out! Access the database directly and properly escape values.

Signed-off-by: Pedro Algarvio <[email protected]>

* Fixes for the new macOS Catalina and BigSur CICD images

Signed-off-by: Pedro Algarvio <[email protected]>

* Migrate ``integration.modules.test_mac_pkgutil`` to PyTest and functional tests.

Fixes #60819

Signed-off-by: Pedro Algarvio <[email protected]>

* Add changelog for #57742

Fixes #57742

Signed-off-by: Pedro Algarvio <[email protected]>

* Add changelog entry.

Fixes #55847

Signed-off-by: Pedro Algarvio <[email protected]>

* initial Proxy Module

* inital pass of restconf states

* adding codeauthor

* adding restconf module

* doc update

* doc update

* doc update

* add requirements chefk to states module

* doc update

* doc update

* doc update

* add changelog entry

* Update salt/proxy/restconf.py

Co-authored-by: Wayne Werner <[email protected]>

* Update salt/proxy/restconf.py

Co-authored-by: Wayne Werner <[email protected]>

* remove noqa statements

* docs update & opts defaults

* docs update

* error checks + https transport method

* docs update

* docs update

* doing a fix for commit check

* adding test and updating modules

* removing old code

* DRY

* docs update

* output updates

* prepping for deepdiff switch

* states testing

* more tests

* changed changelog from fixed to added

* force pre-commit checks

* new style of diff output that is readable

* updating tests and output

* updating output style to make YAML diff more readable

* add test for restconf module

* adjusting based on pr review

* adjusting based on pr review

* adjusting true/false verify based on pr review

* change uri to path and clean args list

Co-authored-by: Wayne Werner <[email protected]>

* adjusting uri to path based on pr review

* adjusting capabilities path based on pr review

* Update salt/modules/restconf.py

Co-authored-by: Wayne Werner <[email protected]>

* adjusting val based on pr review

* caps to confirm to RFC 8040

* WIP: refactor based on feedback

* WIP: changed results to be any kind of falsey val

* linting - pre-commit

* updating docs

* Update salt/states/restconf.py

Co-authored-by: Wayne Werner <[email protected]>

* updating style

* fixes for lint and nox pass

* fixing

* blacken

* Update restconf.py

* add cli examples

* change logging type

* add pytest skip for ordereddict issue with legacy python

* add sys library for pytest mark skipif check

* pre-commit fixes

* Update salt/proxy/restconf.py

Co-authored-by: Pedro Algarvio <[email protected]>

* Update salt/states/restconf.py

Co-authored-by: Pedro Algarvio <[email protected]>

* Update tests/pytests/unit/modules/test_restconf.py

Co-authored-by: Pedro Algarvio <[email protected]>

* Update tests/pytests/unit/states/test_restconf.py

Co-authored-by: Pedro Algarvio <[email protected]>

* Update tests/pytests/unit/proxy/test_restconf.py

Co-authored-by: Pedro Algarvio <[email protected]>

* pre-commit fix

* update logging

* Update restconf.py

* some rest apis dont follow the standard correctly (cisco) and need another header

* fixes #61946 sync_after_install immutabledict error

* add configurable sync sleep for create

* add testing around the particular sync command called

* Adds the ability to get file version information on Windows

- Adds file.version that will get just the version
- Adds file.version_details to get additional information
- Adds tests
- Adds changelog

@amalaguti originally submitted a PR for this: #59770, but he has
since deleted his branch. Recognizing his contribution here.

* Update states.chef for version 16.x and 17.x Chef Infra Client output

* Support previous Chef version plus version 16.x and 17.x output.

* Remove legacy chef unit test

* Add unit tests for chef state

These tests cover the entire state, but are only concerned with the
ret["result"] of the outputs. It would be a good idea to test other
outputs as well - these tests could be modified to include assertions
about the other parts of ret. (Obviously they would need to be slightly
renamed. Maybe to `..._return_expected_ret`

* Update salt-bootstrap to 2022.03.15 release

* Update state.py

Fixing lint.

* Running pre-commit manually.

* swap out salt.utils.platform.is_windows for salt.utils.platform.spawning_platform.  Bump up duration for test_retry_option_success_parallel to 30 seconds to account for test running on Mac OS.

Co-authored-by: Megan Wilhite <[email protected]>
Co-authored-by: nicholasmhughes <[email protected]>
Co-authored-by: Pedro Algarvio <[email protected]>
Co-authored-by: emmadionne1 <[email protected]>
Co-authored-by: MKLeb <[email protected]>
Co-authored-by: Caleb Beard <[email protected]>
Co-authored-by: Pedro Algarvio <[email protected]>
Co-authored-by: Jamie Murphy <[email protected]>
Co-authored-by: Jamie (Bear) Murphy <[email protected]>
Co-authored-by: Jamie Murphy <[email protected]>
Co-authored-by: Wayne Werner <[email protected]>
Co-authored-by: Cassandra Durnford <[email protected]>
Co-authored-by: Jamie Murphy <[email protected]>
Co-authored-by: Jamie (Bear) Murphy <[email protected]>
Co-authored-by: Twangboy <[email protected]>
Co-authored-by: Eric Ham <[email protected]>
Co-authored-by: Wayne Werner <[email protected]>
Co-authored-by: krionbsd <[email protected]>
  • Loading branch information
19 people authored May 26, 2022
1 parent de03178 commit 190ab7c
Show file tree
Hide file tree
Showing 3 changed files with 158 additions and 2 deletions.
1 change: 1 addition & 0 deletions changelog/61630.fixed
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Making the retry state system feature available when parallel is set to True.
67 changes: 66 additions & 1 deletion salt/state.py
Original file line number Diff line number Diff line change
Expand Up @@ -2030,12 +2030,77 @@ def _call_parallel_target(cls, instance, init_kwargs, name, cdata, low):
}

utc_finish_time = datetime.datetime.utcnow()
timezone_delta = datetime.datetime.utcnow() - datetime.datetime.now()
local_finish_time = utc_finish_time - timezone_delta
local_start_time = utc_start_time - timezone_delta
ret["start_time"] = local_start_time.time().isoformat()
delta = utc_finish_time - utc_start_time
# duration in milliseconds.microseconds
duration = (delta.seconds * 1000000 + delta.microseconds) / 1000.0
ret["duration"] = duration
ret["__parallel__"] = True

if "retry" in low:
retries = 1
low["retry"] = instance.verify_retry_data(low["retry"])
if not sys.modules[instance.states[cdata["full"]].__module__].__opts__[
"test"
]:
while low["retry"]["attempts"] >= retries:

if low["retry"]["until"] == ret["result"]:
break

interval = low["retry"]["interval"]
if low["retry"]["splay"] != 0:
interval = interval + random.randint(0, low["retry"]["splay"])
log.info(
"State result does not match retry until value, "
"state will be re-run in %s seconds",
interval,
)
time.sleep(interval)
retry_ret = instance.states[cdata["full"]](
*cdata["args"], **cdata["kwargs"]
)

utc_start_time = datetime.datetime.utcnow()
utc_finish_time = datetime.datetime.utcnow()
delta = utc_finish_time - utc_start_time
duration = (delta.seconds * 1000000 + delta.microseconds) / 1000.0
retry_ret["duration"] = duration

orig_ret = ret
ret = retry_ret
if not ret["comment"]:
_comment = ""
else:
_comment = (
'Attempt {}: Returned a result of "{}", '
'with the following comment: "{}"'.format(
retries, ret["result"], ret["comment"]
)
)

ret["comment"] = "\n".join([orig_ret["comment"], _comment])
ret["duration"] = (
ret["duration"] + orig_ret["duration"] + (interval * 1000)
)
if retries == 1:
ret["start_time"] = orig_ret["start_time"]
retries = retries + 1

else:
ret["comment"] = " ".join(
[
"" if not ret["comment"] else str(ret["comment"]),
"The state would be retried every {interval} seconds "
"(with a splay of up to {splay} seconds) a maximum of "
"{attempts} times or until a result of {until} "
"is returned".format(**low["retry"]),
]
)

troot = os.path.join(instance.opts["cachedir"], instance.jid)
tfile = os.path.join(troot, salt.utils.hashutils.sha1_digest(tag))
if not os.path.isdir(troot):
Expand Down Expand Up @@ -2283,7 +2348,7 @@ def call(self, low, chunks=None, running=None, retries=1):
local_finish_time.time().isoformat(),
duration,
)
if "retry" in low:
if "retry" in low and "parallel" not in low:
low["retry"] = self.verify_retry_data(low["retry"])
if not sys.modules[self.states[cdata["full"]].__module__].__opts__["test"]:
if low["retry"]["until"] != ret["result"]:
Expand Down
92 changes: 91 additions & 1 deletion tests/pytests/functional/modules/state/test_state.py
Original file line number Diff line number Diff line change
Expand Up @@ -690,7 +690,7 @@ def test_retry_option_success(state, state_tree, tmp_path):
testfile
)
duration = 4
if salt.utils.platform.is_windows():
if salt.utils.platform.spawning_platform():
duration = 16

with pytest.helpers.temp_file("retry.sls", sls_contents, state_tree):
Expand All @@ -702,6 +702,44 @@ def test_retry_option_success(state, state_tree, tmp_path):
assert "Attempt 2" not in state_return.comment


@pytest.mark.skip_on_windows(
reason="Skipped until parallel states can be fixed on Windows"
)
def test_retry_option_success_parallel(state, state_tree, tmp_path):
"""
test a state with the retry option that should return True immediately (i.e. no retries)
"""
testfile = tmp_path / "testfile"
testfile.touch()
sls_contents = """
file_test:
file.exists:
- name: {}
- parallel: True
- retry:
until: True
attempts: 5
interval: 2
splay: 0
""".format(
testfile
)
duration = 4
if salt.utils.platform.spawning_platform():
duration = 30

with pytest.helpers.temp_file("retry.sls", sls_contents, state_tree):
ret = state.sls(
"retry", __pub_jid="1"
) # Because these run in parallel we need a fake JID

for state_return in ret:
assert state_return.result is True
assert state_return.full_return["duration"] < duration
# It should not take 2 attempts
assert "Attempt 2" not in state_return.comment


@pytest.mark.slow_test
def test_retry_option_eventual_success(state, state_tree, tmp_path):
"""
Expand Down Expand Up @@ -747,6 +785,58 @@ def create_testfile(testfile1, testfile2):
assert "Attempt 5" not in state_return.comment


@pytest.mark.skip_on_windows(
reason="Skipped until parallel states can be fixed on Windows"
)
@pytest.mark.slow_test
def test_retry_option_eventual_success_parallel(state, state_tree, tmp_path):
"""
test a state with the retry option that should return True, eventually
"""
testfile1 = tmp_path / "testfile-1"
testfile2 = tmp_path / "testfile-2"

def create_testfile(testfile1, testfile2):
while True:
if testfile1.exists():
break
time.sleep(2)
testfile2.touch()

thread = threading.Thread(target=create_testfile, args=(testfile1, testfile2))
sls_contents = """
file_test_a:
file.managed:
- name: {}
- content: 'a'
file_test:
file.exists:
- name: {}
- retry:
until: True
attempts: 5
interval: 2
splay: 0
- parallel: True
- require:
- file_test_a
""".format(
testfile1, testfile2
)
with pytest.helpers.temp_file("retry.sls", sls_contents, state_tree):
thread.start()
ret = state.sls(
"retry", __pub_jid="1"
) # Because these run in parallel we need a fake JID
for state_return in ret:
log.debug("=== state_return %s ===", state_return)
assert state_return.result is True
assert state_return.full_return["duration"] > 4
# It should not take 5 attempts
assert "Attempt 5" not in state_return.comment


@pytest.mark.slow_test
def test_state_non_base_environment(state, state_tree_prod, tmp_path):
"""
Expand Down

0 comments on commit 190ab7c

Please sign in to comment.