Experimental changes for Crosshair support #4164

Zac-HD · 2024-11-10T02:38:51Z

following Various small fixes #4165, the Tracer no longer branches (in a way visible to Crosshair) on whether sys.settrace() is active
span_start() and span_end() methods to make recognising recursive strategies easier.
- not yet tested, but that's why the PR is marked experimental.
experimental engine change which I think should let us raise Unsatisfiable after an alternative backend gives up.
- I'm not sure of this though, and would appreciate a pointer to the affected tests so that we can get this behavior deliberately checked by our test suite.

tybug · 2024-11-10T02:48:57Z

hypothesis-python/src/hypothesis/internal/conjecture/data.py

@@ -1334,6 +1334,29 @@ def draw_bytes(
    ) -> bytes:
        raise NotImplementedError

+    def span_start(self, label: int, /) -> None:


Big fan of the "span" terminology, over "examples".

In the future we should probably provide a more meaningful representation of the strategy than an integer label, so providers can do particular things for e.g. lists. Pass through the strategy instance wholesale, for examples with a corresponding strategy? I'd guess providers are unlikely to check anything beyond type(strategy) is ListStrategy, but could also check attributes like strategy.min_size in principle.

(maybe we should rename the existing code to match? big diff but it'd make it easier to work with the internals...)

The PrimitiveProvider interface is explicitly unstable for now, so we can change the label type when we get around to that ourselves. Nonetheless IMO passing the strategy instance exposes too much accidental complexity and implementation detail; I do want to at least try evolving towards a clean well-abstracted interface.

Pass the strategy type + relevant type-specific kwargs as a dict?

def span_start(self, label: int, strategy: type[SearchStrategy] | None, strategy_kwargs: dict[str, Any])

Agree we can just change it later though, not something we have to decide now. And the label will likely have to stay forever because not all spans are associated with a strategy (eg stateful rules).

-0 on a patch in the near future for selfish reasons; I have some local work towards excising all buffer usages, which is very large and would require conflict resolution. But I wouldn't stop such a pull.

pschanely · 2024-11-11T13:33:42Z

experimental engine change which I think should let us raise Unsatisfiable after an alternative backend gives up.

I'm not sure of this though, and would appreciate a pointer to the affected tests so that we can get this behavior deliberately checked by our test suite.

Thanks for this! Yeah, this sort of change looks like the kind of thing that would help, but my example cases aren't yet passing. They are test_erroring_rewrite_unsatisfiable_filter, test_unsat_filtered_sampling, and test_unsat_filtered_sampling_in_rejection_stage. It's also possible that my diagnosis is just wrong. I'll doublecheck today.

pschanely · 2024-11-12T01:15:24Z

experimental engine change which I think should let us raise Unsatisfiable after an alternative backend gives up.

I'm not sure of this though, and would appreciate a pointer to the affected tests so that we can get this behavior deliberately checked by our test suite.

Thanks for this! Yeah, this sort of change looks like the kind of thing that would help, but my example cases aren't yet passing. They are test_erroring_rewrite_unsatisfiable_filter, test_unsat_filtered_sampling, and test_unsat_filtered_sampling_in_rejection_stage. It's also possible that my diagnosis is just wrong. I'll doublecheck today.

Ok! What you have is almost enough; however I also need BackendCannotProceed exception cases to count as INVALID; something like this?:

diff --git a/hypothesis-python/src/hypothesis/internal/conjecture/engine.py b/hypothesis-python/src/hypothesis/internal/conjecture/engine.py
index d12ad4b5b..180ac3f81 100644
--- a/hypothesis-python/src/hypothesis/internal/conjecture/engine.py
+++ b/hypothesis-python/src/hypothesis/internal/conjecture/engine.py
@@ -444,6 +444,7 @@ class ConjectureRunner:
             interrupted = True
             raise
         except BackendCannotProceed as exc:
+            data.status = Status.INVALID
             if exc.scope in ("verified", "exhausted"):
                 self._switch_to_hypothesis_provider = True
                 if exc.scope == "verified":

This solution would imply that we're declaring that exhausted/verified should be raised outside of a normal execution. But I think we're ok with that?

Zac-HD · 2024-11-18T05:48:20Z

Ok! What you have is almost enough; however I also need BackendCannotProceed exception cases to count as INVALID

Hmm. It's not really Status.INVALID either; that would imply that we've had a filter/assume predicate reject this input - instead I've tried using the interrupted = True code path that we have for KeyboardInterrupt to indicate that we should just throw out that execution entirely; that should work but I'm honestly not sure whether it does.

pschanely · 2024-11-18T19:18:46Z

Ok! What you have is almost enough; however I also need BackendCannotProceed exception cases to count as INVALID

Hmm. It's not really Status.INVALID either; that would imply that we've had a filter/assume predicate reject this input - instead I've tried using the interrupted = True code path that we have for KeyboardInterrupt to indicate that we should just throw out that execution entirely; that should work but I'm honestly not sure whether it does.

Sounds good. I will franken-merge the things together, apply electricity, and report back.

pschanely · 2024-11-20T00:36:44Z

Hmm. It's not really Status.INVALID either; that would imply that we've had a filter/assume predicate reject this input - instead I've tried using the interrupted = True code path that we have for KeyboardInterrupt to indicate that we should just throw out that execution entirely; that should work but I'm honestly not sure whether it does.

Makes sense. This tweak is not solving the issue for me right now however. Setting interrupted = True indeed skips some logic, but because we aren't actually raising an exception, much of ConjectureRunner.test_function still runs; critically the line that increments self.valid_examples. Adding an explicit return statement at the bottom of the except BackendCannotProceed block seems to make it work, but I have very low confidence in my ability to suggest the correct fix in this situation!

Zac-HD · 2024-11-28T08:38:08Z

Looks like the early return was indeed the key; with some minor tweaks I've confirmed that this fixes the can't-raise-Unsatisfiable problem we've had 🎉

Once you confirm, I'll merge!

pschanely · 2024-11-29T21:12:46Z

Looks like the early return was indeed the key; with some minor tweaks I've confirmed that this fixes the can't-raise-Unsatisfiable problem we've had 🎉

Once you confirm, I'll merge!

Yes! These work for me too!

Zac-HD added the interop how to play nicely with other packages label Nov 10, 2024

tybug reviewed Nov 10, 2024

View reviewed changes

This was referenced Nov 10, 2024

Various small fixes #4165

Merged

Rename some internals for clarity #4166

Open

Zac-HD force-pushed the report-spans branch from 4fe44cc to 5d768c2 Compare November 10, 2024 05:19

Zac-HD force-pushed the report-spans branch from 5d768c2 to cc6faf0 Compare November 18, 2024 05:46

Zac-HD added 2 commits November 28, 2024 00:24

experiment: track spans

eed8693

raise Unsatisfiable despite alt backend

8ebe5b8

Zac-HD force-pushed the report-spans branch from cc6faf0 to 8ebe5b8 Compare November 28, 2024 08:24

Zac-HD marked this pull request as ready for review November 29, 2024 22:57

Zac-HD merged commit 1e91394 into HypothesisWorks:master Nov 29, 2024
49 checks passed

Zac-HD deleted the report-spans branch November 29, 2024 22:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experimental changes for Crosshair support #4164

Experimental changes for Crosshair support #4164

Zac-HD commented Nov 10, 2024 •

edited

Loading

tybug Nov 10, 2024 •

edited

Loading

Zac-HD Nov 10, 2024

tybug Nov 10, 2024 •

edited

Loading

pschanely commented Nov 11, 2024

pschanely commented Nov 12, 2024

Zac-HD commented Nov 18, 2024

pschanely commented Nov 18, 2024

pschanely commented Nov 20, 2024

Zac-HD commented Nov 28, 2024 •

edited

Loading

pschanely commented Nov 29, 2024

Experimental changes for Crosshair support #4164

Experimental changes for Crosshair support #4164

Conversation

Zac-HD commented Nov 10, 2024 • edited Loading

tybug Nov 10, 2024 • edited Loading

Choose a reason for hiding this comment

Zac-HD Nov 10, 2024

Choose a reason for hiding this comment

tybug Nov 10, 2024 • edited Loading

Choose a reason for hiding this comment

pschanely commented Nov 11, 2024

pschanely commented Nov 12, 2024

Zac-HD commented Nov 18, 2024

pschanely commented Nov 18, 2024

pschanely commented Nov 20, 2024

Zac-HD commented Nov 28, 2024 • edited Loading

pschanely commented Nov 29, 2024

Zac-HD commented Nov 10, 2024 •

edited

Loading

tybug Nov 10, 2024 •

edited

Loading

tybug Nov 10, 2024 •

edited

Loading

Zac-HD commented Nov 28, 2024 •

edited

Loading