Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for match ... case #60

Merged
merged 30 commits into from
May 30, 2024
Merged

Add support for match ... case #60

merged 30 commits into from
May 30, 2024

Conversation

prsabahrami
Copy link
Contributor

Adding support for match ... case in polatify by converting match ... case into a nested ast.If.

@pavelzw
Copy link
Member

pavelzw commented Mar 20, 2024

@prsabahrami could you please take care of the merge conflicts?

@prsabahrami
Copy link
Contributor Author

@pavelzw It should be fixed now.

Copy link

codecov bot commented Mar 20, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.10%. Comparing base (97fb1da) to head (701b0d0).

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #60      +/-   ##
==========================================
+ Coverage   95.27%   98.10%   +2.83%     
==========================================
  Files           2        2              
  Lines         148      211      +63     
==========================================
+ Hits          141      207      +66     
+ Misses          7        4       -3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pavelzw
Copy link
Member

pavelzw commented Mar 20, 2024

Thanks for your effort! A couple of notes:


The test failures are because of a new pixi version, #61 fixed them 😅 could you merge main again please? i saw you already handled that 😄


match ... case is not 100% covered yet, for example this function doesn't work:

def match_with_or(x):
    match x:
        case 0 | 1:
            return 0
        case 2:
            return 2 * x
        case 3:
            return 3 * x
    return x

Also, I personally would prefer match case being translated into consecutively chained pl.when(...).then(...).when(...).then(...)...otherwise(...) as this matches the original AST with match ... case better.

For example:

def match_signum(x):
    s = 0
    match x:
        case 0:
            s = 1
        case 2:
            s = -1
        case 3:
            s = 0
    return s

# gets transformed right now to
def match_signum_polarified(x):
    import polars as pl
    return pl.when(x == 0).then(1).otherwise(pl.when(x == 2).then(-1).otherwise(pl.when(x == 3).then(0).otherwise(0)))

# imo this would be better
def match_signum_polarified(x):
    import polars as pl
    return pl.when(x == 0).then(1)\
        .when(x == 2).then(-1)\
        .when(x == 3).then(0)\
        .otherwise(0) # implicit otherwise case because no `case _` is given

It would also be nice to have a test case for the match _ case.


Additionally we either need to implement some logic to differentiate between python versions and only include this test for python >=3.10 or just drop python 3.9 support altogether. Thoughts @0xbe7a?

@pavelzw pavelzw added the enhancement New feature or request label Mar 20, 2024
@0xbe7a
Copy link
Contributor

0xbe7a commented Mar 20, 2024

I think we should still support py3.9

@prsabahrami
Copy link
Contributor Author

Thank you very much for the feedback!

I did merge the main and it seems like we still have some errors.

Also, good idea to tranform it into consecutively chained pl.when(...).then(...).when(...).then(...)...otherwise(...) I'll try tp implement match _ case as well as the consecutively chained translation today!

@pavelzw
Copy link
Member

pavelzw commented Mar 20, 2024

I did merge the main and it seems like we still have some errors.

This is because of python3.9 not supporting match ... case. You can ignore that for now and focus on implementing the missing features. Afterwards, I can assist you with adding py39 support.

The py312 test should work, though: pixi run -e py312 test

@prsabahrami
Copy link
Contributor Author

I have added both functinalities both match _ case and support for below example:

def match_with_or(x):
    match x:
        case 0 | 1:
            return 0
        case 2:
            return 2 * x
        case 3:
            return 3 * x
    return x

Note: I have changed the Conditional State s.t. it contains a list of "test" and a list of "then". This will affect If's functionality as well. I could have created an additional State. I wasn't sure which one would be more suitable to your future needs and requirements but it can easily be updated.
Also, the example given above will be translated to:

def match_with_or_polarified(x):
    import polars as pl
    return pl.when((x == 0) |  (x == 1)).then(0).when(x == 2).then(2 * x).when(x == 3).then(3 * x).otherwise(x)

I believe the extra parentheses can be removed via unparsing and parsing and unparsing the result. However, let me know if you can think of any better solution!

@0xbe7a
Copy link
Contributor

0xbe7a commented Mar 21, 2024

Looks great, little nit-pick: Instead of test: Sequence[ast.expr], then: Sequence[State] we can use Sequence[tuple[ast.Expr, State]] instead, as this would provide a little more type safety.

@pavelzw
Copy link
Member

pavelzw commented Mar 21, 2024

To handle the python 3.9 case, you can do something like this:


In the code:

import sys
PY_39 = (sys.version_info <= 3.9)

# every time you reference ast.Match / ast.MatchOr...
if not PY_39:
    # do something

I think type hints like def translate_match(self, subj, stmt: ast.MatchValue | ast.MatchOr) are not evaluated at runtime and thus okay.


In the tests:

Create a separate file functions_310.py

# functions_310.py
def match_bla(...):
    # ...

functions_310 = [
    match_bla,
    match_bla2,
]
# functions.py
import sys
if sys.version_info >= (3, 10):
    from .functions_310 import functions_310
else:
    functions_310 = []


# ...
functions = [
    signum,
    # ...
    *functions_310
]

The relevant part is that python doesn't interpret the 3.10 style functions while importing. Thus, they need to lie in a separate file that is not scanned by pytest (i.e. doesn't start with test).


Also, it would be nice if you switched the default python version for all polars version tests to py310 s.t. we can include all tests for all polars versions:

# pixi.toml
[environments]
default = ["test"]
-pl014 = ["pl014", "py39", "test"]
-pl015 = ["pl015", "py39", "test"]
-pl016 = ["pl016", "py39", "test"]
-pl017 = ["pl017", "py39", "test"]
-pl018 = ["pl018", "py39", "test"]
-pl019 = ["pl019", "py39", "test"]
-pl020 = ["pl020", "py39", "test"]
+pl014 = ["pl014", "py310", "test"]
+pl015 = ["pl015", "py310", "test"]
+pl016 = ["pl016", "py310", "test"]
+pl017 = ["pl017", "py310", "test"]
+pl018 = ["pl018", "py310", "test"]
+pl019 = ["pl019", "py310", "test"]
+pl020 = ["pl020", "py310", "test"]

change this in pixi.toml and do pixi install afterwards.

@pavelzw
Copy link
Member

pavelzw commented Mar 21, 2024

I found another non-covered case that throws a bad error message 😅

def match_a(x):
    y = 0
    match x, y:
        case 1, 0:
            return 4
        case _:
            return 5
E               AttributeError: 'Tuple' object has no attribute 'id'

polarify/main.py:198: AttributeError

Either we also support this case or we raise a more helpful warning.

@prsabahrami
Copy link
Contributor Author

I haven’t yet added the support for all ast.MatchSequence() or ast.MatchMapping(). I will try to add support for all cases and the change you suggested by today!

@prsabahrami
Copy link
Contributor Author

I have made the change from test: Sequence[ast.expr], then: Sequence[State] to Sequence[tuple[ast.Expr, State]]. Also, I have added support and tests for py 3.9.

Few notes: I have added support for below examples:

def match_a(x):
    match x:
        case 1, 0:
            return 4
        case _:
            return 5

and

def match_a(x):
    match x:
        case [1, 0]:
            return 4
        case _:
            return 5

and

def match_a(x):
   match x:
       case [1, 0, *other]:
           return other
       case _:
           return 5

Regarding the above example: for now, it raises "starred patterns are not supported".
But to see its functionality comment the line 180 in main.py

 180        raise ValueError("starred patterns are not supported")

and uncomment the two below lines

 181        self.node.assignments[stmt.patterns[-1].name] = ast.Subscript(value=ast.Name(id=subj, ctx=ast.Load()), slice=ast.Slice(lower=ast.Constant(value=len(stmt.patterns) - 1)))
 182        left = ast.Subscript(value=ast.Name(id=subj, ctx=ast.Load()), slice=ast.Slice(upper=ast.Constant(value=len(stmt.patterns) - 1)))

Consider the below example:

def match_a_polarified(x):
    import polars as pl
    return pl.when(x[:2] == [1, 0]).then(x[2:]).otherwise(5)

where this gives error since this is not how polars slices lists. I will be updating the above example to return something as below:

def match_a_polarified(x):
    import polars as pl
    return pl.when(x.lower(2) == [1, 0]).then(x.slice(2, -1)).otherwise(5)

And I am adding the support for the below examples as well:

def match_a(x):
    y = 0
    match x, y:
        case 1, 0:
            return 4
        case _:
            return 5
def match_a(x):
    y = 3
    match x:
        case y if y > 5:
            return 1
        case _:
            return 5

@prsabahrami
Copy link
Contributor Author

I have added the below functionalities as well:

def match_with_guard(x):
    match x:
        case y if y > 5:
            return 1
        case _:
            return 5

will be translated to:

def match_with_guard_polarified(x):
   import polars as pl
   return pl.when(x > 5).then(1).otherwise(5)

However it still does not support lists yet and only supports single variable guard. (Warnings should be added!)
Also, it supports multi-variable match as below as well:

def match_a(x):
    y = 0
    match x, y:
        case 1, 0:
            return 4
        case _:
            return 5

will be translated to

def match_a_polarified(x):
   import polars as pl
   return pl.when((x == 1) & (0 == 0)).then(4).otherwise(5)

Please let me know if you find any bugs or cases missing!
Given I have added numerous cases, there are warnings associated with each of these cases which should be added.

@pavelzw
Copy link
Member

pavelzw commented Mar 27, 2024

Sorry for the delays, i'll get back to you soon, i'm just a bit busy right now 😓

@prsabahrami
Copy link
Contributor Author

Hello,

Was wondering if you could review the latest commit and approve?

Thanks

@prsabahrami
Copy link
Contributor Author

Increased the code coverage to 0.9575
Also, added some tests both for >= PY310 and PY39.
Everything should be fine now.

@prsabahrami
Copy link
Contributor Author

Added the below test for coverage of L329 - L333:

def multiple_match(x):
    match x:
        case 0:
            return 1
        case 1:
            return 2
    match x:
        case 0:
            return 3
        case 1:
            return 4
    return 5

Copy link
Contributor

@0xbe7a 0xbe7a left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mostly looks good, i found some failing test-cases

polarify/main.py Show resolved Hide resolved
polarify/main.py Show resolved Hide resolved
polarify/main.py Show resolved Hide resolved
@prsabahrami
Copy link
Contributor Author

prsabahrami commented May 18, 2024

Correct me if I'm wrong: Based on my understanding python ignores cases where the number of subjects is not equal to the number of matching values.

def match_sequence_padded_length(x):
    y = 2
    z = None

    match x, y, z:
        case 1, 2:
            return 1
        case _:
            return x

essentially returns x.

I have fixed the failing cases and for the below tests I am raising ValueError given that testing does not support literal returns:

def match_sequence_padded_length_no_case(x):
   y = 2
   z = None

   match x, y, z:
       case 1, 2:
           return 1
       case _:
           return 2

@prsabahrami
Copy link
Contributor Author

I was wondering if you had the time to go over the changes? Any new suggestions or failing test cases?

@0xbe7a
Copy link
Contributor

0xbe7a commented May 23, 2024

No not yet, i hope we can take a look at it tomorrow @pavelzw

Copy link
Member

@pavelzw pavelzw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very good, after the minor comments i think this is ready to merge 🥳

polarify/main.py Outdated
When a match statement has no valid cases (i.e., all cases except catch-all pattern are ignored),
we return the orelse expression but the test setup does not work with literal expressions.
"""
raise ValueError("No valid cases provided.")
Copy link
Member

@pavelzw pavelzw May 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added support for these cases in 10c9753 (#60) in transform_tree_into_expr

polarify/main.py Outdated Show resolved Hide resolved
polarify/main.py Outdated Show resolved Hide resolved
Copy link
Member

@pavelzw pavelzw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very good, after the minor comments i think this is ready to merge 🥳

@prsabahrami
Copy link
Contributor Author

I added the explanation comment in translate_match.
I believe we should be able to merge now.

@pavelzw
Copy link
Member

pavelzw commented May 30, 2024

Thanks for enduring this @prsabahrami 😅 match statements in python are a lot more involved than I initially thought 😄

@pavelzw pavelzw merged commit 9aa94fa into Quantco:main May 30, 2024
17 checks passed
@prsabahrami
Copy link
Contributor Author

Of course!
It was such a great experience!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants