refactor(disordered_tracing): Work with skan-0.12.2 #1052

ns-rse · 2024-12-16T13:43:16Z

Closes #986

With the release of skan-0.12.2 we can remove the code which converts - in the variable names of Pandas DataFrames to instead use _ which allows the use of "dot-notation" when referring to the variables of data frames.

We therefore unpin the skan version in pyproject.toml, explicitly state the separator="_" argument to skan.summarize() and adapt the hard-coded variable names appropriately. As noted in the release notes the default will be _ from v0.13.0 and the flag allows backwards compatibility but since we have gone with the use of _ in variable names it will do no harm leaving these in place (although they could be removed when skan-v0.13.0 is released).

In addressing #986 I tested in a fresh virtual environment. This pulled in matplotlib-3.10.0 which was released 2024-12-14 (two days ago!). Our dependency on topoly doesn't work with this new version because of the removal of inset_location.InsetPosition in favour of inset_axes.

This has been reported to the Topoly developers, see ImportError: cannot import name 'InsetPosition' from
'mpl_toolkits.axes_grid1.inset_locator' · Issue #5 · ilbsm/topoly_tutorial and hopefully won't take long to resolve. For now though we pin our dependency matplotlib~=3.9.4 and all tests pass.

Before submitting a Pull Request please check the following.

Existing tests pass.
Pre-commit checks pass.

Closes #986 With the release of `skan-0.12.2` we can remove the code which converts `-` in the variable names of Pandas DataFrames to instead use `_` which allows the use of "dot-notation" when referring to the variables of data frames. We therefore unpin the `skan` version in `pyproject.toml`, explicitly state the `separator="_"` argument to `skan.summarize()` and adapt the hard-coded variable names appropriately. As noted in the [release notes](https://github.com/jni/skan/releases/tag/v0.12.2) the default will be `_` from `v0.13.0` and the flag allows backwards compatibility but since we have gone with the use of `_` in variable names it will do no harm leaving these in place (although they could be removed when `skan-v0.13.0` is released).

In addressing #986 I tested in a fresh virtual environment. This pulled in [matplotlib-3.10.0](https://github.com/matplotlib/matplotlib/releases/tag/v3.10.0) which was released 2024-12-14 (two days ago!). Our dependency on [topoly](topoly.cent.uw.edu.pl/documentation.html) doesn't work with this new version because of the removal of `inset_location.InsetPosition` in favour of `inset_axes`. This has been reported to the Topoly developers, see [ImportError: cannot import name 'InsetPosition' from 'mpl_toolkits.axes_grid1.inset_locator' · Issue #5 · ilbsm/topoly_tutorial](ilbsm/topoly_tutorial#5) and hopefully won't take long to resolve. For now though we pin our dependency `matplotlib~=3.9.4` and all tests pass.

MaxGamill-Sheffield · 2024-12-16T15:17:37Z

Hey Neil, thanks for following up on this issue with the skan and topoly people!

I've not had time to test but does this then change any of the disordered_stats column names?
This might be an issue for the paper as we have a bunch of analysis scripts relying on certain names and it might be more of an overhead to change this in the v2.3.0 release along with all the analysis scripts than to cite the version once released.

ns-rse · 2024-12-16T15:33:15Z

I've not had time to test but does this then change any of the disordered_stats column names?

Too Long; Did not Read

No, the PR doesn't include any changes to test files where changes in column names would have an impact, ergo nothing in the scripts needs changing because none of the column names have actually changed.

Long; Did Read

Skan originally used - in column names to the dataframe that skan.summarize() produces.

This means that you can't use the notation df.<colname> when referring to columns that are produced (its a useful convenience to be able to do so).

In discussion with the skan developer it was agreed to switch this out and I made a PR to skan to use _ and this was incorporated into release skan-0.12.0, but there were some issues with data types so we had to pin skan~=0.11.0. I helped upstream fix these and they are in skan-0.12.2 release that came out a few days ago.

When disordered_tracing() was being refactored and the results of skan.summarize() were used I mentioned that there was this pending change coming and that we should use _ rather than - in column names.

This PR removes the line that made that change to the returned data frame (see line itself and associated comment) and also explicitly calls skan.summarize() with the separator="_" so that the returned data frame uses _ in the column names and we don't need to manually make the switch. A few other changes were required where comparisons to keys or arguments are hard coded.

Thus the output doesn't actually change at all, which is why all the tests pass without any need to modify any of the test files.

It also threw up the mentioned problem with Matplotlib 3.10.0 causing problems with Topoly which was pure chance but I've addressed it as otherwise the tests in Continuous Integration would pull in the new version and throw errors.

MaxGamill-Sheffield

Ace! In that case this looks clean and good to go, thanks Neil!

ns-rse added 2 commits December 16, 2024 13:32

ns-rse added the v2.3.0 label Dec 16, 2024

ns-rse added this to the v2.3.0 milestone Dec 16, 2024

ns-rse assigned llwiggins, MaxGamill-Sheffield and SylviaWhittle Dec 16, 2024

ns-rse added DNATracing Issues pertaining to the DNATracing class v2.3.0 and removed v2.3.0 labels Dec 16, 2024

Merge branch 'main' into ns-rse/986-update-skan-separator

4d235a5

MaxGamill-Sheffield approved these changes Dec 16, 2024

View reviewed changes

ns-rse added this pull request to the merge queue Dec 16, 2024

Merged via the queue into main with commit 5ec56ec Dec 16, 2024
11 checks passed

ns-rse deleted the ns-rse/986-update-skan-separator branch December 16, 2024 16:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(disordered_tracing): Work with skan-0.12.2 #1052

refactor(disordered_tracing): Work with skan-0.12.2 #1052

ns-rse commented Dec 16, 2024 •

edited

Loading

MaxGamill-Sheffield commented Dec 16, 2024

ns-rse commented Dec 16, 2024 •

edited

Loading

MaxGamill-Sheffield left a comment

refactor(disordered_tracing): Work with skan-0.12.2 #1052

refactor(disordered_tracing): Work with skan-0.12.2 #1052

Conversation

ns-rse commented Dec 16, 2024 • edited Loading

MaxGamill-Sheffield commented Dec 16, 2024

ns-rse commented Dec 16, 2024 • edited Loading

Too Long; Did not Read

Long; Did Read

MaxGamill-Sheffield left a comment

Choose a reason for hiding this comment

ns-rse commented Dec 16, 2024 •

edited

Loading

ns-rse commented Dec 16, 2024 •

edited

Loading