-
-
Notifications
You must be signed in to change notification settings - Fork 413
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fix] Fixed memory baviour in hrv calc #837
Conversation
Fixed function arguments for lower level functions in ppg_intervalrelated.py, such that no weird memory behaviour occurs.
Thanks for opening this pull request! We'll make sure it's perfect before merging 🤗 |
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## dev #837 +/- ##
==========================================
+ Coverage 53.88% 54.06% +0.18%
==========================================
Files 295 295
Lines 13811 13809 -2
==========================================
+ Hits 7442 7466 +24
+ Misses 6369 6343 -26
☔ View full report in Codecov by Sentry. |
I am expecting this to fail with the suggested change because when a dictionary is passed as input, the "output" argument is used
Hi @ThomasGebhardt , thank you for this! I added a test showing how the suggested changes are incompatible with having a dictionary as input. Would you be able to add modifications so that the test passes? |
Fixed the error message, that too many arguments are passed.
Ran into other problems - changed quite a few things as a consequence. - Updated internals' checks to more "pythonic" way - Updated intervals for dicts such that they don't overwrite each other anymore (add label info and rate calculation that is) - Updated such that the weird memory behaviour is gone, when gating hrv parameter calculation for performance improvement - Updated type casting in ppg_intervalrelated internals to utilize pandas native functions for performance increase (and easier to read code) - Added type hints to internal functions such that IDEs actually know what the function gets as input, enabling suggestions and documentation
Previous version created a bug, in which the calculated hrv parameters are entered as dicts into the overall ppg_intervalls df - which miraculously did not conflict with any tests.
Hey @danibene, should be good to go now - hit some road bumps on the way, changed a bit more than intended, but also works now properly, as far as I can see it (at least I didn't find anything why it shouldn't and the tests check out). |
neurokit2/ppg/ppg_intervalrelated.py
Outdated
@@ -97,27 +95,28 @@ def ppg_intervalrelated(data, sampling_rate=1000): | |||
# ============================================================================= | |||
|
|||
|
|||
def _ppg_intervalrelated_formatinput(data, output={}): | |||
|
|||
def _ppg_intervalrelated_formatinput(data: pd.DataFrame): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a reason you added a type annotation here? I do think that type annotations are generally good to have but it's inconsistent with the rest of the code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, but curious about that annotation as well.
In general yes, I also think type annotations are good, although NK might not be a priority repo to add them, as in general we are pretty flexible by design regarding the type of input (i.e., we try to accommodate potential user "mistakes")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is indeed more of a convenience for the "deep-diving" user - such that the employed IDE gets what the incoming data type is and thus enables the suggestion features for the variable. Regarding flexibility, the functions were before and still are only written to handle data as a pandas DataFrame as that is what is created from ppg_process beforehand.
Me personally, I love using type hints as it makes using functions clearer and easier - and they also help with getting back into code that I did not work on for some time a lot faster - so maybe I simply added it from a force of habit as well.
If pure functionality and consistency with the rest of the code are the goals, the type hint can be removed without consequence. Let me know, if you want me to drop it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a big experience with type hints, but I can see their value.
I guess either:
- we remove it here for consistency for now (and eventually someday working on an update by type hitting the whole codebase)
- We leave it and we implicitly start slowly type-hitting here and there whenever we do some code updates (i.e., the progress will be slow and random).
@danibene what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a big experience with type hints, but I can see their value.
I guess either:
- we remove it here for consistency for now (and eventually someday working on an update by type hitting the whole codebase)
- We leave it and we implicitly start slowly type-hitting here and there whenever we do some code updates (i.e., the progress will be slow and random).
@danibene what do you think?
I also don't have much experience with type hints, but I'd err on the side of removing it for now.
One option to start with, to make sure we're not biting off more than we can chew, could be to create separate stub files like pandas https://github.com/pandas-dev/pandas-stubs
I recently started this on my own so I could use mypy in repositories using certain neurokit functions, if you'd like we could add this to the neuropsychology org and request that contributors of new functions contribute there too:
https://github.com/danibene/neurokit2-stubs/tree/dev
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also don't have much experience with type hints, but I'd err on the side of removing it for now.
Let's do that then
could be to create separate stub files
Oh boy it's the first time I hear about stubs 😵💫 Let me do some reading first and then we can see what to do regarding the infrastructure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ThomasGebhardt ! I just added a small comment on style
@DominiqueMakowski could you also please take a look before merging?
Hello,
while writing some custom code for nk2, I tried gating the calculation of the time-based, frequency-based and non-linear PPG HRV Parameters. While doing that I noticed that, when running the code with all three types of HRV parameters it worked flawlessly - as it is intended using ppg_analyze.
But setting any of the gates to false, did not work, if they have been true before without restarting the kernel. Looking into the issue by extensive testing, I found the following:
In both functions (
_ppg_intervalrelated_hrv(...)
and_ppg_intervalrelated_formatinput(...)
), the return variable is also in the arguments list for the functions (i.e.output
):This leads to some really weird memory behaviour, as it seems the output variable within these functions is not properly destroyed (or at least reset), which is not noticeable if the content of the variable is always the same. It is most noticeable, when the amount of columns / keys within
output
changes between runs.The fix is as trivial: just move the initialization of the output dictionary out of the arguments into the actual function body - done.
Cheers