Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JOSS Reviewer Issue #12

Closed
13 tasks done
behinger opened this issue Apr 30, 2024 · 9 comments
Closed
13 tasks done

JOSS Reviewer Issue #12

behinger opened this issue Apr 30, 2024 · 9 comments

Comments

@behinger
Copy link

behinger commented Apr 30, 2024

Hi!
nice software you have there. I will use this issue to keep track of problems I found on the way to use it.

  • the differences to pengouin are mentioned in the paper, but I would also add them to the readme under "related software" with a brief discussion of strengths of each
  • a compatible python version should be provided. I'm running my tests with python 3.12 current version
  • clarification: "with df that have a structure of WIDE-RANGE. " => is WIDE-RANGE the same as a tidy dataframe in R?
  • while the errormessage is generally good (it tells me what I need to do), I don't understand why I need to specify an index for the DF.
  • Search sm.howtouse("fgiure") for the function to draw pictures and graphs! => typo in figure
  • in the howtouse() output it would be nice to get an example how to actual use a method, e.g. something like sm.progress(method = 'ttest_ind', vars = 'age', group_vars = 'sex').figure() - (not for every method, just one general example to get an idea how to use it)
  • if I define an id="species", I cannot use it anymore as grouping variable
  • if I specify more than 2 groups in ttest_ind, I get a very cryptic error AxisError: axismust be an integer, a tuple of integers, orNone.
  • typo: Indenpendent in howtouse
  • if I run ttest_rel I get a KeyError: 's' if I use the wrong call syntax (which is impossible to find out from the REPL/jupyter notebook, you have to look into the actual manual).
  • automated testing: While in the docs in notion there are some printed comparisons against e.g. scipy, it is not clear whether these are run automatically (I didnt find the code for the documentation), or whether the author checked each for equivalence
  • in a spot-check I saw that the cohen's d calculation was implemented by the author, but I cannot confirm that there exists a test for these functions.
  • The JOSS Part " Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support" is not completed as of now (or I missed it)

I will continue later, but the unittesting is really important, especially in a stats-package

Linking back to (openjournals/joss-reviews/issues/6642)

@ckdckd145
Copy link
Owner

ckdckd145 commented May 19, 2024

@behinger

Hi! Thank you again for reviewing statmanager-kr !
Now I'm working on fixing the code, reivising the readme, and documentation based on your valuable feedback.
In particular, I'm working on the automated testing you pointed out (I'm not a developer by nature, so I wasn't aware of these procedures).

By the way, I was wondering if you could tell me directly the code where you got the error you mentioned in ttest_ind and ttest_rel? I've tried several times, but I'm having a hard time reproducing the error you said.

Please have a look if you have a chance and get in touch with me, thanks very much!

@ckdckd145
Copy link
Owner

@behinger

Hi!
I just want to say thank you again for your review.

As I said before, I'm still working on fixes to improve my software related to the feedback you mentioned.
You can always find my working in the dev branch of my repository.
Additionally, the official documentation and readme are being revised, which I hope you'll check out.
To follow up on the work myself, I'll be utilizing the checkboxes in the feedback you leave in this issue.
Only when everything is complete will I comment again so that you can see each fix and the intent behind it.

I apologize for the significant delay. I'm guessing you've been busy as well and haven't been able to give me the code I need to reproduce the error you comment, so I'm looking to fix the code to raise an error on unintended user errors across the board.
I am also continuing to write the code needed for automated testing with pytest.

Thanks!

@teonbrooks
Copy link
Contributor

thanks for the review @behinger. could you highlight where you saw the error for the ttest_ind and ttest_rel when you have a chance?

@behinger
Copy link
Author

behinger commented Jun 10, 2024

hey sorry - I missed the first bump. I think my issue was that running something like this:

sm.progress(method = 'ttest_ind', vars = 'age', group_vars = 'sex') # works
sm.progress(method = 'ttest_rel', vars = 'age', group_vars = 'sex') # doesnt work

results in a not-userfriendly error (now Keyerror 'a' instead of keyerror 's') (e.g. replacing the ttest_ind with ttest_rel), because it expects vars = and not group_vars

I was simply thinking everything is in tidy-data format (one row per observation), but o.c. one can also employ a wider format like here.

@ckdckd145
Copy link
Owner

@behinger

Hi! You don't have to be sorry. :)
Anyway, I finally understand what the not-user-friendly error you're talking about is, and I'll work on fixing the code right away.
Also, I'm working on a fix to prevent errors across the board and provide more user-friendly error messages based on the examples you mentioned.

Also, while the documentation explains why the index column should be set for the user (you can see the related paragraph in the documentation here), I will explain to you separately why the index column should be set in Pandas.DataFrame to use statmanager-kr.

Actually, many of the features of this software owe their origin to Scipy and Statsmodels, as you may know. The problem is that while Scipy is working in a way that is compatible with what I call a wide-frame dataframe (where the information obtained from each subject is arranged horizontally), Statsmodels is not. As far as I know, in order to use the functionality of Statsmodels, in many cases it is necessary to change the Pandas.DataFrame from 'wide-range' to 'tidy-data', which is usually referred to in database languages like SQL. In Statmanager-kr, this is done via methods like melt() in Pandas, but the problem is that pd.melt() always requires an index parameter, which means that in order to use the functionality of Statsmodels (because the dataframe must be reframed via melt() to compatible with Statsmodels), I need to obtain information about the column that serves as the index.

I'm not sure if I've explained this correctly, but I hope this helps address some of the comments you've made.
I'll keep you posted on my work

Thank you!

@behinger
Copy link
Author

ah, I must have missed the explanation, very reasonable, thanks for repeating for me

@ckdckd145
Copy link
Owner

@behinger

Hi!

Letting you know how your work is progressing.
I've almost finished creating the Python files for the automatic tests. The only thing left to do is to update and deploy statmanager-kr to ensure this process goes smoothly, and set up the automatic tests to run on the Github repository. Anyway, my plan is to have the process automatically check for compatible Python versions as well. I'm struggling a bit with this, as I've never done this on Github before. The python files I wrote for the automatic tests can be found in the dev branch.

By the way, I will add things related to the "Community guidelines" in readme file and documentation.

Thank you!

@ckdckd145
Copy link
Owner

@behinger

Hi!

I deeply appreciate your patience and wait.
Finally, I finalized the code for automatic testing, created a github action to run it, and saw all the tests pass. (I'm very excited)
Also, I add the available python versions validated via testing in readme.md.

As if that wasn't enough, I also added a contribute guideline to the documentation. I really hope these fixes will help you finalize your review.

Have a great day!
Thank you.

ref @teonbrooks

@behinger
Copy link
Author

behinger commented Jul 8, 2024

nice! I answered the JOSS issue. My points were addressed.

Maybe a small note: The "how to contribute" is one step dettached from the github readme in the notion file, maybe worth linking to that section from the readme immediately.

@behinger behinger closed this as completed Jul 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants