Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow users to run one tailed experiments #137

Merged
merged 7 commits into from
Jan 12, 2024
Merged

Conversation

Gabrielcidral1
Copy link
Collaborator

@Gabrielcidral1 Gabrielcidral1 commented Dec 20, 2023

This PR aims to allow users to run 1 tailed power analyses. Currently, there's only the possibility of running 2 tailed.

This is the rationale for the adaptation:

If the actual difference (effect) went in the predicted direction:
The one-tail p-value is half the two-tail p-value. So if the two-tailed p-value is 0.1, the one-tailed p-value is 0.05.
The two-tail p-value is twice the one-tail p-value.
If the actual difference (effect) went opposite to the predicted direction:
The one-tail p-value equals one minus half the two-tailed value. So if the two-tailed p-value is 0.1, the one-tailed p-value is 0.95
The two-tail p-value is twice the one-tail p-value.

From my perspective, the best way is that in case of one-tailed experiments, the user inputs the desired direction (left or right, instead of just 'one-tailed'). In case we want accept 'one-sided', we could guess the side by the sign of the average_effect input. However, there would be a challenge as with current design the analysis class doesn't have the effect information.

@Gabrielcidral1 Gabrielcidral1 changed the title Allow users to run one tailed experiments WIP - Allow users to run one tailed experiments Dec 20, 2023
@Gabrielcidral1 Gabrielcidral1 marked this pull request as ready for review December 20, 2023 17:27
@Gabrielcidral1
Copy link
Collaborator Author

Why I can't see the tests? Do I need to deploy the branch?

@Gabrielcidral1 Gabrielcidral1 marked this pull request as draft December 20, 2023 17:31
cluster_experiments/experiment_analysis.py Outdated Show resolved Hide resolved
cluster_experiments/experiment_analysis.py Outdated Show resolved Hide resolved
cluster_experiments/experiment_analysis.py Show resolved Hide resolved
@@ -14,5 +14,5 @@ def test_binary_treatment():

def test_get_pvalue():
analysis_df_full = pd.concat([analysis_df for _ in range(100)])
analyser = OLSAnalysis()
analyser = OLSAnalysis(hypothesis="left_tailed")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add a separate test, and I'd test the functionality itself of the p-value transformer

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as in, check that it is dividing by 2 or doing the other stuff when necessary

@@ -12,7 +12,7 @@ repos:
rev: 22.12.0
hooks:
- id: black
language_version: python3.8
language_version: python3.9
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this for your local developement? or is it because of github actions?

@david26694
Copy link
Owner

david26694 commented Dec 21, 2023

Another comment: to me it's a bit hard to remember what does left and right-sided test mean, I'd try to document very well which sign of the effect does it belong to, even creating some notebook in docs that shows the power of the 3 sides for a similar setting

@Gabrielcidral1
Copy link
Collaborator Author

Another comment: to me it's a bit hard to remember what does left and right-sided test mean, I'd try to document very well which sign of the effect does it belong to, even creating some notebook in docs that shows the power of the 3 sides for a similar setting

I found a better way to do this. We can replicate scipy approach to it:

alternative : {'two-sided', 'less', 'greater'}, optional
    Defines the alternative hypothesis.
    The following options are available (default is 'two-sided'):

    * 'two-sided': the means of the distributions underlying the samples
      are unequal.
    * 'less': the mean of the distribution underlying the first sample
      is less than the mean of the distribution underlying the second
      sample.
    * 'greater': the mean of the distribution underlying the first
      sample is greater than the mean of the distribution underlying
      the second sample.

cluster_experiments/experiment_analysis.py Outdated Show resolved Hide resolved
cluster_experiments/experiment_analysis.py Outdated Show resolved Hide resolved
elif self.hypothesis == "greater":
p_value = p_value_half if treatment_effect >= 0 else 1 - p_value_half
elif self.hypothesis == "two-sided":
p_value = model_result.pvalues[self.treatment_col]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we do this, I understand we are not using enum, right? Then I would remove the Enum code

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, an else clause is missing here raising an error

cluster_experiments/experiment_analysis.py Outdated Show resolved Hide resolved
cluster_experiments/experiment_analysis.py Outdated Show resolved Hide resolved
cluster_experiments/experiment_analysis.py Show resolved Hide resolved
@@ -612,7 +647,9 @@ def analysis_pvalue(self, df: pd.DataFrame, verbose: bool = False) -> float:
if verbose:
print(results_mlm.summary())

return results_mlm.pvalues[self.treatment_col]
p_value = self.pvalue_based_on_hypothesis(results_mlm)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

for consistency



@pytest.mark.parametrize("hypothesis", ["one_sided", "two_sided"])
def test_get_pvalue_hypothesis(hypothesis):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add tests for the pvalue_based_on_hypothesis method too, since it has some logic

@Gabrielcidral1 Gabrielcidral1 changed the title WIP - Allow users to run one tailed experiments Allow users to run one tailed experiments Dec 27, 2023
@codecov-commenter
Copy link

codecov-commenter commented Dec 27, 2023

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (094cf8c) 96.99% compared to head (2a85493) 96.96%.

Files Patch % Lines
cluster_experiments/experiment_analysis.py 96.00% 1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #137      +/-   ##
==========================================
- Coverage   96.99%   96.96%   -0.03%     
==========================================
  Files           9        9              
  Lines         864      890      +26     
==========================================
+ Hits          838      863      +25     
- Misses         26       27       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Gabrielcidral1 Gabrielcidral1 marked this pull request as ready for review December 28, 2023 09:25
Copy link
Owner

@david26694 david26694 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job! Some things missing:

  • There's some commented code in the notebook, I'd remove it
  • You need to add the notebook in mkdocs.yml
  • You need to increase the library version (I'd change from 0.10.2 to 0.11.0 or something like this)

cluster_experiments/experiment_analysis.py Show resolved Hide resolved
cluster_experiments/experiment_analysis.py Show resolved Hide resolved
tests/analysis/test_hypothesis.py Show resolved Hide resolved
@david26694 david26694 merged commit 2024738 into main Jan 12, 2024
4 checks passed
@david26694 david26694 deleted the tailed-experiments branch January 12, 2024 17:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants