Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

revpairwise = pairwise in R #100

Closed
3 tasks
vincentarelbundock opened this issue May 7, 2024 · 7 comments
Closed
3 tasks

revpairwise = pairwise in R #100

vincentarelbundock opened this issue May 7, 2024 · 7 comments

Comments

@vincentarelbundock
Copy link
Owner

vincentarelbundock commented May 7, 2024

@artiom-matvei could you check if hypothesis="revpairwise" and hypothesis="pairwise" give the same results in R and Python? If they don't, please:

  • Bring Python in conformity with R
  • Add a simple test to check row order
  • Write a line about breaking change in the NEWS.md

Simple example like:

avg_predictions(m, by = "three_category_variable", hypothesis = "revpairwise")
@artiom-matvei
Copy link
Contributor

artiom-matvei commented Nov 1, 2024

R code and output:

> library(marginaleffects)
> dat = read.csv("https://vincentarelbundock.github.io/Rdatasets/csv/palmerpenguins/penguins.csv")
> mod = glm(body_mass_g ~ flipper_length_mm * species * bill_length_mm + island, data = dat)
> avg_predictions(m, by = "species", hypothesis = "revpairwise")
Erreur : objet 'm' introuvable
> avg_predictions(mod, by = "species", hypothesis = "revpairwise")

               Term Estimate Std. Error       z Pr(>|z|)     S   2.5 % 97.5 %
 Gentoo - Adelie      1375.4       40.6  33.898   <0.001 834.3  1295.8   1455
 Chinstrap - Adelie     32.4       48.8   0.665    0.506   1.0   -63.2    128
 Chinstrap - Gentoo  -1342.9       50.5 -26.603   <0.001 515.6 -1441.9  -1244

Type:  response 
Columns: term, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high 

> avg_predictions(mod, by = "species", hypothesis = "pairwise")

               Term Estimate Std. Error       z Pr(>|z|)     S 2.5 %  97.5 %
 Adelie - Gentoo     -1375.4       40.6 -33.898   <0.001 834.3 -1455 -1295.8
 Adelie - Chinstrap    -32.4       48.8  -0.665    0.506   1.0  -128    63.2
 Gentoo - Chinstrap   1342.9       50.5  26.603   <0.001 515.6  1244  1441.9

Type:  response 
Columns: term, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high 

Python code and output:

import polars as pl
import statsmodels.formula.api as smf
from marginaleffects import *
penguins = pl.read_csv(
    "../tests/data/penguins.csv",
    null_values="NA",
).drop_nulls()
mod = smf.ols(
  "body_mass_g ~ flipper_length_mm * species * bill_length_mm + island",
  penguins.to_pandas(),
).fit()
print(avg_predictions(mod, by = "species", hypothesis = "revpairwise"))
print(avg_predictions(mod, by = "species", hypothesis = "pairwise"))

output

shape: (3, 8)
┌─────────┬──────────────┬───────────┬───────────┬─────────┬──────────┬──────────────┬─────────────┐
│ Term    ┆ Estimate     ┆ Std.Error ┆ z         ┆ P(>|z|) ┆ S        ┆ 2.5%         ┆ 97.5%       │
│ ---     ┆ ---          ┆ ---       ┆ ---       ┆ ---     ┆ ---      ┆ ---          ┆ ---         │
│ str     ┆ f64          ┆ f64       ┆ f64       ┆ f64     ┆ f64      ┆ f64          ┆ f64         │
╞═════════╪══════════════╪═══════════╪═══════════╪═════════╪══════════╪══════════════╪═════════════╡
│ Row 2 - ┆ 26.923852    ┆ 48.908509 ┆ 0.550494  ┆ 0.58198 ┆ 0.780957 ┆ -68.935065   ┆ 122.782769  │
│ Row 1   ┆              ┆           ┆           ┆         ┆          ┆              ┆             │
│ Row 3 - ┆ 1386.272591  ┆ 41.141626 ┆ 33.695134 ┆ 0.0     ┆ inf      ┆ 1305.636486  ┆ 1466.908697 │
│ Row 1   ┆              ┆           ┆           ┆         ┆          ┆              ┆             │
│ Row 3 - ┆ 1359.34874   ┆ 50.640886 ┆ 26.84291  ┆ 0.0     ┆ inf      ┆ 1260.094427  ┆ 1458.603052 │
│ Row 2   ┆              ┆           ┆           ┆         ┆          ┆              ┆             │
└─────────┴──────────────┴───────────┴───────────┴─────────┴──────────┴──────────────┴─────────────┘

Columns: term, estimate, std_error, statistic, p_value, s_value, conf_low, conf_high

shape: (3, 8)
┌─────────┬──────────────┬───────────┬────────────┬─────────┬──────────┬─────────────┬─────────────┐
│ Term    ┆ Estimate     ┆ Std.Error ┆ z          ┆ P(>|z|) ┆ S        ┆ 2.5%        ┆ 97.5%       │
│ ---     ┆ ---          ┆ ---       ┆ ---        ┆ ---     ┆ ---      ┆ ---         ┆ ---         │
│ str     ┆ f64          ┆ f64       ┆ f64        ┆ f64     ┆ f64      ┆ f64         ┆ f64         │
╞═════════╪══════════════╪═══════════╪════════════╪═════════╪══════════╪═════════════╪═════════════╡
│ Row 1 - ┆ -26.923852   ┆ 48.908509 ┆ -0.550494  ┆ 0.58198 ┆ 0.780957 ┆ -122.782769 ┆ 68.935065   │
│ Row 2   ┆              ┆           ┆            ┆         ┆          ┆             ┆             │
│ Row 1 - ┆ -1386.272591 ┆ 41.141626 ┆ -33.695134 ┆ 0.0     ┆ inf      ┆ -1466.90869 ┆ -1305.63648 │
│ Row 3   ┆              ┆           ┆            ┆         ┆          ┆ 7           ┆ 6           │
│ Row 2 - ┆ -1359.34874  ┆ 50.640886 ┆ -26.84291  ┆ 0.0     ┆ inf      ┆ -1458.60305 ┆ -1260.09442 │
│ Row 3   ┆              ┆           ┆            ┆         ┆          ┆ 2           ┆ 7           │
└─────────┴──────────────┴───────────┴────────────┴─────────┴──────────┴─────────────┴─────────────┘

Columns: term, estimate, std_error, statistic, p_value, s_value, conf_low, conf_high

@artiom-matvei
Copy link
Contributor

artiom-matvei commented Nov 2, 2024

After correction I get the following in Python. The order is different because it seems like python is ordering the rows of the dataframe that is used to create labels alphabetically whereas R does not order them. This causes the difference in the order of the labels.

shape: (3, 8)
┌────────────────────┬─────────────┬───────────┬───────────┬─────────┬──────────┬─────────────┬─────────────┐
│ Term               ┆ Estimate    ┆ Std.Error ┆ z         ┆ P(>|z|) ┆ S        ┆ 2.5%        ┆ 97.5%       │
│ ---                ┆ ---         ┆ ---       ┆ ---       ┆ ---     ┆ ---      ┆ ---         ┆ ---         │
│ str                ┆ f64         ┆ f64       ┆ f64       ┆ f64     ┆ f64      ┆ f64         ┆ f64         │
╞════════════════════╪═════════════╪═══════════╪═══════════╪═════════╪══════════╪═════════════╪═════════════╡
│ Chinstrap - Adelie ┆ 26.923852   ┆ 48.908509 ┆ 0.550494  ┆ 0.58198 ┆ 0.780957 ┆ -68.935065  ┆ 122.782769  │
│ Gentoo - Adelie    ┆ 1386.272591 ┆ 41.141626 ┆ 33.695134 ┆ 0.0     ┆ inf      ┆ 1305.636486 ┆ 1466.908697 │
│ Gentoo - Chinstrap ┆ 1359.34874  ┆ 50.640886 ┆ 26.84291  ┆ 0.0     ┆ inf      ┆ 1260.094427 ┆ 1458.603052 │
└────────────────────┴─────────────┴───────────┴───────────┴─────────┴──────────┴─────────────┴─────────────┘

@artiom-matvei
Copy link
Contributor

artiom-matvei commented Nov 2, 2024

@vincentarelbundock just a quick question? you mentionned writing a simple test to check row order. do you want to have the exact same terms in Python and R and the exact same order?
For example, if the output in Python and R are like below, it would not be acceptable?

Python

shape: (3, 8)
┌────────────────────┬─────────────┬───────────┬───────────┬─────────┬──────────┬─────────────┬─────────────┐
│ Term               ┆ Estimate    ┆ Std.Error ┆ z         ┆ P(>|z|) ┆ S        ┆ 2.5%        ┆ 97.5%       │
│ ---                ┆ ---         ┆ ---       ┆ ---       ┆ ---     ┆ ---      ┆ ---         ┆ ---         │
│ str                ┆ f64         ┆ f64       ┆ f64       ┆ f64     ┆ f64      ┆ f64         ┆ f64         │
╞════════════════════╪═════════════╪═══════════╪═══════════╪═════════╪══════════╪═════════════╪═════════════╡
│ Chinstrap - Adelie ┆ 26.923852   ┆ 48.908509 ┆ 0.550494  ┆ 0.58198 ┆ 0.780957 ┆ -68.935065  ┆ 122.782769  │
│ Gentoo - Adelie    ┆ 1386.272591 ┆ 41.141626 ┆ 33.695134 ┆ 0.0     ┆ inf      ┆ 1305.636486 ┆ 1466.908697 │
│ Gentoo - Chinstrap ┆ 1359.34874  ┆ 50.640886 ┆ 26.84291  ┆ 0.0     ┆ inf      ┆ 1260.094427 ┆ 1458.603052 │
└────────────────────┴─────────────┴───────────┴───────────┴─────────┴──────────┴─────────────┴─────────────┘

R ouput


               Term Estimate Std. Error       z Pr(>|z|)     S   2.5 % 97.5 %
 Gentoo - Adelie      1375.4       40.6  33.898   <0.001 834.3  1295.8   1455
 Chinstrap - Adelie     32.4       48.8   0.665    0.506   1.0   -63.2    128
 Chinstrap - Gentoo  -1342.9       50.5 -26.603   <0.001 515.6 -1441.9  -1244

Type:  response 
Columns: term, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high 

@vincentarelbundock
Copy link
Owner Author

No the row orders need not be the same.

But the statistical contrast needs to be the same, ex: Blue-Red vs. Red-Blue.

@artiom-matvei
Copy link
Contributor

artiom-matvei commented Nov 2, 2024

What happens is that the order of the statistical contrast depends on the row order of the underlying dataframe.
For example

Python dataframe order

Adelie
Chinstrap
Gentoo

Python contrasts:

Chinstrap - Adelie
Gentoo - Adelie
Gentoo - Chinstrap

R dataframe order

Adelie
Gentoo
Chinstrap

R contrasts:

Gentoo - Adelie
Chinstrap - Adelie
Chinstrap - Gentoo

So if we want to maintain the same order between R and Python, we need to make sure that we use the same ordering of the original data dataframe.
Compare these two examples for seeing better the difference:

R output

  Term Estimate
 1 - 4     9.55
 2 - 4     6.61
 3 - 4     0.51
 6 - 4     3.91
 8 - 4    -0.79
 2 - 1    -2.94
 3 - 1    -9.04
 6 - 1    -5.64
 8 - 1   -10.34
 3 - 2    -6.10
 6 - 2    -2.70
 8 - 2    -7.40
 6 - 3     3.40
 8 - 3    -1.30
 8 - 6    -4.70

Python output

┌───────┬────────────┬───────────┬───────────┬─────────┬─────────┬──────────┬───────────┐
│ term  ┆ estimate   ┆ std_error ┆ statistic ┆ p_value ┆ s_value ┆ conf_low ┆ conf_high │
│ ---   ┆ ---        ┆ ---       ┆ ---       ┆ ---     ┆ ---     ┆ ---      ┆ ---       │
│ str   ┆ f64        ┆ f64       ┆ f64       ┆ f64     ┆ f64     ┆ f64      ┆ f64       │
╞═══════╪════════════╪═══════════╪═══════════╪═════════╪═════════╪══════════╪═══════════╡
│ 2 - 1 ┆ -2.942858  ┆ NaN       ┆ NaN       ┆ NaN     ┆ NaN     ┆ NaN      ┆ NaN       │
│ 3 - 1 ┆ -9.042864  ┆ NaN       ┆ NaN       ┆ NaN     ┆ NaN     ┆ NaN      ┆ NaN       │
│ 4 - 1 ┆ -9.552866  ┆ NaN       ┆ NaN       ┆ NaN     ┆ NaN     ┆ NaN      ┆ NaN       │
│ 6 - 1 ┆ -5.642857  ┆ NaN       ┆ NaN       ┆ NaN     ┆ NaN     ┆ NaN      ┆ NaN       │
│ 8 - 1 ┆ -10.342857 ┆ NaN       ┆ NaN       ┆ NaN     ┆ NaN     ┆ NaN      ┆ NaN       │
│ 3 - 2 ┆ -6.100006  ┆ NaN       ┆ NaN       ┆ NaN     ┆ NaN     ┆ NaN      ┆ NaN       │
│ 4 - 2 ┆ -6.610009  ┆ NaN       ┆ NaN       ┆ NaN     ┆ NaN     ┆ NaN      ┆ NaN       │
│ 6 - 2 ┆ -2.699999  ┆ NaN       ┆ NaN       ┆ NaN     ┆ NaN     ┆ NaN      ┆ NaN       │
│ 8 - 2 ┆ -7.399999  ┆ NaN       ┆ NaN       ┆ NaN     ┆ NaN     ┆ NaN      ┆ NaN       │
│ 4 - 3 ┆ -0.510002  ┆ NaN       ┆ NaN       ┆ NaN     ┆ NaN     ┆ NaN      ┆ NaN       │
│ 6 - 3 ┆ 3.400007   ┆ NaN       ┆ NaN       ┆ NaN     ┆ NaN     ┆ NaN      ┆ NaN       │
│ 8 - 3 ┆ -1.299993  ┆ NaN       ┆ NaN       ┆ NaN     ┆ NaN     ┆ NaN      ┆ NaN       │
│ 6 - 4 ┆ 3.91001    ┆ NaN       ┆ NaN       ┆ NaN     ┆ NaN     ┆ NaN      ┆ NaN       │
│ 8 - 4 ┆ -0.78999   ┆ NaN       ┆ NaN       ┆ NaN     ┆ NaN     ┆ NaN      ┆ NaN       │
│ 8 - 6 ┆ -4.7       ┆ NaN       ┆ NaN       ┆ NaN     ┆ NaN     ┆ NaN      ┆ NaN       │
└───────┴────────────┴───────────┴───────────┴─────────┴─────────┴──────────┴───────────┘

Columns: term, estimate, std_error, statistic, p_value, s_value, conf_low, conf_high

R and Python code for generating the above tables

library(marginaleffects)
dat = read.csv("https://github.com/vincentarelbundock/Rdatasets/raw/refs/heads/master/csv/datasets/mtcars.csv")
mod = glm(mpg ~ hp * wt * disp * cyl * qsec, data = dat)
avg_predictions(mod, by = "carb", hypothesis = "revpairwise")
import polars as pl
import statsmodels.formula.api as smf
from marginaleffects import *
mtcars_df = pl.read_csv("https://github.com/vincentarelbundock/Rdatasets/raw/refs/heads/master/csv/datasets/mtcars.csv")
mtcars_mod = smf.ols("mpg ~ hp * wt * disp * cyl * qsec", data=mtcars_df).fit()
avg_predictions(mtcars_mod, by = "carb", hypothesis='revpairwise')

@vincentarelbundock
Copy link
Owner Author

If "pairwise" produces the same results in R and Python under the same row ordering, then we can close this issue.

Thanks

vincentarelbundock pushed a commit that referenced this issue Nov 5, 2024
…irwise (#139)

* fixed row labels names for pairwise hypothesis

* improved hypothesis rowlabels and tested

* removed mean from labels and added test
@artiom-matvei
Copy link
Contributor

Since we merged #139, this issue is supposed to be solved with that merge.

vincentarelbundock added a commit that referenced this issue Nov 5, 2024
…, `pairwise`, `revreference`, `reference`, `revsequential`, `sequential` (#137)

* fixed row labels names for pairwise hypothesis

* improved hypothesis rowlabels and tested

---------

Co-authored-by: Vincent Arel-Bundock <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants