-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
revpairwise
= pairwise
in R
#100
Comments
R code and output:> library(marginaleffects)
> dat = read.csv("https://vincentarelbundock.github.io/Rdatasets/csv/palmerpenguins/penguins.csv")
> mod = glm(body_mass_g ~ flipper_length_mm * species * bill_length_mm + island, data = dat)
> avg_predictions(m, by = "species", hypothesis = "revpairwise")
Erreur : objet 'm' introuvable
> avg_predictions(mod, by = "species", hypothesis = "revpairwise")
Term Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 %
Gentoo - Adelie 1375.4 40.6 33.898 <0.001 834.3 1295.8 1455
Chinstrap - Adelie 32.4 48.8 0.665 0.506 1.0 -63.2 128
Chinstrap - Gentoo -1342.9 50.5 -26.603 <0.001 515.6 -1441.9 -1244
Type: response
Columns: term, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high
> avg_predictions(mod, by = "species", hypothesis = "pairwise")
Term Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 %
Adelie - Gentoo -1375.4 40.6 -33.898 <0.001 834.3 -1455 -1295.8
Adelie - Chinstrap -32.4 48.8 -0.665 0.506 1.0 -128 63.2
Gentoo - Chinstrap 1342.9 50.5 26.603 <0.001 515.6 1244 1441.9
Type: response
Columns: term, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high Python code and output:import polars as pl
import statsmodels.formula.api as smf
from marginaleffects import *
penguins = pl.read_csv(
"../tests/data/penguins.csv",
null_values="NA",
).drop_nulls()
mod = smf.ols(
"body_mass_g ~ flipper_length_mm * species * bill_length_mm + island",
penguins.to_pandas(),
).fit()
print(avg_predictions(mod, by = "species", hypothesis = "revpairwise"))
print(avg_predictions(mod, by = "species", hypothesis = "pairwise")) output
|
After correction I get the following in Python. The order is different because it seems like python is ordering the rows of the dataframe that is used to create labels alphabetically whereas R does not order them. This causes the difference in the order of the labels.
|
@vincentarelbundock just a quick question? you mentionned writing a simple test to check row order. do you want to have the exact same terms in Python and R and the exact same order? Python
R ouput
|
No the row orders need not be the same. But the statistical contrast needs to be the same, ex: Blue-Red vs. Red-Blue. |
What happens is that the order of the statistical contrast depends on the row order of the underlying dataframe. Python dataframe order
Python contrasts:
R dataframe order
R contrasts:
So if we want to maintain the same order between R and Python, we need to make sure that we use the same ordering of the original data dataframe. R output
Python output
R and Python code for generating the above tableslibrary(marginaleffects)
dat = read.csv("https://github.com/vincentarelbundock/Rdatasets/raw/refs/heads/master/csv/datasets/mtcars.csv")
mod = glm(mpg ~ hp * wt * disp * cyl * qsec, data = dat)
avg_predictions(mod, by = "carb", hypothesis = "revpairwise") import polars as pl
import statsmodels.formula.api as smf
from marginaleffects import *
mtcars_df = pl.read_csv("https://github.com/vincentarelbundock/Rdatasets/raw/refs/heads/master/csv/datasets/mtcars.csv")
mtcars_mod = smf.ols("mpg ~ hp * wt * disp * cyl * qsec", data=mtcars_df).fit()
avg_predictions(mtcars_mod, by = "carb", hypothesis='revpairwise') |
If "pairwise" produces the same results in R and Python under the same row ordering, then we can close this issue. Thanks |
Since we merged #139, this issue is supposed to be solved with that merge. |
…, `pairwise`, `revreference`, `reference`, `revsequential`, `sequential` (#137) * fixed row labels names for pairwise hypothesis * improved hypothesis rowlabels and tested --------- Co-authored-by: Vincent Arel-Bundock <[email protected]>
@artiom-matvei could you check if
hypothesis="revpairwise"
andhypothesis="pairwise"
give the same results inR
andPython
? If they don't, please:Simple example like:
The text was updated successfully, but these errors were encountered: