Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rowidcf correction #46

Merged
merged 2 commits into from
Nov 23, 2023
Merged

Conversation

LamAdr
Copy link
Contributor

@LamAdr LamAdr commented Nov 22, 2023

I believe this was a bug. In R, rowidcf is constant for all individuals across counterfactuals. This was not the case in python.

R :

mod <- glm(vs ~ hp + am, data = mtcars, family = binomial)
dg <- datagridcf(model=mod, am = 0:1, hp = c(100, 110, 120))
print(dg)

output :

rowidcf vs am  hp
1         1  0  0 100
2         2  0  0 100
3         3  1  0 100
4         4  1  0 100
5         5  0  0 100
6         6  1  0 100
7         7  0  0 100
8         8  1  0 100
9         9  1  0 100
10       10  1  0 100
11       11  1  0 100
12       12  0  0 100
13       13  0  0 100
14       14  0  0 100
15       15  0  0 100
16       16  0  0 100
17       17  0  0 100
18       18  1  0 100
19       19  1  0 100
20       20  1  0 100
21       21  1  0 100
22       22  0  0 100
23       23  0  0 100
24       24  0  0 100
25       25  0  0 100
26       26  1  0 100
27       27  0  0 100
28       28  1  0 100
29       29  0  0 100
30       30  0  0 100
31       31  0  0 100
32       32  1  0 100
33        1  0  0 110
	...
160      32  1  1 110
161       1  0  1 120
162       2  0  1 120
163       3  1  1 120
164       4  1  1 120
165       5  0  1 120
166       6  1  1 120
167       7  0  1 120
168       8  1  1 120
169       9  1  1 120
170      10  1  1 120
171      11  1  1 120
172      12  0  1 120
173      13  0  1 120
174      14  0  1 120
175      15  0  1 120
176      16  0  1 120
177      17  0  1 120
178      18  1  1 120
179      19  1  1 120
180      20  1  1 120
181      21  1  1 120
182      22  0  1 120
183      23  0  1 120
184      24  0  1 120
185      25  0  1 120
186      26  1  1 120
187      27  0  1 120
188      28  1  1 120
189      29  0  1 120
190      30  0  1 120
191      31  0  1 120
192      32  1  1 120

python :

dat = dat.with_columns(
	pl.col("am").cast(pl.Boolean)
)
mod = smf.glm("vs ~ hp + am", data = dat, family = sm.families.Binomial()).fit()

dg = datagridcf(mod, am = [0, 1], hp = [100, 110, 120])
print(dg.sort(['am_right', 'hp_right'])[['rownames', 'vs', 'am_right', 'hp_right', 'rowidcf']])

output (previously) :

┌───────────────────┬─────┬──────────┬──────────┬─────────┐
│ rownames          ┆ vs  ┆ am_right ┆ hp_right ┆ rowidcf │
│ ---               ┆ --- ┆ ---      ┆ ---      ┆ ---     │
│ str               ┆ i64 ┆ i64      ┆ i64      ┆ i64     │
╞═══════════════════╪═════╪══════════╪══════════╪═════════╡
│ Mazda RX4         ┆ 0   ┆ 0        ┆ 100      ┆ 0       │
│ Mazda RX4 Wag     ┆ 0   ┆ 0        ┆ 100      ┆ 6       │
│ Datsun 710        ┆ 1   ┆ 0        ┆ 100      ┆ 12      │
│ Hornet 4 Drive    ┆ 1   ┆ 0        ┆ 100      ┆ 18      │
│ Hornet Sportabout ┆ 0   ┆ 0        ┆ 100      ┆ 24      │
│ Valiant           ┆ 1   ┆ 0        ┆ 100      ┆ 30      │
│ Duster 360        ┆ 0   ┆ 0        ┆ 100      ┆ 36      │
│ Merc 240D         ┆ 1   ┆ 0        ┆ 100      ┆ 42      │
│ Merc 230          ┆ 1   ┆ 0        ┆ 100      ┆ 48      │
│ Merc 280          ┆ 1   ┆ 0        ┆ 100      ┆ 54      │
│ Merc 280C         ┆ 1   ┆ 0        ┆ 100      ┆ 60      │
│ Merc 450SE        ┆ 0   ┆ 0        ┆ 100      ┆ 66      │
│ …                 ┆ …   ┆ …        ┆ …        ┆ …       │
│ Toyota Corona     ┆ 1   ┆ 1        ┆ 120      ┆ 125     │
│ Dodge Challenger  ┆ 0   ┆ 1        ┆ 120      ┆ 131     │
│ AMC Javelin       ┆ 0   ┆ 1        ┆ 120      ┆ 137     │
│ Camaro Z28        ┆ 0   ┆ 1        ┆ 120      ┆ 143     │
│ Pontiac Firebird  ┆ 0   ┆ 1        ┆ 120      ┆ 149     │
│ Fiat X1-9         ┆ 1   ┆ 1        ┆ 120      ┆ 155     │
│ Porsche 914-2     ┆ 0   ┆ 1        ┆ 120      ┆ 161     │
│ Lotus Europa      ┆ 1   ┆ 1        ┆ 120      ┆ 167     │
│ Ford Pantera L    ┆ 0   ┆ 1        ┆ 120      ┆ 173     │
│ Ferrari Dino      ┆ 0   ┆ 1        ┆ 120      ┆ 179     │
│ Maserati Bora     ┆ 0   ┆ 1        ┆ 120      ┆ 185     │
│ Volvo 142E        ┆ 1   ┆ 1        ┆ 120      ┆ 191     │
└───────────────────┴─────┴──────────┴──────────┴─────────┘

output (new) :

shape: (192, 5)
┌───────────────────┬─────┬──────────┬──────────┬─────────┐
│ rownames          ┆ vs  ┆ am_right ┆ hp_right ┆ rowidcf │
│ ---               ┆ --- ┆ ---      ┆ ---      ┆ ---     │
│ str               ┆ i64 ┆ i64      ┆ i64      ┆ i64     │
╞═══════════════════╪═════╪══════════╪══════════╪═════════╡
│ Mazda RX4         ┆ 0   ┆ 0        ┆ 100      ┆ 0       │
│ Mazda RX4 Wag     ┆ 0   ┆ 0        ┆ 100      ┆ 1       │
│ Datsun 710        ┆ 1   ┆ 0        ┆ 100      ┆ 2       │
│ Hornet 4 Drive    ┆ 1   ┆ 0        ┆ 100      ┆ 3       │
│ Hornet Sportabout ┆ 0   ┆ 0        ┆ 100      ┆ 4       │
│ Valiant           ┆ 1   ┆ 0        ┆ 100      ┆ 5       │
│ Duster 360        ┆ 0   ┆ 0        ┆ 100      ┆ 6       │
│ Merc 240D         ┆ 1   ┆ 0        ┆ 100      ┆ 7       │
│ Merc 230          ┆ 1   ┆ 0        ┆ 100      ┆ 8       │
│ Merc 280          ┆ 1   ┆ 0        ┆ 100      ┆ 9       │
│ Merc 280C         ┆ 1   ┆ 0        ┆ 100      ┆ 10      │
│ Merc 450SE        ┆ 0   ┆ 0        ┆ 100      ┆ 11      │
│ …                 ┆ …   ┆ …        ┆ …        ┆ …       │
│ Toyota Corona     ┆ 1   ┆ 1        ┆ 120      ┆ 20      │
│ Dodge Challenger  ┆ 0   ┆ 1        ┆ 120      ┆ 21      │
│ AMC Javelin       ┆ 0   ┆ 1        ┆ 120      ┆ 22      │
│ Camaro Z28        ┆ 0   ┆ 1        ┆ 120      ┆ 23      │
│ Pontiac Firebird  ┆ 0   ┆ 1        ┆ 120      ┆ 24      │
│ Fiat X1-9         ┆ 1   ┆ 1        ┆ 120      ┆ 25      │
│ Porsche 914-2     ┆ 0   ┆ 1        ┆ 120      ┆ 26      │
│ Lotus Europa      ┆ 1   ┆ 1        ┆ 120      ┆ 27      │
│ Ford Pantera L    ┆ 0   ┆ 1        ┆ 120      ┆ 28      │
│ Ferrari Dino      ┆ 0   ┆ 1        ┆ 120      ┆ 29      │
│ Maserati Bora     ┆ 0   ┆ 1        ┆ 120      ┆ 30      │
│ Volvo 142E        ┆ 1   ┆ 1        ┆ 120      ┆ 31      │
└───────────────────┴─────┴──────────┴──────────┴─────────┘

alternatively
print(dg[['rownames', 'vs', 'am_right', 'hp_right', 'rowidcf']])

output (new) :

shape: (192, 5)
┌───────────────┬─────┬──────────┬──────────┬─────────┐
│ rownames      ┆ vs  ┆ am_right ┆ hp_right ┆ rowidcf │
│ ---           ┆ --- ┆ ---      ┆ ---      ┆ ---     │
│ str           ┆ i64 ┆ i64      ┆ i64      ┆ i64     │
╞═══════════════╪═════╪══════════╪══════════╪═════════╡
│ Mazda RX4     ┆ 0   ┆ 0        ┆ 100      ┆ 0       │
│ Mazda RX4     ┆ 0   ┆ 0        ┆ 110      ┆ 0       │
│ Mazda RX4     ┆ 0   ┆ 0        ┆ 120      ┆ 0       │
│ Mazda RX4     ┆ 0   ┆ 1        ┆ 100      ┆ 0       │
│ Mazda RX4     ┆ 0   ┆ 1        ┆ 110      ┆ 0       │
│ Mazda RX4     ┆ 0   ┆ 1        ┆ 120      ┆ 0       │
│ Mazda RX4 Wag ┆ 0   ┆ 0        ┆ 100      ┆ 1       │
│ Mazda RX4 Wag ┆ 0   ┆ 0        ┆ 110      ┆ 1       │
│ Mazda RX4 Wag ┆ 0   ┆ 0        ┆ 120      ┆ 1       │
│ Mazda RX4 Wag ┆ 0   ┆ 1        ┆ 100      ┆ 1       │
│ Mazda RX4 Wag ┆ 0   ┆ 1        ┆ 110      ┆ 1       │
│ Mazda RX4 Wag ┆ 0   ┆ 1        ┆ 120      ┆ 1       │
│ …             ┆ …   ┆ …        ┆ …        ┆ …       │
│ Maserati Bora ┆ 0   ┆ 0        ┆ 100      ┆ 30      │
│ Maserati Bora ┆ 0   ┆ 0        ┆ 110      ┆ 30      │
│ Maserati Bora ┆ 0   ┆ 0        ┆ 120      ┆ 30      │
│ Maserati Bora ┆ 0   ┆ 1        ┆ 100      ┆ 30      │
│ Maserati Bora ┆ 0   ┆ 1        ┆ 110      ┆ 30      │
│ Maserati Bora ┆ 0   ┆ 1        ┆ 120      ┆ 30      │
│ Volvo 142E    ┆ 1   ┆ 0        ┆ 100      ┆ 31      │
│ Volvo 142E    ┆ 1   ┆ 0        ┆ 110      ┆ 31      │
│ Volvo 142E    ┆ 1   ┆ 0        ┆ 120      ┆ 31      │
│ Volvo 142E    ┆ 1   ┆ 1        ┆ 100      ┆ 31      │
│ Volvo 142E    ┆ 1   ┆ 1        ┆ 110      ┆ 31      │
│ Volvo 142E    ┆ 1   ┆ 1        ┆ 120      ┆ 31      │
└───────────────┴─────┴──────────┴──────────┴─────────┘

Is there a reason why rowidcf starts at 1 in R? Do we want to keep that?

@vincentarelbundock
Copy link
Owner

Thanks!

Can you add a test for the number of rowsÉ

@LamAdr
Copy link
Contributor Author

LamAdr commented Nov 23, 2023

@vincentarelbundock something like that?

@vincentarelbundock
Copy link
Owner

Looks grea,t thanks!

@vincentarelbundock vincentarelbundock merged commit ad449a6 into vincentarelbundock:main Nov 23, 2023
1 of 2 checks passed
@LamAdr LamAdr deleted the rowidcf branch November 23, 2023 15:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants