Add support for randomization inference #431

s3alfisc · 2024-05-05T18:25:35Z

This PR adds support for randomization inference via a ritest method for Feols.

…ster

apoorvalal · 2024-05-06T02:17:02Z

Hey Alex,

(possibly unsolicited) metrics advice on this PR: I think using the studentized statistic (where you calculate the t-stat as $\hat{\tau}/\sqrt{\hat{V}}$ from each permutation distribution has better properties [in both the Fisherian and Neymanian sense] than the simple approach of constructing a randomization distribution from the point-estimate alone. Shouldn't be a major change; one would presumably change _get_ritest_coefs to _get_ritest_studentized (or simply add that as an option where instead of returning the point estimate, you return the t-stat.

Ref: chap 7 of Peng Ding's book [based on this 2021 paper]

s3alfisc · 2024-05-06T16:18:11Z

It's very much appreciated! The more feedback the better =) I actually started out with a t-percentile implementation, but then checked out Grant's version, which worked on the fitted betas. I'll switch back to the t-stats as a default and will enable to run tests on both t-stats and coefficients. I vaguely recally that Alwyn Young recommends to use the beta's and not t-stats for the IV wild bootstrap - you haven't happened to have seen a similar results for RI? Also, great that you point me to Ding's book, I've been looking for a good write up on RI =) Thanks!

s3alfisc · 2024-05-11T09:26:14Z

I have now implemented two algos - one is "fast" and the other is "slow". Both so far only work for iid sampling.

The "slow" one simply loops over calls of "feols" or "fepois" and hence works for OLS, IV and Poisson regression. You can choose to run different variants, the randomization-c and randomization-t following naming conventions introduced by Young.

The "fast" algorithm only works for OLS and the "randomization-c" at the moment. It's vectorized and employs the FWL theorem; going forward, some speed ups should be possible by JIT compiling it via numba. Users can choose "how much" they want to vectorize (as creating a N x reps matrix can be costly if either N or reps are large). To support the "randomization-t", I will have to slightly rework the functions implemented in the vcov method / make them more "generically available".

Here's a code example:

%load_ext autoreload
%autoreload 2

import pyfixest as pf
import numpy as np
data = pf.get_data(N = 10_000)

fml = "Y ~ X1*X2*f2 |f1 + f3"

fit = pf.feols(fml, data=data)
fit.tidy().head()

rng = np.random.default_rng(1234)
fit.ritest(
    resampvar="X2",
    reps = 10_000,
    rng = rng,
    type = "randomization-c", 
    choose_algorithm = "fast",   
    algo_iterations = 1000,  # choose the number of foor loops: draws reps / algo_iterations per loop
    include_plot= True
)

To Do's:

defensive programming, type checks, assertions, etc
tests
cluster sampling
stratified sampling
CIs by test inversion
non-standard null hypotheses, one sided testing, etc

Overall, more work than I expected!

s3alfisc · 2024-05-11T09:35:10Z

Hi @apoorvalal - one question on the "randomization-t" variant: Which defaults should I set for the computation of the vcov?

Should I default to the vcov type set in the "feols" call? Then it could in principle happend that ritest computes iid inferences even under cluster random assignment - which would not be in the spirit of Athey et al.

Here's an example:

fit = pf.feols("Y ~ X1 | f1", vcov = "iid")         # iid ses
fit.ritest(resampvar = "X1", cluster = "f1")     # cluster random assignment; ses should be CRV1-f1

Under the proposed solution, the vcov matrix in each RI-iteration would be computed as iid, despite the cluster random assignment. With this solution, I should at least add a warning message?

Alternatively, I could default to computing CRV variance matrices on the level of cluster assignment and overwrite the vcov type of the feols call.

Do you have any thoughts on this? I hope you could follow =D

s3alfisc · 2024-05-11T20:45:31Z

TODO:

stratified sampling
CIs by test inversion
non-standard null hypotheses, one sided testing, etc
type hints, docstring everywhere

apoorvalal · 2024-05-11T22:16:09Z

Hi Alex,
That's an excellent question; I'm not sure I know the answer off the top of my head. I understand the behaviour of randomization-t in the pure randomized trial with no noncompliance setting, but RI is much less clear to me in settings with non-compliance [Young's paper doesn't motivate it from potential outcomes so I don't really know how to reconcile it with Abadie et al and or Ding's papers/book].

Aronow, Chang, and Lopatto put out an interesting looking paper a couple of weeks ago that might be worth looking at as well.

s3alfisc · 2024-05-12T19:49:40Z

Thanks Apporva - for now, I have deleted the vcov arg to ritest() and now by default compute the vcov as - iid if there is individual level sampling and no controls - HC1 if there is no individual level sampling and controls, and CRV1 for cluster sampling. I think that's a sensible choice, hopefully you agree? 😅

…test

s3alfisc · 2024-05-20T19:26:41Z

Open to-do's:

stratified sampling
support for IV
SEs, confidence intervals
test for hypothesis other than "beta = 0" vs "beta <> 0"

review-notebook-app · 2024-05-22T21:14:51Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

s3alfisc added 6 commits May 1, 2024 20:00

delete deprecated files

95a2c96

Merge branch 'master' of https://github.com/s3alfisc/pyfixest into ma…

0179985

…ster

Merge branch 'master' of https://github.com/s3alfisc/pyfixest into ma…

d148db8

…ster

Merge branch 'master' of https://github.com/s3alfisc/pyfixest into ma…

76e66bd

…ster

Merge branch 'master' of https://github.com/s3alfisc/pyfixest into ma…

6a454bf

…ster

initial commit

bff2ed6

s3alfisc linked an issue May 5, 2024 that may be closed by this pull request

Support for Randomization Inference #427

Closed

s3alfisc added 2 commits May 11, 2024 10:50

implement two algos for ri for iid sampling

df47cb0

add plotting function

0785b6a

s3alfisc added 6 commits May 11, 2024 15:52

add internal tests

97747f3

add support for cluster random sampling

e871e90

delete confint function

92bbd08

update pvalues

f2818d8

some reorg, NA in cluster tests

c36f197

randomization-c as default for now

6f41b6d

s3alfisc added 2 commits May 12, 2024 11:33

change defaults for randomization inference

bad646c

add slow tests against r-ritest and move fast algo to numba

14b468f

s3alfisc added 6 commits May 12, 2024 21:54

skip long test against ritest on CI

9e42149

tweak

8ffda46

update test setup for ritest - run r code locally, not on ci

05d07c5

pass ritest checks

414c7d9

Merge branch 'master' into ritest

b125a47

update tests

a3864d6

s3alfisc added 4 commits May 20, 2024 14:25

update lock file

58d3a61

Merge branch 'ritest' of https://github.com/s3alfisc/pyfixest into ri…

95c664d

…test

no ritest for IV for now

0529dd3

fix np.linalg.lstsq future proof warning

b8836d4

s3alfisc added 7 commits May 20, 2024 21:36

add test for fepois

a966f92

fix small test bug

48ce7bf

cleanup ritest tests

0c07467

report se, ci of the pvalue

2b5ae41

update readme

af2bf4e

add one sided tests

1632260

block randomizaton-t, pass all tests

732162b

s3alfisc mentioned this pull request May 22, 2024

Support for Randomization Inference #427

Closed

s3alfisc added 2 commits May 22, 2024 22:58

update readme

ead91b9

update feols docs

f54ae2c

s3alfisc added 9 commits May 23, 2024 22:30

some small code reorg

0a37b5a

fix bug in slow algo

3483aed

add plot_ritest method and matplotlib support

77018f5

Merge branch 'master' into ritest

32362d4

quickstart tweaks2

91f357c

unlock randomization t plus some tests'

97e845e

fix documentation bug

e0eb7a2

update to quickstart about joint cis

10cba8d

fix one more documentation bug

4f86ce6

s3alfisc merged commit 2e02e0b into master May 25, 2024
7 checks passed

s3alfisc deleted the ritest branch May 25, 2024 15:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for randomization inference #431

Add support for randomization inference #431

s3alfisc commented May 5, 2024

apoorvalal commented May 6, 2024

s3alfisc commented May 6, 2024

s3alfisc commented May 11, 2024

s3alfisc commented May 11, 2024

s3alfisc commented May 11, 2024

apoorvalal commented May 11, 2024

s3alfisc commented May 12, 2024

s3alfisc commented May 20, 2024

review-notebook-app bot commented May 22, 2024

Add support for randomization inference #431

Add support for randomization inference #431

Conversation

s3alfisc commented May 5, 2024

apoorvalal commented May 6, 2024

s3alfisc commented May 6, 2024

s3alfisc commented May 11, 2024

s3alfisc commented May 11, 2024

s3alfisc commented May 11, 2024

apoorvalal commented May 11, 2024

s3alfisc commented May 12, 2024

s3alfisc commented May 20, 2024

review-notebook-app bot commented May 22, 2024