Call for notebooks demonstrating how to handle missing data #461

drbenvincent · 2022-11-12T10:02:08Z

While we do have a lot of example notebooks, we have a distinct lack of examples covering how to deal with missing data. There is also no missing data tag.

The only ones I can think of are the notebooks on censored and truncated data, which are a form of missing data.

So this is a kind of meta-issue. I/We would be very grateful if people would like to contribute notebooks demonstrating how to handle missing data. Feel free to create specific notebook proposal issues, referencing this issue.

The text was updated successfully, but these errors were encountered:

NathanielF · 2022-11-21T19:56:16Z

I think this is an interesting issue, but not one I know a tonne about...I've heard good things about "Applied Missing Data" by Chris Enders though. Might be able to look into this a bit more after I finish out the Bayesian VAR model thing.

NathanielF · 2022-12-14T11:14:54Z

Ok, i've ordered the Enders book - arriving on Friday. I will look into this topic in a bit more detail over Christmas and report back in January if i think i can add anything of interest.

NathanielF · 2023-01-08T13:13:23Z

Ok, I think this is definitely something I want to pursue. Think there is a really nice example of workplace empowerment estimation I want to work through.... Will outline a full proposal after I've finished the reliability and prediction pull request if that's alright?

…t FIML Signed-off-by: Nathaniel <[email protected]>

Signed-off-by: Nathaniel <[email protected]>

NathanielF · 2023-01-16T10:57:39Z

Started some work on this and was able to get FIML and Bayesian imputation working for the multivariate normal. But i had to use a Potential rather than a likelihood as per the discussion here: https://discourse.pymc.io/t/automatic-imputation-of-multivariate-models/11029/3 for the Bayesian MV imputation.

I'm also going to try the chained equation imputation approach which shouldn't need this approach.

juanitorduz · 2023-01-17T09:34:58Z

Coo! By the way have you seen this video https://www.youtube.com/watch?v=nJ3XefApED0 ?

NathanielF · 2023-01-17T12:40:10Z

About a 1/3 of the way through that video

reshamas · 2023-01-17T14:34:34Z

@NathanielF

That video (https://www.youtube.com/watch?v=nJ3XefApED0) needs timestamps, in case you are interested. More info here:
pymc-devs/video-timestamps#11

NathanielF · 2023-01-20T23:13:35Z

Thanks @reshamas , will have a look tomorrow

Signed-off-by: Nathaniel <[email protected]>

NathanielF · 2023-01-25T17:39:14Z

I think this is close to done. Really impressed by those jax samplers!! The speed is so much better!

Signed-off-by: Nathaniel <[email protected]>

…d regression notebook Signed-off-by: Nathaniel <[email protected]>

Signed-off-by: Nathaniel <[email protected]>

…ot by team Signed-off-by: Nathaniel <[email protected]>

…text Signed-off-by: Nathaniel <[email protected]>

Signed-off-by: Nathaniel <[email protected]>

* [Missing Data #461] First commit for missing data working out FIML Signed-off-by: Nathaniel <[email protected]> * [Missing Data #461] Added Sensitivity Plots Signed-off-by: Nathaniel <[email protected]> * [Missing Data #461] Added Bayesian model fit Signed-off-by: Nathaniel <[email protected]> * [Missing Data #461] more testing Signed-off-by: Nathaniel <[email protected]> * [Missing Data #461] added chained equation example Signed-off-by: Nathaniel <[email protected]> * [Missing Data #461] added myst and updated write up Signed-off-by: Nathaniel <[email protected]> * [Missing Data #461] fixed some typos Signed-off-by: Nathaniel <[email protected]> * [Missing Data #461] added hierarchical imputation Signed-off-by: Nathaniel <[email protected]> * [Missing Data #461] used blackjax sampler and converged Signed-off-by: Nathaniel <[email protected]> * [Missing Data #461] updated with feedback Signed-off-by: Nathaniel <[email protected]> * [Missing Data #461] nicer team impact plot with title Signed-off-by: Nathaniel <[email protected]> * [Missing Data #461] updated with Ben's comments Signed-off-by: Nathaniel <[email protected]> * [Missing Data #461] trying to fix sphinx cross ref Signed-off-by: Nathaniel <[email protected]> * [Missing Data #461] updated to link to truncated and censored regression notebook Signed-off-by: Nathaniel <[email protected]> * [Missing Data #461] fixed minor typo Signed-off-by: Nathaniel <[email protected]> * [Missing Data #461] changed authored by date Signed-off-by: Nathaniel <[email protected]> * [Missing Data #461] added more explanatory text on why we plot by team Signed-off-by: Nathaniel <[email protected]> * [Missing Data #461] updated data load method and added some text Signed-off-by: Nathaniel <[email protected]> * [Missing Data #461] removed extra # comments Signed-off-by: Nathaniel <[email protected]> * [Missing Data #461] removed another ## comment Signed-off-by: Nathaniel <[email protected]> --------- Signed-off-by: Nathaniel <[email protected]>

Signed-off-by: Nathaniel <[email protected]>

NathanielF · 2023-02-03T09:42:46Z

Woop!! Thanks so much @drbenvincent. This one was a real fun one!!

drbenvincent added help wanted Extra attention is needed proposal New notebook proposal still up for discussion labels Nov 12, 2022

drbenvincent changed the title ~~Call notebooks demonstrating how to handle missing data~~ Call for notebooks demonstrating how to handle missing data Nov 12, 2022

NathanielF added a commit to NathanielF/pymc-examples that referenced this issue Jan 16, 2023

[Missing Data pymc-devs#461] First commit for missing data working ou…

c477052

…t FIML Signed-off-by: Nathaniel <[email protected]>

NathanielF added a commit to NathanielF/pymc-examples that referenced this issue Jan 16, 2023

[Missing Data pymc-devs#461] Added Sensitivity Plots

b2f829d

Signed-off-by: Nathaniel <[email protected]>

NathanielF added a commit to NathanielF/pymc-examples that referenced this issue Jan 16, 2023

[Missing Data pymc-devs#461] Added Bayesian model fit

0de8f71

Signed-off-by: Nathaniel <[email protected]>

NathanielF added a commit to NathanielF/pymc-examples that referenced this issue Jan 16, 2023

[Missing Data pymc-devs#461] more testing

c8fb977

Signed-off-by: Nathaniel <[email protected]>

NathanielF mentioned this issue Jan 16, 2023

Missing data and Bayesian Imputation #500

Merged

3 tasks

NathanielF added a commit to NathanielF/pymc-examples that referenced this issue Jan 20, 2023

[Missing Data pymc-devs#461] added chained equation example

5f45c5c

Signed-off-by: Nathaniel <[email protected]>

NathanielF added a commit to NathanielF/pymc-examples that referenced this issue Jan 22, 2023

[Missing Data pymc-devs#461] added myst and updated write up

8f8e8d5

Signed-off-by: Nathaniel <[email protected]>

NathanielF added a commit to NathanielF/pymc-examples that referenced this issue Jan 22, 2023

[Missing Data pymc-devs#461] fixed some typos

230bcb5

Signed-off-by: Nathaniel <[email protected]>

NathanielF added a commit to NathanielF/pymc-examples that referenced this issue Jan 23, 2023

[Missing Data pymc-devs#461] added hierarchical imputation

d23c106

Signed-off-by: Nathaniel <[email protected]>

NathanielF added a commit to NathanielF/pymc-examples that referenced this issue Jan 23, 2023

[Missing Data pymc-devs#461] used blackjax sampler and converged

14b25a5

Signed-off-by: Nathaniel <[email protected]>

NathanielF added a commit to NathanielF/pymc-examples that referenced this issue Jan 24, 2023

[Missing Data pymc-devs#461] updated with feedback

a582711

Signed-off-by: Nathaniel <[email protected]>

NathanielF added a commit to NathanielF/pymc-examples that referenced this issue Jan 24, 2023

[Missing Data pymc-devs#461] nicer team impact plot with title

f19a62e

Signed-off-by: Nathaniel <[email protected]>

NathanielF added a commit to NathanielF/pymc-examples that referenced this issue Feb 1, 2023

[Missing Data pymc-devs#461] updated with Ben's comments

6754181

Signed-off-by: Nathaniel <[email protected]>

NathanielF added a commit to NathanielF/pymc-examples that referenced this issue Feb 1, 2023

[Missing Data pymc-devs#461] trying to fix sphinx cross ref

f12fed3

Signed-off-by: Nathaniel <[email protected]>

NathanielF added a commit to NathanielF/pymc-examples that referenced this issue Feb 1, 2023

[Missing Data pymc-devs#461] updated to link to truncated and censore…

8ab5b4c

…d regression notebook Signed-off-by: Nathaniel <[email protected]>

NathanielF added a commit to NathanielF/pymc-examples that referenced this issue Feb 1, 2023

[Missing Data pymc-devs#461] fixed minor typo

832a1e6

Signed-off-by: Nathaniel <[email protected]>

NathanielF added a commit to NathanielF/pymc-examples that referenced this issue Feb 1, 2023

[Missing Data pymc-devs#461] changed authored by date

81d64c9

Signed-off-by: Nathaniel <[email protected]>

NathanielF added a commit to NathanielF/pymc-examples that referenced this issue Feb 1, 2023

[Missing Data pymc-devs#461] added more explanatory text on why we pl…

58feba8

…ot by team Signed-off-by: Nathaniel <[email protected]>

NathanielF added a commit to NathanielF/pymc-examples that referenced this issue Feb 2, 2023

[Missing Data pymc-devs#461] updated data load method and added some …

6eea76e

…text Signed-off-by: Nathaniel <[email protected]>

NathanielF added a commit to NathanielF/pymc-examples that referenced this issue Feb 3, 2023

[Missing Data pymc-devs#461] removed extra # comments

411b2be

Signed-off-by: Nathaniel <[email protected]>

NathanielF added a commit to NathanielF/pymc-examples that referenced this issue Feb 3, 2023

[Missing Data pymc-devs#461] removed another ## comment

8bf64f2

Signed-off-by: Nathaniel <[email protected]>

NathanielF added a commit to NathanielF/pymc-examples that referenced this issue Feb 3, 2023

[Missing Data pymc-devs#461] removed final ## comment

ed2d11a

Signed-off-by: Nathaniel <[email protected]>

NathanielF closed this as completed Mar 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Call for notebooks demonstrating how to handle missing data #461

Call for notebooks demonstrating how to handle missing data #461

drbenvincent commented Nov 12, 2022 •

edited

Loading

NathanielF commented Nov 21, 2022 •

edited

Loading

NathanielF commented Dec 14, 2022

NathanielF commented Jan 8, 2023

NathanielF commented Jan 16, 2023

juanitorduz commented Jan 17, 2023

NathanielF commented Jan 17, 2023

reshamas commented Jan 17, 2023

NathanielF commented Jan 20, 2023

NathanielF commented Jan 25, 2023

NathanielF commented Feb 3, 2023

Call for notebooks demonstrating how to handle missing data #461

Call for notebooks demonstrating how to handle missing data #461

Comments

drbenvincent commented Nov 12, 2022 • edited Loading

NathanielF commented Nov 21, 2022 • edited Loading

NathanielF commented Dec 14, 2022

NathanielF commented Jan 8, 2023

NathanielF commented Jan 16, 2023

juanitorduz commented Jan 17, 2023

NathanielF commented Jan 17, 2023

reshamas commented Jan 17, 2023

NathanielF commented Jan 20, 2023

NathanielF commented Jan 25, 2023

NathanielF commented Feb 3, 2023

drbenvincent commented Nov 12, 2022 •

edited

Loading

NathanielF commented Nov 21, 2022 •

edited

Loading