Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model specification #78

Closed
adibender opened this issue Dec 18, 2018 · 3 comments
Closed

Model specification #78

adibender opened this issue Dec 18, 2018 · 3 comments

Comments

@adibender
Copy link
Owner

@fabian-s I think its time to think more generally about how we call the functions for data transformation.

  • Right now we use Surv(time, status)~ ... + ... | cumulative(...),
    although the Surv(time, status) part is only a mirrage, as we don't really do anything Surv specific with it, except extracting the event time and status variables, while the usual functionality of Surv() is not supported, e.g.

  • Surv(time, status == 2) ~ ... (see also More robust split_data function #31) or

  • Surv(time1, time2, status) ~ ... e.g. for left truncated data,

  • etc.

This will also be relevant when/if we extend the functionality to competing risks/multistate models, in which case we need to support calls like

as_ped(Surv(time, event1) | Surv(time, event2) ~ lin_pred_event1 | lin_pred_event2

This could be done nicely using the Formula functionality, but we already use | on the RHS to differentiate between cumulative effects and "normal" effects ~ ... + ... | cumulative().
The latter may not be necessary, as we can simply extract cumulative via the specials function?

@fabian-s
Copy link
Collaborator

hey, you're on a roll today..... 😮

agree, using specials and having multiple RHS via Formula seems like the way to go, let's try to have as few idiosyncrasies as possible.

need to think more about possible/allowed input data formats as well -- are different event types going to be recorded in different columns (one time column per event type) or in a pair of columns (time, event) with potentially multiple entries per subject? for the first, we'd need multiple RHS, for the second, we could use the event column as "status".

@adibender
Copy link
Owner Author

We also have to consider different types of censoring/truncation on the LHS

@adibender
Copy link
Owner Author

I'll close this general issue. Open individual issues for specific cases

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants