Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nessie #47

Merged
merged 30 commits into from
May 23, 2024
Merged

Nessie #47

merged 30 commits into from
May 23, 2024

Conversation

rimhajal
Copy link
Member

interface needs fixing

the results aren't exactly what i'm looking for but i have been stuck on this ... is there anything that sticks out to you on quick glance?

Copy link
Member

@lrnv lrnv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me take a look

Copy link
Member

@lrnv lrnv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you provide an example ?

@rimhajal
Copy link
Member Author

rimhajal commented May 14, 2024

julia> Nessie(@formula(Surv(time,status)~sex), colrec, colrec.sex, slopop)
([3288.648962464143 2145.609944014756 … 55.901666088940075 15.974814139780039; 2681.6275632951492 1772.6068547689429 … 71.82278361787326 13.957914916445606], [13.004323842641433, 25.295201986907866])

nessie(Surv(time, stat) ~ sex, data = colrec, ratetable = slopop, rmap = list(age = age, sex = sex, year = diag))
sex=1 3289 3134.0 2983.4 2836.5 2693.6 2555.6 2421.9 2291.2 2162.7 2038.0 1920.2 1808.5 1702.9 1603.4 1510.1 1422.9 1339.3 1258.1 1179.1 1102.9 1028.4 954.6 881.8 15.2
sex=2 2682 2579.5 2479.2 2380.6 2283.4 2188.0 2093.2 1998.0 1902.0 1806.7 1714.7 1625.9 1540.8 1460.0 1383.7 1312.1 1243.2 1175.6 1109.6 1045.3 981.9 918.6 855.9 17.0

from the first year, there's a big difference between the two. Maybe the time loops are wrong?

@lrnv
Copy link
Member

lrnv commented May 14, 2024

So why is rate_pred passed as an argument? It should not be necessary right ? The other fit function extracts it from the rate table directly IIRC

@lrnv
Copy link
Member

lrnv commented May 14, 2024

Haaaa i think i got the solution: we let the run last until the last possible moment in the grid, that is until the last death, but we should let the loop run until 120 years, that is the end of the rate table. I think we should try that and if it works write the tests correctly. Then, our loop looks really inneficient but maybe it's RateTables.jl's job to provide the expected remaining life time for a given person directly...

Copy link
Member

@lrnv lrnv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Edited]

New version of the code depend on JuliaSurv/RateTables.jl#8 , which you can try by using :

 ]dev Ratetables

and then in the RateTables package you have to switch to this branch.

I expect the difference in the mean expected values is realted to the fact that people go "outside" of the table, as noter in the linked PR... We have to investigate a bti more...

src/Nessie.jl Outdated Show resolved Hide resolved
@lrnv
Copy link
Member

lrnv commented May 15, 2024

@rimhajal With the last push to the PR on RateTables.jl, it seems ot work and to produce the right numbers:

julia> x = nessie(@formula(Surv(time,status)~sex), colrec, slopop)
(2×2 DataFrame
 Row │ sex     expected_life_time 
     │ Symbol  Float64
─────┼────────────────────────────
   1 │ male               15.2094
   2 │ female             16.9782, 2×3 DataFrame
 Row │ sex     grid                               expected_sample_size
     │ Symbol  Array                             Array
─────┼──────────────────────────────────────────────────────────────────────────────
   1 │ male    [0.0, 365.241, 730.482, 1095.72,  [3289.0, 3133.96, 2983.45, 2836.
   2 │ female  [0.0, 365.241, 730.482, 1095.72,  [2682.0, 2579.52, 2479.22, 2380.)

julia> x[2][1,3]
23-element Vector{Float64}:
 3289.0
 3133.9552625257443
 2983.4502704649744
 2836.515553931062
 2693.622660886542
 2555.616023838256
 2421.939620989653
 2291.2277732240186
 2162.694193625734
 2037.9850260901428
 1920.250036204659
 1808.576030194914
 1702.9475638533381
 1603.425672216488
 1510.101536779181
 1422.9472561463979
 1339.3066605391687
 1258.1118851357537
 1179.1749681302872
 1102.907180537947
 1028.4768917062268
  954.4257626902065
  880.9459393878692

julia> x[2][2,3]
23-element Vector{Float64}:
 2682.0
 2579.5174220050385
 2479.221340174898
 2380.6440942810373
 2283.3771131983344
 2188.0359575554344
 2093.185295200068
 1997.9696204292184
 1901.9741246085762
 1806.7192434040187
 1714.7134710849537
 1625.9460915890554
 1540.8006350160022
 1460.0773507047522
 1383.750974088214
 1312.176526701044
 1243.2187676161443
 1175.6407898073405
 1109.623404759267
 1045.3104332404512
  981.9703553907678
  918.4269851710649
  855.1298088300433

julia> 

The issue was that, for a person aged 20 in 2010, we considered that its life ended in 2020 (at age 30) since the life table did not go further...

Next steps:

Can you do it ?

@lrnv
Copy link
Member

lrnv commented May 16, 2024

@rimhajal Well done. So after this : JuliaRegistries/General#106960 gets merged into the general registry, in this branch we need to up the compat to of RateTables to 0.1.1 (in project.toml) and then re-run the tests to see if it works.

The docs you wrote are OK.

But this lacks a test to compare to R still, right ?

@rimhajal
Copy link
Member Author

Yes there is still the test

@lrnv
Copy link
Member

lrnv commented May 16, 2024

Ok I let you bump the compat and write the test then.

Edit: Maybe we could merge #44 and rebase this PR on the main branch before writing the test, to write it coherently with the new tests.

@lrnv lrnv linked an issue May 16, 2024 that may be closed by this pull request
@rimhajal
Copy link
Member Author

Is it possible to rebase with the conflicts or will it make it worse?

@lrnv
Copy link
Member

lrnv commented May 17, 2024

@rimhajal just sent you a zoom link by mail let's do it together

@rimhajal
Copy link
Member Author

i don't understand why it's a dataframe and the test is erroring because of it

@lrnv
Copy link
Member

lrnv commented May 17, 2024

@rimhajal extracting everything from the object is a bit more easy to avoid the dataframes. But I am still not convinced that this output dataframes is the best thing we can provide, maybe we just need to provide a list.... Not sure

@rimhajal rimhajal merged commit 9c06966 into main May 23, 2024
3 checks passed
@rimhajal rimhajal deleted the nessie branch May 23, 2024 08:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Nessie function
2 participants