Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add sample_ages to wrappers to simulate ageing a subset of lengthed fish #42

Open
sgaichas opened this issue Oct 29, 2020 · 5 comments
Open

Comments

@sgaichas
Copy link
Contributor

The way sampling is set up, effN for each survey is the number of fish measured for length, age, and average weight at age. We could introduce further realism by using the sample_ages() function to take a subsample for age composition and optionally apply an ageing error matrix (specified in the survey and fishery config files). The age comp based on a larger sample size and without error is still used to generate the length composition as input to calc_age2length, so we would have to run sample_ages after we generate lengths and weight at age.

This would require changing the om_comps() wrapper:

  1. move #save age comps lines to after the length comps and weight at age are saved
  2. add a step modifying the age_comp_data[[i]] object before the #save age comps lines by running sample_ages
  3. same two steps for fishery age comps, unless we age all the samples from fisheries
  4. add a step modifying the annage_comp_data[[i]] by running sample_ages (no need to shuffle because this is not an input to the length function

saved age comp objects remain the same
this still means weight at age is from an unrealistically large age sample, could fix later

@kellijohnson-NOAA
Copy link
Contributor

Sorry if I am missing the mark here, I did not review all of the code to see exactly what is going on, but does this assume that weight-at-age samples are independent of age-composition samples? I cannot think of any situation where you would age fish and weigh them and not use those ages in the model as marginal age compositions.

@sgaichas
Copy link
Contributor Author

You are absolutely correct @kellijohnson-NOAA that we wouldn't leave out age samples if we had them. And thank you for helping me think this all the way through. So here is what I think we can do:

Atlantis outputs n at age and weight at age. We use the initial create_survey and sample_fish with an effN equal to the number of lengths measured, which we estimate with calc_age2length because length is not tracked by Atlantis itself. Testing to date has kept all lengths, ages, and mean weight at age output from this step, which I think gives us an unrealistically large age sample, and also a mean weight at age based on this unrealistically large age sample. (This is assuming most surveys measure 10-100x more lengths than ages. If a survey measures as many lengths as ages, then we can stop here.)

So I think we can keep the length output of calc_age2length, but the next step would be to run the n at age output of sample_fish through sample_ages to represent the age subsample (still a subset of fish originally collected on the survey and measured for length). Then we can re-run calc_age2length with the subsample of ages output from sample_ages (optionally with ageing error included) to get both a length composition of the subsampled aged fish and a mean weight at age based only on the subsampled aged fish.

An extra step, but much more representative of how (at least US) surveys work.

Does this make more sense?

@kellijohnson-NOAA
Copy link
Contributor

In fisheries data, we don't always get lengths and ages for a given fish. Sometimes there will be age data with no length information. So, would you always want to just sample ages from those that are lengthed? In ss3sim we allow the sampling to be separate, where we sample from the truth two times, (1) for ages and (2) for lengths. Unless, the data are conditional age-at-length samples; where we would sample for length and take a total number of ages from those lengthed based on the distribution of lengthed fish, i.e., more ages from the most abundant length bin and fewer ages in the bins near the tails. Where many sampling protocols are length stratified and take an equal number per bin if available. But we don't allow for this latter kind of sampling in ss3sim.

If you are sampling ages from those that are lengthed and putting the ages into the model as marginal age-composition samples your information is not as independent as the model assumes because it is double counting each fish, i.e., assuming a length measurement is from and independent fish from the population and assuming an age measurement is from a new independent sample of the population.

Sorry if this is a bit in the weeds.

@sgaichas
Copy link
Contributor Author

Not at all, I'd like to design this so users have options for different biological sampling methods and this is definitely helping.

We can make the age sample independent of fish sampled for length similarly to ss3sim if we re-do sampling at the sample_fish stage with an effN that reflects the age sample. We can then run sample_ages on this to add ageing error if necessary. We can get mean weight at age for this sample either by running calc_age2length and ignoring or discarding the length output if it isn't wanted. That is a lot of overhead so I should write a simpler function to calculate mean weight at age only if lengths aren't used (extracting that bit from the calc_age2length would probably work).

I would also rather avoid an option that mimics length-stratified sampling for age, so I'm glad to hear ss3sim doesn't allow it. I think we are treating the survey ages as conditional age at length in the CC Atlantis-based sardine assessment, but @cstawitz can confirm.

@cstawitz
Copy link
Contributor

cstawitz commented Oct 30, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants