You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jan 18, 2020. It is now read-only.
One of the problems with automated cohort generation is that end-users don't have good control of cohorts. This can cause things like duplicately loaded samples to skew numbers without an obvious way to empower them to mitigate it. This also impacts #161 since the internal cohorts are updated constantly and are a moving target every night. Users should therefore be able to create their own cohorts with the following properties:
Cohorts can be created per-user, per project, and globally
Automatic cohort creation at load time should be eliminated
Cohorts created by end users should be versioned so that it is possible to use any version of a cohort for the purposes of query (and so cohort updates don't impact ongoing analyses).
There are some practical implications for this to work:
There should be a "grace period" before a cohort change is processed to prevent unnecessary churn in the database as someone adds and removes samples
Somehow, all the allele frequencies for all versions of the cohort will need to be stored. This could become unwieldy. One option is to age out older versions and archive them. This would allow users to bounce between the last several versions without issue, but old cohorts could be offloaded to a parallel table or offline storage.
Allele frequency calculations might need to happen outside the Varify application server (maybe even the database?) to preserve performance. This may not be a problem we have to worry about yet.
The text was updated successfully, but these errors were encountered:
One of the problems with automated cohort generation is that end-users don't have good control of cohorts. This can cause things like duplicately loaded samples to skew numbers without an obvious way to empower them to mitigate it. This also impacts #161 since the internal cohorts are updated constantly and are a moving target every night. Users should therefore be able to create their own cohorts with the following properties:
There are some practical implications for this to work:
There should be a "grace period" before a cohort change is processed to prevent unnecessary churn in the database as someone adds and removes samples
Somehow, all the allele frequencies for all versions of the cohort will need to be stored. This could become unwieldy. One option is to age out older versions and archive them. This would allow users to bounce between the last several versions without issue, but old cohorts could be offloaded to a parallel table or offline storage.
Allele frequency calculations might need to happen outside the Varify application server (maybe even the database?) to preserve performance. This may not be a problem we have to worry about yet.
The text was updated successfully, but these errors were encountered: