Skip to content
This repository has been archived by the owner on Jan 18, 2020. It is now read-only.

Provide custom cohorts #162

Open
mitalia opened this issue Mar 11, 2014 · 0 comments
Open

Provide custom cohorts #162

mitalia opened this issue Mar 11, 2014 · 0 comments

Comments

@mitalia
Copy link
Contributor

mitalia commented Mar 11, 2014

One of the problems with automated cohort generation is that end-users don't have good control of cohorts. This can cause things like duplicately loaded samples to skew numbers without an obvious way to empower them to mitigate it. This also impacts #161 since the internal cohorts are updated constantly and are a moving target every night. Users should therefore be able to create their own cohorts with the following properties:

  1. Cohorts can be created per-user, per project, and globally
  2. Automatic cohort creation at load time should be eliminated
  3. Cohorts created by end users should be versioned so that it is possible to use any version of a cohort for the purposes of query (and so cohort updates don't impact ongoing analyses).

There are some practical implications for this to work:

  1. There should be a "grace period" before a cohort change is processed to prevent unnecessary churn in the database as someone adds and removes samples

  2. Somehow, all the allele frequencies for all versions of the cohort will need to be stored. This could become unwieldy. One option is to age out older versions and archive them. This would allow users to bounce between the last several versions without issue, but old cohorts could be offloaded to a parallel table or offline storage.

  3. Allele frequency calculations might need to happen outside the Varify application server (maybe even the database?) to preserve performance. This may not be a problem we have to worry about yet.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants