Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor workflow #76

Merged
merged 257 commits into from
Mar 31, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
257 commits
Select commit Hold shift + click to select a range
94fb775
added titer models
rneher Dec 20, 2021
b72fe40
fix annotations
rneher Dec 21, 2021
84e7815
make titer rule wait for translations
rneher Dec 21, 2021
082de81
add wildcard constraints
rneher Dec 22, 2021
a4453ca
add standard build and second array build
rneher Dec 24, 2021
41f6eb6
small edits
rneher Jan 3, 2022
03a37b9
add glyc and lbi
rneher Jan 3, 2022
2f9e552
add vic and yam configs
rneher Jan 3, 2022
a500b95
add europe profile
rneher Jan 3, 2022
b37eabb
extend nextstrain profile
rneher Jan 3, 2022
f3e6ca9
fixes and add forgotten fitness.smk
rneher Jan 3, 2022
e85fa19
add simple example profile
rneher Jan 3, 2022
7297113
add vaccine markers
rneher Jan 4, 2022
e504092
remove some old scripts
rneher Jan 4, 2022
2220c22
add scicore cluster config
rneher Jan 5, 2022
b40e016
make download rules local rules
rneher Jan 5, 2022
ff480b1
small changes in profiles and strain selection
rneher Jan 20, 2022
efb8929
add stub to make report figures
rneher Jan 20, 2022
2d3ea21
add allflu profile
rneher Jan 20, 2022
5f75284
add missing report rules file
rneher Jan 20, 2022
b61f2f6
Fix misspelling of "glycosylation"
huddlej May 24, 2022
a487279
Remove unnecessary list comprehension
huddlej May 24, 2022
37f96a9
Define gene maps with a dict instead of a function
huddlej May 24, 2022
68e44c2
Remove unnecessary f-strings
huddlej May 24, 2022
af5e66e
Bump threads for alignment and use Snakemake's threads directive
huddlej May 24, 2022
6316a66
Do not automatically delete inputs after parse
huddlej May 24, 2022
d4d7e10
Fix conda environment
huddlej May 24, 2022
9b35b9c
Remove whitespace
huddlej May 24, 2022
ab69a71
Fix bug in augur parse
huddlej May 24, 2022
8260de5
Select strains with augur instead of seqkit
huddlej May 24, 2022
c311e63
Mask sequences with augur and note translation outputs
huddlej May 24, 2022
3ccbdce
Join metadata with a script
huddlej May 25, 2022
95d4f60
Document translation checkpoint
huddlej May 25, 2022
236e6a2
Get segment name from fauna downloads
huddlej May 25, 2022
1f8cbbc
Remove whitespace
huddlej May 25, 2022
c806eb2
Remove local rule annotation for metadata joining
huddlej May 25, 2022
3c5f7ef
Replace run blocks with shell blocks
huddlej Jun 10, 2022
5591eba
Make min date optional, enable max date for frequencies
huddlej Jun 10, 2022
c8f51d1
Rename "min-date" param to "min_date"
huddlej Jun 10, 2022
f3c53d1
Fill missing segment entries in joined metadata
huddlej Jun 10, 2022
b77744a
Quote filter strings for safer filter commands
huddlej Jun 10, 2022
5abb87a
Add config for nextflu-private builds
huddlej Jun 10, 2022
9836c30
Undo filter quoting
huddlej Jun 10, 2022
36cbdf1
Expand variables in subsample params for named builds
huddlej Jun 10, 2022
1731406
Allow overriding default tree builder args
huddlej Jun 10, 2022
b2e7d63
Fix references to ancestral translations
huddlej Jun 17, 2022
b97fb99
Add benchmarks and logs to all rules
huddlej Jun 17, 2022
c13649c
Remove unused parameter for align rule
huddlej Jun 17, 2022
52d271c
Remove weighted frequencies
huddlej Jun 24, 2022
65e552d
Use YAML alias to avoid duplicate subsampling rules
huddlej Jun 24, 2022
83de37c
Enable titer, fitness, distance annotations per build
huddlej Jun 30, 2022
e90168b
Add forecasts to builds
huddlej Jun 30, 2022
14e52d1
Add glycosylation coloring to all auspice configs
huddlej Jun 30, 2022
05a0505
Fix alignment inputs for distances
huddlej Jun 30, 2022
fa5d91b
Ignore builds directory in git
huddlej Jun 30, 2022
11f0a0d
Fix bug in alignments inputs
huddlej Jun 30, 2022
ec35f47
Make strain lists optional for epiweeks
huddlej Jun 30, 2022
965d015
Add recency and epiweek annotations to workflow
huddlej Jun 30, 2022
8ec3d51
Convert H3N2 HA clades to UNIX line endings
huddlej Jul 1, 2022
14caaa4
Allow users to include their own rules
huddlej Jul 6, 2022
46c772b
Use "emerging" clades for private builds
huddlej Jul 6, 2022
d3af35f
Add custom forecast config, rules, and scripts
huddlej Jul 6, 2022
bdb5338
Move parse and metadata join to strain selection
huddlej Jul 7, 2022
9b2e9e7
Bump Augur version and require seqkit
huddlej Jul 7, 2022
abae899
Keep header row in filtered titers
huddlej Jul 8, 2022
7b537fc
Add support for measurements panel of titers
huddlej Jul 8, 2022
92c40f8
Add measurements panels to nextflu-private builds
huddlej Jul 8, 2022
5714d28
Add CI build
huddlej Jul 8, 2022
c5214f9
Use Nextalign v2.2.0
huddlej Jul 8, 2022
7ca9129
Pull in outliers from master
huddlej Jul 8, 2022
33b4d31
Increase samples for nextflu-private builds
huddlej Jul 8, 2022
d4ca795
Fix gene order in alignment output
huddlej Jul 12, 2022
ff990d7
Fix nucleotide coordinate of H3N2 50K clade
huddlej Jul 12, 2022
477a2dd
Prettify fields and add contextual sequences for better tree structure
huddlej Jul 12, 2022
a11fa2d
Add custom colors for H3N2 clades
huddlej Jul 12, 2022
a5bef91
Add "is_reference" column to metadata
huddlej Jul 22, 2022
0a2aca4
Exclude egg-passaged samples, limit references to last 7 years
huddlej Jul 22, 2022
2a5b837
Gray out old clades
huddlej Jul 28, 2022
27a9f24
Require csvtk in conda environment
huddlej Jul 28, 2022
ab6081b
Use files with lists of clades for titer plots
huddlej Jul 28, 2022
5c5f6c4
Add custom rule to make antigenic distance plots
huddlej Jul 28, 2022
d038efe
Update references/clades for titer plots
huddlej Jul 28, 2022
94eb1bc
Improve titer plot aesthetics
huddlej Jul 28, 2022
f7b3c1f
Add egg-passaged builds and HI builds for H3N2
huddlej Jul 28, 2022
c518e81
Use correct submission date field for private forecasts
huddlej Aug 27, 2022
0b2a0c9
Update references to reflect TC1 report
huddlej Aug 27, 2022
d75c791
Allow tree method config, make tree args optional
huddlej Aug 27, 2022
4f7f2c5
Add rule to build all antigenic plots at once
huddlej Aug 27, 2022
af66636
Use RAxML for private builds
huddlej Aug 27, 2022
0f09251
Update config for private forecast replicates
huddlej Aug 27, 2022
a54f4a3
Add notebook used to plot lineage counts over time
huddlej Aug 27, 2022
8747767
Include more context from earlier to improve rooting
huddlej Aug 29, 2022
746d587
Update references for titer plots
huddlej Aug 29, 2022
feb0260
Exclude three swine flu infections of humans
huddlej Aug 29, 2022
9fae09a
Add new egg-passaged strain for H1N1pdm titer plot
huddlej Sep 6, 2022
4a16c2a
Annotate strains with titer data in metadata
huddlej Sep 7, 2022
8155ac4
Log why strains get dropped during subsampling
huddlej Sep 7, 2022
b403a24
Group and filter measurements by serum id
huddlej Sep 12, 2022
0610596
Run multiple forecast replicates
huddlej Sep 12, 2022
0be2c7d
Finalize forecast aggregation
huddlej Sep 12, 2022
db16a0e
Exclude outliers from references and titers
huddlej Sep 14, 2022
047b4a0
Limit contextual samples to just before pandemic
huddlej Sep 14, 2022
7a4e056
Clear notebook for plotting counts per lineage
huddlej Sep 14, 2022
3e40196
Increase samples for production forecasts
huddlej Sep 15, 2022
6ddeabb
Reorder colors for forecast plots
huddlej Sep 15, 2022
85dfdd1
Tune private build params
huddlej Sep 15, 2022
9049046
Add Slovenia reference to titer plots
huddlej Sep 16, 2022
0a9fba9
Include more contextual data to improve H1N1pdm rooting
huddlej Sep 16, 2022
c2a579f
Use IQ-TREE with ncov settings for faster tree building with similar …
huddlej Oct 3, 2022
cb73dde
Fix rooting for H1N1pdm and similar situations
huddlej Oct 3, 2022
7390747
Stub out first attempt at public build config
huddlej Oct 3, 2022
ad31737
Annotate nodes by haplotype
huddlej Oct 3, 2022
064dba8
Add haplotypes to measurements panel
huddlej Oct 5, 2022
9e96082
Define segment-specific Auspice config files
huddlej Oct 5, 2022
783f675
Fix inputs for haplotype annotation
huddlej Oct 5, 2022
61d4082
Add public version of nextflu-private builds
huddlej Oct 17, 2022
278dd95
Include titer annotations when running titer models, too
huddlej Oct 18, 2022
60f7b1e
Update H1N1pdm outliers
huddlej Oct 18, 2022
ac8374d
Stub out instructions for monthly CDC builds
huddlej Oct 21, 2022
81012b9
Add instructions to update the group overview
huddlej Oct 24, 2022
f117ec0
Use @victorlin's recommended command to upload/download
huddlej Oct 24, 2022
aef36f3
Add H3N2 outlier
huddlej Oct 27, 2022
20d4f6a
Fix command for downloading an earlier narrative
huddlej Oct 28, 2022
79fa222
Add resolution-specific configs for public builds
huddlej Nov 2, 2022
598314b
Color/filter trees by titer statuses
huddlej Nov 3, 2022
7d5c389
Correct command to download previous narrative
huddlej Nov 17, 2022
9e5c794
Add H3N2 outliers
huddlej Nov 28, 2022
3c9881b
Use the latest versions of Augur and Nextalign
huddlej Nov 28, 2022
0b8ca2d
Fix command to download previous narrative
huddlej Nov 28, 2022
2454d8e
WIP: Add support for multiple titer collections
huddlej Dec 7, 2022
efb3d4f
Don't annotate titer counts for test viruses
huddlej Dec 7, 2022
62617fd
Disambiguate node data JSON attribute names
huddlej Dec 7, 2022
d482562
Add colorings for new H1N1pdm titer attributes
huddlej Dec 7, 2022
b20495f
Enable Vic builds with new titer annotations
huddlej Dec 7, 2022
8bb132b
Update outliers for H3N2 and H1N1pdm
huddlej Dec 8, 2022
88f5b89
Enable fitness models with titer collections
huddlej Dec 8, 2022
1ff5092
Enable H3N2 builds with multiple titer collections
huddlej Dec 8, 2022
544d42d
Update clade names for 2a.2
huddlej Dec 8, 2022
c709445
Set best forecast model to cell FRA
huddlej Dec 8, 2022
eb8d8c5
Fix colors for new clades
huddlej Dec 8, 2022
773a389
Add forecast model based on human cell FRA data
huddlej Dec 8, 2022
e8c872d
Fix colorings and filters for H3N2 in private builds
huddlej Dec 8, 2022
f233227
Use nextstrain authorization command to update overview
huddlej Dec 8, 2022
6b2978a
Refine counts per lineage plots
huddlej Dec 8, 2022
34ecbd2
Fix filters for Vic
huddlej Dec 8, 2022
6b1a693
Replace old nextflu private config with new one
huddlej Dec 9, 2022
fea329d
Update forecasts config to use titer collections
huddlej Dec 10, 2022
079085c
Update outliers
huddlej Dec 12, 2022
437b0a9
Evenly sample titer strains from the last year
huddlej Dec 13, 2022
a9700f3
Define representative and titers private builds
huddlej Dec 17, 2022
12b8cd6
Use latest Augur and Nextalign versions
huddlej Dec 19, 2022
036ecf8
Convert CI build to use titer collections
huddlej Dec 19, 2022
d509a2d
Make titer attribute prefix optional
huddlej Dec 20, 2022
35779cd
Expand array build params in titer collections
huddlej Dec 20, 2022
3263070
Update build configs to use titer collections
huddlej Dec 20, 2022
2420cf6
Limit titers to ferret-based data
huddlej Dec 21, 2022
700d4bf
Use titer collection prefix for model attributes
huddlej Dec 21, 2022
d58fd4c
Add profile to upload sequences and titers to S3
huddlej Dec 21, 2022
b9b0789
Use sequences and titers on S3 for private builds
huddlej Dec 21, 2022
f979fda
Add back original antigenic model file
huddlej Dec 22, 2022
501c6a8
Increase default recursion limit for the workflow
huddlej Dec 23, 2022
45ca035
Update H3N2 outliers
huddlej Dec 23, 2022
4f0c741
Only try to annotate titer collections for HA
huddlej Dec 23, 2022
9640c49
Minify JSONs for private builds
huddlej Dec 23, 2022
97cc44c
Remove unused Auspice colorings
huddlej Dec 23, 2022
d7c0f85
Drop representative H1N1pdm and Vic private builds
huddlej Dec 23, 2022
8482289
Update H3N2 clades to newly proposed names
huddlej Dec 24, 2022
6b093ec
Exclude egg-passaged test viruses from cell-passaged titers
huddlej Dec 27, 2022
da057b9
Update H1N1pdm clades to newly proposed
huddlej Dec 27, 2022
533c29a
Fix clade colors for H1N1pdm and H3N2
huddlej Dec 28, 2022
ab67b11
Stub new upload workflow for GH Actions
huddlej Jan 18, 2023
867fc84
Add GitHub Action for uploading to S3
huddlej Jan 18, 2023
89cb181
Simplify and speed up upload action
huddlej Jan 19, 2023
b01302a
Update H3N2 outliers
huddlej Jan 20, 2023
e3c7f23
Update clades for H1N1pdm and H3N2
huddlej Jan 22, 2023
91131f5
Minify concatenated measurements JSONs
huddlej Jan 22, 2023
7f47591
Add "long" clade names for H1N1pdm and H3N2
huddlej Jan 22, 2023
9b39c44
Update references/colors for H1/H3 titers plots
huddlej Jan 22, 2023
0cefca3
Update titer plotting script and rule
huddlej Jan 22, 2023
9446960
Force include the latest B/Vic vaccine
huddlej Jan 22, 2023
b73896d
Skip Yam in plots of counts per lineage
huddlej Jan 22, 2023
ad7d1a1
Update Vic references to plot in titer figures
huddlej Jan 23, 2023
c48f865
Define custom subclades for Vic
huddlej Jan 23, 2023
b676089
Annotate serum strain's haplotype instead of clade
huddlej Jan 23, 2023
27e8d36
Simplify antigenic plot rule for WHO reports
huddlej Jan 23, 2023
3220449
Reorder reference sera in titer plots based on haplotype
huddlej Jan 23, 2023
377ee83
Allow egg-passaged reference strains to be force-included
huddlej Jan 23, 2023
dcaaa9d
Add new subclades for H3N2
huddlej Jan 23, 2023
b310d78
Update forecasts profile to use S3 data and new clades
huddlej Jan 23, 2023
d63376f
Refine aesthetics of forecast plots
huddlej Jan 24, 2023
92a618b
Use AWS CLI to download files from S3
huddlej Jan 24, 2023
e5e9511
Rename public builds to match current format
huddlej Jan 26, 2023
432d8e3
Add missing input for root sequence renaming
huddlej Jan 27, 2023
817f197
Hardlink instead of copying
huddlej Jan 27, 2023
0a4db33
Always use resolution variable in auspice names
huddlej Jan 27, 2023
6457b6f
Prune reference strain prior to time tree
huddlej Jan 27, 2023
7f2045c
Prune outgroup after tree building
huddlej Jan 28, 2023
4a10045
Download data from S3 for public builds
huddlej Jan 28, 2023
101153d
Clean up H3N2 NA reference name
huddlej Jan 28, 2023
0aa29c5
Update outliers
huddlej Feb 10, 2023
e4124f6
Bump Augur and Nextalign versions
huddlej Feb 10, 2023
78ac638
Order measurements by clade and reference date
huddlej Feb 10, 2023
3512807
Skip root node when collapsing haplotypes
huddlej Jan 30, 2023
6f2f747
Automate plotting of genome counts per lineage
huddlej Feb 13, 2023
23d946b
Update titer reference strains for H1N1pdm
huddlej Feb 14, 2023
ee70410
Limit forecasts to a requested subclade of a tree
huddlej Feb 14, 2023
cd25c81
Limit forecasts to a requested subclade of a tree
huddlej Feb 14, 2023
291bce0
Parameterize height per row for forecast plots
huddlej Feb 14, 2023
46cb0a7
Update outliers
huddlej Feb 15, 2023
2c0491e
Update H1 titer references
huddlej Feb 15, 2023
b237955
Update H3 titer references
huddlej Feb 15, 2023
5438e95
Fix order of H3 titer references
huddlej Feb 15, 2023
ccdc36e
Update Vic titer references
huddlej Feb 15, 2023
e616cae
Fix order of Vic references
huddlej Feb 15, 2023
c5a92a9
Add Southern hemisphere H3N2 vaccine
huddlej Feb 15, 2023
76ce5a8
Update outliers
huddlej Feb 16, 2023
f4b1ba0
Add new references to H3N2 titer plots
huddlej Feb 17, 2023
5a2e2ce
Pin BioPython to 1.80 in Conda environment
huddlej Feb 17, 2023
0f7f05e
Use same Auspice config in forecast builds as private builds and use …
huddlej Feb 17, 2023
9ea2da1
Reorder tooltip in measurements panel
huddlej Feb 21, 2023
09f8334
Limit columns displayed in measurements tooltips
huddlej Feb 21, 2023
1c978f5
Allow "vidrl" wildcard value for "center"
huddlej Feb 23, 2023
c439434
Add config, rules, and scripts to support private.nextflu.org
huddlej Feb 24, 2023
f286cab
Make augur output optional for global frequencies
huddlej Feb 24, 2023
7d6b5bd
Fix quoting typo in config paths
huddlej Feb 24, 2023
28f612a
Port mutations rules from WHO workflow
huddlej Feb 24, 2023
7669fbf
Fix broken conda environment path
huddlej Feb 24, 2023
c365634
Move titer export logic to its own script
huddlej Feb 24, 2023
7460ad1
Standardize quotes
huddlej Feb 24, 2023
596a4ef
Fix bugs in diffusion frequency rules
huddlej Feb 25, 2023
aa39e27
Fix bugs with sequence export and entropy
huddlej Feb 25, 2023
56f1604
Pass space-delimited list of genes to entropy
huddlej Feb 27, 2023
bcbe56c
Use fauna to download data for public builds
huddlej Feb 28, 2023
36d39ce
Don't build recent Yam trees
huddlej Mar 1, 2023
38f3202
Don't run Yam builds at all
huddlej Mar 1, 2023
e1f455f
Use cross-immunity metric without cell FRA suffix
huddlej Mar 1, 2023
de78753
Only use emerging Vic clades in private group
huddlej Mar 3, 2023
164421f
Allow regions to be superset of given frequencies
huddlej Mar 14, 2023
0761b70
Estimate global frequencies from non-empty region translations
huddlej Mar 14, 2023
7b74e56
Sort measurements by min y-axis position of clade
huddlej Mar 16, 2023
5c9a1a5
Update vaccine strains
huddlej Mar 16, 2023
f3ccb9e
Fix clade definitions based on nucleotides
huddlej Mar 16, 2023
34344c3
Annotate year/month for easier date filtering
huddlej Mar 17, 2023
94a3ed4
Merge branch 'master' into refactor-workflow
joverlee521 Mar 31, 2023
236ae40
Correct and document Vic HA gene coordinates
huddlej Mar 31, 2023
62659e2
Sort outliers with sort -k 1,1
huddlej Mar 31, 2023
936689b
Add outliers from master branch
huddlej Mar 31, 2023
692a4cc
Sort reference strains
huddlej Mar 31, 2023
b0f444f
Add reference strains from master branch
huddlej Mar 31, 2023
8bdf310
Add `auspice_renamed/` to .gitignore
joverlee521 Mar 31, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,4 @@ jobs:
ci:
uses: nextstrain/.github/.github/workflows/pathogen-repo-ci.yaml@master
with:
build-args: auspice/flu_seasonal_h3n2_ha_12y.json auspice/flu_seasonal_h3n2_ha_12y_tip-frequencies.json
build-args: --configfile profiles/ci/builds.yaml -p
32 changes: 32 additions & 0 deletions .github/workflows/upload.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
name: Upload data from fauna to S3

# Only support manual trigger of this workflow.
on: workflow_dispatch

jobs:
upload:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
with:
ref: refactor-workflow
# Install Nextstrain CLI, so we can run the flu workflow.
- uses: nextstrain/.github/actions/setup-nextstrain-cli@master
# Run the flu workflow that downloads titers and sequences from fauna and
# uploads to S3.
- name: Download from fauna and upload to S3
run: |
set -x

nextstrain build \
--docker \
. \
-j 4 \
upload_all_titers \
upload_all_sequences \
--configfile profiles/upload.yaml
env:
RETHINK_HOST: ${{ secrets.RETHINK_HOST }}
RETHINK_AUTH_KEY: ${{ secrets.RETHINK_AUTH_KEY }}
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
# Files created by the pipeline, which we want to keep out of git
# (or at least out of _this_ git repo).
data/
builds/
results/
auspice/
auspice-who/
auspice_renamed/
build/
logs/
figures/
Expand Down Expand Up @@ -51,3 +53,7 @@ nohup.out

# cluster logs
slurm-*

# Jupyter/Altair droppings
.ipynb_checkpoints
geckodriver.log
Loading