Releases: eead-csic-compbio/get_homologues
get_homologues-est
This release ships with several changes and fixes:
20042022: added parse_pangenome_matrix.pl -n for matrices that do not include cloud clusters
21042022: transcripts2cdsCPP.pl not tested by default with make test, as bioconda lacks Inline::CPP
21042022: if bin/ not in place phyTools::set_phyTools_env assumes binary dependencies in PATH
21042022: updated install.pl so that it copes with binary dependencies in PATH
21042022: install.pl no_databases COGS only gets/compiles COGS, so that other binaries can be taken from PATH
22042022: added CODE_OF_CONDUCT
28042022: added test_swiss to Makefile
28042022: added compila_conda.pl to bin/COGsoft & added $(CXX) to Makefiles to use in conda recipe (macosx-intel copy still the same)
28042022: COGsoft source released as https://github.com/eead-csic-compbio/get_homologues/releases/download/v3.4.6/COGsoft.tgz
28042022: released updated bin.tgz (v3.5)
binaries
Compressed TAR file with binaries to be downloaded after cloning the source repository. This should be done with install.pl
COGsoft/*/Makefile in bin.tgz now have $(CXX) instead of g++
COGsoft
This is the COGtriangles code with minor include tweaks so that it compiles in conda.
Running $ perl conda_compile.pl creates new binaries and puts them in bin/
The original code can be obtained from
https://ftp.ncbi.nih.gov/pub/wolf/COGs/COGsoft/
https://sourceforge.net/projects/cogtriangles/
Please see Readme.2012.04.txt for credits and terms of use.
get_homologues-est
This release ships with several changes and fixes:
16092021: updated hcluster_pangenome_matrix.sh: added functions cleanup_R_script & print_version; checked and fixed options and associated names; added options -v, -h; improved output file checking; 100% check compliance
05112021: added explanation to annotate_cluster.pl regarding MVIEW non-reported deletions in longest sequences
05112021: improved manual description of MVIEW alignments within annotate_cluster.pl
05112021: compare_clusters.pl now prints all duplicated clusters to duplicated.cluster_list ; explained in manual
08112021: updated manual/HOWTOsge.txt
03122021: genbank files produced by PATRIC/RAST2 can now be parsed (thanks Irene Ortega!)
09122021: updated format_BLAST[NP]_command_aligns to take a max number of hits to report,
09212021: this fixes a bug in annotate_clusters.pl that affected clusters > 250 seqs (thanks carolynzi!)
22122021: made %feature_output non-redundant
15022022: make_nr_pangenome_matrix.pl now produces logfile to see date of redundant sequences (thanks carolynzi!)
21022022: updated transpose oneliner in make_nr_pangenome_matrix.pl (thanks carolynzi!)
23022022: make_nr_pangenome_matrix.pl now logs recursive list of redundant clusters (thanks carolynzi!)
23022022: get_homologues.pl stops if no input sequences are parsed and remined accepted extensions (thanks apoorva004!)
28022022: removed unused vars from lib/marfil_homology.pm (thanks https://metacpan.org/pod/App::perlvars)
09032022: annotate_cluster.pl stops if < 2 sequences
16032022: parse_pangenome_matrix.pl can read pangene_matrix.tab produced at https://github.com/Ensembl/plant-scripts/tree/master/pangenes
17032022: format_BLASTN_command_aligns now takes only one 1hsp per query (thanks carolynzi!)
17032022: annotate_cluster.pl now prints the name of longest sequence
05042022: updated TODO
Bhybridum_suppl_files
This compressed folder contain datasets used in the paper:
SP Gordon, B Contreras-Moreira, JJ Levy, A Djamei, A Czedik-Eysenberg et al (2020)
Gradual polyploid genome evolution revealed by pan-genomic analysis of Brachypodium
hybridum and its diploid progenitors. Nat Commun 11(1):3670. doi: 10.1038/s41467-020-17302-5
The original location of these files was https://floresta.eead.csic.es/plant-pan-genomes/Bhybridum
plant-pan-genomes_suppl_files
This compressed folder contain datasets used in the paper describing pan-genome analyses in plants:
Contreras-Moreira B, Cantalapiedra CP, García-Pereira M, Gordon SP, Vogel JP,
Igartua E, Casas AM, Vinuesa P (2017) Analysis of plant pan-genomes and
transcriptomes with GET_HOMOLOGUES-EST, a clustering solution for sequences of
the same species. Front. Plant Sci. 8:184. doi: 10.3389/fpls.2017.00184
The original location of these files was https://floresta.eead.csic.es/plant-pan-genomes/
- Athaliana/
Contains original locations of genome-based CDS sets.
- Athaliana_denovo/
Contains FASTA files with nucleotide sequences of de-novo assembled transcripts.
- barley_transcripts/
Contains FASTA files with nucleotide sequences of 14 de-novo assembled transcriptomes and transcripts/cDNA sequences annotated in reference accessions Morex and Haruna Nijo.
- suppl_scripts/
Contains a protocols and scripts used to generate results and plots in the paper, such as dN/dS (omega) estimates. See also:
https://github.com/eead-csic-compbio/get_homologues/tree/master/user_utils/dNdS
- Please refer to the main publication describing this work for further details.
Thanks
B Contreras-Moreira, CP Cantalapiedra, MJ Garcia-Pereira, SP Gordon, JP Vogel, E Igartua, AM Casas, P Vinuesa
get_homologues-est
This release removes redundant batch jobs from dryrun.txt files when using -m dryrun -c
get_homologues-est
This release added dryrun examples to manuals
get_homologues-est
This release ships with several changes and fixes:
03032020: improved regex at sub read_cluster_config (lib/HPCluster.pm)
10032020: fixed annotate_cluster.pl -D in cases with no Pfam hits
10032020: updated regex to detect peptide sequences in lib/phyTools.pm
08042020: updated mview version to 1.67 which now shows masked sequences with Xs and not gaps
08042020: updated annotate_cluster.pl to use mview 1.67
09042020: added make clean to mcl compilation in install.pl
10042020: v2.4_10Apr20, added option -b for par(mar()) and improved options documentation to hcluster_pangenome_matrix.sh
20042020: updated hcluster_pangenome_matrix.sh to v2.4_10Apr20 to eliminate warning message
20042020: acknowledged possible bug in section 4 of plot_matrix_heatmap.sh
20042020: added arg $testR to hcluster_pangenome_matrix.sh and plot_matrix_heatmap.sh tests to check R packages
20042020: updated test_get_homologues.t, Makefile & .travis.yml
20042020: updated ubuntu to bionic (18.04) in travis.yml to match docker OS
20042020: added libgd-dev to travis.yml
21042020: added check to get_homologuespl when printing aligned coords
21042020: fixed index seek for EST jobs in check_BDBHs.pl
27042020: updated annotate_cluster.pl to scape forbidden chars in sequence names that trouble mview
29042020: updated install instructions for R modules imported by hcluster_pangenome_matrix.sh
02062020: added -n dryrun to get_homologues.pl
09072020: install.pl now tries curl before wget in macOS
31072020: downgraded cluster_pangenome_matrix.sh plot_matrix_heatmap.sh to simplify R deps
25082020: improved warning for gbk files with no nucleotide seqs in get_homologues.pl
08092020: fixed indentation of R code within compare_clusters.pl
30102020: fixed -m dryrun
10112020: replaced ' with - in $cluster_name in get_homologuespl
10112020: compare_clusters.pl stops if intersection==0
10112020: updated if else Venn R code in compare_clusters.pl
16112020: added real test of transcripts2cdsCPP.pl, not used for travis
17112020: fixed bug and improved documentation of pfam_enrich.pl
17022021: added -m dryrun to get_homologues-est.pl
17022021: fixed _cluster_makeIsoform.pl
17022021: updated bin/mcl-14-137 and made mview executable
17022021: added get_homologues-est.pl -m dryrun to test and manuals
binaries
Compressed TAR file with binaries to be downloaded after cloning the source repository. This should be done with install.pl
22042022: added COGsoft/README.txt