Releases
v0.1.0
0.1.0 (2024-05-10)
⚠ BREAKING CHANGES
return data as NDJSON instead of JSON
Features
AAMutations with multiple sequences (0def8b2 )
Action for amino acid distribution (a0a4cf1 )
add limit orderBy and offset to all query actions (13b7e01 )
add log statements to loadDatabaseState (46a0421 )
add more tests, make less flaky and viable with large dataset (7772ae3 )
add unit test for findIllegalNucleotideChar, unique test case name for insertion contains invalid pattern tests (99c9c4b )
added amino acid insertion search, added many test cases and fixed various bugs (d1e4b2b )
allow default preprocessing config along with user defined preprocessing config (ee9f20e )
allow reading fasta files with missing segments and genes #220 (8ea9893 )
allow reading segments and genes that are null from ndjson file #220 (d0a3a7e )
also get Runtime Config options from environment variables (33bdd65 )
also log to stdout (54b8a47 )
also return mutation destructed that does not need to be reparsed (a93abbf )
Alternative templating of symbol classes (6b61985 )
automatically detect file endings for fasta files (75bd14e )
be more lenient on input data, ignore superfluous sequences and fill missing sequences with Ns (ee12186 )
Better test coverage for SymbolEquals filter (42c685c )
boolean columns, resolves #384 : const declaration (685db9f )
boolean columns: actions/tuple: update assignTupleField() (b020109 )
boolean columns: add and use JsonValueType (2ad1268 )
boolean columns: add bool to JsonValueType, update tuple (3c990e2 )
boolean columns: add expression_type "BooleanEquals" (b82ec69 )
boolean columns: add filter_expressions/bool_equals (ff2a138 )
boolean columns: add optional_bool (a2e47f8 )
boolean columns: add storage/column/bool_column (5eb3430 )
boolean columns: column_group: update ColumnPartitionGroup (d7adecf )
boolean columns: column_group.h: add {ColumnPartitionGroup,ColumnGroup}::bool_columns fields (af44f1a )
boolean columns: database (c555a68 )
boolean columns: database_config: add "bool" case to DatabaseConfigReader::readConfig() (12fc7b8 )
boolean columns: database_config: add "boolean" case to de/serialisation (e6d5363 )
boolean columns: database_config: add BOOL to ValueType (0b47b82 )
boolean columns: database_config: update DatabaseMetadata::getColumnType() (30febdd )
boolean columns: database_partition (0aef8f0 )
boolean columns: optional_bool: add ==
(f0aa3e8 )
boolean columns: selection (e52f2dc )
build metadata in parallel to sequences. Do not create unaligned sequence tables in preprocessing, rather hive-partition them directly to disk. Better (debug-)logging (c1cdfeb )
bulk Tuple allocations now possible (902ec04 )
clearer Operator::negate and Expression::toString, logical Equivalents for debug printing/logging for the Leaf Operators IndexScan and BitmapSelection (026b639 )
consistent behavior of configs when starting SILO with both --preprocessing and --api (847ec7e )
declutter README.md from linting option, which is now disabled by default and enforced in the CI for the Linter (9220435 )
details no longer shows insertions (#354 ) (473cd98 )
display database info after loading new database state (0249416 )
display preprocessing duration in logs in human-readable format (not in microseconds) #296 (a2499af )
do not enforce building with clang-tidy by default. Linter will still be enforced (7134e45 )
FastaAligned action (50776c8 )
faster builds by copying @corneliusroemer image caching for our dependency images, which rarely change (#374 ) (7867bc7 )
filter for amino acids (b52aabd )
fix sorting (1ed18ae )
flipped bitmap can now be set before insertion (f61c803 )
format DatabaseConfig (4fb8f1b )
format PreprocessingConfig (ee35207 )
generalize mutations action to have consistent behavior for different symbols (9834aea )
generalizing symbol and mutation filters. Clear handling of ambiguous symbols (aa9ad4d )
Generalizing the config for multiple nucleotide sequences and multiple genes (9a80204 )
have structured and destructured insertion in insertions response (0a7e46a )
hide intermediate results of the preprocessing - don't put it in the output (44327b0 )
implement basic request id to trace requests #303 (4defb59 )
implement data updates at runtime. More resilient to superfluous or missing directory separators (dc5dfaa )
implement insertion columns and search (9167236 )
improve loadDB speeds (2b7cd7d )
improve validation error message of some actions on orderByFields (a0da5b5 )
insertion action targets all insertion columns by default (6b70241 )
insertion columns for amino acids and multiple sequence names (3cc8fee )
insertions action (e067062 )
insertions contains action now targets all columns if the column name is missing (32a6951 )
introduce new storage type for Sequence Positions, where the most numerous symbol is deleted (6e15204 )
introduce storage of unaligned sequences from either ndjson file or fasta file and make them queryable via the Fasta action (44df849 )
load table lazily. Unaligned Sequences do not need to load the table (c2a8439 )
log databaseConfig and preprocessingConfig (d2dc58c )
logging for partition (e75a925 )
logging improvements (4c12a88 )
make database serializable again (2523e67 )
make pangoLineageDefinitionFilename in preprocessing config optional, linter errors (0f3dc53 )
make partition_by field in config optional (3942418 )
make SILO Docker image by default read data from /data (e83b910 )
make threads and max queued http connections available through optional parameter (3ecde68 )
migration to duckdb 0.10.1 (c1426ef )
mine data version at beginning of preprocessing (362fe0f )
More robust InputStreamWrapper (305dd36 )
multiple performance improvements for details endpoint (28f41d0 )
optimize bitmaps before finishing partition (5b06d58 )
order all actions by default (a2f5c04 )
preparation of insertion columns (c14a370 )
put output and logs to gitignore (789e489 )
reenable bitmap inversion (75ac20f )
reenable pushdown of And expressions through selections (802bec0 )
refactor saving and loading database to not require preprocessing structs anymore (45bf7ed )
reintroduce randomize for all query actions (166045c )
reserve space in columns when bulk inserting rows (e3c9620 )
return data as NDJSON instead of JSON (c236ba4 ), closes #126
return data version on each query (be5c886 )
return only aliased pango lineages (abf0844 )
reveal some more details when reading YAML fails (1f8d9db )
run preprocessing in github ci (f53eddb )
save database state into folder with name <data version> (41923eb )
separate preprocessing and starting silo (9808e2d )
serialization of partition descriptor to json (472c1da )
some suggestions for the insertion search (ae900da )
Specifiable nucSequence query target (7cc609f )
statically disable deleted symbol optimisations because of performance penalty (4e522f4 )
stick to the default of having config value keys in camel case (a1cae40 )
storing amino acids (f11a330 )
support for nullable columns (6f78e3a )
support recombinant lineages (3e848a5 )
template class for sequence store (cef4d48 )
templatized Symbol classes (6b9d734 )
test set with amino acid insertions (de2c4f8 )
throw an error when there is not initialized database loaded yet #295 (b17f72a )
tidying up CLI and file configuration for runtime config. Added option for specifying the port (c3a88a0 )
Unit tests for Tuple (4fc06e8 )
update conan version (5540a67 )
use 'pragma once' as include guards instead of 'ifndef...' (bc49aa5 )
use own scope for preprocessing (2a93846 )
use same default min proportions for mutations actions as the old LAPIS (f42f830 )
Bug Fixes
adapt randomize query results to target architecture. x86 and ARM have possibly different std::hash results (#355 ) (600000d )
add bash dependency which is required by conan build of pkgconf and is not installed on alpine by default (1b3f51c )
add insertion to database_config test (aec40d0 )
add missing file for test (262bedc )
add missing sequenceName field to mutation action "orderBy" (06c8c86 )
add sleep statement before row call (8fa8efb )
add workaround so insertions are read correctly (8a2bfa8 )
allow sql keywords for metadata field names #259 (6fbeee5 )
also consider 'missing' symbols in the mutation action. Bugfix where Position invariant was broken because of 'missing' symbols (fab72a6 )
alternative non-exhaustive three mer index (467086f )
always build dependency image for amd64 platform (4f73cb0 )
bug when filtering for indexedStringColumns which are not present in some partitions (62d4e08 )
bug where sequence reconstruction is false when the flipped bitmap is different from the reference sequence symbol (edac58c )
change random ordering for gcc hashing (78055ff )
change test to reflect new optimisations (cb63010 )
compiling And: append selection_child->predicates to predicates - not vice versa (bdebca5 )
deterministic order for e2e test (02427a2 )
divergence between mac and linux info test results, fix memory leaks in Threshold.cpp (ffcefd0 )
do not exclude zstd filter from boost installation (c974230 )
do not use std::filesystem::path::relative_path() to also support absolute paths (b9ff422 )
endToEndResults (c807a33 )
error when the Mutations action looked for sequences but the filter was empty (91e5d52 )
fix memory leaks in indexed_string_column.cpp and insertion_contains.cpp (c2eeac8 )
floatEquals and floatBetween with null values (47b436e )
hide nucleotide sequence for default sequence (584715c )
insertion column, remove reference (d940c52 )
insertion search e2e and insertion column tests, dont allow non-empty value for insertion search (19af04d )
linking error on linux (ad9076c )
linter (54d3c34 )
linter (e6b6ab7 )
linter (b31851c )
linter (34830ab )
linter errors (e9e1bbf )
Linter throws again and added clang-format option (87cb4a5 )
Make C++ flags in CMake compatible for MacOS (38cae69 )
metadata info test accessing getMetadataFields output no longer directly but over the address of a const (0dcdee1 )
missing include (0773999 )
new linter errors (ea6934c )
no longer have regression when no bitmap flipped is most efficient (b653753 )
nodiscard (silence warnings) (69421c1 ), closes #390
non default unaligned nucleotide sequence prefix (93b4829 )
nucleotide symbol equals with dot (6ad623e )
only apply order-by if the field is set, validate orderBy fields for all operations (405a7f1 )
pango lineage filter with null values (b7238a8 )
parse error messages for mutation filter expressions (9e4612d )
put compressors into sql function to avoid static variables (c1a11c8 )
quoting {} in "x.{}" SQL struct accesses, as a string starting with a number leads to parser errors (#409 ) (f3ba6db )
random (but deterministic for a version) result can depend on internal state, which was changed with duckdb update (9d9351b )
recursive file reading for nodejs<20 (9ac0ffe )
remove caching for linter. Docker image to large on github actions. (8eaeee6 )
response format (bca4961 )
revert duckdb migration due to it being unable to build the new version on the GitHub runner (2a62c54 )
revert test numbers to pre-optimization (99b600a )
Roaring from 1.3.0 -> 1.0.0 because of broken CI (ce82c77 )
seralization for insertion_index and insertion_column (ee99f8d )
single partition build fix (a8af1c9 )
specify namespace fmt in calls to format_to (#353 ) (62ffa3b )
specifying apk versions (2c2354e )
test cases verifying that the positions index for mutation distribution are now 1-indexed (fdf972a )
test with deterministic results, remove 2 unused variables (96a424b )
unit test info number updates for new pango-lineages in test data (39d07c2 )
unit tests and mock fixtures (db506a1 )
update cmake version on ubuntu (227a2dd )
Upper and lower bound should be inclusive in DateBetween filter (53c6c05 )
Wrong compare function used in multi-threaded case, which displayed wrong tuples in the details endpoint when a limit was used (650bf36 )
zstd dependency (c145722 )
You can’t perform that action at this time.