Skip to content

v0.1.0

Compare
Choose a tag to compare
@github-actions github-actions released this 13 May 07:49

0.1.0 (2024-05-10)

⚠ BREAKING CHANGES

  • return data as NDJSON instead of JSON

Features

  • AAMutations with multiple sequences (0def8b2)
  • Action for amino acid distribution (a0a4cf1)
  • add limit orderBy and offset to all query actions (13b7e01)
  • add log statements to loadDatabaseState (46a0421)
  • add more tests, make less flaky and viable with large dataset (7772ae3)
  • add unit test for findIllegalNucleotideChar, unique test case name for insertion contains invalid pattern tests (99c9c4b)
  • added amino acid insertion search, added many test cases and fixed various bugs (d1e4b2b)
  • allow default preprocessing config along with user defined preprocessing config (ee9f20e)
  • allow reading fasta files with missing segments and genes #220 (8ea9893)
  • allow reading segments and genes that are null from ndjson file #220 (d0a3a7e)
  • also get Runtime Config options from environment variables (33bdd65)
  • also log to stdout (54b8a47)
  • also return mutation destructed that does not need to be reparsed (a93abbf)
  • Alternative templating of symbol classes (6b61985)
  • automatically detect file endings for fasta files (75bd14e)
  • be more lenient on input data, ignore superfluous sequences and fill missing sequences with Ns (ee12186)
  • Better test coverage for SymbolEquals filter (42c685c)
  • boolean columns, resolves #384: const declaration (685db9f)
  • boolean columns: actions/tuple: update assignTupleField() (b020109)
  • boolean columns: add and use JsonValueType (2ad1268)
  • boolean columns: add bool to JsonValueType, update tuple (3c990e2)
  • boolean columns: add expression_type "BooleanEquals" (b82ec69)
  • boolean columns: add filter_expressions/bool_equals (ff2a138)
  • boolean columns: add optional_bool (a2e47f8)
  • boolean columns: add storage/column/bool_column (5eb3430)
  • boolean columns: column_group: update ColumnPartitionGroup (d7adecf)
  • boolean columns: column_group.h: add {ColumnPartitionGroup,ColumnGroup}::bool_columns fields (af44f1a)
  • boolean columns: database (c555a68)
  • boolean columns: database_config: add "bool" case to DatabaseConfigReader::readConfig() (12fc7b8)
  • boolean columns: database_config: add "boolean" case to de/serialisation (e6d5363)
  • boolean columns: database_config: add BOOL to ValueType (0b47b82)
  • boolean columns: database_config: update DatabaseMetadata::getColumnType() (30febdd)
  • boolean columns: database_partition (0aef8f0)
  • boolean columns: optional_bool: add == (f0aa3e8)
  • boolean columns: selection (e52f2dc)
  • build metadata in parallel to sequences. Do not create unaligned sequence tables in preprocessing, rather hive-partition them directly to disk. Better (debug-)logging (c1cdfeb)
  • bulk Tuple allocations now possible (902ec04)
  • clearer Operator::negate and Expression::toString, logical Equivalents for debug printing/logging for the Leaf Operators IndexScan and BitmapSelection (026b639)
  • consistent behavior of configs when starting SILO with both --preprocessing and --api (847ec7e)
  • declutter README.md from linting option, which is now disabled by default and enforced in the CI for the Linter (9220435)
  • details no longer shows insertions (#354) (473cd98)
  • display database info after loading new database state (0249416)
  • display preprocessing duration in logs in human-readable format (not in microseconds) #296 (a2499af)
  • do not enforce building with clang-tidy by default. Linter will still be enforced (7134e45)
  • FastaAligned action (50776c8)
  • faster builds by copying @corneliusroemer image caching for our dependency images, which rarely change (#374) (7867bc7)
  • filter for amino acids (b52aabd)
  • fix sorting (1ed18ae)
  • flipped bitmap can now be set before insertion (f61c803)
  • format DatabaseConfig (4fb8f1b)
  • format PreprocessingConfig (ee35207)
  • generalize mutations action to have consistent behavior for different symbols (9834aea)
  • generalizing symbol and mutation filters. Clear handling of ambiguous symbols (aa9ad4d)
  • Generalizing the config for multiple nucleotide sequences and multiple genes (9a80204)
  • have structured and destructured insertion in insertions response (0a7e46a)
  • hide intermediate results of the preprocessing - don't put it in the output (44327b0)
  • implement basic request id to trace requests #303 (4defb59)
  • implement data updates at runtime. More resilient to superfluous or missing directory separators (dc5dfaa)
  • implement insertion columns and search (9167236)
  • improve loadDB speeds (2b7cd7d)
  • improve validation error message of some actions on orderByFields (a0da5b5)
  • insertion action targets all insertion columns by default (6b70241)
  • insertion columns for amino acids and multiple sequence names (3cc8fee)
  • insertions action (e067062)
  • insertions contains action now targets all columns if the column name is missing (32a6951)
  • introduce new storage type for Sequence Positions, where the most numerous symbol is deleted (6e15204)
  • introduce storage of unaligned sequences from either ndjson file or fasta file and make them queryable via the Fasta action (44df849)
  • load table lazily. Unaligned Sequences do not need to load the table (c2a8439)
  • log databaseConfig and preprocessingConfig (d2dc58c)
  • logging for partition (e75a925)
  • logging improvements (4c12a88)
  • make database serializable again (2523e67)
  • make pangoLineageDefinitionFilename in preprocessing config optional, linter errors (0f3dc53)
  • make partition_by field in config optional (3942418)
  • make SILO Docker image by default read data from /data (e83b910)
  • make threads and max queued http connections available through optional parameter (3ecde68)
  • migration to duckdb 0.10.1 (c1426ef)
  • mine data version at beginning of preprocessing (362fe0f)
  • More robust InputStreamWrapper (305dd36)
  • multiple performance improvements for details endpoint (28f41d0)
  • optimize bitmaps before finishing partition (5b06d58)
  • order all actions by default (a2f5c04)
  • preparation of insertion columns (c14a370)
  • put output and logs to gitignore (789e489)
  • reenable bitmap inversion (75ac20f)
  • reenable pushdown of And expressions through selections (802bec0)
  • refactor saving and loading database to not require preprocessing structs anymore (45bf7ed)
  • reintroduce randomize for all query actions (166045c)
  • reserve space in columns when bulk inserting rows (e3c9620)
  • return data as NDJSON instead of JSON (c236ba4), closes #126
  • return data version on each query (be5c886)
  • return only aliased pango lineages (abf0844)
  • reveal some more details when reading YAML fails (1f8d9db)
  • run preprocessing in github ci (f53eddb)
  • save database state into folder with name <data version> (41923eb)
  • separate preprocessing and starting silo (9808e2d)
  • serialization of partition descriptor to json (472c1da)
  • some suggestions for the insertion search (ae900da)
  • Specifiable nucSequence query target (7cc609f)
  • statically disable deleted symbol optimisations because of performance penalty (4e522f4)
  • stick to the default of having config value keys in camel case (a1cae40)
  • storing amino acids (f11a330)
  • support for nullable columns (6f78e3a)
  • support recombinant lineages (3e848a5)
  • template class for sequence store (cef4d48)
  • templatized Symbol classes (6b9d734)
  • test set with amino acid insertions (de2c4f8)
  • throw an error when there is not initialized database loaded yet #295 (b17f72a)
  • tidying up CLI and file configuration for runtime config. Added option for specifying the port (c3a88a0)
  • Unit tests for Tuple (4fc06e8)
  • update conan version (5540a67)
  • use 'pragma once' as include guards instead of 'ifndef...' (bc49aa5)
  • use own scope for preprocessing (2a93846)
  • use same default min proportions for mutations actions as the old LAPIS (f42f830)

Bug Fixes

  • adapt randomize query results to target architecture. x86 and ARM have possibly different std::hash results (#355) (600000d)
  • add bash dependency which is required by conan build of pkgconf and is not installed on alpine by default (1b3f51c)
  • add insertion to database_config test (aec40d0)
  • add missing file for test (262bedc)
  • add missing sequenceName field to mutation action "orderBy" (06c8c86)
  • add sleep statement before row call (8fa8efb)
  • add workaround so insertions are read correctly (8a2bfa8)
  • allow sql keywords for metadata field names #259 (6fbeee5)
  • also consider 'missing' symbols in the mutation action. Bugfix where Position invariant was broken because of 'missing' symbols (fab72a6)
  • alternative non-exhaustive three mer index (467086f)
  • always build dependency image for amd64 platform (4f73cb0)
  • bug when filtering for indexedStringColumns which are not present in some partitions (62d4e08)
  • bug where sequence reconstruction is false when the flipped bitmap is different from the reference sequence symbol (edac58c)
  • change random ordering for gcc hashing (78055ff)
  • change test to reflect new optimisations (cb63010)
  • compiling And: append selection_child->predicates to predicates - not vice versa (bdebca5)
  • deterministic order for e2e test (02427a2)
  • divergence between mac and linux info test results, fix memory leaks in Threshold.cpp (ffcefd0)
  • do not exclude zstd filter from boost installation (c974230)
  • do not use std::filesystem::path::relative_path() to also support absolute paths (b9ff422)
  • endToEndResults (c807a33)
  • error when the Mutations action looked for sequences but the filter was empty (91e5d52)
  • fix memory leaks in indexed_string_column.cpp and insertion_contains.cpp (c2eeac8)
  • floatEquals and floatBetween with null values (47b436e)
  • hide nucleotide sequence for default sequence (584715c)
  • insertion column, remove reference (d940c52)
  • insertion search e2e and insertion column tests, dont allow non-empty value for insertion search (19af04d)
  • linking error on linux (ad9076c)
  • linter (54d3c34)
  • linter (e6b6ab7)
  • linter (b31851c)
  • linter (34830ab)
  • linter errors (e9e1bbf)
  • Linter throws again and added clang-format option (87cb4a5)
  • Make C++ flags in CMake compatible for MacOS (38cae69)
  • metadata info test accessing getMetadataFields output no longer directly but over the address of a const (0dcdee1)
  • missing include (0773999)
  • new linter errors (ea6934c)
  • no longer have regression when no bitmap flipped is most efficient (b653753)
  • nodiscard (silence warnings) (69421c1), closes #390
  • non default unaligned nucleotide sequence prefix (93b4829)
  • nucleotide symbol equals with dot (6ad623e)
  • only apply order-by if the field is set, validate orderBy fields for all operations (405a7f1)
  • pango lineage filter with null values (b7238a8)
  • parse error messages for mutation filter expressions (9e4612d)
  • put compressors into sql function to avoid static variables (c1a11c8)
  • quoting {} in "x.{}" SQL struct accesses, as a string starting with a number leads to parser errors (#409) (f3ba6db)
  • random (but deterministic for a version) result can depend on internal state, which was changed with duckdb update (9d9351b)
  • recursive file reading for nodejs<20 (9ac0ffe)
  • remove caching for linter. Docker image to large on github actions. (8eaeee6)
  • response format (bca4961)
  • revert duckdb migration due to it being unable to build the new version on the GitHub runner (2a62c54)
  • revert test numbers to pre-optimization (99b600a)
  • Roaring from 1.3.0 -> 1.0.0 because of broken CI (ce82c77)
  • seralization for insertion_index and insertion_column (ee99f8d)
  • single partition build fix (a8af1c9)
  • specify namespace fmt in calls to format_to (#353) (62ffa3b)
  • specifying apk versions (2c2354e)
  • test cases verifying that the positions index for mutation distribution are now 1-indexed (fdf972a)
  • test with deterministic results, remove 2 unused variables (96a424b)
  • unit test info number updates for new pango-lineages in test data (39d07c2)
  • unit tests and mock fixtures (db506a1)
  • update cmake version on ubuntu (227a2dd)
  • Upper and lower bound should be inclusive in DateBetween filter (53c6c05)
  • Wrong compare function used in multi-threaded case, which displayed wrong tuples in the details endpoint when a limit was used (650bf36)
  • zstd dependency (c145722)