Sourced from org.rocksdb:rocksdbjni's releases.
RocksDB 9.7.3
9.7.3 (10/16/2024)
Behavior Changes
- OPTIONS file to be loaded by remote worker is now preserved so that it does not get purged by the primary host. A similar technique as how we are preserving new SST files from getting purged is used for this. min_options_file_numbers_ is tracked like pending_outputs_ is tracked.
9.7.2 (10/08/2024)
Bug Fixes
- Fix a bug for surfacing write unix time:
Iterator::GetProperty("rocksdb.iterator.write-time")
for non-L0 files.9.7.1 (09/26/2024)
Bug Fixes
- Several DB option settings could be lost through
GetOptionsFromString()
, possibly elsewhere as well. Affected options, now fixed:background_close_inactive_wals
,write_dbid_to_manifest
,write_identity_file
,prefix_seek_opt_in_only
- Fix under counting of allocated memory in the compressed secondary cache due to looking at the compressed block size rather than the actual memory allocated, which could be larger due to internal fragmentation.
- Skip insertion of compressed blocks in the secondary cache if the lowest_used_cache_tier DB option is kVolatileTier.
9.7.0 (09/20/2024)
New Features
- Make Cache a customizable class that can be instantiated by the object registry.
- Add new option
prefix_seek_opt_in_only
that makes iterators generally safer when you might set aprefix_extractor
. Whenprefix_seek_opt_in_only=true
, which is expected to be the future default, prefix seek is only used whenprefix_same_as_start
orauto_prefix_mode
are set. Also,prefix_same_as_start
andauto_prefix_mode
now allow prefix filtering even withtotal_order_seek=true
.- Add a new table property "rocksdb.key.largest.seqno" which records the largest sequence number of all keys in file. It is verified to be zero during SST file ingestion.
Behavior Changes
- Changed the semantics of the BlobDB configuration option
blob_garbage_collection_force_threshold
to define a threshold for the overall garbage ratio of all blob files currently eligible for garbage collection (according toblob_garbage_collection_age_cutoff
). This can provide better control over space amplification at the cost of slightly higher write amplification.- Set
write_dbid_to_manifest=true
by default. This means DB ID will now be preserved through backups, checkpoints, etc. by default. Also addwrite_identity_file
option which can be set to false for anticipated future behavior.- In FIFO compaction, compactions for changing file temperature (configured by option
file_temperature_age_thresholds
) will compact one file at a time, instead of merging multiple eligible file together (#13018).- Support ingesting db generated files using hard link, i.e. IngestExternalFileOptions::move_files/link_files and IngestExternalFileOptions::allow_db_generated_files.
- Add a new file ingestion option
IngestExternalFileOptions::link_files
to hard link input files and preserve original files links after ingestion.- DB::Close now untracks files in SstFileManager, making avaialble any space used by them. Prior to this change they would be orphaned until the DB is re-opened.
Bug Fixes
- Fix a bug in CompactRange() where result files may not be compacted in any future compaction. This can only happen when users configure CompactRangeOptions::change_level to true and the change level step of manual compaction fails (#13009).
- Fix handling of dynamic change of
prefix_extractor
with memtable prefix filter. Previously, prefix seek could mix different prefix interpretations between memtable and SST files. Now the latestprefix_extractor
at the time of iterator creation or refresh is respected.- Fix a bug with manual_wal_flush and auto error recovery from WAL failure that may cause CFs to be inconsistent (#12995). The fix will set potential WAL write failure as fatal error when manual_wal_flush is true, and disables auto error recovery from these errors.
Sourced from org.rocksdb:rocksdbjni's changelog.
9.7.3 (10/16/2024)
Behavior Changes
- OPTIONS file to be loaded by remote worker is now preserved so that it does not get purged by the primary host. A similar technique as how we are preserving new SST files from getting purged is used for this. min_options_file_numbers_ is tracked like pending_outputs_ is tracked.
9.7.2 (10/08/2024)
Bug Fixes
- Fix a bug for surfacing write unix time:
Iterator::GetProperty("rocksdb.iterator.write-time")
for non-L0 files.9.7.1 (09/26/2024)
Bug Fixes
- Several DB option settings could be lost through
GetOptionsFromString()
, possibly elsewhere as well. Affected options, now fixed:background_close_inactive_wals
,write_dbid_to_manifest
,write_identity_file
,prefix_seek_opt_in_only
- Fix under counting of allocated memory in the compressed secondary cache due to looking at the compressed block size rather than the actual memory allocated, which could be larger due to internal fragmentation.
- Skip insertion of compressed blocks in the secondary cache if the lowest_used_cache_tier DB option is kVolatileTier.
9.7.0 (09/20/2024)
New Features
- Make Cache a customizable class that can be instantiated by the object registry.
- Add new option
prefix_seek_opt_in_only
that makes iterators generally safer when you might set aprefix_extractor
. Whenprefix_seek_opt_in_only=true
, which is expected to be the future default, prefix seek is only used whenprefix_same_as_start
orauto_prefix_mode
are set. Also,prefix_same_as_start
andauto_prefix_mode
now allow prefix filtering even withtotal_order_seek=true
.- Add a new table property "rocksdb.key.largest.seqno" which records the largest sequence number of all keys in file. It is verified to be zero during SST file ingestion.
Behavior Changes
- Changed the semantics of the BlobDB configuration option
blob_garbage_collection_force_threshold
to define a threshold for the overall garbage ratio of all blob files currently eligible for garbage collection (according toblob_garbage_collection_age_cutoff
). This can provide better control over space amplification at the cost of slightly higher write amplification.- Set
write_dbid_to_manifest=true
by default. This means DB ID will now be preserved through backups, checkpoints, etc. by default. Also addwrite_identity_file
option which can be set to false for anticipated future behavior.- In FIFO compaction, compactions for changing file temperature (configured by option
file_temperature_age_thresholds
) will compact one file at a time, instead of merging multiple eligible file together (#13018).- Support ingesting db generated files using hard link, i.e. IngestExternalFileOptions::move_files/link_files and IngestExternalFileOptions::allow_db_generated_files.
- Add a new file ingestion option
IngestExternalFileOptions::link_files
to hard link input files and preserve original files links after ingestion.- DB::Close now untracks files in SstFileManager, making avaialble any space used by them. Prior to this change they would be orphaned until the DB is re-opened.
Bug Fixes
- Fix a bug in CompactRange() where result files may not be compacted in any future compaction. This can only happen when users configure CompactRangeOptions::change_level to true and the change level step of manual compaction fails (#13009).
- Fix handling of dynamic change of
prefix_extractor
with memtable prefix filter. Previously, prefix seek could mix different prefix interpretations between memtable and SST files. Now the latestprefix_extractor
at the time of iterator creation or refresh is respected.- Fix a bug with manual_wal_flush and auto error recovery from WAL failure that may cause CFs to be inconsistent (#12995). The fix will set potential WAL write failure as fatal error when manual_wal_flush is true, and disables auto error recovery from these errors.
9.6.0 (08/19/2024)
New Features
- *Best efforts recovery supports recovering to incomplete Version with a clean seqno cut that presents a valid point in time view from the user's perspective, if versioning history doesn't include atomic flush.
- New option
BlockBasedTableOptions::decouple_partitioned_filters
should improve efficiency in serving read queries because filter and index partitions can consistently target the configuredmetadata_block_size
. This option is currently opt-in.- Introduce a new mutable CF option
paranoid_memory_checks
. It enables additional validation on data integrity during reads/scanning. Currently, skip list based memtable will validate key ordering during look up and scans.Public API Changes
- Add ticker stats to count file read retries due to checksum mismatch
- Adds optional installation callback function for remote compaction
Behavior Changes
- There may be less intra-L0 compaction triggered by total L0 size being too small. We now use compensated file size (tombstones are assigned some value size) when calculating L0 size and reduce the threshold for L0 size limit. This is to avoid accumulating too much data/tombstones in L0.
Bug Fixes
- *Make DestroyDB supports slow deletion when it's configured in
SstFileManager
. The slow deletion is subject to the configuredrate_bytes_per_sec
, but not subject to themax_trash_db_ratio
.- Fixed a bug where we set unprep_seqs_ even when WriteImpl() fails. This was caught by stress test write fault injection in WriteImpl(). This may have incorrectly caused iteration creation failure for unvalidated writes or returned wrong result for WriteUnpreparedTxn::GetUnpreparedSequenceNumbers().
... (truncated)
0e2801a
Version and HISTORY.md update for 9.7.3 patch2647d5c
Fix Compaction Stats (#13071)11f21cf
Preserve Options File (#13074)eca4f10
Add file_checksum from FileChecksumGenFactory and Tests for corrupted
output ...5bb363e
Print unknown writebatch tag (#13062)b5cde68
Update HISTORY for 9.7.2d978726
Update version.h2fef013
Fix a bug for surfacing write unix time (#13057)a245672
Update HISTORY and version for 9.7.1786ac6a
Bug fix and test BuildDBOptions (#13038)