-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Monster PR #1197
Monster PR #1197
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed in doc/source/tutorials/format-description.rst that the current format version number is said to be "2". Looks like constants.cc changed it to 3.
56e4523
to
076d42d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finished RTree review; did not see anything major.
RTree changes LGTM. Moving on to the next review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
API review done (I didn't really look at the associated unit test sources though). Mostly these are questions, I don't know how much time @stavrospapadopoulos and @jakebolewski you want to spend on the API at this time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Subarray and SubarrayPartitioner reviewed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VFS changes LGTM -- @joe-maley is also taking a look
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Read algorithm and Reader
class review done -- LGTM.
b8f82aa
to
1c06178
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
* Added Subarray class to implement tiledb_subarray_t. * Added functionality for estimating the result size in Subarray. * Added RTree class to expedite checking for tile overlap when estimating the result size. * Now the MBRs in FragmentMetadata are organized into an RTree after loading.
…d consolidator. Also fixed an issue with consolidating kv. Deleted consistency test, since now flushing the fragment metadata after writing to an array requires an exclusive lock. The TileDB consistency model is now better captured by the tiledb_array_open_at functionality anyway.
…in consolidator. Added \bigobj flag for Windows build.
…he read batch stitching. If there are fewer than vfs.batch_gap_size bytes between two batches, the batches get stitched. The current default is 500KB
…arrays. Fixes issue with indefinite growing of the URI to encryption key mapping in `StorageManager` (the mapping is no longer needed).
536cf0f
to
f778783
Compare
This PR removes some unused code, specifically: * A patch file that should have been removed in #4553. * The `EncryptionKeyValidation` class that is unused since #1197. * Setting CMake policies to NEW that are already set by `cmake_minimum_required(VERSION 3.21)` --- TYPE: NO_HISTORY (cherry picked from commit 49e8752)
This is unfortunately the outcome of numerous changes we had to experiment with to address various important performance and functionality challenges during a contract with a customer. It should have been better modularized into separate commits, but this did not happen as the work got interleaved as we were progressing with the contract. To be absolutely avoided in the future!
To make the reviewing progress somewhat easier, I am breaking down the changes per meaningful group of files.
High-level Changes
tiledb_array_open_at
functionality anyway.vfs.min_batch_size
.vfs.batch_gap_size
) to control the read batch stitching. If there are fewer thanvfs.batch_gap_size
bytes between two batches, the batches get stitched. The current default is 500KBTODO after merge
SubarrayPartitioner
to the C/C++ API viatiledb_subarray_partitioner_t
.StorageManager::get_fragment_info
(there are some TODOs there regarding xlocking)StorageManager
. Use an alternative thread pool instead.Review Help
tiledb/sm/rtree/rtree.cc
tiledb/sm/rtree/rtree.h
test/src/unit-rtree.cc
RTree
class.tiledb/sm/subarray/subarray.cc
tiledb/sm/subarray/subarray.h
tiledb/sm/misc/tile_overlap.h
tiledb/sm/cpp_api/deleter.h
tiledb/sm/cpp_api/query.h
tiledb/sm/cpp_api/subarray.h
tiledb/sm/cpp_api/tiledb
tiledb/sm/query/query.h
tiledb/sm/query/query.cc
Subarray
, which now handles multi-range subarrays.tiledb/sm/subarray/subarray_partitioner.cc
tiledb/sm/subarray/subarray_partitioner.h
SubarrayPartitioner
, which manages the result size and facilitates incomplete queries.tiledb/sm/c_api/tiledb.h
tiledb/sm/c_api/tiledb.cc
test/src/unit-capi-subarray.cc
test/src/unit-capi-subarray-partitioner.cc
test/src/unit-cppapi-subarray.cc
doc/source/c-api.rst
tiledb_subarray_t
(which wrapsSubarray
)tiledb_subarray_partitioner_t
(which wrapsSubarrayPartitioner
)tiledb/sm/vfs/vfs.h
tiledb/sm/vfs/vfs.cc
test/src/unit-vfs.cc
VFS::read_all
takes as input a thread pooltiledb/sm/storage_manager/config.h
tiledb/sm/storage_manager/config.cc
tiledb/sm/cpp_api/config.h
test/src/unit-config.cc
docs/source/tutorials/config.rst
docs/source/tutorials/performance-factors.rst
tiledb/sm/tile/tile.cc
tiledb/sm/fragment_metadata/fragment_metadata.h
tiledb/sm/fragment_metadata/fragment_metadata.cc
doc/source/tutorials/format-description.rst
FragmentMetadata
instead ofStorageManager
tiledb/sm/storage_manager/open_array.h
tiledb/sm/storage_manager/open_array.cc
OpenArray
instead ofStorageManager
StorageManager
tiledb/sm/storage_manager/consolidator.h
tiledb/sm/storage_manager/consolidator.cc
test/src/unit-capi-consolidation.cc
tiledb/sm/fragment/fragment_info.h
void*
for non empty domain.tiledb/sm/array_schema/domain.h
tiledb/sm/misc/uri.h
tiledb/sm/misc/uri.cc
tiledb/sm/misc/utils.h
tiledb/sm/misc/utils.cc
tiledb/sm/query/writer.h
tiledb/sm/query/writer.cc
void*
subarrays.void*
subarrays and useSubarray
instead for writes.tiledb/sm/query/reader.h
tiledb/sm/query/reader.cc
test/src/unit-capi-incomplete-2.cc
test/src/unit-capi-sparse_array-2.cc
test/src/unit-capi-sparse_neg-2.cc
test/src/unit-capi-sparse_real-2.cc
Read::read_2
. This will substituteRead::read
once the dense multi-range subarray are implemented.void*
subarray still works).*-2.cc
, which will substitute the old ones once the dense multi-range subarray are implemented.tiledb/sm/storage_manager/storage_manager.h
tiledb/sm/storage_manager/storage_manager.cc
Writer
.tiledb/sm/misc/constants.h
tiledb/sm/misc/constants.cc
tiledb/sm/misc/parallel_functions.h
tiledb/sm/misc/status.h
tiledb/sm/misc/status.cc
tiledb/sm/array.h
tiledb/sm/array.cc
test/src/unit-capi-array.cc
test/src/unit-dense-array.cc
test/src/unit-kv.cc
test/src/unit-cppapi-array.cc
test/src/unit-capi-consistency.cc
open_{array,kv}_at
functions.HISTORY.md