Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[dbnode] Large tiles aggregation flow v2 (#2643)
* Refactor and cleanup * Refactor interfaces to more closely match design * Update frame iterator read from a encoding.ReaderIterator * Removing unnecessary files * az * Add utility to apply tile calculations on data. * test fix * Added concurrency * Concurrency logging * [dbnode] A noop AggregateTiles thrift RPC * Add AggregateTilesRequest.rangeType * sourceNameSpace / targetNameSpace * Drop AggregateTilesRequest.shardId * A partial implementation of AggregateTiles * Open DataFileSetReader and iterate through it * Decompress the data read * Add explicit FileSetType * [dbnode] Add OrderedByIndex option for DataFileSetReader.Open * Remove dbShard.TagsFromSeriesID * Regenerate mocks * Unit tests * Mockgen * Fix test * Resurrect rpc_mock.go * Remove accidentally committed files * Trigger build * Add step parameter * Write aggregated data to other namespace * Fix tests * Introduced AggregateTilesOptions * Minor improvements * Cleanup * PR response * Add headers * Remove unnecessary stuff. * [dbnode] A noop AggregateTiles thrift RPC * Add AggregateTilesRequest.rangeType * sourceNameSpace / targetNameSpace * Drop AggregateTilesRequest.shardId * A partial implementation of AggregateTiles * Open DataFileSetReader and iterate through it * Decompress the data read * Add explicit FileSetType * Remove dbShard.TagsFromSeriesID * Regenerate mocks * Unit tests * Mockgen * Fix test * Resurrect rpc_mock.go * Remove accidentally committed files * Trigger build * Add step parameter * Write aggregated data to other namespace * Fix tests * Introduced AggregateTilesOptions * Minor improvements * Cleanup * [dbnode] Integrate arrow iterators into tile aggregation * Fix close error after EOF * Can already close the SeriesBlockIterator * Update to use concurrent iteration and prefer single metadata * [dbnode] Cross block series reader * Assert on OrderedByIndex * Tests * Mocks * Dont test just the happy path * Compute and validate block time frames * [dbnode] Integration test for large tiles (#2478) * [dbnode] Create a virtual reverse index for a computed namespace * Return processedBlockCount from AggregateTiles * Improve error handling * Validate AggregateTilesOptions * Unnest read locks * Use default instead of constant * Fix test * minor refactoring * Addressed review feedback * Legal stuff * Refactor recorder * Allow using flat buffers rather than arrow * [dbnode] persist manager for large tiles * revert of .ci * minor * Adding better comparisons for arrow vs flat * Some fixes for query_data_files * An option to read from all shards * Fix large_tiles_test * Fix TestDatabaseAggregateTiles * Read data ordered by index * Generate mocks * Fix TestAggregateTiles * Group Read() results by id * Remodel CrossBlockReader as an Iterator * Mockgen * Erase slice contents before draining them * Resolve merge conflicts * Align with master * Integrate CrossBlockReader * Make a defensive copy of dataFileSetReaders * avoid panics * Improve TestNamespaceAggregateTiles * Added TODO on TestAggregateTiles * Align query_data_files * Mockgen * Added cross block iterator to be able to read multiple BlockRecords. Also removed concurrency from tile iterators and cleaned up utility * Add HandleCounterResets to AggregateTilesOptions * Additional tests and cleanup. * [dbnode] Large Tiles fs.writer experimental implementation * Implement DownsampleCounterResets * Improve readablitiy * Use pointer arguments to get the results * Reduce code duplication * Refine comments * Remove dependency on SeriesBlockFrame * [dbnode] Add OrderedByIndex option for DataFileSetReader.Open (#2465) * [dbnode] Cross-block series reader (#2481) * Fix build * Integrate DownsampleCounterResets * Introduce DownsampledValue struct * Update DownsampleCounterResets integration * Preallocate capacity for downsampledValues * Large tiles parrallel indexing. * Checkpoint fixed * Successful write/fetch with some hardcoded values. * Some FIXME solved * [dbnode] AggregateTiles RPC - minimal E2E flow (#2466) * minor fixes * codegen fix * Address feedback from PR 2477 * TestShardAggregateTiles using 2 readers * Fix large_tiles_test.go * integration test fix * Bug fix and test * [large tiles] Fix refcounting in CrossBlockReader * Workaround for negative reference count * Integration test fix * [large-tiles] Try detect double finalize * [dbnode] Large tiles concurrency test * Batch writer is used for waster writes * race fix * Fix compilation error * Comment out some noise * Fix data races on time field (bootstrapManager, flushManager) * Fix misplaced wd.Add * Close context used for aggregation (in test code) * Close encoders during large tile writes * removing some debug code * Close series in test code * Move series.Close() after err check (test code) * Update to series frame API * Additional tests * PR * work on integ test * tags close * [large-tiles] Fix management of pooled objects * Fix query_data_files tool * Use mock checked bytes in cross_block_reader_test.go * query test * Query in place of fetch * test fix * Bug reproduced * Heavier concurrency test * Fix session related races in large_tiles_test.go * Fix string conversion * Use mocks for pooled objects in TestShardAggregateTiles * Add build integration tag * test fix * minor refactoring * increased the amount of series to 5k * Remove a noop * Log the finish of namespace aggregation * some minor refactorings * Streaming aggregation, reusing resources for each series * Do not allocate minHeapEntries * Cleanup * Fix query_data_files * Fix build * go.sum * Align with StreamingWriter API changes * Add FIXME WRT segment.Tail finalizing * Exclude query_data_files * Exclude counter_resets_downsampler.go * Remove arrow related code * Cleanup * Use explicit EncodedTags type * Rename processedBlockCount to processedTileCount * Fix build * Exclude read_index_ids changes * Address review feedback * Increase fetch timeout to stabilize TestAggregationAndQueryingAtHighConcurrency * Abort the writer in case of any error during aggregation Co-authored-by: Artem <[email protected]> Co-authored-by: Gediminas <[email protected]>
- Loading branch information