-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-15: Fix a naming typo for memory.AllocationManager.AllocationOutcome #4
Conversation
No other place in Arrow refers This is fairly a small change, @jacques-n @wesm would you mind taking a look when you have time? Cheers. |
array.h : Determine if a slot if null. For inner loops. Does not boundscheck “ is this should be " is null"....? |
@zhonghongxia yes, that's also a typo, happy to accept patches to improve the C++ library code comments or documentation =) |
+1 |
…instructions Author: dhirschf <[email protected]> Author: Korn, Uwe <[email protected]> Author: Uwe L. Korn <[email protected]> Closes #1277 from xhochy/ARROW-1764 and squashes the following commits: 63143bb [Uwe L. Korn] Merge pull request #4 from dhirschfeld/gflags bdb6b3f [dhirschf] Clarification of gflags channel 27f0896 [Korn, Uwe] ARROW-1764: [Python] Add -c conda-forge for Windows dev installation instructions
…lue data Modified BinaryBuilder::Resize(int64_t) so that when building BinaryArrays with a known size, space is also reserved for value_data_builder_ to prevent internal reallocation. Author: Panchen Xue <[email protected]> Closes #1481 from xuepanchen/master and squashes the following commits: 707b67b [Panchen Xue] ARROW-1712: [C++] Fix lint errors 360e601 [Panchen Xue] Merge branch 'master' of https://github.com/xuepanchen/arrow d4bbd15 [Panchen Xue] ARROW-1712: [C++] Modify test case for BinaryBuilder::ReserveData() and change arguments for offsets_builder_.Resize() 77f8f3c [Panchen Xue] Merge pull request #5 from apache/master bc5db7d [Panchen Xue] ARROW-1712: [C++] Remove unneeded data member in BinaryBuilder and modify test case 5a5b70e [Panchen Xue] Merge pull request #4 from apache/master 8e4c892 [Panchen Xue] Merge pull request #3 from xuepanchen/xuepanchen-arrow-1712 d3c8202 [Panchen Xue] ARROW-1945: [C++] Fix a small typo 0b07895 [Panchen Xue] ARROW-1945: [C++] Add data_capacity_ to track capacity of value data 18f90fb [Panchen Xue] ARROW-1945: [C++] Add data_capacity_ to track capacity of value data bbc6527 [Panchen Xue] ARROW-1945: [C++] Update test case for BinaryBuild data value space reservation 15e045c [Panchen Xue] Add test case for array-test.cc 5a5593e [Panchen Xue] Update again ReserveData(int64_t) method for BinaryBuilder 9b5e805 [Panchen Xue] Update ReserveData(int64_t) method signature for BinaryBuilder 8dd5eaa [Panchen Xue] Update builder.cc b002e0b [Panchen Xue] Remove override keyword from ReserveData(int64_t) method for BinaryBuilder de318f4 [Panchen Xue] Implement ReserveData(int64_t) method for BinaryBuilder e0434e6 [Panchen Xue] Add ReserveData(int64_t) and value_data_capacity() for methods for BinaryBuilder 5ebfb32 [Panchen Xue] Add capacity() method for TypedBufferBuilder 5b73c1c [Panchen Xue] Update again BinaryBuilder::Resize(int64_t capacity) in builder.cc d021c54 [Panchen Xue] Merge pull request #2 from xuepanchen/xuepanchen-arrow-1712 232024e [Panchen Xue] Update BinaryBuilder::Resize(int64_t capacity) in builder.cc c2f8dc4 [Panchen Xue] Merge pull request #1 from apache/master
This PR moves the `Table` class out of the Vector hierarchy and adds optimized dataframe operations to it. Currently implements an optimized `scan()` method, `filter(predicate)`, `count()`, and `countBy(column_name)` (only works on dictionary-encoded columns). Some usage examples, based on the file generated by `js/test/data/tables/generate.py`: ``` js > let table = Table.from(...); > table.count() 1000000 > table.filter(col('lat').gteq(0)).count() 499718 > table.countBy('origin').toJSON() { Charlottesville: 166839, 'New York': 166251, 'San Francisco': 166642, Seattle: 166659, 'Terre Haute': 166756, 'Washington, DC': 166853 } > table.filter(col('lng').gteq(0)).countBy('origin').toJSON() { Charlottesville: 83109, 'New York': 83221, 'San Francisco': 83515, Seattle: 83362, 'Terre Haute': 83314, 'Washington, DC': 83479 } ``` There are performance tests for the dataframe operations, to run them you must first generate the test data by running `npm run create:perfdata`. The PR also includes @trxcllnt's refactor of the JS implementation to make it more closely resemble the C++ implementation. This refactor resolves multiple JIRAs: ARROW-1903, ARROW-1898, ARROW-1502, ARROW-1952 (partially), and ARROW-1985 Author: Paul Taylor <[email protected]> Author: Brian Hulette <[email protected]> Author: Brian Hulette <[email protected]> Closes #1482 from TheNeuralBit/table-scan-perf and squashes the following commits: 52f1e0e [Brian Hulette] <, > are not commutative, misc cleanup 04b1838 [Brian Hulette] even more table tests 16b9ccb [Brian Hulette] Merge pull request #4 from trxcllnt/js-cpp-refactor fe300df [Paul Taylor] fix closure es5/umd toString() iterator 3d5240a [Paul Taylor] fix more externs 10c48ad [Paul Taylor] Merge branch 'table-scan-perf' of github.com:ccri/arrow into js-cpp-refactor dbe7f81 [Brian Hulette] Add more Table unit tests 1910962 [Brian Hulette] Add optional bind callback to scan 5bdf17f [Brian Hulette] Fix perf 8cf2473 [Brian Hulette] Merge remote-tracking branch 'origin/master' into table-scan-perf 4a41b18 [Paul Taylor] add src/predicate to the list of exports we should save from uglify 5a91fab [Paul Taylor] add more view, predicate externs f6adfb3 [Brian Hulette] Create predicate namespace f7bb0ed [Paul Taylor] Merge branch 'table-scan-perf' of github.com:ccri/arrow into js-cpp-refactor e148ee4 [Paul Taylor] Merge branch 'extern-woes' into js-cpp-refactor 25cdc4a [Paul Taylor] add src/predicate to the list of exports we should save from uglify dc7c728 [Paul Taylor] add more view, predicate externs 25e6af7 [Brian Hulette] Create predicate namespace 579ab1f [Brian Hulette] Merge pull request #2 from trxcllnt/js-cpp-refactor f3cde1a [Paul Taylor] fix lint 9769773 [Paul Taylor] fix vector perf tests 016ba78 [Brian Hulette] Merge pull request #1 from trxcllnt/js-cpp-refactor 272d293 [Paul Taylor] Merge pull request #4 from ccri/empty-table 7bc7363 [Brian Hulette] Fix exception for empty Table 8ddce0a [Paul Taylor] check bounds in getChildAt(i) to avoid NPEs f1dead0 [Paul Taylor] compute chunked nested childData list correctly 18807c6 [Paul Taylor] rename ChunkData's fields so it's more clear they're not semantically similar to other similarly named fields 7e43b78 [Paul Taylor] add test:integration npm script a5f200f [Paul Taylor] Merge pull request #3 from ccri/table-from-struct c8cd286 [Brian Hulette] Add Table.fromStruct a00415e [Brian Hulette] Fix perf 54d4f5b [Paul Taylor] lazily allocate table and recordbatch columns, support NestedView's getChildAt(i) method in ChunkedView 40b3638 [Paul Taylor] run integration tests with local data for coverage stats fe31ee0 [Paul Taylor] slice the flat data values before returning an iterator of them e537789 [Paul Taylor] make it easier to run all integration tests from local data c0fd2f9 [Paul Taylor] use the dictionary of the last chunked vector list for chunked dictionary vectors e33c068 [Paul Taylor] Merge pull request #2 from ccri/fixed-size-list 5bb63af [Brian Hulette] Don't read OFFSET vector for FixedSizeList 614b688 [Paul Taylor] add asEpochMs to date and timestamp vectors 87334a5 [Paul Taylor] Merge branch 'table-scan-perf' of github.com:ccri/arrow into js-cpp-refactor b7f5bfb [Paul Taylor] rename numRows to length, add table.getColumn() e81082f [Paul Taylor] export vector views, allow cloning data as another type 700a47c [Paul Taylor] export visitors e859e13 [Paul Taylor] fix package.json bin entry 0620cfd [Brian Hulette] use Math.fround 0126dc4 [Brian Hulette] Don't recompute total length e761eee [Brian Hulette] Rename asJSON to toJSON 6c91ed4 [Paul Taylor] Merge branch 'master' of github.com:apache/arrow into js-cpp-refactor-merge_with-table-scan-perf d2b18d5 [Paul Taylor] Merge remote-tracking branch 'ccri/table-scan-perf' into js-cpp-refactor-merge_with-table-scan-perf f3f3b86 [Paul Taylor] rename table.ts to recordbatch.ts in preparation for merging latest e3f629d [Paul Taylor] fix rest of the mangling issues fa7c17a [Paul Taylor] passing all tests except es5 umd mangler ones e20decd [Brian Hulette] Add license headers edcbdbe [Brian Hulette] cleanup 20717d5 [Brian Hulette] Fixed countBy(string) 7244887 [Brian Hulette] Add table unit tests... 6719147 [Brian Hulette] Add DataFrame.countBy operation 2f4a349 [Brian Hulette] Minor tweaks 2e118ab [Brian Hulette] linter a788db3 [Brian Hulette] Cleanup a9fff89 [Brian Hulette] Move Table out of the Vector hierarchy 1d60aa1 [Brian Hulette] Moved DataFrame ops to Table. DataFrame is now an interface e8979ba [Brian Hulette] Refactor DataFrame to extend Vector<StructRow> 6a41d68 [Brian Hulette] clean up table benchmarks 2744c63 [Brian Hulette] Remove Chunked/Simple DataFrame distinction aa999f8 [Brian Hulette] Add DictionaryVector optimization for equals predicate 4d9e8c0 [Brian Hulette] Add concept of predicates for filtering dataframes 796f45d [Brian Hulette] add DataFrame filter and count ops 30f0330 [Brian Hulette] Add basic DataFrame impl ... a1edac2 [Brian Hulette] Add perf tests for table scans d18d915 [Paul Taylor] fix struct and map rows 61dc699 [Paul Taylor] WIP -- refactor types to closer match arrow-cpp 62db338 [Paul Taylor] update dependencies and add es6+ umd targets to jest transform ignore patterns to fix ci 6ff18e9 [Paul Taylor] ship es2015 commonJS in main package to avoid confusion 74e828a [Paul Taylor] fix typings issues (ARROW-1903)
https://issues.apache.org/jira/browse/ARROW-3965 This creates an object which configures the BaseAllocator and Calendar used during to configure the translation from a JDBC ResultSet to an Arrow vector. Author: Mike Pigott <[email protected]> Author: Michael Pigott <[email protected]> Closes #3133 from mikepigott/jdbc-to-arrow-config and squashes the following commits: be95426 <Mike Pigott> ARROW-3965: JDBC-To-Arrow Config Builder javadocs. d6c64a7 <Mike Pigott> ARROW-3965: JdbcToArrowConfigBuilder d7ca982 <Mike Pigott> Merge branch 'master' into jdbc-to-arrow-config 789c8c8 <Michael Pigott> Merge pull request #4 from apache/master e5b19ee <Michael Pigott> Merge pull request #3 from apache/master 3b17c29 <Michael Pigott> Merge pull request #2 from apache/master 5b1b364 <Mike Pigott> Merge branch 'master' into jdbc-to-arrow-config 881c6c8 <Michael Pigott> Merge pull request #1 from apache/master bb3165b <Mike Pigott> Updating the function calls to use the JdbcToArrowConfig versions. 68c91e7 <Mike Pigott> Modifying the jdbcToArrowSchema and jdbcToArrowVectors methods to receive JdbcToArrowConfig objects. 8d6cf00 <Mike Pigott> Documentation for public static VectorSchemaRoot sqlToArrow(Connection connection, String query, JdbcToArrowConfig config) 4f1260c <Mike Pigott> Adding documentation for public static VectorSchemaRoot sqlToArrow(ResultSet resultSet, JdbcToArrowConfig config) df632e3 <Mike Pigott> Updating the SQL tests to include JdbcToArrowConfig versions. b270044 <Mike Pigott> Updated validaton & documentation, and unit tests for the new JdbcToArrowConfig. da77cbe <Mike Pigott> Creating a configuration class for the JDBC-to-Arrow converter.
https://issues.apache.org/jira/browse/ARROW-3923 Hello! I was reading through the JDBC source code and I noticed that a java.util.Calendar was required for creating an Arrow Schema and Arrow Vectors from a JDBC ResultSet, when none is required. This change makes the Calendar optional. Unit Tests: The existing SureFire plugin configuration uses a UTC calendar for the database, which is the default Calendar in the existing code. Likewise, no changes to the unit tests are required to provide adequate coverage for the change. Author: Michael Pigott <[email protected]> Author: Mike Pigott <[email protected]> Closes #3066 from mikepigott/jdbc-timestamp-no-calendar and squashes the following commits: 4d95da0 <Mike Pigott> ARROW-3923: Supporting a null Calendar in the config, and reverting the breaking change. cd9a230 <Mike Pigott> Merge branch 'master' into jdbc-timestamp-no-calendar 509a1cc <Michael Pigott> Merge pull request #5 from apache/master 789c8c8 <Michael Pigott> Merge pull request #4 from apache/master e5b19ee <Michael Pigott> Merge pull request #3 from apache/master 3b17c29 <Michael Pigott> Merge pull request #2 from apache/master 881c6c8 <Michael Pigott> Merge pull request #1 from apache/master 089cff4 <Mike Pigott> Format fixes a58a4a5 <Mike Pigott> Fixing calendar usage. e12832a <Mike Pigott> Allowing for timestamps without a time zone.
https://issues.apache.org/jira/browse/ARROW-3966 This change includes #3133, and supports a new configuration item called "Include Metadata." If true, metadata from the JDBC ResultSetMetaData object is pulled along to the Schema Field Metadata. For now, this includes: * Catalog Name * Table Name * Column Name * Column Type Name Author: Mike Pigott <[email protected]> Author: Michael Pigott <[email protected]> Closes #3134 from mikepigott/jdbc-column-metadata and squashes the following commits: 02f2f34 <Mike Pigott> ARROW-3966: Picking up lost change to support null calendars. 7049c36 <Mike Pigott> Merge branch 'master' into jdbc-column-metadata e9a9b2b <Michael Pigott> Merge pull request #6 from apache/master 65741a9 <Mike Pigott> ARROW-3966: Code review feedback cc6cc88 <Mike Pigott> ARROW-3966: Using a 1:N loop instead of a 0:N-1 loop for fewer index offsets in code. cfb2ba6 <Mike Pigott> ARROW-3966: Using a helper method for building a UTC calendar with root locale. 2928513 <Mike Pigott> ARROW-3966: Moving the metadata flag assignment into the builder. 69022c2 <Mike Pigott> ARROW-3966: Fixing merge. 4a6de86 <Mike Pigott> Merge branch 'master' into jdbc-column-metadata 509a1cc <Michael Pigott> Merge pull request #5 from apache/master 789c8c8 <Michael Pigott> Merge pull request #4 from apache/master e5b19ee <Michael Pigott> Merge pull request #3 from apache/master 3b17c29 <Michael Pigott> Merge pull request #2 from apache/master d847ebc <Mike Pigott> Fixing file location 1ceac9e <Mike Pigott> Merge branch 'master' into jdbc-column-metadata 881c6c8 <Michael Pigott> Merge pull request #1 from apache/master 03091a8 <Mike Pigott> Unit tests for including result set metadata. 72d64cc <Mike Pigott> Affirming the field metadata is empty when the configuration excludes field metadata. 7b4527c <Mike Pigott> Test for the include-metadata flag in the configuration. 7e9ce37 <Mike Pigott> Merge branch 'jdbc-to-arrow-config' into jdbc-column-metadata bb3165b <Mike Pigott> Updating the function calls to use the JdbcToArrowConfig versions. a6fb1be <Mike Pigott> Fixing function call 5bfd6a2 <Mike Pigott> Merge branch 'jdbc-to-arrow-config' into jdbc-column-metadata 68c91e7 <Mike Pigott> Modifying the jdbcToArrowSchema and jdbcToArrowVectors methods to receive JdbcToArrowConfig objects. b5b0cb1 <Mike Pigott> Merge branch 'jdbc-to-arrow-config' into jdbc-column-metadata 8d6cf00 <Mike Pigott> Documentation for public static VectorSchemaRoot sqlToArrow(Connection connection, String query, JdbcToArrowConfig config) 4f1260c <Mike Pigott> Adding documentation for public static VectorSchemaRoot sqlToArrow(ResultSet resultSet, JdbcToArrowConfig config) e34a9e7 <Mike Pigott> Fixing formatting. fe097c8 <Mike Pigott> Merge branch 'jdbc-to-arrow-config' into jdbc-column-metadata df632e3 <Mike Pigott> Updating the SQL tests to include JdbcToArrowConfig versions. b270044 <Mike Pigott> Updated validaton & documentation, and unit tests for the new JdbcToArrowConfig. da77cbe <Mike Pigott> Creating a configuration class for the JDBC-to-Arrow converter. a78c770 <Mike Pigott> Updating Javadocs. 523387f <Mike Pigott> Updating the API to support an optional 'includeMetadata' field. 5af1b5b <Mike Pigott> Separating out the field-type creation from the field creation.
https://issues.apache.org/jira/browse/ARROW-3923 Hello! I was reading through the JDBC source code and I noticed that a java.util.Calendar was required for creating an Arrow Schema and Arrow Vectors from a JDBC ResultSet, when none is required. This change makes the Calendar optional. Unit Tests: The existing SureFire plugin configuration uses a UTC calendar for the database, which is the default Calendar in the existing code. Likewise, no changes to the unit tests are required to provide adequate coverage for the change. Author: Michael Pigott <[email protected]> Author: Mike Pigott <[email protected]> Closes #3066 from mikepigott/jdbc-timestamp-no-calendar and squashes the following commits: 4d95da0 <Mike Pigott> ARROW-3923: Supporting a null Calendar in the config, and reverting the breaking change. cd9a230 <Mike Pigott> Merge branch 'master' into jdbc-timestamp-no-calendar 509a1cc <Michael Pigott> Merge pull request #5 from apache/master 789c8c8 <Michael Pigott> Merge pull request #4 from apache/master e5b19ee <Michael Pigott> Merge pull request #3 from apache/master 3b17c29 <Michael Pigott> Merge pull request #2 from apache/master 881c6c8 <Michael Pigott> Merge pull request #1 from apache/master 089cff4 <Mike Pigott> Format fixes a58a4a5 <Mike Pigott> Fixing calendar usage. e12832a <Mike Pigott> Allowing for timestamps without a time zone.
https://issues.apache.org/jira/browse/ARROW-3966 This change includes #3133, and supports a new configuration item called "Include Metadata." If true, metadata from the JDBC ResultSetMetaData object is pulled along to the Schema Field Metadata. For now, this includes: * Catalog Name * Table Name * Column Name * Column Type Name Author: Mike Pigott <[email protected]> Author: Michael Pigott <[email protected]> Closes #3134 from mikepigott/jdbc-column-metadata and squashes the following commits: 02f2f34 <Mike Pigott> ARROW-3966: Picking up lost change to support null calendars. 7049c36 <Mike Pigott> Merge branch 'master' into jdbc-column-metadata e9a9b2b <Michael Pigott> Merge pull request #6 from apache/master 65741a9 <Mike Pigott> ARROW-3966: Code review feedback cc6cc88 <Mike Pigott> ARROW-3966: Using a 1:N loop instead of a 0:N-1 loop for fewer index offsets in code. cfb2ba6 <Mike Pigott> ARROW-3966: Using a helper method for building a UTC calendar with root locale. 2928513 <Mike Pigott> ARROW-3966: Moving the metadata flag assignment into the builder. 69022c2 <Mike Pigott> ARROW-3966: Fixing merge. 4a6de86 <Mike Pigott> Merge branch 'master' into jdbc-column-metadata 509a1cc <Michael Pigott> Merge pull request #5 from apache/master 789c8c8 <Michael Pigott> Merge pull request #4 from apache/master e5b19ee <Michael Pigott> Merge pull request #3 from apache/master 3b17c29 <Michael Pigott> Merge pull request #2 from apache/master d847ebc <Mike Pigott> Fixing file location 1ceac9e <Mike Pigott> Merge branch 'master' into jdbc-column-metadata 881c6c8 <Michael Pigott> Merge pull request #1 from apache/master 03091a8 <Mike Pigott> Unit tests for including result set metadata. 72d64cc <Mike Pigott> Affirming the field metadata is empty when the configuration excludes field metadata. 7b4527c <Mike Pigott> Test for the include-metadata flag in the configuration. 7e9ce37 <Mike Pigott> Merge branch 'jdbc-to-arrow-config' into jdbc-column-metadata bb3165b <Mike Pigott> Updating the function calls to use the JdbcToArrowConfig versions. a6fb1be <Mike Pigott> Fixing function call 5bfd6a2 <Mike Pigott> Merge branch 'jdbc-to-arrow-config' into jdbc-column-metadata 68c91e7 <Mike Pigott> Modifying the jdbcToArrowSchema and jdbcToArrowVectors methods to receive JdbcToArrowConfig objects. b5b0cb1 <Mike Pigott> Merge branch 'jdbc-to-arrow-config' into jdbc-column-metadata 8d6cf00 <Mike Pigott> Documentation for public static VectorSchemaRoot sqlToArrow(Connection connection, String query, JdbcToArrowConfig config) 4f1260c <Mike Pigott> Adding documentation for public static VectorSchemaRoot sqlToArrow(ResultSet resultSet, JdbcToArrowConfig config) e34a9e7 <Mike Pigott> Fixing formatting. fe097c8 <Mike Pigott> Merge branch 'jdbc-to-arrow-config' into jdbc-column-metadata df632e3 <Mike Pigott> Updating the SQL tests to include JdbcToArrowConfig versions. b270044 <Mike Pigott> Updated validaton & documentation, and unit tests for the new JdbcToArrowConfig. da77cbe <Mike Pigott> Creating a configuration class for the JDBC-to-Arrow converter. a78c770 <Mike Pigott> Updating Javadocs. 523387f <Mike Pigott> Updating the API to support an optional 'includeMetadata' field. 5af1b5b <Mike Pigott> Separating out the field-type creation from the field creation.
…comments. The reset method allow the data structures to be re-used so they don't have to be allocated over and over again. Closes #6430 from richardartoul/ra/merge-upstream and squashes the following commits: 5a08281 <Richard Artoul> Add license to test file d76be05 <Richard Artoul> Add test for data reset d102b1f <Richard Artoul> Add tests d3e6e67 <Richard Artoul> cleanup comments c8525ae <Richard Artoul> Add Reset method to int array (#5) 489ca25 <Richard Artoul> Fix array.setData() to retain before release (#4) 88cd05f <Richard Artoul> Add reset method to Data (#3) 6d1b277 <Richard Artoul> Add Reset() method to String array (#2) dca2303 <Richard Artoul> Add Reset method to buffer and cleanup comments (#1) Lead-authored-by: Richard Artoul <[email protected]> Co-authored-by: Richard Artoul <[email protected]> Signed-off-by: Sebastien Binet <[email protected]>
This PR enables tests for `ARROW_COMPUTE`, `ARROW_DATASET`, `ARROW_FILESYSTEM`, `ARROW_HDFS`, `ARROW_ORC`, and `ARROW_IPC` (default on). #7131 enabled a minimal set of tests as a starting point. I confirmed that these tests pass locally with the current master. In the current TravisCI environment, we cannot see this result due to a lot of error messages in `arrow-utility-test`. ``` $ git log | head -1 commit ed5f534 % ctest ... Start 1: arrow-array-test 1/51 Test #1: arrow-array-test ..................... Passed 4.62 sec Start 2: arrow-buffer-test 2/51 Test #2: arrow-buffer-test .................... Passed 0.14 sec Start 3: arrow-extension-type-test 3/51 Test #3: arrow-extension-type-test ............ Passed 0.12 sec Start 4: arrow-misc-test 4/51 Test #4: arrow-misc-test ...................... Passed 0.14 sec Start 5: arrow-public-api-test 5/51 Test #5: arrow-public-api-test ................ Passed 0.12 sec Start 6: arrow-scalar-test 6/51 Test #6: arrow-scalar-test .................... Passed 0.13 sec Start 7: arrow-type-test 7/51 Test #7: arrow-type-test ...................... Passed 0.14 sec Start 8: arrow-table-test 8/51 Test #8: arrow-table-test ..................... Passed 0.13 sec Start 9: arrow-tensor-test 9/51 Test #9: arrow-tensor-test .................... Passed 0.13 sec Start 10: arrow-sparse-tensor-test 10/51 Test #10: arrow-sparse-tensor-test ............. Passed 0.16 sec Start 11: arrow-stl-test 11/51 Test #11: arrow-stl-test ....................... Passed 0.12 sec Start 12: arrow-concatenate-test 12/51 Test #12: arrow-concatenate-test ............... Passed 0.53 sec Start 13: arrow-diff-test 13/51 Test #13: arrow-diff-test ...................... Passed 1.45 sec Start 14: arrow-c-bridge-test 14/51 Test #14: arrow-c-bridge-test .................. Passed 0.18 sec Start 15: arrow-io-buffered-test 15/51 Test #15: arrow-io-buffered-test ............... Passed 0.20 sec Start 16: arrow-io-compressed-test 16/51 Test #16: arrow-io-compressed-test ............. Passed 3.48 sec Start 17: arrow-io-file-test 17/51 Test #17: arrow-io-file-test ................... Passed 0.74 sec Start 18: arrow-io-hdfs-test 18/51 Test #18: arrow-io-hdfs-test ................... Passed 0.12 sec Start 19: arrow-io-memory-test 19/51 Test #19: arrow-io-memory-test ................. Passed 2.77 sec Start 20: arrow-utility-test 20/51 Test #20: arrow-utility-test ...................***Failed 5.65 sec Start 21: arrow-threading-utility-test 21/51 Test #21: arrow-threading-utility-test ......... Passed 1.34 sec Start 22: arrow-compute-compute-test 22/51 Test #22: arrow-compute-compute-test ........... Passed 0.13 sec Start 23: arrow-compute-boolean-test 23/51 Test #23: arrow-compute-boolean-test ........... Passed 0.15 sec Start 24: arrow-compute-cast-test 24/51 Test #24: arrow-compute-cast-test .............. Passed 0.22 sec Start 25: arrow-compute-hash-test 25/51 Test #25: arrow-compute-hash-test .............. Passed 2.61 sec Start 26: arrow-compute-isin-test 26/51 Test #26: arrow-compute-isin-test .............. Passed 0.81 sec Start 27: arrow-compute-match-test 27/51 Test #27: arrow-compute-match-test ............. Passed 0.40 sec Start 28: arrow-compute-sort-to-indices-test 28/51 Test #28: arrow-compute-sort-to-indices-test ... Passed 3.33 sec Start 29: arrow-compute-nth-to-indices-test 29/51 Test #29: arrow-compute-nth-to-indices-test .... Passed 1.51 sec Start 30: arrow-compute-util-internal-test 30/51 Test #30: arrow-compute-util-internal-test ..... Passed 0.13 sec Start 31: arrow-compute-add-test 31/51 Test #31: arrow-compute-add-test ............... Passed 0.12 sec Start 32: arrow-compute-aggregate-test 32/51 Test #32: arrow-compute-aggregate-test ......... Passed 14.70 sec Start 33: arrow-compute-compare-test 33/51 Test #33: arrow-compute-compare-test ........... Passed 7.96 sec Start 34: arrow-compute-take-test 34/51 Test #34: arrow-compute-take-test .............. Passed 4.80 sec Start 35: arrow-compute-filter-test 35/51 Test #35: arrow-compute-filter-test ............ Passed 8.23 sec Start 36: arrow-dataset-dataset-test 36/51 Test #36: arrow-dataset-dataset-test ........... Passed 0.25 sec Start 37: arrow-dataset-discovery-test 37/51 Test #37: arrow-dataset-discovery-test ......... Passed 0.13 sec Start 38: arrow-dataset-file-ipc-test 38/51 Test #38: arrow-dataset-file-ipc-test .......... Passed 0.21 sec Start 39: arrow-dataset-file-test 39/51 Test #39: arrow-dataset-file-test .............. Passed 0.12 sec Start 40: arrow-dataset-filter-test 40/51 Test #40: arrow-dataset-filter-test ............ Passed 0.16 sec Start 41: arrow-dataset-partition-test 41/51 Test #41: arrow-dataset-partition-test ......... Passed 0.13 sec Start 42: arrow-dataset-scanner-test 42/51 Test #42: arrow-dataset-scanner-test ........... Passed 0.20 sec Start 43: arrow-filesystem-test 43/51 Test #43: arrow-filesystem-test ................ Passed 1.62 sec Start 44: arrow-hdfs-test 44/51 Test #44: arrow-hdfs-test ...................... Passed 0.13 sec Start 45: arrow-feather-test 45/51 Test #45: arrow-feather-test ................... Passed 0.91 sec Start 46: arrow-ipc-read-write-test 46/51 Test #46: arrow-ipc-read-write-test ............ Passed 5.77 sec Start 47: arrow-ipc-json-simple-test 47/51 Test #47: arrow-ipc-json-simple-test ........... Passed 0.16 sec Start 48: arrow-ipc-json-test 48/51 Test #48: arrow-ipc-json-test .................. Passed 0.27 sec Start 49: arrow-json-integration-test 49/51 Test #49: arrow-json-integration-test .......... Passed 0.13 sec Start 50: arrow-json-test 50/51 Test #50: arrow-json-test ...................... Passed 0.26 sec Start 51: arrow-orc-adapter-test 51/51 Test #51: arrow-orc-adapter-test ............... Passed 1.92 sec 98% tests passed, 1 tests failed out of 51 Label Time Summary: arrow-tests = 27.38 sec (27 tests) arrow_compute = 45.11 sec (14 tests) arrow_dataset = 1.21 sec (7 tests) arrow_ipc = 6.20 sec (3 tests) unittest = 79.91 sec (51 tests) Total Test time (real) = 79.99 sec The following tests FAILED: 20 - arrow-utility-test (Failed) Errors while running CTest ``` Closes #7142 from kiszk/ARROW-8754 Authored-by: Kazuaki Ishizaki <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
From a deadlocked run... ``` #0 0x00007f8a5d48dccd in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x00007f8a5d486f05 in pthread_mutex_lock () from /lib64/libpthread.so.0 #2 0x00007f8a566e7e89 in arrow::internal::FnOnce<void ()>::FnImpl<arrow::Future<Aws::Utils::Outcome<Aws::S3::Model::ListObjectsV2Result, Aws::S3::S3Error> >::Callback<arrow::fs::(anonymous namespace)::TreeWalker::ListObjectsV2Handler> >::invoke() () from /arrow/r/check/arrow.Rcheck/arrow/libs/arrow.so #3 0x00007f8a5650efa0 in arrow::FutureImpl::AddCallback(arrow::internal::FnOnce<void ()>) () from /arrow/r/check/arrow.Rcheck/arrow/libs/arrow.so #4 0x00007f8a566e67a9 in arrow::fs::(anonymous namespace)::TreeWalker::ListObjectsV2Handler::SpawnListObjectsV2() () from /arrow/r/check/arrow.Rcheck/arrow/libs/arrow.so #5 0x00007f8a566e723f in arrow::fs::(anonymous namespace)::TreeWalker::WalkChild(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int) () from /arrow/r/check/arrow.Rcheck/arrow/libs/arrow.so #6 0x00007f8a566e827d in arrow::internal::FnOnce<void ()>::FnImpl<arrow::Future<Aws::Utils::Outcome<Aws::S3::Model::ListObjectsV2Result, Aws::S3::S3Error> >::Callback<arrow::fs::(anonymous namespace)::TreeWalker::ListObjectsV2Handler> >::invoke() () from /arrow/r/check/arrow.Rcheck/arrow/libs/arrow.so #7 0x00007f8a5650efa0 in arrow::FutureImpl::AddCallback(arrow::internal::FnOnce<void ()>) () from /arrow/r/check/arrow.Rcheck/arrow/libs/arrow.so #8 0x00007f8a566e67a9 in arrow::fs::(anonymous namespace)::TreeWalker::ListObjectsV2Handler::SpawnListObjectsV2() () from /arrow/r/check/arrow.Rcheck/arrow/libs/arrow.so #9 0x00007f8a566e723f in arrow::fs::(anonymous namespace)::TreeWalker::WalkChild(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int) () from /arrow/r/check/arrow.Rcheck/arrow/libs/arrow.so #10 0x00007f8a566e74b1 in arrow::fs::(anonymous namespace)::TreeWalker::DoWalk() () from /arrow/r/check/arrow.Rcheck/arrow/libs/arrow.so ``` The callback `ListObjectsV2Handler` is being called recursively and the mutex is non-reentrant thus deadlock. To fix it I got rid of the mutex on `TreeWalker` by using `arrow::util::internal::TaskGroup` instead of manually tracking the #/status of in-flight requests. Closes #9842 from westonpace/bugfix/arrow-12040 Lead-authored-by: Weston Pace <[email protected]> Co-authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>
* add jni metrics interface * add metrics related protocol, update fbs file and generated header file * add Plasma metrics message functions * impl metrcis method (server side) * impl metrics method (client side) * impl metrics method (jni layer) * add ut * fix code style
* add jni metrics interface * add metrics related protocol, update fbs file and generated header file * add Plasma metrics message functions * impl metrcis method (server side) * impl metrics method (client side) * impl metrics method (jni layer) * add ut * fix code style
* add jni metrics interface * add metrics related protocol, update fbs file and generated header file * add Plasma metrics message functions * impl metrcis method (server side) * impl metrics method (client side) * impl metrics method (jni layer) * add ut * fix code style
* add jni metrics interface * add metrics related protocol, update fbs file and generated header file * add Plasma metrics message functions * impl metrcis method (server side) * impl metrics method (client side) * impl metrics method (jni layer) * add ut * fix code style
Before change: ``` Direct leak of 65536 byte(s) in 1 object(s) allocated from: #0 0x522f09 in #1 0x7f28ae5826f4 in #2 0x7f28ae57fa5d in #3 0x7f28ae58cb0f in #4 0x7f28ae58bda0 in ... ``` After change: ``` Direct leak of 65536 byte(s) in 1 object(s) allocated from: #0 0x522f09 in posix_memalign (/build/cpp/debug/arrow-dataset-file-csv-test+0x522f09) #1 0x7f28ae5826f4 in arrow::(anonymous namespace)::SystemAllocator::AllocateAligned(long, unsigned char**) /arrow/cpp/src/arrow/memory_pool.cc:213:24 #2 0x7f28ae57fa5d in arrow::BaseMemoryPoolImpl<arrow::(anonymous namespace)::SystemAllocator>::Allocate(long, unsigned char**) /arrow/cpp/src/arrow/memory_pool.cc:405:5 #3 0x7f28ae58cb0f in arrow::PoolBuffer::Reserve(long) /arrow/cpp/src/arrow/memory_pool.cc:717:9 #4 0x7f28ae58bda0 in arrow::PoolBuffer::Resize(long, bool) /arrow/cpp/src/arrow/memory_pool.cc:741:7 ... ``` Closes #10498 from westonpace/feature/ARROW-13027--c-fix-asan-stack-traces-in-ci Authored-by: Weston Pace <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>
Before change: ``` Direct leak of 65536 byte(s) in 1 object(s) allocated from: #0 0x522f09 in apache#1 0x7f28ae5826f4 in apache#2 0x7f28ae57fa5d in apache#3 0x7f28ae58cb0f in apache#4 0x7f28ae58bda0 in ... ``` After change: ``` Direct leak of 65536 byte(s) in 1 object(s) allocated from: #0 0x522f09 in posix_memalign (/build/cpp/debug/arrow-dataset-file-csv-test+0x522f09) apache#1 0x7f28ae5826f4 in arrow::(anonymous namespace)::SystemAllocator::AllocateAligned(long, unsigned char**) /arrow/cpp/src/arrow/memory_pool.cc:213:24 apache#2 0x7f28ae57fa5d in arrow::BaseMemoryPoolImpl<arrow::(anonymous namespace)::SystemAllocator>::Allocate(long, unsigned char**) /arrow/cpp/src/arrow/memory_pool.cc:405:5 apache#3 0x7f28ae58cb0f in arrow::PoolBuffer::Reserve(long) /arrow/cpp/src/arrow/memory_pool.cc:717:9 apache#4 0x7f28ae58bda0 in arrow::PoolBuffer::Resize(long, bool) /arrow/cpp/src/arrow/memory_pool.cc:741:7 ... ``` Closes apache#10498 from westonpace/feature/ARROW-13027--c-fix-asan-stack-traces-in-ci Authored-by: Weston Pace <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>
* add jni metrics interface * add metrics related protocol, update fbs file and generated header file * add Plasma metrics message functions * impl metrcis method (server side) * impl metrics method (client side) * impl metrics method (jni layer) * add ut * fix code style
* add jni metrics interface * add metrics related protocol, update fbs file and generated header file * add Plasma metrics message functions * impl metrcis method (server side) * impl metrics method (client side) * impl metrics method (jni layer) * add ut * fix code style
* add jni metrics interface * add metrics related protocol, update fbs file and generated header file * add Plasma metrics message functions * impl metrcis method (server side) * impl metrics method (client side) * impl metrics method (jni layer) * add ut * fix code style
…ncryption-cython-refactor Dataset parquet encryption cython refactor
…ormatting buffer With ASAN, this reproduces the issue. [ RUN ] Formatting.Timestamp ================================================================= ==4191383==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fff1804c48f at pc 0x5608edffe39d bp 0x7fff1804c110 sp 0x7fff1804c108 WRITE of size 1 at 0x7fff1804c48f thread T0 #0 0x5608edffe39c in arrow::internal::detail::FormatOneChar(char, char**) /home/felipeo/code/arrow/cpp/src/arrow/util/formatting.h:132:67 apache#1 0x5608ee035c00 in arrow::internal::detail::FormatYYYY_MM_DD(arrow_vendored::date::year_month_day, char**) /home/felipeo/code/arrow/cpp/src/arrow/util/formatting.h:351:5 apache#2 0x5608ee05e8a0 in decltype(std::declval<arrow::StringAppender&>()(std::basic_string_view<char, std::char_traits<char> >{})) arrow::internal::StringFormatter<arrow::TimestampType, void>::operator()<std::chrono::duration<long, std::ratio<1l, 1000l> >, arrow::StringAppender&>(std::chrono::duration<long, std::ratio<1l, 1000l> >, long, arrow::StringAppender&) /home/felipeo/code/arrow/cpp/src/arrow/util/formatting.h :521:5 apache#3 0x5608ee05d60f in decltype(std::declval<arrow::internal::StringFormatter<arrow::TimestampType, void>&>()(std::chrono::duration<long, std::ratio<1l, 1l> >{}, std::declval<long&>(), std::declval<arrow::StringAppender&>())) arrow::util::VisitDuration<arrow::internal::StringFormatter<arrow::TimestampType, void>&, long&, arrow::StringAppender&>(arrow::TimeUnit::type, arrow::internal::StringFormatter<arrow::Timestam pType, void>&, long&, arrow::StringAppender&) /home/felipeo/code/arrow/cpp/src/arrow/util/time.h:60:14 apache#4 0x5608ee05d122 in decltype(std::declval<arrow::StringAppender&>()(std::basic_string_view<char, std::char_traits<char> >{})) arrow::internal::StringFormatter<arrow::TimestampType, void>::operator()<arrow::StringAppender&>(long, arrow::StringAppender&) /home/felipeo/code/arrow/cpp/src/arrow/util/formatting.h:527:12 apache#5 0x5608edfeffb3 in void arrow::AssertFormatting<arrow::internal::StringFormatter<arrow::TimestampType, void>, long>(arrow::internal::StringFormatter<arrow::TimestampType, void>&, long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /home/felipeo/code/arrow/cpp/src/arrow/util/formatting_util_test.cc:52:3 apache#6 0x5608edfece95 in arrow::Formatting_Timestamp_Test::TestBody() /home/felipeo/code/arrow/cpp/src/arrow/util/formatting_util_test.cc:540:5 apache#7 0x7fd95d7901de in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (/home/felipeo/code/arrow/cpp/ninja/debug/libarrow_testing.so.1600+0xd901de) (BuildId: dd9af0bafdb1786050262e8f6002568a9f08ecf6) apache#8 0x7fd95d784905 in testing::Test::Run() (/home/felipeo/code/arrow/cpp/ninja/debug/libarrow_testing.so.1600+0xd84905) (BuildId: dd9af0bafdb1786050262e8f6002568a9f08ecf6) apache#9 0x7fd95d784a84 in testing::TestInfo::Run() (/home/felipeo/code/arrow/cpp/ninja/debug/libarrow_testing.so.1600+0xd84a84) (BuildId: dd9af0bafdb1786050262e8f6002568a9f08ecf6) apache#10 0x7fd95d785038 in testing::TestSuite::Run() (/home/felipeo/code/arrow/cpp/ninja/debug/libarrow_testing.so.1600+0xd85038) (BuildId: dd9af0bafdb1786050262e8f6002568a9f08ecf6) apache#11 0x7fd95d78573e in testing::internal::UnitTestImpl::RunAllTests() (/home/felipeo/code/arrow/cpp/ninja/debug/libarrow_testing.so.1600+0xd8573e) (BuildId: dd9af0bafdb1786050262e8f6002568a9f08ecf6) apache#12 0x7fd95d7907a6 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) (/home/felipeo/code/arrow/cpp/ninja/debug/libarrow_testing.so.1600+0xd907a6) (BuildId: dd9af0bafdb1786050262e8f6002568a9f08ecf6) apache#13 0x7fd95d784b4b in testing::UnitTest::Run() (/home/felipeo/code/arrow/cpp/ninja/debug/libarrow_testing.so.1600+0xd84b4b) (BuildId: dd9af0bafdb1786050262e8f6002568a9f08ecf6) apache#14 0x5608ee54506d in RUN_ALL_TESTS() /usr/include/gtest/gtest.h:2490:46 apache#15 0x5608ee544fb9 in main /home/felipeo/code/arrow/cpp/src/arrow/util/logging_test.cc:129:10 apache#16 0x7fd93de29d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16 apache#17 0x7fd93de29e3f in __libc_start_main csu/../csu/libc-start.c:392:3 apache#18 0x5608ed7b4f64 in _start (/home/felipeo/code/arrow/cpp/ninja/debug/arrow-utility-test+0x1224f64) (BuildId: 81cfdc36b7a960a7249ecd5884beaa869140ab89) Address 0x7fff1804c48f is located in stack of thread T0 at offset 399 in frame #0 0x5608ee05dc9f in decltype(std::declval<arrow::StringAppender&>()(std::basic_string_view<char, std::char_traits<char> >{})) arrow::internal::StringFormatter<arrow::TimestampType, void>::operator()<std::chrono::duration<long, std::ratio<1l, 1000l> >, arrow::StringAppender&>(std::chrono::duration<long, std::ratio<1l, 1000l> >, long, arrow::StringAppender&) /home/felipeo/code/arrow/cpp/src/arrow/util/formatting.h :486
Rename FORCED_SUCESS to FORCED_SUC**C**ESS in memory.AllocationManager.AllocationOutcome.