forked from apache/arrow
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache object code in memory instead of entire module. #4
Open
augustoasilva
wants to merge
126
commits into
master
Choose a base branch
from
feature/cache-object-code-in-memory
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Closes apache#10157 from mathyingzhou/ARROW-9299 Lead-authored-by: Ying Zhou <[email protected]> Co-authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>
For serial CSV readers track the absolute row number and report it in errors encountered during parsing or converting. I did try to get row numbers for the parallel reader but the only way I thought that could work would be to add delimiter counting to the Chunker but that seemed to add more complexity than I wanted to. Closes apache#10321 from n3world/ARROW-12675-report_rows Authored-by: Nate Clark <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>
…_id's Questions: - This is my first PR in the parquet namespace, I'm not sure of all the special rules. - The field ID generation doesn't happen on the `parquet::schema` -> `arrow::schema` phase but on the `parquet::format::schema` -> `parquet::schema` phase. So in order to test I had to add `#include "generated/parquet_types.h"` to `arrow_schema_test.cc` and I wasn't sure if I was allowed to reference the `generated/*` files like that. - This PR simply allows user specified field id's to be persisted. Is that sufficient for PARQUET-1798 (the title is rather general) or should I open up a dedicated JIRA? Closes apache#10289 from westonpace/feature/PARQUET-1798-field-id-assignment Lead-authored-by: Weston Pace <[email protected]> Co-authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>
* Download URL is wrong * Downloaded packages aren't removed Closes apache#10418 from kou/release-csharp Authored-by: Sutou Kouhei <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
Closes apache#10343 from thisisnic/ARROW-12758_examples Lead-authored-by: Nic Crane <[email protected]> Co-authored-by: Jonathan Keane <[email protected]> Signed-off-by: Jonathan Keane <[email protected]>
This runs reverse dependency checks using {revdepchecks}. The way that works is by installing a release version of arrow and the current development version (i.e. from the git checkout), and then runs checks on each of the reverse dependencies first with the release (called "old" in {revdepcheck}'s terms) and with the development version ("new" in {revdepcheck}'s terms). Then it compares the outputs and will only fail if there is a failure in the new check that is not in the old check. I've customized the output a bit so that it prints any errors that come up in either (in the revdepcheck problems step) so we can more easily diagnose, but it will only fail if there are new errors. One thing that I tried and was unable to do is to find a way to cache packages+info across runs. The github cache action will create a cache, but because of how they are run on crossbow (i.e. on different branches) the caches are never accessible in different runs. I've kept the cacheing step in for now, if we could find a way to (manually?) run this on the main branch like https://github.com/ursacomputing/crossbow/blob/master/.github/workflows/cache_vcpkg.yml before we use this heavily (i.e. likely only around a release) that would create a cache that could be used to speed up some of the jobs. Closes apache#10345 from jonkeane/ARROW-12569-revdepcheck Authored-by: Jonathan Keane <[email protected]> Signed-off-by: Jonathan Keane <[email protected]>
Adjust the R version used to be able to install binary arrow packages from RSPM. Small adjustment to tests that doesn't require the order of attributes to be fixed (the order changed slightly in version 3.0.0) Closes apache#10409 from jonkeane/ARROW-12883-version-compatibility Authored-by: Jonathan Keane <[email protected]> Signed-off-by: Jonathan Keane <[email protected]>
Closes apache#10368 from thisisnic/ARROW-12841_examples_part_2 Authored-by: Nic Crane <[email protected]> Signed-off-by: Jonathan Keane <[email protected]>
…nd is_in Closes apache#10383 from thisisnic/ARROW-12777_match_arrow_is_in Authored-by: Nic Crane <[email protected]> Signed-off-by: Jonathan Keane <[email protected]>
Closes apache#10419 from raybellwaves/docs-np-import Authored-by: Ray Bell <[email protected]> Signed-off-by: David Li <[email protected]>
Closes apache#10413 from jonkeane/ARROW-12894 Authored-by: Jonathan Keane <[email protected]> Signed-off-by: Neal Richardson <[email protected]>
…ory' into feature/cache-object-code-in-memory # Conflicts: # cpp/src/gandiva/base_object_cache.h # cpp/src/gandiva/cache.h # cpp/src/gandiva/engine.h # cpp/src/gandiva/lru_cache.h # cpp/src/gandiva/projector.cc # cpp/src/gandiva/projector.h
anthonylouisbsb
pushed a commit
that referenced
this pull request
Jun 16, 2021
Before change: ``` Direct leak of 65536 byte(s) in 1 object(s) allocated from: #0 0x522f09 in #1 0x7f28ae5826f4 in #2 0x7f28ae57fa5d in #3 0x7f28ae58cb0f in #4 0x7f28ae58bda0 in ... ``` After change: ``` Direct leak of 65536 byte(s) in 1 object(s) allocated from: #0 0x522f09 in posix_memalign (/build/cpp/debug/arrow-dataset-file-csv-test+0x522f09) #1 0x7f28ae5826f4 in arrow::(anonymous namespace)::SystemAllocator::AllocateAligned(long, unsigned char**) /arrow/cpp/src/arrow/memory_pool.cc:213:24 #2 0x7f28ae57fa5d in arrow::BaseMemoryPoolImpl<arrow::(anonymous namespace)::SystemAllocator>::Allocate(long, unsigned char**) /arrow/cpp/src/arrow/memory_pool.cc:405:5 #3 0x7f28ae58cb0f in arrow::PoolBuffer::Reserve(long) /arrow/cpp/src/arrow/memory_pool.cc:717:9 #4 0x7f28ae58bda0 in arrow::PoolBuffer::Resize(long, bool) /arrow/cpp/src/arrow/memory_pool.cc:741:7 ... ``` Closes apache#10498 from westonpace/feature/ARROW-13027--c-fix-asan-stack-traces-in-ci Authored-by: Weston Pace <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.