Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arrow2 02092022 #1795

Merged
merged 51 commits into from
Feb 15, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
7b8d72c
feat: add join type for logical plan display (#1674)
xudong963 Jan 27, 2022
18ced8d
(minor) Reduce memory manager and disk manager logs from `info!` to `…
alamb Jan 28, 2022
ed1de63
Move `information_schema` tests out of execution/context.rs to `sql_i…
alamb Jan 28, 2022
ab145c8
Move timestamp related tests out of context.rs and into sql integrati…
alamb Jan 28, 2022
641338f
Add `MemTrackingMetrics` to ease memory tracking for non-limited memo…
yjshen Jan 29, 2022
0d6d1ce
Implement TableProvider for DataFrameImpl (#1699)
cpcloud Jan 30, 2022
75c7578
refine test in repartition.rs & coalesce_batches.rs (#1707)
xudong963 Jan 30, 2022
a7f0156
Fuzz test for spillable sort (#1706)
yjshen Jan 30, 2022
fecce97
Lazy TempDir creation in DiskManager (#1695)
alamb Jan 30, 2022
3494e9c
Incorporate dyn scalar kernels (#1685)
matthewmturner Jan 30, 2022
2512608
add annotation for select_to_plan (#1714)
xudong963 Jan 31, 2022
1caf52a
Support `create_physical_expr` and `ExecutionContextState` or `Defaul…
alamb Jan 31, 2022
f849968
Fix can not load parquet table form spark in datafusion-cli. (#1665)
Ted-Jiang Jan 31, 2022
d01d8d5
add upper bound for pub fn (#1713)
HaoYang670 Jan 31, 2022
7bec762
Create SchemaAdapter trait to map table schema to file schemas (#1709)
thinkharderdev Jan 31, 2022
cfb655d
approx_quantile() aggregation function (#1539)
domodwyer Jan 31, 2022
940d4eb
suppport bitwise and as an example (#1653)
liukun4515 Jan 31, 2022
b6ace16
fix: substr - correct behaivour with negative start pos (#1660)
ovr Jan 31, 2022
bacf10d
minor: fix cargo run --release error (#1723)
xudong963 Feb 1, 2022
b9a8f15
Convert boolean case expressions to boolean logic (#1719)
tustvold Feb 1, 2022
46879f1
substitute `parking_lot::Mutex` for `std::sync::Mutex` (#1720)
xudong963 Feb 2, 2022
e4a056f
Add Expression Simplification API (#1717)
alamb Feb 2, 2022
d1ebdbf
Add tests and CI for optional pyarrow module (#1711)
wjones127 Feb 3, 2022
aca855d
Update parking_lot requirement from 0.11 to 0.12 (#1735)
dependabot[bot] Feb 3, 2022
78c30b6
Prevent repartitioning of certain operator's direct children (#1731) …
tustvold Feb 3, 2022
b2eaee3
API to get Expr's type and nullability without a `DFSchema` (#1726)
alamb Feb 3, 2022
5124759
Fix typos in crate documentation (#1739)
r4ntix Feb 3, 2022
97a1b21
add `cargo check --release` to ci (#1737)
xudong963 Feb 4, 2022
15cfcbc
Move optimize test out of context.rs (#1742)
alamb Feb 4, 2022
40df55f
use clap 3 style args parsing for datafusion cli (#1749)
jimexist Feb 5, 2022
e52f844
Add partitioned_csv setup code to sql_integration test (#1743)
alamb Feb 5, 2022
4f4153b
use ordered-float 2.10 (#1756)
andygrove Feb 5, 2022
f139ef8
#1768 Support TimeUnit::Second in hasher (#1769)
jychen7 Feb 7, 2022
31d0adf
format (#1745)
xudong963 Feb 7, 2022
40c29e5
Create built-in scalar functions programmatically (#1734)
HaoYang670 Feb 7, 2022
fe46a1e
[split/1] split datafusion-common module (#1751)
jimexist Feb 7, 2022
d014ff2
fix: Case insensitive unquoted identifiers (#1747)
mkmik Feb 7, 2022
2e535f9
move dfschema and column (#1758)
jimexist Feb 7, 2022
a39a223
add datafusion-expr module (#1759)
jimexist Feb 7, 2022
2ec34cf
move column, dfschema, etc. to common module (#1760)
jimexist Feb 7, 2022
09c67d5
include window frames and operator into datafusion-expr (#1761)
jimexist Feb 7, 2022
3c39c72
move signature, type signature, and volatility to split module (#1763)
jimexist Feb 8, 2022
86dcb09
[split/10] split up expr for rewriting, visiting, and simplification …
jimexist Feb 8, 2022
4b68273
move built-in scalar functions (#1764)
jimexist Feb 8, 2022
f2615af
split expr type and null info to be expr-schemable (#1784)
jimexist Feb 8, 2022
e8c198b
rewrite predicates before pushing to union inputs (#1781)
korowa Feb 8, 2022
ed9b049
move accumulator and columnar value (#1765)
jimexist Feb 9, 2022
014e5e9
move accumulator and columnar value (#1762)
jimexist Feb 9, 2022
d23c873
merge latest datafusion on 02092022
Igosuki Feb 9, 2022
b2cfe2b
fix bad data type in test_try_cast_decimal_to_decimal
Igosuki Feb 9, 2022
7910765
added projections for avro columns
Igosuki Feb 12, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 57 additions & 2 deletions .github/workflows/rust.yml
Original file line number Diff line number Diff line change
Expand Up @@ -58,12 +58,18 @@ jobs:
rustup toolchain install ${{ matrix.rust }}
rustup default ${{ matrix.rust }}
rustup component add rustfmt
- name: Build Workspace
- name: Build workspace in debug mode
run: |
cargo build
env:
CARGO_HOME: "/github/home/.cargo"
CARGO_TARGET_DIR: "/github/home/target"
CARGO_TARGET_DIR: "/github/home/target/debug"
- name: Build workspace in release mode
run: |
cargo check --release
env:
CARGO_HOME: "/github/home/.cargo"
CARGO_TARGET_DIR: "/github/home/target/release"
- name: Check DataFusion Build without default features
run: |
cargo check --no-default-features -p datafusion
Expand Down Expand Up @@ -230,6 +236,55 @@ jobs:
# do not produce debug symbols to keep memory usage down
RUSTFLAGS: "-C debuginfo=0"

test-datafusion-pyarrow:
needs: [linux-build-lib]
runs-on: ubuntu-latest
strategy:
matrix:
arch: [amd64]
rust: [stable]
container:
image: ${{ matrix.arch }}/rust
env:
# Disable full debug symbol generation to speed up CI build and keep memory down
# "1" means line tables only, which is useful for panic tracebacks.
RUSTFLAGS: "-C debuginfo=1"
steps:
- uses: actions/checkout@v2
with:
submodules: true
- name: Cache Cargo
uses: actions/cache@v2
with:
path: /github/home/.cargo
# this key equals the ones on `linux-build-lib` for re-use
key: cargo-cache-
- name: Cache Rust dependencies
uses: actions/cache@v2
with:
path: /github/home/target
# this key equals the ones on `linux-build-lib` for re-use
key: ${{ runner.os }}-${{ matrix.arch }}-target-cache-${{ matrix.rust }}
- uses: actions/setup-python@v2
with:
python-version: "3.8"
- name: Install PyArrow
run: |
echo "LIBRARY_PATH=$LD_LIBRARY_PATH" >> $GITHUB_ENV
python -m pip install pyarrow
- name: Setup Rust toolchain
run: |
rustup toolchain install ${{ matrix.rust }}
rustup default ${{ matrix.rust }}
rustup component add rustfmt
- name: Run tests
run: |
cd datafusion
cargo test --features=pyarrow
env:
CARGO_HOME: "/github/home/.cargo"
CARGO_TARGET_DIR: "/github/home/target"

lint:
name: Lint
runs-on: ubuntu-latest
Expand Down
6 changes: 4 additions & 2 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@
[workspace]
members = [
"datafusion",
"datafusion-common",
"datafusion-expr",
"datafusion-cli",
"datafusion-examples",
"benchmarks",
Expand All @@ -33,5 +35,5 @@ lto = true
codegen-units = 1

[patch.crates-io]
#arrow2 = { git = "https://github.com/jorgecarleitao/arrow2.git", branch = "main" }
#parquet2 = { git = "https://github.com/jorgecarleitao/parquet2.git", branch = "main" }
arrow2 = { git = "https://github.com/jorgecarleitao/arrow2.git", branch = "main" }
parquet2 = { git = "https://github.com/jorgecarleitao/parquet2.git", branch = "main" }
Loading