#793 |
Update Jenkins scripts for release |
#798 |
Fix shims provider override config not being seen by executors |
#785 |
Make shuffle run on CPU if we do a join where we read from bucketed table |
#765 |
Add config to override shims provider class |
#759 |
Add CHANGELOG for release 0.2 |
#758 |
Skip the udf test fails periodically. |
#752 |
Fix snapshot plugin jar version in docs |
#751 |
Correct the channel for cudf installation |
#754 |
Filter nulls from joins where possible to improve performance |
#732 |
Add a timeout for RapidsShuffleIterator to prevent jobs to hang infin… |
#637 |
Documentation changes for 0.2 release |
#747 |
Disable udf tests that fail periodically |
#745 |
Revert Null Join Filter |
#741 |
Fix issue with parquet partitioned reads |
#733 |
Remove GPU Types from github |
#720 |
Stop removing GpuCoalesceBatches from non-AQE queries when AQE is enabled |
#729 |
Fix collect time metric in CoalesceBatches |
#640 |
Support running Pandas UDFs on GPUs in Python processes. |
#721 |
Add some more checks to databricks build scripts |
#714 |
Move spark 3.0.1-shims out of snapshot-shims |
#711 |
fix blossom checkout repo |
#709 |
[BUG] fix unexpected indentation issue in blossom yml |
#642 |
Init workflow for blossom-ci |
#705 |
Enable configuration check for cast string to timestamp |
#702 |
Update slack channel for Jenkins builds |
#701 |
fix checkout-ref for automerge |
#695 |
Fix spark-3.0.1 shim to be released |
#668 |
refactor automerge to support merge for protected branch |
#687 |
Include the UDF compiler in the dist jar |
#689 |
Change shims dependency to spark-3.0.1 |
#677 |
Use multi-threaded parquet read with small files |
#638 |
Add Parquet-based cache serializer |
#613 |
Enable UCX + AQE |
#684 |
Enable test for literal string values in a select |
#686 |
Remove sorts when replacing sort aggregate if possible |
#675 |
Added TimeAdd |
#645 |
[window] Add GpuWindowExec requiredChildOrdering |
#676 |
fixUpJoinConsistency rule now works when AQE is enabled |
#683 |
Fix issues with cannonicalization of WrappedAggFunction |
#682 |
Fix path to start-slave.sh script in docs |
#673 |
Increase build timeouts on nightly and premerge builds |
#648 |
add signoff-check use github actions |
#593 |
Add support for isNaN and datetime related instructions in UDF compiler |
#666 |
[window] Disable GPU for COUNT(exp) queries |
#655 |
Implement AQE unit test for InsertAdaptiveSparkPlan |
#614 |
Fix for aggregation with multiple distinct and non distinct functions |
#657 |
Fix verify build after integration tests are run |
#660 |
Add in neverReplaceExec and several rules for it |
#639 |
BooleanType test shouldn't xfail |
#652 |
Mark UVM config as internal until supported |
#653 |
Move to the cudf-0.15 release |
#647 |
Improve warnings about AQE nodes not supported on GPU |
#646 |
Stop reporting zero metrics for GpuCustomShuffleReader |
#644 |
Small fix for race in catalog where a buffer could get spilled while … |
#623 |
Fix issues with canonicalization |
#599 |
[FEA] changelog generator |
#563 |
cudf and spark version info in artifacts |
#633 |
Fix leak if RebaseHelper throws during Parquet read |
#632 |
Copy function isSearchableType from Spark because signature changed in 3.0.1 |
#583 |
Add udf compiler unit tests |
#617 |
Documentation updates for branch 0.2 |
#616 |
Add config to reserve GPU memory |
#612 |
[REVIEW] Fix incorrect output from averages with filters in partial only mode |
#609 |
fix minor issues with instructions for building ucx |
#611 |
Added in profile to enable shims for SNAPSHOT releases |
#595 |
Parquet small file reading optimization |
#582 |
fix #579 Auto-merge between branches |
#536 |
Add test for skewed join optimization when AQE is enabled |
#603 |
Fix data size metric always 0 when using RAPIDS shuffle |
#600 |
Fix calculation of string data for compressed batches |
#597 |
Remove the xfail for parquet test_read_merge_schema on Databricks |
#591 |
Add ucx license in NOTICE-binary |
#596 |
Add Spark 3.0.2 to Shim layer |
#594 |
Filter nulls from joins where possible to improve performance. |
#590 |
Move GpuParquetScan/GpuOrcScan into Shim |
#588 |
xfail the tpch spark 3.1.0 tests that fail |
#572 |
Update buffer store to return compressed batches directly, add compression NVTX ranges |
#558 |
Fix unit tests when AQE is enabled |
#580 |
xfail the Spark 3.1.0 integration tests that fail |
#565 |
Minor improvements to TPC-DS benchmarking code |
#567 |
Explicitly disable AQE in one test |
#571 |
Fix Databricks shim layer for GpuFileSourceScanExec and GpuBroadcastExchangeExec |
#564 |
Add GPU decode time metric to scans |
#562 |
getCatalog can be called from the driver, and can return null |
#555 |
Fix build warnings for ColumnViewAccess |
#560 |
Fix databricks build for AQE support |
#557 |
Fix tests failing on Spark 3.1 |
#547 |
Add GPU metrics to GpuFileSourceScanExec |
#462 |
Implement optimized AQE support so that exchanges run on GPU where possible |
#550 |
Document Parquet and ORC compression support |
#539 |
Update script to audit multiple Spark versions |
#543 |
Add metrics to GpuUnion operator |
#549 |
Move spark shim properties to top level pom |
#497 |
Add UDF compiler implementations |
#487 |
Add framework for batch compression of shuffle partitions |
#544 |
Add in driverExtraClassPath for standalone mode docs |
#546 |
Fix Spark 3.1.0 shim build error in GpuHashJoin |
#537 |
Use fresh SparkSession when capturing to avoid late capture of previous query |
#538 |
Revert "Temporary workaround for RMM initial pool size bug (#530)" |
#517 |
Add config to limit maximum RMM pool size |
#527 |
Add support for split and getArrayIndex |
#534 |
Fixes bugs around GpuShuffleEnv initialization |
#529 |
[BUG] Degenerate table metas were not getting copied to the heap |
#530 |
Temporary workaround for RMM initial pool size bug |
#526 |
Fix bug with nullability reporting in GpuFilterExec |
#521 |
Fix typo with databricks shim classname SparkShimServiceProvider |
#522 |
Use SQLConf instead of SparkConf when looking up SQL configs |
#518 |
Fix init order issue in GpuShuffleEnv when RAPIDS shuffle configured |
#514 |
Added clarification of RegExpReplace, DateDiff, made descriptive text consistent |
#506 |
Add in basic support for running tpcds like queries |
#504 |
Add ability to ignore tests depending on spark shim version |
#503 |
Remove unused async buffer spill support |
#501 |
disable codegen in 3.1 shim for hash join |
#466 |
Optimize and fix Api validation script |
#481 |
Codeowners |
#439 |
Check a PR has been committed using git signoff |
#319 |
Update partitioning logic in ShuffledBatchRDD |
#491 |
Temporarily ignore AQE integration tests |
#490 |
Fix Spark 3.1.0 build for HashJoin changes |
#482 |
Prevent bad practice in python tests |
#485 |
Show plan in assertion message if test fails |
#480 |
Fix link from README to getting-started.md |
#448 |
Preliminary support for keeping broadcast exchanges on GPU when AQE is enabled |
#478 |
Fall back to CPU for binary as string in parquet |
#477 |
Fix special case joins in broadcast nested loop join |
#469 |
Update HashAggregateSuite to work with AQE |
#475 |
Udf compiler pom followup |
#434 |
Add UDF compiler skeleton |
#474 |
Re-enable noscaladoc check |
#461 |
Fix comments style to pass scala style check |
#468 |
fix broken link |
#456 |
Add closeOnExcept to clean up code that closes resources only on exceptions |
#464 |
Turn off noscaladoc rule until codebase is fixed |
#449 |
Enforce NoScalaDoc rule in scalastyle checks |
#450 |
Enable scalastyle for shuffle plugin |
#451 |
Databricks remove unneeded files and fix build to not fail on rm when file missing |
#442 |
Shim layer support for Spark 3.0.0 Databricks |
#447 |
Add scalastyle plugin to shim module |
#426 |
Update BufferMeta to support multiple codec buffers per table |
#440 |
Run mortgage test both with AQE on and off |
#445 |
Added in StringRPad and StringLPad |
#422 |
Documentation updates |
#437 |
Fix bug with InSet and Strings |
#435 |
Add in checks for Parquet LEGACY date/time rebase |
#432 |
Fix batch use-after-close in partitioning, shuffle env init |
#423 |
Fix duplicates includes in assembly jar |
#418 |
CI Add unit tests running for Spark 3.0.1 |
#421 |
Make it easier to run TPCxBB benchmarks from spark shell |
#413 |
Fix download link |
#414 |
Shim Layer to support multiple Spark versions |
#406 |
Update cast handling to deal with new libcudf casting limitations |
#405 |
Change slave->worker |
#395 |
Databricks doc updates |
#401 |
Extended the FAQ |
#398 |
Add tests for GpuPartition |
#352 |
Change spark tgz package name |
#397 |
Fix small bug in ShuffleBufferCatalog.hasActiveShuffle |
#286 |
[REVIEW] Updated join tests for cache |
#393 |
Contributor license agreement |
#389 |
Added in support for RangeExec |
#390 |
Ucx getting started |
#391 |
Hide slack channel in Jenkins scripts |
#387 |
Remove the term whitelist |
#365 |
[REVIEW] Timesub tests |
#383 |
Test utility to compare SQL query results between CPU and GPU |
#380 |
Fix databricks notebook link |
#378 |
Added in FAQ and fixed spelling |
#377 |
Update heading in configs.md |
#373 |
Modifying branch name to conform with rapidsai branch name change |
#376 |
Add our session extension correctly if there are other extensions configured |
#374 |
Fix rat issue for notebooks |
#364 |
Update Databricks patch for changes to GpuSortMergeJoin |
#371 |
fix typo and use regional bucket per GCP's update |
#359 |
Karthik changes |
#353 |
Fix broadcast nested loop join for the no column case |
#313 |
Additional tests for broadcast hash join |
#342 |
Implement build-side rules for shuffle hash join |
#349 |
Updated join code to treat null equality properly |
#335 |
Integration tests on spark 3.0.1-SNAPSHOT & 3.1.0-SNAPSHOT |
#346 |
Update the Title Header for Fine Tuning |
#344 |
Fix small typo in readme |
#331 |
Adds iterator and client unit tests, and prepares for more fetch failure handling |
#337 |
Fix Scala compile phase to allow Java classes referencing Scala classes |
#332 |
Match GPU overwritten functions with SQL functions from FunctionRegistry |
#339 |
Fix databricks build |
#338 |
Move GpuPartitioning to a separate file |
#310 |
Update release Jenkinsfile for Databricks |
#330 |
Hide private info in Jenkins scripts |
#324 |
Add in basic support for GpuCartesianProductExec |
#328 |
Enable slack notification for Databricks build |
#321 |
update databricks patch for GpuBroadcastNestedLoopJoinExec |
#322 |
Add oss.sonatype.org to download the cudf jar |
#320 |
Don't mount passwd/group to the container |
#258 |
Enable running TPCH tests with AQE enabled |
#318 |
Build docker image with Dockerfile |
#309 |
Update databricks patch to latest changes |
#312 |
Trigger branch-0.2 integration test |
#307 |
[Jenkins] Update the release script and Jenkinsfile |
#304 |
[DOC][Minor] Fix typo in spark config name. |
#303 |
Update compatibility doc for -0.0 issues |
#301 |
Add info about branches in README.md |
#296 |
Added in basic support for broadcast nested loop join |
#297 |
Databricks CI improvements and support runtime env parameter to xfail certain tests |
#292 |
Move artifacts version in version-def.sh |
#254 |
Cleanup QA tests |
#289 |
Clean up GpuCollectLimitMeta and add in metrics |
#287 |
Add in support for right join and fix issues build right |
#273 |
Added releases to the README.md |
#285 |
modify run_pyspark_from_build.sh to be bash 3 friendly |
#281 |
Add in support for Full Outer Join on non-null keys |
#274 |
Add RapidsDiskStore tests |
#259 |
Add RapidsHostMemoryStore tests |
#282 |
Update Databricks patch for 0.2 branch |
#261 |
Add conditional xfail test for DISTINCT aggregates with NaN |
#263 |
More time ops |
#256 |
Remove special cases for contains, startsWith, and endWith |
#253 |
Remove GpuAttributeReference and GpuSortOrder |
#271 |
Update the versions for 0.2.0 properly for the databricks build |
#162 |
Integration tests for corner cases in window functions. |
#264 |
Add a local mvn repo for nightly pipeline |
#262 |
Refer to branch-0.2 |
#255 |
Revert change to make dependencies of shaded jar optional |
#257 |
Fix link to RAPIDS cudf in index.md |
#252 |
Update to 0.2.0-SNAPSHOT and cudf-0.15-SNAPSHOT |