Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEAT] Add initial Spark Connect support #3261

Merged
merged 12 commits into from
Nov 11, 2024
Merged

Conversation

andrewgazelka
Copy link
Contributor

@andrewgazelka andrewgazelka commented Nov 11, 2024

Implements basic Spark Connect functionality in Daft with the following:

  • Add daft-connect crate to handle Spark Connect protocol
  • Implement configuration management via Spark Connect API
  • Add Python bindings with connect_start() function
  • Add integration tests for config operations

Currently supports:

  • Basic session management
  • Config operations (Set, Get, GetWithDefault, GetOption, GetAll, Unset)
  • Error handling and status reporting

Notable changes:

  • New dependency on spark-connect protocol
  • Added tracing for debugging and monitoring
  • Integration with existing Daft infrastructure

Some operations like execute_plan, analyze_plan, and artifact handling are
currently unimplemented and will return appropriate error messages.

@github-actions github-actions bot added the enhancement New feature or request label Nov 11, 2024
Copy link

codspeed-hq bot commented Nov 11, 2024

CodSpeed Performance Report

Merging #3261 will degrade performances by 34.58%

Comparing andrew/connect-config (a3c6037) with main (f290f40)

Summary

⚡ 1 improvements
❌ 2 regressions
✅ 14 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark main andrew/connect-config Change
test_count[1 Small File] 3.3 ms 4 ms -17.44%
test_iter_rows_first_row[100 Small Files] 177.3 ms 271 ms -34.58%
test_show[100 Small Files] 51.1 ms 42.1 ms +21.5%

Copy link

codecov bot commented Nov 11, 2024

Codecov Report

Attention: Patch coverage is 69.48052% with 94 lines in your changes missing coverage. Please review.

Project coverage is 77.78%. Comparing base (f290f40) to head (a3c6037).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/daft-connect/src/lib.rs 58.67% 50 Missing ⚠️
src/daft-connect/src/main.rs 0.00% 33 Missing ⚠️
src/daft-table/src/lib.rs 0.00% 8 Missing ⚠️
src/daft-connect/src/config.rs 95.00% 3 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3261      +/-   ##
==========================================
+ Coverage   76.07%   77.78%   +1.70%     
==========================================
  Files         644      650       +6     
  Lines       81581    80350    -1231     
==========================================
+ Hits        62065    62501     +436     
+ Misses      19516    17849    -1667     
Files with missing lines Coverage Δ
src/daft-connect/src/session.rs 100.00% <100.00%> (ø)
src/daft-connect/src/util.rs 100.00% <100.00%> (ø)
src/daft-local-execution/src/lib.rs 92.85% <ø> (ø)
src/lib.rs 95.77% <100.00%> (+0.06%) ⬆️
src/daft-connect/src/config.rs 95.00% <95.00%> (ø)
src/daft-table/src/lib.rs 88.59% <0.00%> (-1.17%) ⬇️
src/daft-connect/src/main.rs 0.00% <0.00%> (ø)
src/daft-connect/src/lib.rs 58.67% <58.67%> (ø)

... and 27 files with indirect coverage changes

src/daft-connect/src/lib.rs Outdated Show resolved Hide resolved
@andrewgazelka andrewgazelka enabled auto-merge (squash) November 11, 2024 23:29
@andrewgazelka andrewgazelka changed the title [FEAT] daft-connect add config option [FEAT] Add initial Spark Connect support Nov 11, 2024
@andrewgazelka andrewgazelka enabled auto-merge (squash) November 11, 2024 23:30
@andrewgazelka andrewgazelka merged commit 16f5a8c into main Nov 11, 2024
47 of 48 checks passed
@andrewgazelka andrewgazelka deleted the andrew/connect-config branch November 11, 2024 23:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants