Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmark Truffle "add" for arrow-language and Column.+ #10056

Closed
JaroslavTulach opened this issue May 23, 2024 · 5 comments · Fixed by #10150
Closed

Benchmark Truffle "add" for arrow-language and Column.+ #10056

JaroslavTulach opened this issue May 23, 2024 · 5 comments · Fixed by #10150
Assignees
Labels
-compiler -libs Libraries: New libraries to be implemented

Comments

@JaroslavTulach
Copy link
Member

JaroslavTulach commented May 23, 2024

There already is an implementation of arrow-language written by @hubertp. There is an interest in cleaning up the current Storage architecture particularly to clearly separate storage from operations. @JaroslavTulach claims that the most efficient (and at the end also simplest) implementation can be achieved by using Truffle. Let's run an experiment to verify such a claim!

The documentation of Column.+ mentions two examples:

             vector_plus = Examples.float_column + Examples.integer_column
             scalar_plus = Examples.integer_column + 10

let's reimplement them with the Arrow language. Then let's generate a million item column and benchmark them against each other.

API

The Arrow language already supports two operations: new[type] and cast[type]. E.g. one can write new[Int32] and gets a function that can be called to allocate array of provided (as function argument) size. Let's introduce new syntax:

+[type]

parsing such statement is going to produce a function (e.g. isExecutable object) that takes two arguments. When invoked, it'd check whether both arguments hasArrayElements. If so, they have to have the same getArraySize(). Then the function would create new[type] for appropriate size and filled the arrow column with added content of both arrays. If one (or both) arguments are scalars, then the same behavior as scalar_plus would "happen".

One of the important steps in the implementation are conversions. The values need of arguments need to be converted to the target type of the resulting array. Such conversions can benefit from Truffle @Specialization and effectively compute the optimal result.

Expected Result

After final benchmarking we shall see that the Truffle results are faster, yet the code remains simple.

@enso-bot
Copy link

enso-bot bot commented Jun 4, 2024

Jaroslav Tulach reports a new STANDUP for yesterday (2024-06-03):

Progress: - arrow language +: #10150

@JaroslavTulach JaroslavTulach moved this from ⚙️ Design to 🔧 Implementation in Issues Board Jun 4, 2024
@enso-bot
Copy link

enso-bot bot commented Jun 5, 2024

Jaroslav Tulach reports a new STANDUP for yesterday (2024-06-04):

Progress: - RoundingUtil: #10150 (comment)

mergify bot pushed a commit that referenced this issue Jun 5, 2024
While working on #10056 I realized the names of method and closure nodes are incomprehensible to anyone. This PR replaces the infamous `<anonymous>` with a name hinting where the method actually is.

# Important Notes
I assume this change will be visible not only in IGV, but also in _stacktraces_ and we may need to adjust few tests.
@JaroslavTulach
Copy link
Member Author

@enso-bot
Copy link

enso-bot bot commented Jun 7, 2024

Jaroslav Tulach reports a new STANDUP for yesterday (2024-06-06):

Progress: - Hands on IGV lab! https://drive.google.com/file/d/11o2eHSy0Ptux0SWW9JpRiGQp5wtVZshh/view

Discord
Discord is great for playing games and chilling with friends, or even building a worldwide community. Customize your own space to talk, play, and hang out.

@enso-bot
Copy link

enso-bot bot commented Jun 8, 2024

Jaroslav Tulach reports a new STANDUP for yesterday (2024-06-07):

Progress: - Hands on IGV lab recording: https://drive.google.com/file/d/11o2eHSy0Ptux0SWW9JpRiGQp5wtVZshh/view?usp=sharing

  • long with nulls benchmark
  • Using builder.append again
  • Speeding append up: 801184a
  • PutNode: f10bd06
  • proper (but slow) support for nulls: 72c89a1 It should be finished by 2024-06-07.
Google Docs

@JaroslavTulach JaroslavTulach moved this from 🔧 Implementation to 👁️ Code review in Issues Board Jun 10, 2024
@mergify mergify bot closed this as completed in #10150 Jun 11, 2024
mergify bot pushed a commit that referenced this issue Jun 11, 2024
Prototype of #10056 showing `+` operation implemented in the _Arrow language_.
@github-project-automation github-project-automation bot moved this from 👁️ Code review to 🟢 Accepted in Issues Board Jun 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
-compiler -libs Libraries: New libraries to be implemented
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants