switch of backends #865

pommedeterresautee · 2022-11-09T12:54:46Z

pommedeterresautee
Nov 9, 2022

Hi, we use Triton in our OSS package (Kernl) and our understanding is that the backend will change soon to rely on MLIR.

We are wondering how the switch to the new backend will happen?
More precisely, will there be a time when new and current backend will be alive together? Is there a way for us to test the new backend before release to report bugs, if any? etc.

Our main concern is that to keep output precision high, we are relying on a bunch of workaround and wonder how they will behave with the new backend, what we can drop, if we can enable some optimizations we can't use right now, etc.

Any info regarding the future release would be very welcome.

Btw... tks for all the great work! (and the rewrite 🎉)

Answered by ptillet

Nov 10, 2022

Hey!

The master branch is barely supported now, as we are very focused on the triton-mlir branch, and the plan is indeed to completely deprecate it when the merge happens. However, we don't want the merge to break anything fundamentally (there will be very minor changes, such as view -> reshape). Do you have a link to the specific workarounds you're concerned about? Does kernl have some sort of test suite we could run before the merge happens?

View full answer

ptillet · 2022-11-10T00:28:57Z

ptillet
Nov 10, 2022
Maintainer

Hey!

The master branch is barely supported now, as we are very focused on the triton-mlir branch, and the plan is indeed to completely deprecate it when the merge happens. However, we don't want the merge to break anything fundamentally (there will be very minor changes, such as view -> reshape). Do you have a link to the specific workarounds you're concerned about? Does kernl have some sort of test suite we could run before the merge happens?

0 replies

pommedeterresautee · 2022-11-10T17:10:52Z

pommedeterresautee
Nov 10, 2022
Author

Thank you, it's very clear.

We do not track all the workarounds we put in place, it may be strange things like declaring 2 times the same variable with the same name and value to avoid a segfault, avoiding some block shape which crash for no obvious reasons (at least we have not understood), or just things we do in a certain way because other apparently legitimate way to program the same thing do not work as expected (we have a -very- simple "debugger" which takes triton code and execute it on PyTorch to check if the issue is on our side or Triton one).

We have unit tests, just typing pytest in the package context, and you have >2K tests running. It is quite slow to execute all of them, around 1h30, so maybe you want to let us run them :-) Just ping me when you think it's time for it.

I am very sorry for my next question... but I have to ask, is there a date for the release? (or at least a feature complete version of Triton)

0 replies

ptillet · 2022-11-10T17:43:53Z

ptillet
Nov 10, 2022
Maintainer

Ah I see. Yeah the hope is that these workarounds won't be needed anymore when the merge happens (or shortly after), and won't break anything on the new backend.

We're planning to merge mid-December. We've been very hard at work lately and it's starting to look good, but we still need a few changes to reach performance parity on dense matmul, and get flash attention working.

2 replies

pommedeterresautee Nov 10, 2022
Author

We have noticed the crazy flow of PR since 2 months at least, this is quite impressive.

When you say you need to make flash attention working with new mlir based backend, do you mean it will need some special code on the backend side to make it work (like how it is done right now from my understanding) or more nothing is perfect, and at least you will first fix bugs to
make FA work but the backend will support many different complicated implementations with dot operations, for loops and plenty of other things current backend don’t like together?

ptillet Nov 10, 2022
Maintainer

Our first goal is to reach performance parity with the master branch. Then, after things are merged, there are a few more tricks up our sleeve we could use to make the forward pass even faster :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

switch of backends #865

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments 2 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

switch of backends #865

pommedeterresautee Nov 9, 2022

Replies: 3 comments · 2 replies

ptillet Nov 10, 2022 Maintainer

pommedeterresautee Nov 10, 2022 Author

ptillet Nov 10, 2022 Maintainer

pommedeterresautee Nov 10, 2022 Author

ptillet Nov 10, 2022 Maintainer

pommedeterresautee
Nov 9, 2022

Replies: 3 comments 2 replies

ptillet
Nov 10, 2022
Maintainer

pommedeterresautee
Nov 10, 2022
Author

ptillet
Nov 10, 2022
Maintainer

pommedeterresautee Nov 10, 2022
Author

ptillet Nov 10, 2022
Maintainer