-
Notifications
You must be signed in to change notification settings - Fork 12.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SystemZ Backend: Add support for operations such as FP16_TO_FP and FP_TO_FP16 #50374
Comments
Any updates from the community reg this issue? Thanks! |
Moving out of MLIR: this is a backend issue. |
It looks like commit https://reviews.llvm.org/rG8cd8120a7b5d which has been tagged as 13.0.0-rc could solve this issue, since it adds support for arch14 and operations related to FP16 conversion to the SystemZ backend. Could anyone from community help to confirm this? Thanks! |
This issue still persists on TensorFlow v2.8.0 which uses LLVM 15. Looks like specific half-precision (16 bit) operations are still missing in SystemZ LLVM backend. Can anyone from community take a look at this issue? Thanks very much! |
Recently we've run test cases under Looks like We also found that when building TensorFlow with options We think this could be used as a workaround for now, but to address the root cause, Any thoughts or suggestions from the community reg this issue would be greatly appreciated. Thanks! |
The following function (compiler explorer): define half @deref(ptr %p) {
%x = load half, ptr %p
ret half %x
} currently fails to compile when compiling for
Other operations involving |
FWIW, this is the only remaining blocker I'm aware of for Zig to be able to target s390x: ❯ zig cc s390x.c -target s390x-linux-musl
LLVM ERROR: Cannot select: 0x6d24170: i32 = fp_to_fp16 0x6d23a00
0x6d23a00: f32,ch = CopyFromReg 0x5e82a10, Register:f32 %10
0x6c03da0: f32 = Register %10
In function: __fixhfsi |
On s390x, every use of the f16 data type will currently ICE due to llvm/llvm-project#50374, causing doctest failures on the platform. Most doctests were already restricted to certain platforms, so fix this by likewise restricting the remaining five.
…oss35 core: Limit remaining f16 doctests to x86_64 linux On s390x, every use of the f16 data type will currently ICE due to llvm/llvm-project#50374, causing doctest failures on the platform. Most doctests were already restricted to certain platforms, so fix this by likewise restricting the remaining five.
Rollup merge of rust-lang#127588 - uweigand:s390x-f16-doctests, r=tgross35 core: Limit remaining f16 doctests to x86_64 linux On s390x, every use of the f16 data type will currently ICE due to llvm/llvm-project#50374, causing doctest failures on the platform. Most doctests were already restricted to certain platforms, so fix this by likewise restricting the remaining five.
Patch in progress here: #109164 |
Extended Description
Hi,
Recently we're running test suite of TensorFlow v2.5.0 on s390x (Ubuntu 18.04).
Test case //tensorflow/compiler/tests:sort_ops_test_cpu fails due to the following error:
LLVM ERROR: Cannot select: 0x3ff14167ca0: f32 = fp16_to_fp 0x3ff14167f10
0x3ff14167f10: i32,ch = load<(dereferenceable load 2 from %ir.4, !alias.scope !6, !noalias !4), zext from i16> 0x3ff14197548, 0x3ff141678f8, undef:i64
0x3ff141678f8: i64,ch = load<(load 8 from %ir.3)> 0x3ff14197548, 0x3ff14167890, undef:i64
0x3ff14167890: i64 = add nuw 0x3ff141674e8, Constant:i64<8>
0x3ff141674e8: i64,ch = CopyFromReg 0x3ff14197548, Register:i64 %2
0x3ff14167480: i64 = Register %2
0x3ff14167828: i64 = Constant<8>
0x3ff14167758: i64 = undef
0x3ff14167758: i64 = undef
In function: compare_lt_WCTTAtafbb4__.7
Other test cases such as //tensorflow/python/keras/optimizer_v2:adam_test and //tensorflow/core/kernels/mlir_generated:abs_cpu_f16_f16_gen_test also fail on s390x due to similar reasons. A related issue (tensorflow/tensorflow#44362) has been raised in TensorFlow GitHub issues.
We think the root cause is lack of support for operations such as FP16_TO_FP and FP_TO_FP16 which perform promotions and truncation for half-precision (16 bit) floating numbers in the SystemZ LLVM backend (llvm/lib/Target/SystemZ/SystemZISelLowering.cpp). Could these features be considered to add to SystemZ LLVM backend? Thanks!
The text was updated successfully, but these errors were encountered: