Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLVM ERROR: floatFromInt cast #2

Closed
kassane opened this issue Feb 22, 2024 · 3 comments
Closed

LLVM ERROR: floatFromInt cast #2

kassane opened this issue Feb 22, 2024 · 3 comments
Labels
bug Something isn't working good first issue Good for newcomers help wanted Extra attention is needed

Comments

@kassane
Copy link
Owner

kassane commented Feb 22, 2024

Build and linking, using zig only:

./zig-x86_64-relsafe-espressif-linux-musl-baseline/zig build-exe start.zig -target xtensa-freestanding-none -mcpu=<cpuname>

  • esp32
  • esp32s3
  • cnl (intel cannonlake variant for xtensa)

Skip error: add -fno-compiler-rt

LLVM Emit Object... LLVM ERROR: Cannot select: 0x7379c54b5200: f32 = fp16_to_fp 0x7379c5783cc0, float_from_int.zig:46:9 @[ floatsihf.zig:11:24 ]
  0x7379c5783cc0: i32 = or 0x7379c578d240, 0x7379c578d390, float_from_int.zig:46:9 @[ floatsihf.zig:11:24 ]
    0x7379c578d240: i32 = AssertZext 0x7379c54a17c0, ValueType:ch:i16, float_from_int.zig:46:9 @[ floatsihf.zig:11:24 ]
      0x7379c54a17c0: i32,ch = CopyFromReg 0x7379c5d20110, Register:i32 %1, float_from_int.zig:46:9 @[ floatsihf.zig:11:24 ]
        0x7379c54b60c0: i32 = Register %1
    0x7379c578d390: i32 = XtensaISD::PCREL_WRAPPER TargetConstantPool:i32<i32 31744> 0
      0x7379c5786680: i32 = TargetConstantPool<i32 31744> 0
In function: __floatsihf
[1]    3295 IOT instruction (core dumped)  ./zig-x86_64-relsafe-espressif-linux-musl-baseline/zig build-exe start.zig

// Compute exponent
if ((int_bits > max_exp) and (exp > max_exp)) // If exponent too large, overflow to infinity
return @bitCast(sign_bit | @as(uT, @bitCast(inf)));

fn __floatdihf(a: i64) callconv(.C) f16 {
return floatFromInt(f16, a);
}

  • esp32s2 works!!
time report
===-------------------------------------------------------------------------===
                      Instruction Selection and Scheduling
===-------------------------------------------------------------------------===
  Total Execution Time: 0.0000 seconds (0.0000 wall clock)

   ---User Time---   --User+System--   ---Wall Time---  --- Name ---
   0.0000 ( 34.8%)   0.0000 ( 34.8%)   0.0000 ( 34.4%)  Instruction Selection
   0.0000 ( 15.2%)   0.0000 ( 15.2%)   0.0000 ( 15.4%)  Instruction Creation
   0.0000 ( 13.0%)   0.0000 ( 13.0%)   0.0000 ( 13.3%)  Instruction Scheduling
   0.0000 ( 10.9%)   0.0000 ( 10.9%)   0.0000 ( 11.3%)  Vector Legalization
   0.0000 ( 10.9%)   0.0000 ( 10.9%)   0.0000 ( 10.8%)  DAG Combining 1
   0.0000 (  4.3%)   0.0000 (  4.3%)   0.0000 (  4.6%)  DAG Legalization
   0.0000 (  4.3%)   0.0000 (  4.3%)   0.0000 (  4.1%)  Type Legalization
   0.0000 (  4.3%)   0.0000 (  4.3%)   0.0000 (  4.1%)  DAG Combining 2
   0.0000 (  2.2%)   0.0000 (  2.2%)   0.0000 (  2.1%)  Instruction Scheduling Cleanup
   0.0000 (100.0%)   0.0000 (100.0%)   0.0000 (100.0%)  Total

===-------------------------------------------------------------------------===
                          Pass execution timing report
===-------------------------------------------------------------------------===
  Total Execution Time: 0.0004 seconds (0.0004 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
   0.0002 ( 48.3%)   0.0000 (  0.0%)   0.0002 ( 45.2%)   0.0002 ( 44.6%)  Xtensa DAG->DAG Pattern Instruction Selection
   0.0001 ( 15.9%)   0.0000 (  0.0%)   0.0001 ( 14.9%)   0.0001 ( 14.3%)  Xtensa Assembly Printer
   0.0000 (  7.8%)   0.0000 (  0.0%)   0.0000 (  7.3%)   0.0000 (  7.0%)  Live DEBUG_VALUE analysis
   0.0000 (  3.6%)   0.0000 (  0.0%)   0.0000 (  3.4%)   0.0000 (  3.3%)  Prologue/Epilogue Insertion & Frame Finalization
   0.0000 (  3.0%)   0.0000 (  0.0%)   0.0000 (  2.8%)   0.0000 (  2.8%)  MachineDominator Tree Construction
   0.0000 (  0.0%)   0.0000 ( 13.0%)   0.0000 (  0.8%)   0.0000 (  1.4%)  Lower constant intrinsics
   0.0000 (  1.2%)   0.0000 (  0.0%)   0.0000 (  1.1%)   0.0000 (  1.1%)  Fast Register Allocator
   0.0000 (  0.9%)   0.0000 (  0.0%)   0.0000 (  0.8%)   0.0000 (  0.8%)  Free MachineFunction
   0.0000 (  0.9%)   0.0000 (  0.0%)   0.0000 (  0.8%)   0.0000 (  0.8%)  Machine Natural Loop Construction
   0.0000 (  0.0%)   0.0000 (  8.7%)   0.0000 (  0.6%)   0.0000 (  0.8%)  Remove unreachable blocks from the CFG #2
   0.0000 (  0.9%)   0.0000 (  0.0%)   0.0000 (  0.8%)   0.0000 (  0.8%)  Stack Frame Layout Analysis
   0.0000 (  0.9%)   0.0000 (  0.0%)   0.0000 (  0.8%)   0.0000 (  0.8%)  Finalize ISel and expand pseudo-instructions
   0.0000 (  0.9%)   0.0000 (  0.0%)   0.0000 (  0.8%)   0.0000 (  0.8%)  Xtensa instruction size reduction pass
   0.0000 (  0.6%)   0.0000 (  0.0%)   0.0000 (  0.6%)   0.0000 (  0.6%)  Two-Address instruction pass
   0.0000 (  0.3%)   0.0000 (  4.3%)   0.0000 (  0.6%)   0.0000 (  0.6%)  Safe Stack instrumentation pass
   0.0000 (  0.0%)   0.0000 (  4.3%)   0.0000 (  0.3%)   0.0000 (  0.6%)  Insert stack protectors
   0.0000 (  0.6%)   0.0000 (  0.0%)   0.0000 (  0.6%)   0.0000 (  0.6%)  Machine Natural Loop Construction #2
   0.0000 (  0.6%)   0.0000 (  0.0%)   0.0000 (  0.6%)   0.0000 (  0.6%)  Branch relaxation pass
   0.0000 (  0.0%)   0.0000 (  4.3%)   0.0000 (  0.3%)   0.0000 (  0.6%)  Scalarize Masked Memory Intrinsics
   0.0000 (  0.3%)   0.0000 (  0.0%)   0.0000 (  0.3%)   0.0000 (  0.6%)  Fixup Statepoint Caller Saved
   0.0000 (  0.0%)   0.0000 (  4.3%)   0.0000 (  0.3%)   0.0000 (  0.6%)  Expand vector predication intrinsics
   0.0000 (  0.3%)   0.0000 (  0.0%)   0.0000 (  0.3%)   0.0000 (  0.6%)  StackMap Liveness Analysis
   0.0000 (  0.3%)   0.0000 (  8.7%)   0.0000 (  0.8%)   0.0000 (  0.6%)  Shadow Stack GC Lowering
   0.0000 (  0.6%)   0.0000 (  0.0%)   0.0000 (  0.6%)   0.0000 (  0.6%)  MachineDominator Tree Construction #2
   0.0000 (  0.0%)   0.0000 (  8.7%)   0.0000 (  0.6%)   0.0000 (  0.6%)  Pre-ISel Intrinsic Lowering
   0.0000 (  0.6%)   0.0000 (  0.0%)   0.0000 (  0.6%)   0.0000 (  0.5%)  Eliminate PHI nodes for register allocation
   0.0000 (  0.0%)   0.0000 (  4.3%)   0.0000 (  0.3%)   0.0000 (  0.5%)  Expand large div/rem
   0.0000 (  0.6%)   0.0000 (  0.0%)   0.0000 (  0.6%)   0.0000 (  0.5%)  Machine Optimization Remark Emitter
   0.0000 (  0.3%)   0.0000 (  4.3%)   0.0000 (  0.6%)   0.0000 (  0.5%)  Expand large fp convert
   0.0000 (  0.3%)   0.0000 (  4.3%)   0.0000 (  0.6%)   0.0000 (  0.5%)  Expand Atomic instructions
   0.0000 (  0.6%)   0.0000 (  0.0%)   0.0000 (  0.6%)   0.0000 (  0.5%)  Analyze Machine Code For Garbage Collection
   0.0000 (  0.6%)   0.0000 (  0.0%)   0.0000 (  0.6%)   0.0000 (  0.5%)  Insert fentry calls
   0.0000 (  0.3%)   0.0000 (  0.0%)   0.0000 (  0.3%)   0.0000 (  0.5%)  Insert XRay ops
   0.0000 (  0.6%)   0.0000 (  0.0%)   0.0000 (  0.6%)   0.0000 (  0.5%)  Implement the 'patchable-function' attribute
   0.0000 (  0.6%)   0.0000 (  0.0%)   0.0000 (  0.6%)   0.0000 (  0.5%)  Xtensa bool reg fixup pass
   0.0000 (  0.3%)   0.0000 (  4.3%)   0.0000 (  0.6%)   0.0000 (  0.5%)  Assignment Tracking Analysis
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.5%)  Lower invoke and unwind, for unwindless code generators
   0.0000 (  0.6%)   0.0000 (  0.0%)   0.0000 (  0.6%)   0.0000 (  0.5%)  Machine Optimization Remark Emitter #2
   0.0000 (  0.6%)   0.0000 (  0.0%)   0.0000 (  0.6%)   0.0000 (  0.5%)  Lazy Machine Block Frequency Analysis #2
   0.0000 (  0.6%)   0.0000 (  0.0%)   0.0000 (  0.6%)   0.0000 (  0.5%)  Contiguously Lay Out Funclets
   0.0000 (  0.0%)   0.0000 ( 13.0%)   0.0000 (  0.8%)   0.0000 (  0.5%)  Remove unreachable blocks from the CFG
   0.0000 (  0.0%)   0.0000 (  4.3%)   0.0000 (  0.3%)   0.0000 (  0.5%)  Lower Garbage Collection Instructions
   0.0000 (  0.0%)   0.0000 (  4.3%)   0.0000 (  0.3%)   0.0000 (  0.5%)  Prepare callbr
   0.0000 (  0.6%)   0.0000 (  0.0%)   0.0000 (  0.6%)   0.0000 (  0.5%)  Xtensa Hardware Loop Fixup
   0.0000 (  0.3%)   0.0000 (  0.0%)   0.0000 (  0.3%)   0.0000 (  0.3%)  Assumption Cache Tracker
   0.0000 (  0.3%)   0.0000 (  0.0%)   0.0000 (  0.3%)   0.0000 (  0.3%)  Local Stack Slot Allocation
   0.0000 (  0.3%)   0.0000 (  0.0%)   0.0000 (  0.3%)   0.0000 (  0.3%)  Xtensa fix PSRAM cache issue in the ESP32 chips
   0.0000 (  0.0%)   0.0000 (  4.3%)   0.0000 (  0.3%)   0.0000 (  0.3%)  Expand reduction intrinsics
   0.0000 (  0.3%)   0.0000 (  0.0%)   0.0000 (  0.3%)   0.0000 (  0.3%)  Xtensa Hardware Loops
   0.0000 (  0.6%)   0.0000 (  0.0%)   0.0000 (  0.6%)   0.0000 (  0.3%)  Post-RA pseudo instruction expansion pass
   0.0000 (  0.3%)   0.0000 (  0.0%)   0.0000 (  0.3%)   0.0000 (  0.3%)  Lazy Machine Block Frequency Analysis
   0.0000 (  0.3%)   0.0000 (  0.0%)   0.0000 (  0.3%)   0.0000 (  0.3%)  Create Garbage Collector Module Metadata
   0.0000 (  0.3%)   0.0000 (  0.0%)   0.0000 (  0.3%)   0.0000 (  0.3%)  Target Library Information
   0.0000 (  0.3%)   0.0000 (  0.0%)   0.0000 (  0.3%)   0.0000 (  0.3%)  Target Transform Information
   0.0000 (  0.3%)   0.0000 (  0.0%)   0.0000 (  0.3%)   0.0000 (  0.3%)  Remove Redundant DEBUG_VALUE analysis
   0.0000 (  0.3%)   0.0000 (  0.0%)   0.0000 (  0.3%)   0.0000 (  0.3%)  Profile summary info
   0.0000 (  0.3%)   0.0000 (  0.0%)   0.0000 (  0.3%)   0.0000 (  0.3%)  Machine Sanitizer Binary Metadata
   0.0000 (  0.3%)   0.0000 (  0.0%)   0.0000 (  0.3%)   0.0000 (  0.2%)  Xtensa Constant Islands
   0.0000 (  0.3%)   0.0000 (  0.0%)   0.0000 (  0.3%)   0.0000 (  0.2%)  Machine Branch Probability Analysis
   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)   0.0000 (  0.0%)  Target Pass Configuration
   0.0000 (  0.3%)   0.0000 (  0.0%)   0.0000 (  0.3%)   0.0000 (  0.0%)  Machine Module Information
   0.0003 (100.0%)   0.0000 (100.0%)   0.0004 (100.0%)   0.0004 (100.0%)  Total

===-------------------------------------------------------------------------===
                                 DWARF Emission
===-------------------------------------------------------------------------===
  Total Execution Time: 0.0002 seconds (0.0002 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
   0.0001 (100.0%)   0.0001 (100.0%)   0.0002 (100.0%)   0.0002 (100.0%)  Debug Info Emission
   0.0001 (100.0%)   0.0001 (100.0%)   0.0002 (100.0%)   0.0002 (100.0%)  Total

===-------------------------------------------------------------------------===
                        Analysis execution timing report
===-------------------------------------------------------------------------===
  Total Execution Time: 0.0000 seconds (0.0000 wall clock)

   --System Time--   --User+System--   ---Wall Time---  --- Name ---
   0.0000 ( 60.0%)   0.0000 ( 60.0%)   0.0000 ( 61.9%)  TargetLibraryAnalysis
   0.0000 ( 20.0%)   0.0000 ( 20.0%)   0.0000 ( 19.0%)  InnerAnalysisManagerProxy<FunctionAnalysisManager, Module>
   0.0000 ( 20.0%)   0.0000 ( 20.0%)   0.0000 ( 19.0%)  ProfileSummaryAnalysis
   0.0000 (100.0%)   0.0000 (100.0%)   0.0000 (100.0%)  Total

===-------------------------------------------------------------------------===
                          Pass execution timing report
===-------------------------------------------------------------------------===
  Total Execution Time: 0.0000 seconds (0.0000 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
   0.0000 ( 20.0%)   0.0000 ( 34.8%)   0.0000 ( 32.1%)   0.0000 ( 38.5%)  AlwaysInlinerPass
   0.0000 ( 40.0%)   0.0000 ( 34.8%)   0.0000 ( 35.7%)   0.0000 ( 34.4%)  AnnotationRemarksPass
   0.0000 ( 40.0%)   0.0000 ( 30.4%)   0.0000 ( 32.1%)   0.0000 ( 27.0%)  CoroConditionalWrapper
   0.0000 (100.0%)   0.0000 (100.0%)   0.0000 (100.0%)   0.0000 (100.0%)  Total

===-------------------------------------------------------------------------===
                          Pass execution timing report
===-------------------------------------------------------------------------===
  Total Execution Time: 0.0000 seconds (0.0000 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
   0.0000 ( 20.0%)   0.0000 ( 34.8%)   0.0000 ( 32.1%)   0.0000 ( 38.5%)  AlwaysInlinerPass
   0.0000 ( 40.0%)   0.0000 ( 34.8%)   0.0000 ( 35.7%)   0.0000 ( 34.4%)  AnnotationRemarksPass
   0.0000 ( 40.0%)   0.0000 ( 30.4%)   0.0000 ( 32.1%)   0.0000 ( 27.0%)  CoroConditionalWrapper
   0.0000 (100.0%)   0.0000 (100.0%)   0.0000 (100.0%)   0.0000 (100.0%)  Total

===-------------------------------------------------------------------------===
                        Analysis execution timing report
===-------------------------------------------------------------------------===
  Total Execution Time: 0.0000 seconds (0.0000 wall clock)

   --System Time--   --User+System--   ---Wall Time---  --- Name ---
   0.0000 ( 60.0%)   0.0000 ( 60.0%)   0.0000 ( 61.9%)  TargetLibraryAnalysis
   0.0000 ( 20.0%)   0.0000 ( 20.0%)   0.0000 ( 19.0%)  InnerAnalysisManagerProxy<FunctionAnalysisManager, Module>
   0.0000 ( 20.0%)   0.0000 ( 20.0%)   0.0000 ( 19.0%)  ProfileSummaryAnalysis
   0.0000 (100.0%)   0.0000 (100.0%)   0.0000 (100.0%)  Total

Reference

@kassane kassane added bug Something isn't working help wanted Extra attention is needed good first issue Good for newcomers labels Feb 22, 2024
@kassane kassane changed the title LLVM ERROR: during fp16 to fp32 cast LLVM ERROR: floatFromInt cast Feb 23, 2024
@kassane
Copy link
Owner Author

kassane commented Feb 26, 2024

Maybe here's where the problem lies:

/// This governs whether to use these symbol names for f16/f32 conversions
/// rather than the standard names:
/// * __gnu_f2h_ieee
/// * __gnu_h2f_ieee
/// Known correct configurations:
/// x86_64-freestanding-none => true
/// x86_64-linux-none => true
/// x86_64-linux-gnu => true
/// x86_64-linux-musl => true
/// x86_64-linux-eabi => true
/// arm-linux-musleabihf => true
/// arm-linux-gnueabihf => true
/// arm-linux-eabihf => false
/// wasm32-wasi-musl => false
/// wasm32-freestanding-none => false
/// x86_64-windows-gnu => true
/// x86_64-windows-msvc => true
/// any-macos-any => false
pub const gnu_f16_abi = switch (builtin.cpu.arch) {
.wasm32,
.wasm64,
.riscv64,
.riscv32,
=> false,
.x86, .x86_64 => true,
.arm, .armeb, .thumb, .thumbeb => switch (builtin.abi) {
.eabi, .eabihf => false,
else => true,
},
else => !builtin.os.tag.isDarwin(),
};

/// AArch64 is the only ABI (at the moment) to support f16 arguments without the
/// need for extending them to wider fp types.
/// TODO remove this; do this type selection in the language rather than
/// here in compiler-rt.
pub fn F16T(comptime OtherType: type) type {
return switch (builtin.cpu.arch) {
.arm, .armeb, .thumb, .thumbeb => if (std.Target.arm.featureSetHas(builtin.cpu.features, .has_v8))
switch (builtin.abi.floatAbi()) {
.soft => u16,
.hard => f16,
}
else
u16,
.aarch64, .aarch64_be, .aarch64_32 => f16,
.riscv64 => if (builtin.zig_backend == .stage1) u16 else f16,
.x86, .x86_64 => if (builtin.target.isDarwin()) switch (OtherType) {
// Starting with LLVM 16, Darwin uses different abi for f16
// depending on the type of the other return/argument..???
f32, f64 => u16,
f80, f128 => f16,
else => unreachable,
} else f16,
else => u16,
};
}

@kassane
Copy link
Owner Author

kassane commented Feb 26, 2024

Based on https://github.com/kassane/zig-espressif-bootstrap/blob/xtensa/zig/lib/std/Target/xtensa.zig esp32s2 not use S32C1I and Single FP:

$> ./zig-x86_64-relsafe-espressif-linux-musl-baseline/zig build-exe start.zig \
    -target xtensa-freestanding-none -mcpu=<esp32|esp32s3>-fp
LLVM Emit Object... error: <unknown>:0: Undefined temporary symbol .LBB385_2

$> ./zig-x86_64-relsafe-espressif-linux-musl-baseline/zig build-exe start.zig \
    -target xtensa-freestanding-none -mcpu=<esp32|esp32s3>-fp-s32c1i

@kassane
Copy link
Owner Author

kassane commented Apr 1, 2024

@kassane kassane closed this as completed Oct 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

1 participant