Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ia32/exceptions: improve FPU exceptions #624

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open

Conversation

badochov
Copy link
Contributor

@badochov badochov commented Nov 18, 2024

Description

This PR aims to resolve improve situation with exceptions caused by FPU on Intel.

FIrst commit resolves issue where recently instead of a triple fault on FPU error exception 16 got presented on the screen, due to wrong instruction setting flag.
The next commit fixes saving context in exceptions.
Following commit changes fsave in context saving procedure to fnsave to prevent infinite recursion on FPU exception causing a triple fault. Due to this change exception 16 is reported correctly as exception 16 not as tripple fault. Thus, exception 16 is excepted to appear not Triple fault.

The following commits are optimizations.

I tried to dig into the root cause, here are my findings:
In my testing I have found:

  • the fail always happens after first schedule of the program after returning from the program handler/
  • the context on the user stack in affected runs is corrupted just after entering the _signal_trampoline, I have confirmed it is correct in
    hal_spinlockClear(&threads_common.spinlock, &sc);
  • the corruption affects two fields savesp and cr0Bits. In affected runs the cr0Bits is corrupted to hold the same value as eax.

The fail occurs due to bad cr0Bits as we remove the TS flag from cr0 without initializing the FPU.

Additionally, I have noticed that in

movl %eax, FPU_CONTEXT_SIZE(%esp)
and
movl %eax, FPU_CONTEXT_SIZE(%esp)
cr0Bits is temporarily replaced with eax. Additionally ia32 is the only platform not disabling iterrupts during context restoration.

Unfortunatelly, I was not able to understand the real reason behind it. Initially this PR got a commit with potential solution by disabling interrupts during stack restoration, sadly insufficient. Initially I thought that it worked but it didn't as the interrupts were not properly reenabled before passing control to the user space.

Perhaps interleave causing problems is timer interrupt just after iret instruction passsing control to the user in hal_jmp.

Motivation and Context

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Chore (refactoring, style fixes, git/CI config, submodule management, no code logic changes)

How Has This Been Tested?

  • Already covered by automatic testing.
  • New test added: (add PR link here).
  • Tested by hand on: ia32-generic-qemu

Checklist:

  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added tests to cover my changes.
  • All new and existing linter checks and tests passed.
  • My changes generate no new compilation warnings for any of the targets.

Special treatment

  • This PR needs additional PRs to work (list the PRs, preferably in merge-order).
  • I will merge this PR by myself when appropriate.

jnz .exception_pushRegisters should have flags from andl not subl
instruction.

JIRA: RTOS-954
savesp saved in exception_pushContext pointed to savesp instead of edi.

JIRA: RTOS-954
Checking for FPU exception in preambule of exception handling causes
infinite recursion of FPU exceptions if one is found.

JIRA: RTOS-954
@badochov badochov marked this pull request as ready for review November 18, 2024 14:37
Copy link

github-actions bot commented Nov 18, 2024

Unit Test Results

7 949 tests  ±0   7 231 ✅ ±0   40m 11s ⏱️ +2s
  461 suites ±0     718 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit 1967f0f. ± Comparison against base commit e2a6b1f.

♻️ This comment has been updated with latest results.

@badochov badochov removed the request for review from agkaminski November 18, 2024 15:03
@badochov badochov marked this pull request as draft November 18, 2024 15:03
As we use interirupt gate for exceptions the interrupts are already
disabled.

JIRA: RTOS-954
Apply same optimizations as to exceptions_pushContext

JIRA: RTOS-954
@badochov badochov changed the title ia32/exceptions: fix FPU exceptions ia32/exceptions: improve FPU exceptions Nov 18, 2024
@badochov badochov marked this pull request as ready for review November 18, 2024 16:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant