Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OPAL_FLASH_READ can take 1003ms #207

Open
ghost opened this issue Dec 12, 2018 · 13 comments
Open

OPAL_FLASH_READ can take 1003ms #207

ghost opened this issue Dec 12, 2018 · 13 comments
Assignees
Labels

Comments

@ghost
Copy link

ghost commented Dec 12, 2018

Spotted on a simple boot-to-OS test on Habanero, a OPAL_FLASH_READ call took 1003ms to complete.

My guess is this is opal-prd starting up and doing a bunch of flash things.

@ghost ghost added the bug label Dec 12, 2018
@debmc
Copy link
Collaborator

debmc commented Apr 24, 2019

Moving this to -> https://github.ibm.com/open-power/opal-project-mgmt/issues/61

I think this one can be closed and we can track under the opal-project-mgmt issue.

@ghost
Copy link
Author

ghost commented Apr 29, 2019 via email

@debmc
Copy link
Collaborator

debmc commented Apr 29, 2019

Tracking here

Ok, please assign to me (it would not allow me).

@debmc
Copy link
Collaborator

debmc commented Apr 29, 2019

Subject: libflash API: Add 'Prepare' call to allow asynchronous behaviour

We need to be able to perform asynchronous flash operations across the various backends.

Suggestions to look at:
1 - https://github.com/stewart-ibm/skiboot/commits/async-flash
2 - posix_fadvise(POSIX_FADV_WILLNEED)

My plan to approach this task:

1 - Learn the existing libflash and related code
2 - Understand the limitations of each backend
3 - Understand the async-flash patch for applicability
4 - Design/Code a prototype
5 - Deliver patch for review and comment

@ghost ghost assigned debmc Apr 30, 2019
@oohal
Copy link
Contributor

oohal commented Jul 15, 2019

@debmc any updates on this?

@debmc
Copy link
Collaborator

debmc commented Jul 15, 2019

updates

Working async tree -> https://github.com/debmc/skiboot/tree/async

Prototype testing is looking good, need to circle back and review with current prototype implementation if the goals are achieved across the backends.

@debmc
Copy link
Collaborator

debmc commented Jul 15, 2019

 hiomap: Enable Async IPMI messaging

To provide higher layer async operations to access
the target flash, enable hiomap to perform async
ipmi messaging (post boot).

Special considerations and constraints are to prevent
recursive locking and/or polling inside OPAL calls.

Some summary op-test Unit of Work Comparisons

Good Operations Faster
Bad Operations (test only) Penalty on waiting for timeout

START runTestReadEraseWriteNVRAM
STOP runTestReadEraseWriteNVRAM
  8.3 secs SYNC
  6.6 secs ASYNC

START runTestReadWritePAYLOAD
STOP runTestReadWritePAYLOAD
  1.87 secs SYNC
  - differences are timeouts on bad ops
  2.75 secs ASYNC

START runTestWriteTOC
STOP runTestWriteTOC
  1.55 secs SYNC
  - differences are timeouts on bad ops
  2.52 secs ASYNC

@debmc
Copy link
Collaborator

debmc commented Aug 2, 2019

Refreshed the async tree -> https://github.com/debmc/skiboot/tree/async

Tests look good, may be worth some review if anyone has some time.

Outstanding work is to circle back on error path testing.

@amboar
Copy link
Contributor

amboar commented Aug 6, 2019

I had a quick look. There's a bunch of formatting and logging changes in the diff which bloats out the patch, I suggest separating this out as much as possible. Additionally a sizeable chunk of callback handling has been implemented alongside what already exists, and then there are various conditional blocks to ensure one path or another is taken. Maybe I just need to take the time to read the patch properly, but I guess I was missing the "why" on first skim. I think putting a summary of the approach in the commit message would be useful.

I was also a little concerned by the addition of "moving" to the window states - the window state is meant to reflect those described by the protocol, whereas this moving state is an implementation detail, so my gut instinct is we need to think about this a bit more. A similar concern was the introduction of state to track sequence ids of specific in-flight commands; the protocol only supports one active window and all operations against that window should to be serialised, so there should not currently be a case where multiple messages are in-flight. I suspect that needs more thought as well, or a convincing story to be written in comments or the commit message.

@hegdevasant
Copy link

Refreshed the async tree -> https://github.com/debmc/skiboot/tree/async

Tests look good, may be worth some review if anyone has some time.

Can you please post patches to mailing list? It becomes easy to review in mailing list.

-Vasant

@debmc
Copy link
Collaborator

debmc commented Aug 6, 2019

the protocol only supports one active window and all operations against that window should to be serialised, so there should not currently be a case where multiple messages are in-flight

I'll add some comments to the code as suggested, some overview here:

The tracking of sequence id's is for debug of ipmi messages in OPAL, so when tracing is turned on with log-level-memory and the BT Queue we can identify any queue handling or timing issues to map hiomap ipmi messages to other OPAL ipmi non-sync messages, etc. All operations on the window are serialized.

The "moving" is an internal state, however its needed to help control the state transition since we are now waiting on the move ('ll add some comments to the code as well).

The callback handling is needing to be kicked sometimes by sync messages (during boot) and post boot by async messages, therefore we need to conditionally identify when each method type would be called at the specific point in time.

@debmc
Copy link
Collaborator

debmc commented Sep 7, 2019

Refreshed the async tree -> https://github.com/debmc/skiboot/tree/async

Still need to do some more testing and regression testing.

@debmc
Copy link
Collaborator

debmc commented Oct 17, 2019

Sent patch series to skiboot mailing list:

[Skiboot] [PATCH 00/12] ipmi-hiomap: Enablement for Async opal_flash_op's
https://lists.ozlabs.org/pipermail/skiboot/2019-October/015624.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants