Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding manual sample entries #193

Closed
JCash opened this issue Apr 13, 2022 · 13 comments
Closed

Adding manual sample entries #193

JCash opened this issue Apr 13, 2022 · 13 comments

Comments

@JCash
Copy link
Contributor

JCash commented Apr 13, 2022

In our use case, we have projects that have large Lua code bases, and sometimes they need to figure out where the time is spent in their scripts.

Lua has it's own way of adding debug callbacks, which allows the client to gather the data they need.
I think it would be beneficial to be able to add these "externally" collected samples into the profiler data.

On the C side, it might be good to mimic the current C code flow with the Begin/End pairs:

rmt_BeginManualSample(name, flags)

    rmt_ManualSampleSetTime(time)
    rmt_ManualSampleSetStart(start)

rmt_EndManualSample()

As a first step, the question is what do you think about the idea, and if it would fit Remotery?

@dwilliamson
Copy link
Collaborator

You've bumped into the lack of an extendable Sample API here.

Right now the solution is to add your own SampleType specific to the target platform and implement the exact same code as the CPU samplers, along with the same macros. It's very verbose and not friendly to external extension, like you require.

So the "plan" is to simplify the Sample implementations and expose custom callbacks to add your own Sample types without having to add anything to the implementation. That's not going to solve your immediate problem, however.

I don't want to add this extra kind of API but let me think about it all day today. It may be that you'll have to maintain a fork for this kind of custom modification but maybe there's a way to avoid that.

@JCash
Copy link
Contributor Author

JCash commented Apr 13, 2022

Ok, good to know. 👍
Please post any design ideas here, and I can try to keep our fork in line with those thoughts to try to make it more seamless later on when there's an update to the official api.

I'd also like to point out that this is not an immediate concern, but I would like to to have something in place for us during this year.

@dwilliamson
Copy link
Collaborator

Can you give me an example of the Lua functions you want to call that generates this debug/timing info for you?

@JCash
Copy link
Contributor Author

JCash commented Jun 12, 2022

Tbh, I'm not sure I need/want this feature request anymore.

Currently, I've implemented a profiler.scope_begin(name)/profiler.scope_end(() in Lua, which will be enough for our use case. One concern I had was regarding the fact that the Lua scope in question might miss the end scope due to a Lua error being thrown.
But, we've decided to let that responsibility be on the user side.

As for Lua, it has the capability of adding a debug hook, but I couldn't get it to work reliably with LuaJIT (only did a first test the other day).
It seemed to work fine with vanilla Lua 5.1 though.

Since we have the Lua begin/end, and it seems to solve our use case, I'll close this feature request.

@JCash JCash closed this as completed Jun 12, 2022
@dwilliamson
Copy link
Collaborator

You can quickly detect an unbalanced tree due to missing ends by marking your root with RMTSF_Root

@JCash
Copy link
Contributor Author

JCash commented Jul 16, 2022

Coming back to this, I believe I need to be able to make it a bit more "robust" than simply asserting on unmatched scopes.
GIven the scriptability in Lua, it's always possible that the rmt_EndCpuSample() is simply skipped (due to Lua's longjmp() call).

What would be the best way to achieve this?

@JCash
Copy link
Contributor Author

JCash commented Jul 17, 2022

I believe in that case, I should be able to protect that scope with a Lua PCall, so that I can catch any lua error, and then gracefully exit the profile scope.
I'll try that first.

@dwilliamson
Copy link
Collaborator

Can you inject a to-be-closed variable?

@JCash
Copy link
Contributor Author

JCash commented Jul 18, 2022

Do you mean like a struct with constructor/destructor (i.e. RAII behavior)?

Yes, we use those on the C++ side, however, the problem is that Lua can make calls to longjmp() (when it throws an error) which will simply skip any destructor code.

But, I can set my own position to jump to by making my own protected Lua call.
It however forces me to wrap a Lua function and not allow the user to make separate begin/end scope calls.

@dwilliamson
Copy link
Collaborator

to-be-closed variables call their __close metamethod even when they exit scope via error: https://www.lua.org/manual/5.4/manual.html#3.3.8

I was thinking you inject some lua bytecode to introduce the variable on each sampling scope, or do it via the API, if possible.

@JCash
Copy link
Contributor Author

JCash commented Jul 18, 2022

Aha, I didn't know about that functionality!

Unfortunately, we're stricly Lua 5.1 due to the compatibility with LuaJIT.

@dwilliamson
Copy link
Collaborator

dwilliamson commented Jul 18, 2022

If you could debug hook pcall and error I'm sure there's something you could do with stack depth intersection to detect when a scope has not been closed. But they may just be language primitives, or debug hooks may be very slow.

The only alternative I can see is to make stack imbalances less fatal. That will require some more book keeping per sample:

  • When you Push a sample, keep count of "open" children.
  • When you Pop a sample, decrement and check to see if open children is zero.
  • If not; there is an imbalance.
  • Walk all children and close unclosed samples.

It could work... but it will require yet another integer storage with increment/decrement. Annoying, but more robust.

Though that said, if the tree is reconstructed elsewhere, instead of when you call Push/Pop, that cost won't factor in the threads you are timing. But that's for a later date.

And to be fair... I'm usually in favour of this kind of redundancy as it makes the library easier to adopt and maintain.

And we already have CloseOpenSamples.

Yet more comments: and the integer could be bitwise very small unsigned as wrap-around will ensure we'll always hit zero if there tree is balanced. If it's 8-bits it means somebody would need 257 children to break the balance detection, 16-bits even more.

@JCash
Copy link
Contributor Author

JCash commented Jul 19, 2022

My favorite so far is the book keeping with the open/closed children.

For the workarounds I had in mind in Lua/C all involved wrapping functions.
It has some disadvantages I don't like.

I'd have to put your code into a function.
It makes the code unnecessarily split up. And it makes you a bit more reluctant to profile smaller scopes.

The wrapper call signature is a bit clunky.
Also, it doesn't lend itself to the "Release" version of the game.
E.g. having a line like this in a released version won't work, since the profiler is disabled in release mode.

profiler.scoped_pcall("scope_name", function, args)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants