-
Notifications
You must be signed in to change notification settings - Fork 514
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Profiles will not work with a traces_sampler
instead of traces_sample_rate
#1813
Comments
As a side note: I am using It's possible the workaround does not work for me due to gevent, but seeing as the code sentry's profiler was based off was originally written to be used with gevent I hope that's not the case. |
I didn't manage to make the workaround work using any of the timing methods. Some debug logging statements (like with tracing) would be helpful to know whether or not profiling has been attempted at all. Also on a more general note: Have you considered integrating with an existing profiler that is both more feature complete and imposes less of a performance overhead? I specifically have scalene in mind, which can also do memory debugging and can measure how much time is spent in native code as opposed to the Python interpreter (e.g. for C extensions) and will work with multiprocessing (i.e. most of the profiling work could then run on a separate CPU thread, without affecting the main application). It might be a little bit more difficult to achieve though, seeing as scalene has to hook into the application very early on, before anything else gets loaded to achieve its low overhead. But making the profiler as accurate and feature complete as possible should be a win for everybody. |
@Daverball profiling is currently only on for WSGI, ASGI support has an open pull request. What framework/server are you using? |
@sl0thentr0py I'm using my own framework (unfortunately I currently cannot share the source code), but it is WSGI and using the |
@Daverball There are some gevent specific changes that need to be ported for the profiler to collect samples correctly that is being looked into but the profile should still be sent to sentry. Are you not seeing any profiles being sent at all? You can confirm this by enabling debug mode and looking for a debug log like
If there are envelopes being sent with 2 items and one of them being a profile, it is correctly being sent to Sentry. After that, you should check the stats page from the side bar in Sentry and look at the profiles category. It's possible that the profiles you're sending are being dropped, potentially due to them containing only a single sample. For context, if a profile being sent contains only a single sample (taken at ~10ms intervals), it will be rejected.
Yes, this is something that will be added to help with debugging.
We have looked into existing profilers but many of them are not very suitable for production profiling as they induce a rather large overhead. Scalene seems to use a similar approach we use here. This current implementation of the profiler has deviated quite significantly from what we based it on and looking at Scalene, it seems that they are actually similar in the way CPU profiling is done (I'll need to look at it in more detail to see how they differ). Our implementation is also tuned to better suit the way Sentry process profiles server side so we likely will not be integrating existing profilers directly into the SDK but in terms of overhead, we are aiming for ~5%-10% in most cases. |
10ms as a sampling interval seems a bit high. I was expecting something in the neighbourhood of 1-2ms, but I suppose part of the theory is that aggregating a bunch of profiles with a low sampling rate given enough requests will eventually yield a profile that is just as detailed and accurate as one taken at a higher sampling rate, although I'm not fully convinced that still is true if you go below a certain sampling rate since the likelihood of missing certain parts of the code completely increases the larger the gap between samples is, although in most cases those parts probably wouldn't contribute to the total runtime in any significant way anyways. But I will have to play around with it myself and compare it with scalene's results to find out. |
This is precisely the reason why we started with a 10ms sampling interval. At the current 10ms interval, we've already seen success with identifying bottlenecks down to lines of code. It's definitely possible that some of the more shorter calls will be missed but to be able to sample at 1-2ms intervals in production, the overhead naturally will increase. So that's not something that we'll be changing in the near future, but that may change in the future. |
This issue has gone three weeks without activity. In another week, I will close it. But! If you comment or otherwise update it, I will reset the clock, and if you label it "A weed is but an unloved flower." ― Ella Wheeler Wilcox 🥀 |
I managed to get one profile to show up as a complete fluke. The issue definitely is that the scheduler doesn't work correctly with gevent though, so for pretty much any view it just does not get enough chances to record a sample. I will have to upgrade to 1.14 and retest, now that gevent is supported properly. That being said I think the originally reported issue has been fixed and the workaround should no longer be necessary. But I will test with 1.14 (as soon as I have the time) before closing the issue. |
@Zylphrex Looks like even with 1.14 I'm not getting any profiles to show up. It is always sending an envelope with a profile but I guess I'm just not getting enough samples. Potentially the reason is the use of One potential problem with that is that the profiler thread won't release the GIL when it sleeps using That being said this doesn't really explain to me why I'm getting too few samples, even with gevent the interpreter should release the GIL every 5ms, so the sampling thread should get a chance to run as often as it takes samples. It's probably also a bit better to use the existing gevent thread pool, rather than creating a new instance, since I believe there are some caveats to using the ThreadPool that will be taken care of automatically if you use the existing instance: Somewhat unrelated question: Do you do anything to ensure that overlapping transactions and any other scheduled tasks running in other greenlets don't show up in the profile? You could do the same thing you do with Threads and just track the one Greenlet that was active when the transaction was started. You can still get at this specific greenlets frame even when it's not currently active when you're sampling: |
Good point about the I can take a look at using the existing threadpool but that threadpool implementation can be user defined docs and we would permanently take up one of the threads by default so that may have other unintended side effects.
Not at this time, one of the reasons being exactly what you're describing. It is definitely something to investigate more and improve on. |
As pointed out in #1813 (comment), gevent patches the `time` module and `time.sleep` will only release the GIL if there no other greenlets ready to run. This ensures that we always use the builtin `time.sleep` and not the patched version provided by `gevent`.
As pointed out in #1813 (comment), gevent patches the `time` module and `time.sleep` will only release the GIL if there no other greenlets ready to run. This ensures that we always use the builtin `time.sleep` and not the patched version provided by `gevent`.
Did some quick testing again with 1.15.0. Looks like it still will not work, but at least it's being honest about it now, since it won't include the payload unless there's enough samples, so it's quicker for me to see if it works. I've noticed that for testing the gevent code path you don't really seem to be patching anything. Ideally the whole test suite would run monkeypatched (theres pytest-gevent that provides a patched pytest runner, but you can also just use the monkey utility from gevent directly to invoke the pytest module in a fully patched interpreter) Otherwise you are not really testing much of anything, other than the imports not crashing and the functions doing what they're advertised. If I find some time this weekend or next weekend I will probably make a local fork of the codebase to see if I can get it to work myself with a few tweaks. Would also be interesting to see if I can prototype something, so that all the greenlets get tracked like individual threads. Not sure whether it's possible without using the greenlet C API though. It's probably also worth noting that I'm using uwsgi which is using the C API directly from its own C extension to spawn the greenlets, which means the parent frame doesn't really exist for those greenlets. I saw a little comment about that in the gevent codebase in some optional instrumentation code that tracks the greenlet call tree. |
I wonder if your issue is actually with |
If anything it would be the combination of the two. We use |
@Zylphrex Looks like something very strange is happening with the I'm not really sure how this is possible. The best idea I could come up with is that the instance of the scheduler that is running the sampler thread is not the same one that is used to start profiling. But |
Are you able to create a reproduction that you can share? I know you mentioned you use your own framework but perhaps it's you can reproduce it with uwsgi + gevent somehow? That help me a lot in debugging this. |
repro.py from gevent import monkey; monkey.patch_all()
import sentry_sdk
from sentry_sdk.integrations.wsgi import SentryWsgiMiddleware
from time import sleep
SENTRY_DSN = 'https://{}.ingest.sentry.io/{}'
sentry_sdk.init(
dsn=SENTRY_DSN,
environment='development',
traces_sample_rate=1.0,
_experiments={
'profiles_sample_rate': 1.0,
}
)
# Main WSGI application
def application(env, start_response):
response = b'Hello world'
for byte in response:
sleep(0.1)
start_response('200 OK', [
('Content-Type', 'text/plain'),
('Content-Length', str(len(response)))
])
return [response]
application = SentryWsgiMiddleware(application) repro.ini
|
Although I'm not sure sentry-sdk will work at all, because the default transport complains about disabled threading in uwsgi, but you are not allowed to turn on uwsgi threads if you enable gevent. I've implemented my own transport that works with |
Actually it looks like the minimal repro does not actually reproduce the problem and does record samples periodically, like it's supposed to, although the profiles are completely useless because the frame only shows what the Hub greenlet is doing. But it's at least good to know, that I no longer need my custom transport. Looks like the default transport works just fine now with my setup. |
@Zylphrex Alright I think it might be related to uwsgi's builtin static mapping and/or cache. If I enable them for the minimal repro case, then profiles stop working as well: repro.ini
Yes, If I remove the static mapping and cache from my application, then profiles start working there as well. I assume the way uwsgi forks the processes happens a bit differently with static maps. |
@Zylphrex I've made a little proof-of-concept for sampling all the currently active greenlets. It's a little bit messy still, but it works. Cleaning it up properly will probably require some backend-support for an optional second identifier in addition to the |
@Daverball Thanks for this and the repro! I'll take a look this soon |
Hey @Daverball, I'm still looking into a more permanent solution into this issue but I think I've narrowed down what's happening. When running uwsgi with master mode, which your cache options implicitly enables, threads that were started in the master process do not run in the forked worker processes. This means that the sample only runs in the master process while the profiles are being pushed on the the queue in the worker process so no samples are ever taken. As a short term solution, you can potentially try enabling I'll look into a more permanent fix to this issue. Let me know if that works for you in the mean time. And I'll also be looking into your poc with the greenlets soon! |
@Daverball if you are able to test this out this branch to see if it solves your problem that'd be much appreciated 😄 |
@Zylphrex Looks like delaying the start of the sampler thread does indeed do the trick. |
This issue has gone three weeks without activity. In another week, I will close it. But! If you comment or otherwise update it, I will reset the clock, and if you label it "A weed is but an unloved flower." ― Ella Wheeler Wilcox 🥀 |
@Zylphrex Since the original issue has been resolved I will close the issue. I may open another one for greenlet tracking improvements. I might take a look at the backend code myself to see what's needed there to support tracking fibers in addition to threads. |
Yes, do open a separate issue and we can discuss how we can improve greenlet tracking there 👍 |
How do you use Sentry?
Sentry Saas (sentry.io)
Version
1.12.1
Steps to Reproduce
TracesSampler
instead of settingtraces_sample_rate
_experiments['profiles_sample_rate']
to 1.0Expected Result
Every request with a trace also has a corresponding profile
Actual Result
No profiles are being recorded
P.S. From glancing at the code it doesn't look like there is a hard dependency on the more "dumb"
traces_sample_rate
over aTracesSampler
. It's just thatinit
checks fortraces_sample_rate
instead of doing the more thorough check ofhas_tracing_enabled
which will check either option.As a workaround I seem to be able to just pass a dummy fixed sampling rate, since the TracesSampler always takes priority.
P.P.S. Looks like the workaround might not actually work, since I'm still seeing no profiles for traced transactions. Unless they take a while to process. But I will try some alternative sampling methods for the profiler before giving up.
The text was updated successfully, but these errors were encountered: