-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
java.lang.UnsupportedOperationException: Unexpected message id from the client #16664
Comments
Hi, thanks for the issue. What PUSH transport are you using? Will you be able to provide also the browser console logs, if you can reproduce the error in development mode? Is there something your application does on the first page load? Do you have some particular Vaadin setting on (e.g. |
Hi! I use:
Also, I use:
because I implemented In case I'll reproduce it on my dev env, I'll try to provide browser logs |
Is it possible that you are somehow invoking |
I checked all of the code, there are no such issues. I always use the following pattern:
|
Also, I noticed that after some idle time PUSH notifications stop working at all, until I manually refresh the entire web page in the browser. I do not see any issues in the log (INFO level). Could you please help me figure out this error? What logs or something else should I enable or provide in order to figure out the reason for that? |
Thank you @alexanoid! Were you able to reproduce it maybe in dev environment? |
Hi @czp13 , thanks for your response! Unfortunately, I am unable to replicate it in the development environment. While working in the development environment, I periodically see the following warning in the console:
Btw, I exported logs from FF and Chrome consoles: Additionally, you may test the actual website I am running: https://decisionwanted.com/ If needed, I can enable there any logs you request to help figure out the issue. |
Unfortunately, I can't see anything suspicious in the attached logs. I tried to navigate the application, but I couldn't see any error or page reload. Where is PUSH used in the application? It is quite difficult to understand what is happening, without being able to reproduce the issue. |
PUSH is used almost anywhere on the website. For example, the following page https://decisionwanted.com/jobs By pressing this link, user invokes Job Catalog update via async PUSH I was able to navigate the website via the router links, but the asynchronous PUSH based content was not delivered to the client. As soon as I refresh the web page by pressing F5 everything starts working correctly. It is definitely hard to reproduce.. but potentially something is wrong with restoring the PUSH connection sometimes. If you have any ideas what may help to catch the issue.. which logs should be enabled or something else.. I'll b glad to execute it and collect any additional info. |
Some additional info:
I have no idea what this means:
but maybe it will be also helpful for you |
The "Fallback ..." message is not relevant to this issue, and it is not a signal of problems . So, it would be interesting to understand why nginx responded with 502 and why we get an unauthorized response on push connection |
I think this is because I redeployed the app on prod... Enabled Vaadin and Atmosphere logs: logging.level.com.vaadin=DEBUG |
Also, I reduced vaadin.pushLongPollingSuspendTimeout= from 180000 to 60000 server.servlet.session.timeout=reduced to 30m (previously I had 7 days) This is my current application properties:
some of NGINX parameters:
|
I tried to open one page and then click something after a couple of hours, but unfortunately did not notice any error. |
@mcollovati Thank you! I think I know how to reproduce it:
Looks like something is broken after sleep mode.. There are 3 possible places for the issue:
I do not see any errors in the application logs. I log everything (I hope :) ) |
And logs for another resync issue:
FF_27_localhost_console-export-2023-4-27_0-47-52.zip I often encounter this issue during development mode when I relaunch the application. After restarting, my application tries to reload itself in the Firefox browser but fails with the error mentioned above. Therefore, I have to manually refresh the page with F5. |
Well.. If you redeploy the application, you are literally killing the backend and the client side can NEVER be in sync when the browser is still opened and sends another request. So that's totally normal. (Except, somebody could expect that I. This case the application refreshes itself to ease development) |
From the logs, I can see a new UI instance is created because the server session is expired. Can you please share the code related to the click on the tab, such as listeners, push calls related to that click, etc ?
Be aware that this will significantly increase the logs size |
@knoobie I use Keycloak as an SSO solution, and its session does not expire. So when I redeploy my application, it automatically redirects to the Keycloak login page. Since there is an active SSO session, the user is automatically redirected back to the application. This is how I "emulate" a user re-logging into the application. The resync occurs after this redirect from SSO to the application. In this case, I can only explain that the resync happens, because the application is not fully started yet. But the Vaadin index.html is correctly delivered to the browser at this moment... |
@mcollovati In the case of clicking on the Candidate Catalog tab, the
now, I'm going to enable
and redeploy the application on the prod |
After these steps in the application logs when I click on the Candidate Catalog with TRACE I see the following:
|
Yes, that trace tells us who originated the server sync id increment, but we need to trace all increments, not just one. |
This is the only one trace I have in the application logs after the Candidate Catalog tab click with reproduced issue. On the UI in the Network console I only see one:
and UI with infinitive progress bars. The issue is very well producible right now |
Looks like after sleep mode and wake up - PUSH stopped working at all. Only XHR requests are working |
Nice catch. But it seems to me this is a different issue from the one causing the resynchronization.
Now that you can reproduce the issue, do you mind creating a simple project so that we can debug locally? |
Sure, let me see what I can do with a sample project. The biggest problem that this project is a huge code base.. I'll try |
That's crazy, but I'm unable to reproduce issue #1 on localhost. However, this issue is reproducible on the production environment where NGINX is involved. So, I suspect right now that there is some NGINX misconfiguration or something like that. |
Also, the issue #1 is only reproducible in FF browser. In Chrome and MS Edge everything is working fine. I even tried to change the Linux settings, to:
it doesn't help. For some unknown reason after sleep mode and session expiration, PUSH stops working in Vaadin application opened in FF browser (I test it under Win 10). In FF in this case it only sends the following requests: https://decisionwanted.com/?v-r=uidl&v-uiId=7 and doesn't send the https://decisionwanted.com/VAADIN/push?v-r=push&v-uiId=2&v-.... Chrome sends both.. |
That is really weird. Please, double check that the application is effectively using web sockets and has not fallback to long polling |
Push over XHR uses websockets as protocol, so it's highly possible your nginx config has to be adopted. |
I already have the following config in place:
Websockets work fine, when it work. I tried to debug my iPhone with dev console on MacBook, and do not see fallback to long polling |
I apologize, accidentally clicked close button here |
I don't know is it related but found a few errors(actually alerts) in NGINX error log: 2023/05/23 20:19:30 [alert] 1354970#1354970: *499843 could not allocate new session in SSL session shared cache "le_nginx_SSL" while SSL handshaking, client: *****, server: **** Applied the fix to letsencrypt from here: https://serverfault.com/questions/1045597/could-not-allocate-new-session-in-ssl-session-shared-cache-le-nginx-ssl-while
|
The same issue also, after resync - XHR works with WEBSOCKET_XHR transport, but PUSH doesn't work. In the application log I see the following:
|
Hi @mcollovati ! I apologize for disturbing you, but is there any progress on this issue? I checked my logs and there are still a bunch of the following errors there:
|
Hi, I'm sorry but without a setup that can be used to consistently reproduce the issue there's not much we can do. |
I'm trying to catch it somehow.. What is the most current version for 24.1 which should I use right now? Is it 24.1.0.rc2 ? |
Do we have ping-pong functionality in Vaadin Flow for WEBSOCKET_XHR to detect stale/broken connections? If no, may be there is some example how to implement it? |
IIRC there's no such functionality implemented in Flow. BTW, I don't remember if you have tried to set the |
@mcollovati thank you! I'll try! One more question - do I need to use your fix with the patched AtmosphereBroadcaster in version 24.1, or was it already incorporated there? |
The code I posted has currently not been integrated. It requires more research |
@mcollovati Is it possible to catch the 'resync' event/state somehow in the application, in order to attempt a full page refresh instead of trying to recover through an XHR request? |
The resynchronization request comes from the client when it does not receive a missing message after waiting for it for a while, so you cannot intercept it on the server side before it happens |
Thanks, I mean I'd like to intercept this on the client in order to reload the page from JS call. |
As root cause analysis is proving elusive it would be great to prioritise the ability to intercept this condition so the client and backend can effectively reset and recover. I experienced this issue with v24.2.4 when using the default websocket transport and manual push. The Vaadin debug monitor does detect an error, but the browser just continued to show a stale view state while remaining in an "offline" state. My environment setup is comparatively simple: running on Windows local host using Chrome browser tools and switching connection throttle profile to “offline” (just for a few seconds) before reverting to “none”. I can confirm calls to update component state and calls to push are guarded by access(). It's worth noting my application code aggressively debounces UI state updates and calls to push, so as to batch updates and limit websocket bandwidth utilisation. Though I apply jitter some batching does occur and I suppose can result in "large" UIDL messages from time to time. Have included the associated stack trace: 2023-12-08T16:07:11.665+08:00 INFO 678172 --- [nio-8080-exec-2] c.v.f.s.communication.ServerRpcHandler : Ignoring old duplicate message from the client. Expected: 137, got: 136 java.lang.UnsupportedOperationException: Unexpected message id from the client. Expected sync id: 137, got 138. more details logged on DEBUG level. |
Any progress here? We do also noticed these kind of Exceptions on our production TomEE application server. |
Description of the bug
From time to time I may see the following PUSH error in the logs:
Expected behavior
Theres is no such issues in the Vaadin application
Minimal reproducible example
n/a. I see such issue on the prod env. from time to time
Versions
The text was updated successfully, but these errors were encountered: