Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WebSocket sometimes fails to reconnect until browser restart or website access from incognito mode #15410

Open
davidhq opened this issue Apr 21, 2021 · 68 comments
Labels
OS/Desktop priority/P3 The next thing for us to work on. It'll ride the trains.

Comments

@davidhq
Copy link

davidhq commented Apr 21, 2021

Description

Websocket sometimes fails to reconnect. Only solution is to access the website in incognito mode or restart the browser.

WebSocket is closed before the connection is established.

Steps to Reproduce

  1. have a tab open
  2. close the laptop lid, reopen and reconnect to wifi
  3. websocket stays disconnected and keeps retrying to reconnect with no success

Actual result:

websocket keeps trying to reconnect but it is not able to

Expected result:

Should reconnect

Reproduces how often:

Intermittent issue

Brave version (brave://version info)

Latest: Version 1.23.71 Chromium: 90.0.4430.72 (Official Build) (x86_64)

Other Additional Information:

Miscellaneous Information:

@davidhq
Copy link
Author

davidhq commented Apr 23, 2021

It happened again now, latest screens:

Screenshot 2021-04-23 at 13 36 04

Screenshot 2021-04-23 at 13 35 40

Screenshot 2021-04-23 at 13 35 26

Screenshot 2021-04-23 at 13 34 04

@csadai
Copy link

csadai commented May 27, 2021

+1

@davidhq
Copy link
Author

davidhq commented May 30, 2021

It happens all the time to me recently, too bad nobody acknowledges this problem for more than a year now :/
Such a core functionality and browser otherwise works ok

@davidhq
Copy link
Author

davidhq commented Jul 5, 2021

So ... today I tried using Firefox after some time, and surprise: websockets there are even more broken!

So is this tech doomed? See this: https://stackoverflow.com/questions/14140414/websocket-was-interrupted-while-page-is-loading-on-firefox-for-socket-io/68260938

Why Brave and (even more so) Firefox teams don't care about this even one bit and assign lowest possible priority to this or entirely ignore the issue?

What is the working websockets alternative once can use in the same way? There probably isn't any... so many websites use websockets for crucial parts of their functionality and this means that one third of the internet is half broken in some of the biggest browsers... wtf!

@davidhq
Copy link
Author

davidhq commented Jul 5, 2021

https://medium.com/axiomzenteam/websockets-http-2-and-sse-5c24ae4d9d96

I would encourage every team to explore SSE. Perhaps it doesn’t completely cover your use case, or maybe it’s the missing piece of your HTTP/2 compliant stack you’ve been waiting for. Either way, it’s fun and lightweight, and as we start seriously considering parting ways with our dear old friend WebSockets, it’s time to start calling up older friends to see what they have to offer.

?

@rebron rebron added repros-on-chrome priority/P5 Not scheduled. Don't anticipate work on this any time soon. Chromium/waiting upstream Issue is in Chromium; we'll likely wait for the fix labels Aug 6, 2021
@rebron
Copy link
Collaborator

rebron commented Aug 6, 2021

@davidhq Do you have an issue filed here: https://bugs.chromium.org/p/chromium/issues/list
This is not an issue that we'd address separately from the chromium team.

@davidhq
Copy link
Author

davidhq commented Aug 7, 2021

@rebron no, the issue is not in that list and the problem does not happen in Chrome, it's a Brave issue

@davidhq
Copy link
Author

davidhq commented Aug 7, 2021

You somehow broke the core functionality which is working in Chromium

@wanton7
Copy link

wanton7 commented Oct 20, 2021

This seems to be a big issue for things that are run though websockets only like Phoenix LiveView https://elixirforum.com/t/websocket-is-closed-before-the-connection-is-established/40481/5

chrismccord Creator of Phoenix
This is the third report I have seen with Brave hanging in the 101 websocket upgrade. Unfortunately it doesn’t appear to be anything on our side and by all things I’ve seen the bits never make it to the server. One person had it happen only when the adguard extension was enabled, but disabling didn’t fix it until they restarted Brave. Both previous reports I’ve seen restarting Brave fixed it, but this looks like some obscure issues squarely on Brave’s side. I have wondered if we are attempting to connect too aggressively for brave, but it is a total guess. If you are able to reliably recreate, you could try instantiating the liveSocket and liveSocket.connect inside a window.addEventListender("DOMContentLoaded", ...)

@chrismccord
Copy link

chrismccord commented Oct 28, 2021

@rebron I'm getting more reports about this affecting Phoenix users, and I'm wondering if you were able to verify this is actually reproducible on chromium? I see the repros-on-chromium and awaiting-chromium-upstream labels have been added here, but I have yet to hear about anyone hitting this kind of issue on chrome. Narrowing down where this might be occurring would be really helpful. Does Brave implement any websocket specific handling or security checks? I've long wondered if we connect too aggressively and trip some brave threshold, but I have no evidence to suggest that's the case. In any case, I would love see if we can at least pinpoint whether this is on the Brave or Chromium side. All reports point to Brave at the moment. Thanks!

@davidhq
Copy link
Author

davidhq commented Oct 28, 2021

My guess would be that they think many other issues should get priority or that this is not happening / is not important.

The nature of this issues is that it happens once every few months perhaps, sometimes more, under right conditions.

Not trying to be negative but indeed I have stopped utilizing Brave for anything because such unpredictable issues when you have to stop and think if the problem is my code or not (it is not)... just not worth it.

For developers of Elixir / Phoenix it must be even more frustrating because on the outside it looks like their system is not reliable when it's browser's fault. If they in turn point a finger to chromium then it shows lack of organization or responsibility over there.

I hope they make it work though :) Would be good for everyone.

@Thireus
Copy link

Thireus commented Oct 28, 2021

macOS 12.0.1 Intel here. Such a weird issue indeed with wss:// completely stuck with "WebSocket is closed before the connection is established" despite restarting the app several times. Private Window seemed to work just fine though.

Had that issue with:
Version 1.32.81 Chromium: 95.0.4638.54 (Official Build) beta (x86_64)

Just upgraded to a new release, the issue is still there but less frequent now:
Version 1.32.84 Chromium: 95.0.4638.54 (Official Build) beta (x86_64)

Give it a try: https://cryptowat.ch/charts/POLONIEX:SHIB-USDT

Websocket_Issue_Brave

@davidhq
Copy link
Author

davidhq commented Oct 28, 2021

macOS 12.0.1 Intel here. Such a weird issue indeed with wss:// completely stuck with "WebSocket is closed before the connection is established". Private Window seems to work just fine.

Had that issue with: Version 1.32.81 Chromium: 95.0.4638.54 (Official Build) beta (x86_64)

Just upgraded to a new release, the issue seems to be gone (at least for now): Version 1.32.84 Chromium: 95.0.4638.54 (Official Build) beta (x86_64)

we commented almost at the same time...

so for you did it actually happen on Chrome and not on Brave?

For me in 2 years with this issues it never happened in chrome... or rather MAYBE once... in Brave every few weeks, sometimes days, sometimes months.

Surely it must be complicated and elusive and a lot of work for maintaners...

just pointing out again after 6 months that it's a rather nasty issues (at least on Brave) and it seems many others are experiencing this or something similar.

@Thireus
Copy link

Thireus commented Oct 28, 2021

macOS 12.0.1 Intel here. Such a weird issue indeed with wss:// completely stuck with "WebSocket is closed before the connection is established". Private Window seems to work just fine.
Had that issue with: Version 1.32.81 Chromium: 95.0.4638.54 (Official Build) beta (x86_64)
Just upgraded to a new release, the issue seems to be gone (at least for now): Version 1.32.84 Chromium: 95.0.4638.54 (Official Build) beta (x86_64)

we commented almost at the same time...

so for you did it actually happen on Chrome and not on Brave?

For me in 2 years with this issues it never happened in chrome... or rather MAYBE once... in Brave every few weeks, sometimes days, sometimes months.

Same, same... more than 2 years I would say really...

This is Brave Beta version that I use. But indeed it happens every few weeks, sometimes days as you mention. This is driving me crazy, having to manually reload websites that use live streams with websockets every now and then because of that issue.

But today on Brave Beta version 1.32.81 no matter what I tried it would not connect to wss at all (only in Private Window mode). I noticed there was a new Beta version 1.32.84 a few minutes ago, so I upgraded, now wss works, but I bet this is just a rollback to the previous behaviour with wss getting suck every few weeks, sometimes days... we shall see.

Edit: issue is back... oh well :(

@davidhq
Copy link
Author

davidhq commented Oct 28, 2021

I guess WebSockets are tricky :) even more so underlying implementation of the protocol (it seems)

I'm sure Brave team is very competent and if it wasn't tricky, they would already make it work.

Not sure why they had to add or change anything in chromium implementation then?

A good start would be as @chrismccord recommends to actually know what is the difference between vanilla chromium.

@davidhq
Copy link
Author

davidhq commented Oct 28, 2021

Maybe everyone is moving to WebTransport which will replace WebSockets soonish!

https://www.w3.org/TR/webtransport/

@wanton7
Copy link

wanton7 commented Oct 29, 2021

@davidhq from Elixir forum thread there was talk about Brave proxying WebSocket connections. So this could be an issue in their proxy implementation here https://github.com/brave/brave-core/blob/master/browser/net/brave_proxying_web_socket.cc or in code that uses it.

@grahac
Copy link

grahac commented Nov 3, 2021

Don't know if this helps or not but I have seen this primarily when a site opens multiple different websockets on multiple tabs. It seems like Brave (on windows primarily) only likes one websocket from each domain?

@wanton7
Copy link

wanton7 commented Nov 3, 2021

@grahac I think issue #19080 is related to this one. It explains that Brave fails to close WebSocket connections properly and gets into to an invalid state where WebSocket connections don't connect anymore.

@megalithic
Copy link

@jonathansampson and @brave-dev team, this would be suuuuuch an important thing to fix, since more and more apps and frameworks are using websockets for data transports from client to server (Phoenix LiveView being the one top of mind).

Any thoughts/updates on this?

@Karunamon
Copy link

Karunamon commented Oct 16, 2023

I finally figured out that the strange behavior I have been seeing is web socket related, and a look at the console confirms it is this problem, so +1 from me as well. Windows 10 22H2, browser 1.59.117 , though it's been happening for months across multiple versions.

  • It does not matter whether or not there are updates pending.
  • I am running a pretty restrictive software firewall (simplewall), but this problem started before I begin using it. Brave is white listed besides, so this is not a firewall problem.
  • It's also not an environment problem. This happens on popular public websites (the usual indication that I hit this bug is the inability to connect to chat on Twitch) but also on sites in my LAN (I run an instance of uptime kuma, and if web sockets are broken, the application is unusable)
  • In my case it seems to be time-related. I keep my machine on during the day and in sleep mode at night. It has never happened twice in the same day (after a full browser restart), so there is a minimum time between recurrences of about 12 hours. Sleep mode does not reliably trigger it.

Hope this helps

@pejrich
Copy link

pejrich commented Mar 3, 2024

Sorry for the wrong prioritization, we're discussing this internally. I adjusted the priority/p5 label to priority/p3.

Hey, curious how much longer that discussion will be going on for? I was gonna grab some lunch, do you think you'll be discussing this for another 18 months or so? Just wanted to get a gauge on time, so I don't accidentally miss some movement on it. Ya'll make the Tar Pitch Drop look like Formula 1. https://en.wikipedia.org/wiki/Pitch_drop_experiment

@jokaorgua
Copy link

is there any kind of progress here? I guess in a couple of days we are going to celebrate 3 years of this bug. :)) Really? It seems it is time to move to Google Chrome or Chromium if Brave does want to move on with fixing bugs

@bezenson
Copy link

Any chance it's going to be fixed?

I'm developer and my app uses WebSocket. A little bit tired to reload browser during the day to fix the issue. And was really wondered that issue exists for 3 years...

@wanton7
Copy link

wanton7 commented Apr 26, 2024

@bezenson highly unlikely because the browser has been broken for over three years when it comes to WebSockets. Lot of sites now days rely on WebSockets for their real-time UI updates. I would expect Brave users have experienced all kinds of odd issues in all these years.

You can detect Brave browser with navigator.brave.isBrave(). Just give warning when someone is using Brave and explain that the browser team hasn't fixed this known bad issues in three years and add link to this issue.

@pejrich
Copy link

pejrich commented May 1, 2024

Maybe the 2nd best programmer in the world, according to this list: https://unstop.com/blog/best-programmers-in-the-world could lend us a hand. Because like so many people out there, when I hear the names Ritchie, Knuth, Kernigan, Torvalds, I yawn, and say "you think they're smart?! What about that guy who owns that browser company that can't even do Websockets! He's the 2nd best programmer in the world!*" (*unless you happen to be someone who uses websockets)

@BootsSiR
Copy link

Developing a project that uses websockets and it will randomly stop working until I restart Brave. Pretty annoying.

@bezenson
Copy link

@BootsSiR I know that feel, bro

image

@mareczek
Copy link

+1

1 similar comment
@aguadeowo
Copy link

+1

@markg85
Copy link

markg85 commented Jun 14, 2024

Create a site with a websocket connection. Probably best to use your own server for this.
Rapidly refresh so you get lots op closed before open.

Now enjoy waiting, what seems like, forever till any websocket connection is established again. Entirely depending on how long you kept refresh bashing.

Something somewhere now blocks the websocket connection. I'm 99.99% confident that this is in the browser itself. Why?

  • open the same site in a different browser: works
  • close the browser and reopen: works

Now this specific test isn't what's reported here. You don't even see the socket error message as that is gone before you can see it due to the refresh.
But... It does provide.a stable point to search from. Suffice to say, rapidly opening and closing a socket connection is what's going on here and is causing a weird timeout to kick in. If a site (without refreshing) does the same behavior (rapidly opening and closing a socket) you'll likely have the very issue that is reported here.

@babashark
Copy link

image

image

please help.... >_<

@rroller
Copy link

rroller commented Jul 20, 2024

This is happening to me when using Brave to access Frigate (Live streams from cameras stop working) and Unraid (CPU stats stop working which are driven from a ws)

@jokaorgua
Copy link

@brave-dev

are there any plans to fix this?

@cheald
Copy link

cheald commented Aug 27, 2024

I've been experiencing this in local development. I am using Vite, which uses a websocket channel to do hot module reloads. When the websocket connection breaks, my hot reloads stop working until I restart Brave, which is a serious bummer.

#19990

I do tend to keep a large number of sites open, and who knows how many of them use websockets. I closed a bunch of tabs, and my websockets immediately began working again, which suggests to me that this is a global pool limit of some sort.

My development happens against localhost:port, and I don't have Brave's shields on for localhost, but it doesn't seem to matter either way.

My running suspicion is that another of the sites I use is exhausting the websocket pool, and closing a bunch of tabs closes down whatever's exhausting it, but I haven't validated that yet.

@Inlustra
Copy link

Inlustra commented Sep 11, 2024

Do yourselves a favour and move to ungoogled-chromium. More of the privacy, none of the pain.

Appreciate that a lot of people have been saying that they have issues with their users using Brave, I guess all you can do in that situation is ask your users to stop, add a banner to your webpage explaining that you don't support Brave.

I've been subscribed to this issue for over 2 years and will finally be unsubscribing from it.

@acnebs
Copy link

acnebs commented Sep 12, 2024

@Inlustra I'm fairly certain this is actually an upstream Chromium issue – I've also noticed it happening with Electron apps and similar. I think you'll eventually see the same things in ungoogled-chromium when you've spent enough time with it.

@markg85
Copy link

markg85 commented Sep 12, 2024

@Inlustra I'm fairly certain this is actually an upstream Chromium issue – I've also noticed it happening with Electron apps and similar. I think you'll eventually see the same things in ungoogled-chromium when you've spent enough time with it.

Any chromium-based browser seems to have this effect.

This is just speculation on my part but i would not at all be surprised if this, as in the actual root cause of this behavior, is a security feature or done for security purposes. Why? Most my developer annoyances with chrome eventually end up being "security" related. Can't load resource from server (cors, had to fix headers), can't load site as file (had to spin up a server so it's localhost instead of file://...), etc...

It would be super if some knowledgeable about the chrome code could dive into this one.

@pejrich
Copy link

pejrich commented Sep 30, 2024

@markg85 those things you mention are legitimate security concerns. CORS is in place to stop JS code making requests as a signed in user to a website that is expecting requests only to come from itself. The inability for the browser to have free reign on your filesystem is equally a valid security measure. Are these things sometimes annoying to work around? Sure, but nevertheless the are completely valid and sensible security concerns. They are there to stop very real, and very easy attacks that would exist if they weren't there.

Randomly breaking websockets on a regular basis however is NOT a security feature.

@Inlustra
Copy link

Inlustra commented Sep 30, 2024

@acnebs @markg85 I've been using ungoogled-chromium for the past year and haven't seen the issue at all, and I leave my browser windows open for weeks in code-server (I was experiencing this issue weekly in Brave)

My personal dashboard uses Websockets, they're open on every new tab. I use code-server, also driven by websockets. I use Unraid, Frigate, all of which are powered by websockets. If there was an issue in Chrome or ungoogled-chromium, I would have noticed it already.

Not saying you're wrong, I did see this issue in Vivaldi too in my hopping for a replacement, so it's likely a chromium issue, but Chrome and ungoogled-chromium has this fixed, at least from my anecdotal evidence. Would be good to see if anyone else can reproduce in these browsers.

@markg85
Copy link

markg85 commented Sep 30, 2024

@pejrich Thank you for your analysis but that's unnecessary.

I hadn't put more thought into it other than to go by my own grievances. But apparently that provokes actually answering why the cause of those grievances exist, that wasn't the point at all.. The point is that i assume - without knowing - that a security mediation at some point in time had an effect on websocket behavior. Like DOS/DDOS protection could very realistically be a thing. I think, couldn't find the actual limit though, that even pressing F5 (or CTRL+F5 or SHIFT+F5) has a max limit of refreshes it allows you to do per second to prevent flooding.

The issue here resembles DOS/DDOS so it's only logical to assume some preventative measure in the browser itself is affecting this.

@pejrich
Copy link

pejrich commented Sep 30, 2024

@markg85 You mentioned those two specific examples or CORS and file system access followed by "security"(in quotes), as if to suggest that they are not actually security features, when in reality they are in place for a very good reason. If my comment correcting your inaccuracies is unnecessary, then what pray tell is the necessity of your original comment?

@markg85
Copy link

markg85 commented Sep 30, 2024

Hi @pejrich, I'll happily explain it! We are drifting off-topic though, sorry for that.

To me CORS is a security measurement that frustrates development. Why can't i do local development if my URL is file://? It can be assumed that i'm working on my local filesystem so why throw in CORS to ruin it? That - note the file:// in the URL bar too! - is just pointless security nerfing for no reason as running it on a localhost webserver is merely a step to satisfy the browser.

I genuinely hate CORS because there are some sites that provide json data that is meant to be fetched but them not having setup CORS correctly makes that hard. So i made a "cors proxy" site that wraps a request in a CORS request and fetches the data behind the scenes outside the browser to circumvent such ill configured sites. It's frustrating that the browser decides to block a request in such cases. In my opinion the browser should stay out of my way and let me do what i tell it to do. If a site doesn't like that then they should fix their server settings to prohibit it. Now it works like this: "hey, i'm chrome, do you allow me to request this info? No or you don't know? Then i'll preemptively block the request for my user!" while i want it to be: "hey i'm chrome, give me the data for this resource. Not blocked? Sweet, here's the data!". Or to put it differently, i think CORS should be server side configuration that, when configured, is something the client listens too. No CORS on the server should be a free unobstructed fetch.

I'm not questioning the protection CORS offers, i just find it to be too strict.

The point of my original post was to highlight my educated guess that looking for security related measures along with websocket connections could well be a cause of the symptoms we're seeing. It's a "I've got a hunch" starting point that might be useful if someone decided to take a look at the code.

@pejrich
Copy link

pejrich commented Oct 1, 2024

@markg85 You seem to fundamentally misunderstand why these measures are in place.

If the browser were to simply assume that because you opened a local HTML file, that it should serve any local file, then merely getting someone to open an html file on their computer(which by default would open in the browser) would be enough for me to copy the entire contents of their hard drive and send it to myself. I think it's safe to assume that not everyone who deliberately or accidentally opens an HTML file wants whoever wrote that file to have access to their entire computer.

And your "CORS bypass" proxy is not in any way bypassing CORS protection. Sure, you might be able to access the information on the server, but CORS isn't merely there to stop you accessing the information on the server, it's there to stop a request to that server looking like the user made it. After all, any requests in JS are coming from the user's browser. If they just logged into their bank before visiting my website, then my site makes a JS request to "bank.com/transfers?amount=1000000&to=MY_ACCOUNT_NUMBER", the browser would include the users cookie and to the bank it would look like a request from the user. Your "CORS bypass" has no such issue since when you request the CORS bypass server, the browser won't be including a cookie for the website you ultimately want to access, and therefore there's no security risk.

It's best to understand why these measures are in place before you just assume they exist only to annoy you, or that merely because they annoy you, that they're therefore "too strict"

@markg85
Copy link

markg85 commented Oct 2, 2024

@pejrich There is a persistent misunderstanding here.
I dont't mean or imply to say that a local file:// site should allow fetch('file://...'). No, just no! That would be a severe security failure, i can totally see that and i'm not advocating for that at all.

What i mean is that running a site (we're just talking about html/css/js from the same folder) from file://... should relax cors rules as-if you run on localhost. Hope that makes more sense

After all, any requests in JS are coming from the user's browser. If they just logged into their bank before visiting my website, then my site makes a JS request to "bank.com/transfers?amount=1000000&to=MY_ACCOUNT_NUMBER", the browser would include the users cookie and to the bank it would look like a request from the user.

My understanding is different. I wasn't aware a CORS request also sends a browsers cookie. In fact, i'm fairly sure that's not happening when you use fetch. I just want to send a fetch request to something, it should allow that. It doesn't because of cors so cors is getting the beating for being too strict. Please do explain the cookie case as it seems like an edge case to me. And in that mindset it seems like insanity to let the browser block a request just to handle a possible nefarious edge case. The browser knows a lot about cookies (like who placed them) so why not use that knowledge in requests to allow or deny even getting the cookie? This to me all smells like a "fixed the symptoms, not the cause" case. But then again, i might be completely misunderstanding it so i'd happily hear your thoughts on it!

It's best to understand why these measures are in place before you just assume they exist only to annoy you, or that merely because they annoy you, that they're therefore "too strict"

It is surprisingly difficult to find the actual reasons for CORS. All i can find is "because of security" and some include very minimal example usecases. It's hard to find a true in depth explanation of the actual symptoms cors was meant to fix. You're doing a better job at that then most of the results i could find! Thank you for that :) (even though we drift off-course from the topic, feel free to mail me instead (email is on my github profile).

@dirkhas
Copy link

dirkhas commented Dec 6, 2024

I see the broken websocket behavior in Brave too, using the site humaans.io. It's an HR service we use at my workplace. Due to this bug, it fails every. single. day.

@BootsSiR
Copy link

BootsSiR commented Dec 6, 2024

Proxmox VM console also break because of the WebSocket implementation in brave

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OS/Desktop priority/P3 The next thing for us to work on. It'll ride the trains.
Projects
Status: No status
Development

No branches or pull requests