Skip to content
This repository has been archived by the owner on Oct 23, 2024. It is now read-only.

CRITICAL: ICE Procedure failed in 4.3.x #526

Open
MrSmith87 opened this issue May 23, 2020 · 24 comments
Open

CRITICAL: ICE Procedure failed in 4.3.x #526

MrSmith87 opened this issue May 23, 2020 · 24 comments

Comments

@MrSmith87
Copy link

Hi Guys,

I have been facing connectivity issue with ICS MCU once every 7-8 times I connect. I guess this is happening after I deployed your latest branch last month following the TLS1.2 (as 4.3 release stopped working for safari)
Ref
https://github.com/open-webrtc-toolkit/owt-server/pull/466

This is critically impacting our production services, hence request you to look into it in priority.
Please find noted below client side and server side errors

WEBRTC LOG

2020-05-23 06:39:39.939 - INFO: WorkingNode - pid: 21638
2020-05-23 06:39:39.941 - INFO: WorkingNode - Connecting to rabbitMQ server...
2020-05-23 06:39:39.953 - INFO: AmqpClient - Connecting to rabbitMQ server OK, options: { host: 'localhost', port: 5672 }
2020-05-23 06:39:39.976 - INFO: InternalConnectionFactory - QUIC is not enabled for internal IO
2020-05-23 06:39:39.983 - INFO: WorkingNode - webrtc-********* as rpc server ready
2020-05-23 06:39:39.987 - INFO: WorkingNode - webrtc-********* as monitor ready
2020-05-23 06:43:35,894 - WARN: dtls.SSL - failed in error
2020-05-23 06:43:35,894 - ERROR: dtls.DtlsSocket - SSL error 1
2020-05-23 06:43:35,894 - WARN: dtls.DtlsSocketContext - DTLS Handshake Failure error:14102412:SSL routines:dtls1_read_bytes:sslv3 alert bad certificate
2020-05-23 06:43:35,894 - WARN: DtlsTransport - id: 748717144211681300, message: Handshake failed, transportName:video, openSSLerror: error:14102412:SSL routines:dtls1_read_bytes:sslv3 alert bad certificate
2020-05-23 06:43:35,895 - ERROR: WebRtcConnection - id: 748717144211681300, message: Transport Failed, transportType: video
2020-05-23 06:43:35.895 - WARN: Connection - message: failed the ICE process, code: 502, id: 748717144211681300
2020-05-23 06:43:35.896 - WARN: WrtcConnection - ICE failed, 500 748717144211681300
2020-05-23 06:54:20.144 - WARN: WorkingNode - Exiting on SIGTERM

CONFERENCE LOGS

2020-05-23 06:39:39.698 - DEBUG: WorkingNode - No native logger for reconfiguration
2020-05-23 06:39:39.738 - INFO: WorkingNode - pid: 21597
2020-05-23 06:39:39.738 - INFO: WorkingNode - Connecting to rabbitMQ server...
2020-05-23 06:39:39.749 - INFO: AmqpClient - Connecting to rabbitMQ server OK, options: { host: 'localhost', port: 5672 }
2020-05-23 06:39:39.957 - INFO: WorkingNode - conference-f5ec6862554b1609bfec@xxxx as rpc server ready
2020-05-23 06:39:39.960 - INFO: WorkingNode - conference-f5ec6862554b1609bfec@xxxx as monitor ready
2020-05-23 06:43:35.897 - INFO: AccessController - onFailed, sessionId: 748717144211681300 reason: Ice procedure failed.

Client side error:

Screenshot 2020-05-23 at 11 59 54 AM

@starwarfan
Copy link
Collaborator

Hi,
We cannot reproduce this issue on latest 4.3.x branch. Did this happen on specific browser or all the browsers?

@MrSmith87
Copy link
Author

I am able to replicate it multiple times atleast in Chrome.

@starwarfan
Copy link
Collaborator

Client sent a bad certificate to server according to the logs.
Can you replicate it by refreshing the sample page on latest chrome?

@MrSmith87
Copy link
Author

Just figured out that one of the intermediate certificate that were installed int the service had expired while the root certificates were fine. Replaced it, will monitor this issue for next few days before closing this ticket.
However the weird part is, if expired certificate was the issue then why was it working most of the time but failing at times only.

@starwarfan
Copy link
Collaborator

starwarfan commented Jun 5, 2020

Hi, what's your intermediate certificate?
I remember the certificate is generated during ICE in 4.3.x.

@MrSmith87
Copy link
Author

Oh,
My bad.
I was referring to the SSL certificates (it includes domain, root and intermediate certificates) installed in the sample service application running the REST APIs.
Now I understand that it was the certificate meant for DTLS handshaking, which you mentioned.

BTW do you have any idea under what scenario can such failure happen?

@starwarfan
Copy link
Collaborator

Hi,
when certificate setting was supported long time ago, DTLS failure could happen when the certificate is larger than MTU, because certificate segmentation is not handled. But I think this issue is not similar.

@MrSmith87
Copy link
Author

I see.
Can you relate it to some change that might have happened in Release 4.3, as Release 4.2.1 never had this issue? I know people who downgraded to 4.2.1 from 4.3 in their production because of similar issues and things are fine for them now.

@MrSmith87
Copy link
Author

Hi,

I have upgraded the Server to the latest 4.3.x, also moved my deployment to a cloud with dedicated CPU, 8 core+32GB RAM to make sure its not an environment issue. Getting the exact same error, that too quite frequently. Please help!!

@MrSmith87
Copy link
Author

Hello Team,

This issue is severely impacting the functionality. Given that this is surely a critical issue for everyone using your MCU in production mode, Its important to identify and get this resolved as soon as possible.

@MrSmith87
Copy link
Author

I was reviewing the server user guide, came across this line "But if you meet DTLS SSL connection error in webrtc-agent, please use 1024-bit instead of 2048-bit private key because of a known network MTU issue." Do you think its related as I am using 2048 bit Private key?
But If that is the case shouldn't I be getting this error everytime?
Kindly help!

@MrSmith87
Copy link
Author

Hello Team,

Any update on this? I am pretty sure you cannot doubt the criticality of this issue and consider this issue in top priority.

Regards
Sumit

@mattskinosix
Copy link

mattskinosix commented Aug 14, 2020

I have the same problem on chrome beta build 85.0.4183.69 with owt 4.3.1.
I haven't any problem with Chrome Stable.

@kurapatijayaram
Copy link

Hi @mattskinosix ,
I am facing this issue with chrome stable 84 and 85 versions.

@mattskinosix
Copy link

mattskinosix commented Aug 26, 2020

The problem Is solved with this change #599. It change version of TLS used

@kurapatijayaram
Copy link

Thanks for the reply @mattskinosix , I have to pull the latest code from master to overcome this issue ?

@mattskinosix
Copy link

mattskinosix commented Aug 26, 2020 via email

@kurapatijayaram
Copy link

One more question @mattskinosix , in v4.2.1 also this problem exists ??

@mattskinosix
Copy link

mattskinosix commented Aug 26, 2020 via email

@ry28
Copy link

ry28 commented Oct 12, 2020

I have the same issue, when I set x509 version is invalid too.

@ry28
Copy link

ry28 commented Oct 16, 2020

Hello Team,

Any update on this?
Today I have the same problem again.

@mattskinosix
Copy link

mattskinosix commented Oct 16, 2020 via email

@ry28
Copy link

ry28 commented Oct 16, 2020

Thanks for the reply @mattskinosix

4.3.1 version fix issue #614, it's problem is "dtls1 read bytes: tlsv1 alert decode error".
but in the issue #526, it's for "dtls1_read_bytes:sslv3 alert bad certificate".

So, it is One more question?

@mattskinosix
Copy link

mattskinosix commented Oct 16, 2020 via email

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants