Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recent BearSSL updates not honoring client.setInsecure() for Public WiFi that requires acceptance of "Terms and Conditions" at redirected address #5019

Closed
6 tasks done
gojimmypi opened this issue Aug 8, 2018 · 15 comments
Assignees
Labels
waiting for feedback Waiting on additional info. If it's not received, the issue may be closed.

Comments

@gojimmypi
Copy link
Contributor

Basic Infos

  • This issue complies with the issue POLICY doc.
  • I have read the documentation at readthedocs and the issue is not addressed there.
  • I have tested that the issue is present in current master branch (aka latest git).
  • I have searched the issue tracker for a similar issue.
  • If there is a stack dump, I have decoded it.
  • I have filled out all fields below.

Platform

  • Hardware: NodeMCU 1.0 ESP-12E ESP82266
  • Core Version: commit 7dd2ca3 (working, recent git pull will cause failure)
  • Development Env: Visual Studio w/ Visual Micro
  • Operating System: Windows 10

Settings in IDE

  • Module: [Generic ESP8266 Module|Wemos D1 mini r2|Nodemcu|other]
  • Flash Mode: [qio|dio|other]
  • Flash Size: 4MB
  • lwip Variant: |v2 Lower Memory
  • Reset Method: [ck|nodemcu]
  • Flash Frequency: [40Mhz]
  • CPU Frequency: [160MHz]
  • Upload Using: [SERIAL]
  • Upload Speed: 256000] (serial upload only)

Problem Description

Hello.

Recent updates (since June 13) have apparently disabled or otherwise now ignore the (BearSSL::WiFiClientSecure)client.setInsecure() to no longer allow an "insecure" connection. Fortunately I had a backup of my \hardware\esp8266com\esp8266 to confirm this.

So why would one want to allow an insecure connection? Well, when you are connecting to a Public WiFi access point that asks to "accept the terms and conditions" of use. (It's our own WiFi, in a corporate environment). When doing a client.connect(host, httpPort) over SSL, the connection simply now fails. I suspect the problem is I'm trying to connect to a "real" TLS/SSL address, and the WiFi sneaks in and forces the "response" to instead come from 192.0.2.1 (a pseudo web server of sorts at the AP)- the nature of SSL under normal conditions, is that yes - it failed. But during intial connection time, I'm actually expecting this. (and it recently worked)

If interested in the details, I have a doAcceptTermsAndConditions() in my development branch:

https://github.com/gojimmypi/DesktopDashboard/blob/development/htmlHelper.cpp#L722

@devyte
Copy link
Collaborator

devyte commented Aug 9, 2018

CC @earlephilhower

@devyte
Copy link
Collaborator

devyte commented Aug 9, 2018

@gojimmypi your linked code is too long. Please reduce to a MCVE sketch for testing.

@devyte devyte added the waiting for feedback Waiting on additional info. If it's not received, the issue may be closed. label Aug 9, 2018
@gojimmypi
Copy link
Contributor Author

gojimmypi commented Aug 9, 2018

ok, will do. I'm having some difficulty in coming up with a tiny demo that shows the problem, as once I am connected with working code, the AP allows me to reconnect for quite some time, no longer redirecting to the terms and conditions page. Please be patient and I will supply an example as soon as I can.

in the meantime for reference - the breaking code is in commit f776454

@earlephilhower
Copy link
Collaborator

The traveling today so can't get into details, but the referenced commit doesn't affect anything but a newline to the serial port. Can you check that's the right one?

SetInsecure just disables the SSL checking. It always returns OK when asked by the backed.

There is no concept of a redirect at the tcp level, only at the https one. You need to be able to negotiate a SSL connection and then handle the http protocol and the returned 301 (or whatever the permanent or temporary redirect code is...)

A reproducible sketch would help explain what you're trying to do...

@gojimmypi
Copy link
Contributor Author

@earlephilhower I should have been more clear; the working code is 7dd2ca3 - somewhere between there and f776454 a change was made to cause client.connect(host, httpPort) to fail when the intended destination is replaced with the AP pseudo-web "terms and conditions" page. It will be at least Aug 20 before I can revisit this.

@earlephilhower
Copy link
Collaborator

Ah, that would make more sense, @gojimmypi.

Just to be clear, setInsecure can and could never actually connect to a non-TLS server, it still requires encryption. It just does not validate the certificate (so it is insecure in that you could be subject to a MITM attack and would not notice it).

We'll follow up when you get some time after the 20th.

Thx
-EFP3

@gojimmypi
Copy link
Contributor Author

@earlephilhower could you please clarify:

can and could never actually connect to a non-TLS server

when attempting to connect to a valid TLS server with a current certificate and all... what would you expect to happen if the "terms and conditions" web server on the AP inserted itself to show that page before proceeding? I would think that the page would be flagged as "not secured", no? That would essentially be acting as MITM (thus why I thought setInsecure resolved this for me)

if it is not related to the setInsecure - then I wonder what is different between the two commits to allow the older version to connect, but the new one to fail. I've tested repeatedly and can confirm something changes to disallow the connection on that latest commit when the Terms and Conditions page is presented.

thanks for feedback, I'm certainly curious as to the cause and solution here.

@earlephilhower
Copy link
Collaborator

@gojimmypi I think you're saying the AP is serving fake DNS response with a short TTL so that, say, www.github.com points to 10.0.0.1 or whatever the portal is, yes? Or are you saying it does a redirect, which is a HTTP-level thing and not at the TCP stack level?

If www.site.com DNS resolves to 10.0.0.1, then the SSL object will try and open a TCP connection to 10.0.0.1:443 and begin the TLS handshake. As long as the AP on that port handles TLS negotiation and sends back any certificate, (i.e. one valid for 10.0.0.1 and not www.site.com) it should work. If the AP doesn't have a https login page, though, it won't connect since it can not fallback to :80 un-encrypted comms.

If that's the case, I'll need some wire dumps to see what's going on. I don't have any way to reproduce it otherwise except when I'm stuck in a hotel. While I like the 8266 and its quirks, I don't like it enough to bring it on business trips. :)

@earlephilhower
Copy link
Collaborator

One other thing, the setInsecure does not live through connect attempts.

That was an API change before the release where each connection reuse was set to "start fresh."

So, if you're only doing setInsecure one time at the start of the app, just move it to right before the connect() line and you may be all set.

@gojimmypi
Copy link
Contributor Author

@earlephilhower interesting questions: no, I didn't think that fake DNS is being used (although indeed I suppose that would be one way to implement that).

I'm thinking that this is purely a redirect; I recall when the new 1.1.1.1 dns came out, I was wondering what would happen to my code (that was the old location of the AP pseudo web server). Indeed the IT folks ended up changing the AP software, breaking my code, and now the address is http://192.0.2.1

see: https://github.com/gojimmypi/DesktopDashboard/blob/development/htmlHelper.cpp#L744

and here'e where I manually do the get/redirect (but on port 80, now making me wonder if this new problem is at all related the TLS):

https://github.com/gojimmypi/DesktopDashboard/blob/development/htmlHelper.cpp#L797

(Admittedly this is quite a hack, but I learned a lot; I definitely need to clean up the code)

I think this is the most promising comment:

So, if you're only doing setInsecure one time at the start of the app, just move it to right before the connect() line and you may be all set.

As yes, I believe I set it only once on the object at init time. Unfortunately I don't have access to this from home, but I am looking forward to exploring it more. I'm not sure I'll be able to provide any wire dumps, as I don't have access to the AP, and I certainly don't have WireShark for the ESP8266 ;)

I did play with this AP packet forwarder:

https://gojimmypi.blogspot.com/2017/08/openwrt-remote-network-wireshark-packet.html

but alas this would the forwarding router AP itself to be a client - and I'd lose the actual ESP8266/BearSSL-AP packet exchange.

Do you have any tips on how to sniff these packets, other than in my own code?

stuck at a hotel, eh? Hopefully you've been enjoying DefCon!

@earlephilhower
Copy link
Collaborator

Can you please simplify what you're doing and where it fails into a small MCVE?

I've looked over bits of the code and, from the descriptions I don't even see any SSL connections in the doAcceptTermsAndConditions() function you referenced, only htmlSend()s to port 80. (I found the delay(5000); comment inexplicably hilarious for some reason. :) )

I'm not sure what the comparison with client == NULL will boil down to, especially since client is a global variable and not a pointer and doesn't have an operator ==, only an operator bool():
https://github.com/gojimmypi/DesktopDashboard/blob/81143a9e8a5cdee7a89ad6bfa3dc326e18a5475c/DesktopDashboard.ino#L496-L512

It's always safe to setInsecure() on the client, and takes no time (sets 1 flag and returns), so you can move that down unconditionally right before the connect() a little bit later.

W/o the MCVE, though I'm not sure there's much we can help with here. If you can't control the AP, then you probably can't get a good packet trace. But at this point, I'd recommend making a MCVE to pinpoint the problem, which could very well be elsewhere...

@gojimmypi
Copy link
Contributor Author

Just a heads up that the sprinkling of more setInsecure() before each connect() did not resolve.

I'm not sure what the comparison with client == NULL will boil down to

This is half-baked code that will eventually pass a pointer to a client object in a helper-class wrapper.

(I found the delay(5000); comment inexplicably hilarious for some reason.

lol - that is actually a critical piece of code, without it the captcha gets triggered.

Thanks for your patience as I've not yet had time for the MCVE.

@earlephilhower
Copy link
Collaborator

Check PR #5120 . The connection flags got stomped on in some cases which may have caused setInsecure to be removed right when the connection was made.

@gojimmypi
Copy link
Contributor Author

Hm. Yes, #5120 is interesting and similar, however I'm not sure that's my exact problem. Even when using client.setInsecure() immediately before the client.connect() it still fails if having previously connected via "Accept Terms and Conditions". (or are you saying that settings still somehow get stomped on before I can connect?)

I've created a mini sample (not quite ready to call it an MCVE)... but at least it is vastly simpler:

https://github.com/gojimmypi/MCVE_5019

The client.connect(DASHBOARD_HOST, httpPort) in the loop() fails on port 443 if previously connected to an AP that required acceptance of terms and conditions during setup(). I am able to confirm download of public internet files on port 80 after acceptance of terms.

Code works fine with my AP without a guest prompt redirection.

I'll need to make some time soon at the day job to see if I can narrow this down any more.

Any suggestions as to client object properties to help diagnose?

@gojimmypi
Copy link
Contributor Author

gojimmypi commented Sep 24, 2018

@earlephilhower I finally discovered the problem with client.setInsecure() - the issue was that despite the Insecure setting, the cert was still being validated on a date basis, returning this information in getLastSSLError:

client status = 4
client available = 0
client getLastSSLError = 54 // BR_ERR_X509_EXPIRED
client getWriteError = 0

See tools/sdk/include/bearssl/bearssl_x509.h for all the BR_ERR codes

Compounding my problem is the fact that not only does my device not have a power-loss-tolerant RTC, but our internal network does not allow outbound ntp traffic. (see also #1679); there may also be issues with timezone (see #4637)

I've confirmed the cert date validation despite client.setInsecure() setting was definitely the case in 7dd2ca3 - however the problem seems to have since been fixed as of Sep 18 commit 1a44f79, as that version no longer exhibits this behavior - so I will close this issue.

I've updated my MCVE on this topic, and here's some time-setting code in case someone runs into a similar time problem:

#include <sys/time.h> // for struct timeval
#include <time.h>

#define RTC_TEST 1537577487 // GMT Epoch see https://www.epochconverter.com/

// *******************************************************************************************
void setupLocalTime()
// *******************************************************************************************
{
	// see https://github.com/esp8266/Arduino/issues/4637
	time_t now; 
	now = time(nullptr); // if there's no time, this will have a value of 28800; Thu Jan  1 08:00:00 1970 
	Serial.print("Initial time:"); Serial.println(now);
	Serial.println(ctime(&now));

	int myTimezone = -7;
	int dst = 0;
	int SecondsPerHour = 3600;
	int MAX_TIME_RETRY = 60;
	int i = 0;

	// it is unlikely that the time is already set since we have no battery; 
	// if no time is avalable, then try to set time from the network
	if (now <= 1500000000) {
		// try to set network time via ntp packets
		configTime(myTimezone * SecondsPerHour, dst * 0, "pool.ntp.org", "time.nist.gov");

		// Starting in 2007, most of the United States and Canada observe DST from
		// the second Sunday in March to the first Sunday in November.
		// example setting Pacific Time:
		setenv("TZ", "PST8PDT,M3.2.0/02:00:00,M11.1.0/02:00:00", 1); // see https://users.pja.edu.pl/~jms/qnx/help/watcom/clibref/global_data.html
		//                     | month 3, second Sunday at 2:00AM
		//                                    | Month 11 - first Sunday, at 2:00am  
		// Mm.n.d
		//   The dth day(0 <= d <= 6) of week n of month m of the year(1 <= n <= 5, 1 <= m <= 12, where 
		//     week 5 means "the last d day in month m", which may occur in the fourth or fifth week).
		//     Week 1 is the first week in which the dth day occurs.Day zero is Sunday.


		tzset();
		Serial.print("Waiting for time(nullptr).");
		i = 0;
		while (!time(nullptr)) {
			Serial.print(".");
			delay(1000);
			i++;
			if (i > MAX_TIME_RETRY) {
				Serial.println("Gave up waiting for time(nullptr) to have a valid value.");
				break;
			}
		}
	}
	Serial.println("");

	// wait and determine if we have a valid time from the network. 
	now = time(nullptr);
	i = 0;
	Serial.print("Waiting for network time.");
	while (now <= 1500000000) {
		Serial.print(".");
		delay(1000); // allow a few seconds to connect to network time.
		i++;
		now = time(nullptr);
		if (i > MAX_TIME_RETRY) {
			Serial.println("Gave up waiting for network time(nullptr) to have a valid value.");
			break;
		}
	}
	Serial.println("");

	// get the time from the system
	Serial.print("Network time:");  Serial.println(now);
	Serial.println(ctime(&now));

	// TODO - implement a web service that returns current epoch time to use when NTP unavailable (insecure SSL due to cert date validation)

	// some networks may not allow ntp protocol (e.g. guest networks) so we may need to fudge the time
	if (now <= 1500000000) {
		Serial.println("Unable to get network time. Setting to fixed value. \n");
		// set to RTC text value
		// see https://www.systutorials.com/docs/linux/man/2-settimeofday/
		//
		//struct timeval {
		//	time_t      tv_sec;     /* seconds */
		//	suseconds_t tv_usec;    /* microseconds */
		//};
		timeval tv = { RTC_TEST, 0 };
		//
		//struct timezone {
		//	int tz_minuteswest;     /* minutes west of Greenwich */
		//	int tz_dsttime;         /* type of DST correction */
		//};
		timezone tz = { myTimezone * 60 , 0 };  

		// int settimeofday(const struct timeval *tv, const struct timezone *tz);
		settimeofday(&tv, &tz);
	}

	now = time(nullptr);
	Serial.println("Final time:");  Serial.println(now);
	Serial.println(ctime(&now));
}

Edit: Added code comments to setenv. Also: Anyone interested in time issues should also see #4637 regarding "Serious issues with timezone support in time functions". I myself didn't see any timezone problems when using this to set Pacific time in my code, above:

setenv("TZ", "PST8PDT,M3.2.0/02:00:00,M11.1.0/02:00:00", 1)

(although I am curious if it will actually work properly when time changes later in the year)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
waiting for feedback Waiting on additional info. If it's not received, the issue may be closed.
Projects
None yet
Development

No branches or pull requests

3 participants