Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error after server returns connection close #1441

Closed
skoppe opened this issue Feb 24, 2016 · 13 comments
Closed

Error after server returns connection close #1441

skoppe opened this issue Feb 24, 2016 · 13 comments

Comments

@skoppe
Copy link
Contributor

skoppe commented Feb 24, 2016

Using vibe.d 0.7.29 and dmd 2.070

I am writing an adapter to move data from CouchDB into ElasticSearch. There are aroung 600 small databases that I need to move over. CouchDB is behind an nginx proxy with SSL termination.

I have determined that CouchDB sends a connection:close response header on api call no. 99 (due to some setting) Then things go haywire.

With libevent the app blocks at res.readJson() in the requestHttp. With libasync it fails on the next call to requestHttp with exception saying Socket operation on nonsocket.

What I think happens is that CouchDB closes the connection, and vibe.d or nginx isn't handling SSL finalizing properly.

This is my course of action:

  1. will try https://ma.ttias.be/enable-keepalive-connections-in-nginx-upstream-proxy-configurations/ as per unexpected client disconnction #1400 (didn't do much, although one time I got 299 calls before freezing)
  2. I will look in the nginx and couchdb logs. (didn't say much)
  3. I will get rid of the nginx proxy in the middle and try again. (confirmed this worked)
  4. Find a way how to get my adapter to close the connection before reaching 100, and force opening a new one.

Here is a log for a correct call (libevent):

[02BD0000:02BD3100 trc] returning HTTPClient connection 0 of 1
[02BD0000:02BD3100 trc] --------------------
[02BD0000:02BD3100 trc] HTTP client request:
[02BD0000:02BD3100 trc] --------------------
[02BD0000:02BD3100 trc] GET /a26c3188d-59f4-4aea-80b0-43c523be38da HTTP/1.1
[02BD0000:02BD3100 trc] User-Agent: vibe.d/0.7.27 (HTTPClient, +http://vibed.org/)
[02BD0000:02BD3100 trc] Connection: keep-alive
[02BD0000:02BD3100 trc] Accept-Encoding: gzip, deflate
[02BD0000:02BD3100 trc] Host: app-data.21brains.com
[02BD0000:02BD3100 trc] Accept: application/json
[02BD0000:02BD3100 trc] evbuffer_add (fd 556): 285 B
[02BD0000:02BD3100 trc] Content-Type: application/json
[02BD0000:02BD3100 trc] Authorization: Basic xxxx
[02BD0000:02BD3100 trc] --------------------
[02BD0000:02BD3100 trc] evbuffer_add (fd 556): 94 B
[02BD0000:02BD3100 trc] HTTP client waiting for response
[02BD0000:00000000 trc] socket 556 write event (false)!
[02BD0000:00000000 trc] socket 556 read event!
[02BD0000:02BD3100 trc] HTTP client reading status line
[02BD0000:02BD3100 trc] stln: HTTP/1.1 200 OK
[02BD0000:02BD3100 trc] ---------------------
[02BD0000:02BD3100 trc] HTTP client response:
[02BD0000:02BD3100 trc] ---------------------
[02BD0000:02BD3100 trc] HTTP/1.1 200 OK
[02BD0000:02BD3100 trc] Server: nginx/1.9.5
[02BD0000:02BD3100 trc] Date: Wed, 24 Feb 2016 09:58:46 GMT
[02BD0000:02BD3100 trc] Content-Type: application/json
[02BD0000:02BD3100 trc] Content-Length: 266
[02BD0000:02BD3100 trc] Connection: keep-alive
[02BD0000:02BD3100 trc] Cache-Control: must-revalidate
[02BD0000:02BD3100 trc] Strict-Transport-Security: max-age=31536000
[02BD0000:02BD3100 trc] ---------------------
[02BD0000:00000000 trc] socket 556 read event!

Here the failing one (libevent):

[02BD0000:02BD3100 trc] returning HTTPClient connection 0 of 1
[02BD0000:02BD3100 trc] --------------------
[02BD0000:02BD3100 trc] HTTP client request:
[02BD0000:02BD3100 trc] --------------------
[02BD0000:02BD3100 trc] GET /a26e0aa8c-ca37-4fd0-add7-432b991da964 HTTP/1.1
[02BD0000:02BD3100 trc] User-Agent: vibe.d/0.7.27 (HTTPClient, +http://vibed.org/)
[02BD0000:02BD3100 trc] Connection: keep-alive
[02BD0000:02BD3100 trc] Accept-Encoding: gzip, deflate
[02BD0000:02BD3100 trc] Host: app-data.21brains.com
[02BD0000:02BD3100 trc] Accept: application/json
[02BD0000:02BD3100 trc] evbuffer_add (fd 556): 285 B
[02BD0000:02BD3100 trc] Content-Type: application/json
[02BD0000:02BD3100 trc] Authorization: Basic xxxx
[02BD0000:02BD3100 trc] --------------------
[02BD0000:02BD3100 trc] evbuffer_add (fd 556): 94 B
[02BD0000:02BD3100 trc] HTTP client waiting for response
[02BD0000:00000000 trc] socket 556 write event (false)!
[02BD0000:00000000 trc] socket 556 read event!
[02BD0000:02BD3100 trc] HTTP client reading status line
[02BD0000:02BD3100 trc] stln: HTTP/1.1 200 OK
[02BD0000:02BD3100 trc] ---------------------
[02BD0000:02BD3100 trc] HTTP client response:
[02BD0000:02BD3100 trc] ---------------------
[02BD0000:02BD3100 trc] HTTP/1.1 200 OK
[02BD0000:02BD3100 trc] Server: nginx/1.9.5
[02BD0000:02BD3100 trc] Date: Wed, 24 Feb 2016 09:58:47 GMT
[02BD0000:02BD3100 trc] Content-Type: application/json
[02BD0000:02BD3100 trc] Content-Length: 260
[02BD0000:02BD3100 trc] Connection: close
[02BD0000:02BD3100 trc] Cache-Control: must-revalidate
[02BD0000:02BD3100 trc] Strict-Transport-Security: max-age=31536000
[02BD0000:02BD3100 trc] ---------------------
[02BD0000:00000000 trc] socket 556 read event!
[02BD0000:02BD3100 trc] OpenSSLStream finalize
[02BD0000:02BD3100 trc] evbuffer_add (fd 556): 31 B
[02BD0000:02BD3100 dbg] TCP close request true open
[02BD0000:02BD3100 dia] Actively closing TCP connection
[02BD0000:02BD3100 trc] Closing socket 556...
[02BD0000:00000000 dbg] Socket event on fd 556: 17 (DF4044 vs DF4044)
[02BD0000:00000000 dbg] Connection was closed (fd 556).

and the failing libasync one

[02CA0000:02CA3200 trc] returning HTTPClient connection 0 of 1
[02CA0000:02CA3200 trc] Async resolved address xxx.xxx.xxx.xxx:0
[02CA0000:02CA3200 trc] Acquire Writer
[02CA0000:00000000 trc] Handler
[02CA0000:00000000 trc] onClose
[02CA0000:00000000 trc] exception trap
[02CA0000:02CA3200 CRITICAL] Task terminated with uncaught exception: Socket operation on nonsocket.
[02CA0000:02CA3200 dbg] Full error: object.Exception@C:\Users\Bas\AppData\Roaming\dub\packages\vibe-d-0.7.27\source\vibe\core\drivers\libasync.d(1451): Socket operation on nonsocket.
[02CA0000:02CA3200 dbg] ----------------
0x004FCDF7 in @safe void vibe.core.core.VibeDriverCore.processDeferredExceptions(vibe.core.task.Task) at C:\Users\Bas\AppData\Roaming\dub\packages\vibe-d-0.7.27\source\vibe\core\core.d(1327)
0x004FC47F in @safe void vibe.core.core.VibeDriverCore.yieldForEvent() at C:\Users\Bas\AppData\Roaming\dub\packages\vibe-d-0.7.27\source\vibe\core\core.d(1203)
0x004EA110 in vibe.core.drivers.libasync.LibasyncTCPConnection vibe.core.drivers.libasync.LibasyncDriver.connectTCP(libasync.windows.NetworkAddress) at C:\Users\Bas\AppData\Roaming\dub\packages\vibe-d-0.7.27\source\vibe\core\drivers\libasync.d(292)
0x004E626E in vibe.core.net.TCPConnection vibe.core.net.connectTCP(libasync.windows.NetworkAddress)
0x004E080D in vibe.core.net.TCPConnection vibe.core.net.connectTCP(immutable(char)[], ushort) at C:\Users\Bas\AppData\Roaming\dub\packages\vibe-d-0.7.27\source\vibe\core\net.d(96)
0x004965A9 in bool vibe.http.client.HTTPClient.doRequest(scope void delegate(vibe.http.client.HTTPClientRequest), bool*, bool, std.datetime.SysTime) at C:\Users\Bas\AppData\Roaming\dub\packages\vibe-d-0.7.27\source\vibe\http\client.d(534)
0x004963C5 in bool vibe.http.client.HTTPClient.doRequestWithRetry(scope void delegate(vibe.http.client.HTTPClientRequest), bool, out bool, out std.datetime.SysTime) at C:\Users\Bas\AppData\Roaming\dub\packages\vibe-d-0.7.27\source\vibe\http\client.d(472)
0x00495F87 in void vibe.http.client.HTTPClient.request(scope void delegate(scope vibe.http.client.HTTPClientRequest), scope void delegate(scope vibe.http.client.HTTPClientResponse)) at C:\Users\Bas\AppData\Roaming\dub\packages\vibe-d-0.7.27\source\vibe\http\client.d(406)
0x004A392D in void vibe.http.client.requestHTTP(vibe.inet.url.URL, scope void delegate(scope vibe.http.client.HTTPClientRequest), scope void delegate(scope vibe.http.client.HTTPClientResponse), vibe.http.client.HTTPClientSettings) at C:\Users\Bas\AppData\Roaming\dub\packages\vibe-d-0.7.27\source\vibe\http\client.d(137)
0x0049AEAD in void vibe.http.client.requestHTTP(immutable(char)[], scope void delegate(scope vibe.http.client.HTTPClientRequest), scope void delegate(scope vibe.http.client.HTTPClientResponse), vibe.http.client.HTTPClientSettings) at C:\Users\Bas\AppData\Roaming\dub\packages\vibe-d-0.7.27\source\vibe\http\client.d(107)
0x004318F3 in couch.DBInfo couch.getDbInfo!(settings.Settings).getDbInfo(settings.Settings, immutable(char)[]) at c:\Users\Bas\dev\21brains\sourcing-tool-log-adapter-service\source\couch.d(145)
0x00406CE1 in void app.main().__lambda1() at c:\Users\Bas\dev\21brains\sourcing-tool-log-adapter-service\source\app.d(88)
0x00420BBA in void vibe.core.core.makeTaskFuncInfo!(void delegate()).makeTaskFuncInfo(ref void delegate()).callDelegate(vibe.core.core.TaskFuncInfo*) at C:\Users\Bas\AppData\Roaming\dub\packages\vibe-d-0.7.27\source\vibe\core\core.d(499)
0x004D98B2 in void vibe.core.core.CoreTask.run() at C:\Users\Bas\AppData\Roaming\dub\packages\vibe-d-0.7.27\source\vibe\core\core.d(1085)
0x005AC99D in void core.thread.Fiber.run()
[02CA0000:02CA3200 trc] Emitting signal
[02CA0000:02CA3200 trc] Looping signals. found: 0
@skoppe
Copy link
Contributor Author

skoppe commented Feb 24, 2016

I can confirm that removing the nginx proxy resolved the issue. I will try fiddling with nginx's settings, as there seems to be going something wrong there.

@skoppe
Copy link
Contributor Author

skoppe commented Feb 24, 2016

I reduced my code to this:

import vibe.d;

void main(){
    import std.stdio;
    runTask({
        writeln("Making Request");
        requestHTTP("https://www.google.com",
            (scope req)
            {
                writeln("Sending Headers");
                req.method = HTTPMethod.GET;
                req.headers["Connection"] = "close";
            },
            (scope res)
            {
                writeln("Reading Response");
                scope(exit) { res.dropBody(); }
                if (res.statusCode < 200 || res.statusCode > 299)
                {
                    writeln(res);
                }
                writeln(res.headers);
                writeln(cast(char[])res.bodyReader.readAll); // gets stuck here
            }
        );
        writeln("Never reaches here!");
        exitEventLoop();
    });
    runEventLoop();
};

I added Connection: close header to simulate what I get back from CouchDB.

Is this normal?

Edit: to cause libasync to crash you need to do the requestHTTP(...) twice.

@skoppe
Copy link
Contributor Author

skoppe commented Feb 25, 2016

I keep returning to https://github.com/rejectedsoftware/vibe.d/blob/3d851408a813523b73a003805a58f5c5f14c53aa/source/vibe/core/drivers/libevent2_tcp.d#L191.

In my case the request method in HTTPClient finds the Connection: close headers and calls the disconnect method. That one in turn calls finalize on the m_stream object, which happens to be
an instance of the OpenSSLStream since I'm making SSL requests. In the finalize method OpenSSL's SSL_shutdown function is called, which, per the specs, sends a notification to the peer that its shutting down. The next thing the disconnect method does is calling the close method on the - in my case - Libevent2TCPConnection instance. Everybody tidies up, except execution never continues after calling yieldForEvent().

I changed the onBioWrite function in openssl.d to not write the closing notification and behold - all is well.

Somehow it seems vibe.d is waiting for some event to never happen, or some queue to drain that never does.

@skoppe
Copy link
Contributor Author

skoppe commented Feb 25, 2016

I changed this master...skoppe:ssl-conn-close and it worked for libevent.

@s-ludwig
Copy link
Member

This issue might also explain dlang/dub#760, even though the trace logs seem to indicate something else, but I'll have to double-check.

So the problem appears to be that during the active connection close, the connection got reset by the peer before that TLS close notification got delivered (at least for libevent). Due to that, an onSocketEvent gets emitted before onSocketWrite, which actually then never gets called. The reason for this bug seems to be the fix for #1376. I'll have to take a closer look to find a correct and clean fix for both.

@skoppe
Copy link
Contributor Author

skoppe commented Feb 26, 2016

Yep that is correct. onSockerWrite is never called and the yieldForEvent() in the close function never resumes.

@skoppe
Copy link
Contributor Author

skoppe commented Feb 26, 2016

The reason yieldForEvent() doens't resume is because the task isn't resumed in onSocketEvent due to the if (ctx.readOwner && ctx.readOwner.running && ctx.state != ConnectionState.activeClose) evaluating to false.

@s-ludwig
Copy link
Member

Exactly, that check got introduced to fix #1376. I think there simply needs to be an additional state to keep track of the resumption state of the task.

@skoppe
Copy link
Contributor Author

skoppe commented Feb 26, 2016

What about removing res.headers.get("Connection") == "close" from if (user_exception || res.headers.get("Connection") == "close") disconnect(); in client.d ??

I mean, we are actively closing a connection, that is already closed on the other end. We just need to yield to receive the BEV_EVENT_EOF event (and maybe set ctx.state to ConnectionState.passiveClose to cause further reads/writes to throw.)

@s-ludwig
Copy link
Member

The thing is that there is definitely a bug in the libevent driver, so that has to be fixed anyway. Letting the server do the disconnect would probably still be a good idea, but it would have to be somewhat more complex (i.e. actually wait for the server to close the connection and actively close after a timeout if necessary).

I've quickly tested with an additional state, but then I stumbled over another issue involving onSocketRead+onSocketEvent triggering a "task already scheduled to be resumed" assertion. I think I'll just have to hack around it using a separate Boolean. On the plus side, the next major version of the vibe.core module won't have this issue, because it uses a unified approach to yield+resume that works a little differently.

@s-ludwig
Copy link
Member

PR #1443 looks like it works. I'll also test that w.r.t. the dub-registry issue.

@skoppe
Copy link
Contributor Author

skoppe commented Feb 26, 2016

will test tomorrow

@skoppe
Copy link
Contributor Author

skoppe commented Feb 27, 2016

test is green

s-ludwig added a commit that referenced this issue Feb 27, 2016
(cherry picked from commit ac11197)
s-ludwig added a commit that referenced this issue Feb 29, 2016
This assertion got triggered in a number of harmless situations and it's not always easy to avoid. Trying to avoid it, it has already caused a rather bad task starvation bug (#1441), so at this point it seems to be better to get rid of it.

See also #1407
s-ludwig added a commit that referenced this issue Apr 10, 2016
(cherry picked from commit ac11197)
s-ludwig added a commit that referenced this issue Apr 10, 2016
This assertion got triggered in a number of harmless situations and it's not always easy to avoid. Trying to avoid it, it has already caused a rather bad task starvation bug (#1441), so at this point it seems to be better to get rid of it.

See also #1407

(cherry picked from commit 621e34b)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants