Occasional delays caused by TCP NoDelay config option defaulting to false #334

deblockt · 2020-10-05T13:32:01Z

Bug Report

Versions

Driver: 0.8.3.RELEASE - 0.8.4.RELEASE
Database: postgres
Java: 11
OS: Ubuntu

Current Behavior

Upgrading from 0.8.3.REALASE to 0.8.4.RELEASE of r2dbc-postgresql causes performance issues.
On our api we can see *2 of response time.

Exemple of zipkin trace of api.
With 0.8.3.RELEASE

With 0.8.4.RELEASE

you can see that, with 0.8.4.RELEASE, SQL queries are executed faster, but there are gaps between the different executions.

Table schema

Steps to reproduce

I have created a project to see this issue.

0.8.3.RELEASE: https://github.com/deblockt/r2dbc-issue-reproducer/tree/perf-issue-0.8.3
0.8.4.RELEASE: https://github.com/deblockt/r2dbc-issue-reproducer/tree/perf-issue-0.8.4

This project perform some insertion on database (table with one column).
On my laptop using the 0.8.3.RELEASE version this script takes 200ms, with 0.8.4.RELEASE it takes 500ms.

Expected behavior/code

have no performance issue with the 0.8.4.RELEASE version

Possible Solution

Additional context

The text was updated successfully, but these errors were encountered:

mp911de · 2020-10-05T13:58:31Z

Between 0.8.3 and 0.8.4 we changed a few deadlocks and ensured proper backpressure propagation. Do you have an idea what's happening in the huge gap between the two blocks of queries? Can you take a thread dump and optionally add a bit of logging to see where the application gets stuck?

Other than that, we need to debug/profile the driver to find out what's going on.

deblockt · 2020-10-05T15:16:26Z

I don't know what's happening in the gap. there are nothing, the code look like that :

firstBlock()
 .flatMap(result -> secondBlock())

I will try to take a thread dump.

deblockt · 2020-10-06T07:34:31Z

No thread dump for now. But I have create a very simple endpoint (only one insertion) to see the issue.

The exemple is available on this project https://github.com/deblockt/r2dbc-issue-reproducer/tree/perf-issue-0.8.3 for version 0.8.3 and https://github.com/deblockt/r2dbc-issue-reproducer/tree/perf-issue-0.8.4 for versoin 0.8.4

I will try to do a thread dump in the evening

deblockt · 2020-10-07T08:58:21Z

@mp911de Thread dump and trace sample have been added on sample project, you can find it:

mp911de · 2020-10-07T09:25:06Z

Thanks a lot. What immediately becomes visible is the gap between the log outputs of several 10's of ms:

2020-10-07 10:50:17.674 DEBUG [-,93cf6dd2e00c1412,93cf6dd2e00c1412,true] 521405 --- [tor-tcp-epoll-5] io.r2dbc.postgresql.QUERY                : Executing query: INSERT INTO applied_message (id) VALUES ($1) RETURNING *
2020-10-07 10:50:17.715  INFO [-,cde0e5665f4fe46d,cde0e5665f4fe46d,true] 521405 --- [tor-tcp-epoll-5] r2dbc.reproducer.TestController          : after the insertion

deblockt · 2020-10-07T11:29:41Z

Yes, this 10ms appear only on version > 0.8.4.RELEASE.
but nothing is done between the log and the query:

this.insert(id)
      .doOnNext(it -> log.info("after the insertion"))

 public Mono<String> insert(String id) {
        return databaseClient.
                insert()
                .into("applied_message")
                .value("id", id)
                .then()
                .thenReturn(id);
    }

mp911de · 2020-10-08T09:19:33Z

That's weird, locally I'm not able to reproduce the issue:

2020-10-08 11:17:51.123 DEBUG [-,d2c3ca9a1d73d4ce,d2c3ca9a1d73d4ce,true] 67184 --- [ctor-http-nio-3] o.s.d.r.c.R2dbcTransactionManager        : Creating new transaction with name [r2dbc.reproducer.TestController.simulate]: PROPAGATION_REQUIRED,ISOLATION_DEFAULT
2020-10-08 11:17:51.123 DEBUG [-,d2c3ca9a1d73d4ce,d2c3ca9a1d73d4ce,true] 67184 --- [ctor-http-nio-3] o.s.d.r.c.R2dbcTransactionManager        : Acquired Connection [MonoLift] for R2DBC transaction
2020-10-08 11:17:51.123 DEBUG [-,d2c3ca9a1d73d4ce,d2c3ca9a1d73d4ce,true] 67184 --- [ctor-http-nio-3] o.s.d.r.c.R2dbcTransactionManager        : Switching R2DBC Connection [PooledConnection-proxy [PooledConnection[PostgresqlConnection{client=io.r2dbc.postgresql.client.ReactorNettyClient@371f0b2a, codecs=io.r2dbc.postgresql.codec.DefaultCodecs@47cc9ad9}]]] to manual commit
2020-10-08 11:17:51.123 DEBUG [-,d2c3ca9a1d73d4ce,d2c3ca9a1d73d4ce,true] 67184 --- [ctor-http-nio-3] io.r2dbc.postgresql.QUERY                : Executing query: BEGIN
2020-10-08 11:17:51.124  INFO [-,d2c3ca9a1d73d4ce,d2c3ca9a1d73d4ce,true] 67184 --- [actor-tcp-nio-2] r2dbc.reproducer.TestController          : before insert data
2020-10-08 11:17:51.124 DEBUG [-,d2c3ca9a1d73d4ce,d2c3ca9a1d73d4ce,true] 67184 --- [actor-tcp-nio-2] o.s.d.r2dbc.core.DefaultDatabaseClient   : Executing SQL statement [INSERT INTO applied_message (id) VALUES ($1)]
2020-10-08 11:17:51.124 DEBUG [-,d2c3ca9a1d73d4ce,d2c3ca9a1d73d4ce,true] 67184 --- [actor-tcp-nio-2] io.r2dbc.postgresql.QUERY                : Executing query: INSERT INTO applied_message (id) VALUES ($1) RETURNING *
2020-10-08 11:17:51.126  INFO [-,d2c3ca9a1d73d4ce,d2c3ca9a1d73d4ce,true] 67184 --- [actor-tcp-nio-2] r2dbc.reproducer.TestController          : after the insertion
2020-10-08 11:17:51.126  INFO [-,d2c3ca9a1d73d4ce,d2c3ca9a1d73d4ce,true] 67184 --- [actor-tcp-nio-2] r2dbc.reproducer.TestController          : after creating the response entity
2020-10-08 11:17:51.126 TRACE [-,d2c3ca9a1d73d4ce,d2c3ca9a1d73d4ce,true] 67184 --- [actor-tcp-nio-2] o.s.d.r.c.R2dbcTransactionManager        : Triggering beforeCommit synchronization
2020-10-08 11:17:51.126 TRACE [-,d2c3ca9a1d73d4ce,d2c3ca9a1d73d4ce,true] 67184 --- [actor-tcp-nio-2] o.s.d.r.c.R2dbcTransactionManager        : Triggering beforeCompletion synchronization
2020-10-08 11:17:51.126 DEBUG [-,d2c3ca9a1d73d4ce,d2c3ca9a1d73d4ce,true] 67184 --- [actor-tcp-nio-2] o.s.d.r.c.R2dbcTransactionManager        : Initiating transaction commit
2020-10-08 11:17:51.126 DEBUG [-,d2c3ca9a1d73d4ce,d2c3ca9a1d73d4ce,true] 67184 --- [actor-tcp-nio-2] o.s.d.r.c.R2dbcTransactionManager        : Committing R2DBC transaction on Connection [PooledConnection-proxy [PooledConnection[PostgresqlConnection{client=io.r2dbc.postgresql.client.ReactorNettyClient@371f0b2a, codecs=io.r2dbc.postgresql.codec.DefaultCodecs@47cc9ad9}]]]
2020-10-08 11:17:51.126 DEBUG [-,d2c3ca9a1d73d4ce,d2c3ca9a1d73d4ce,true] 67184 --- [actor-tcp-nio-2] io.r2dbc.postgresql.QUERY                : Executing query: COMMIT
2020-10-08 11:17:51.126 TRACE [-,d2c3ca9a1d73d4ce,d2c3ca9a1d73d4ce,true] 67184 --- [actor-tcp-nio-2] o.s.d.r.c.R2dbcTransactionManager        : Triggering afterCommit synchronization
2020-10-08 11:17:51.126 TRACE [-,d2c3ca9a1d73d4ce,d2c3ca9a1d73d4ce,true] 67184 --- [actor-tcp-nio-2] o.s.d.r.c.R2dbcTransactionManager        : Triggering afterCompletion synchronization
2020-10-08 11:17:51.127 DEBUG [-,d2c3ca9a1d73d4ce,d2c3ca9a1d73d4ce,true] 67184 --- [actor-tcp-nio-2] o.s.d.r.c.R2dbcTransactionManager        : Releasing R2DBC Connection [PooledConnection-proxy [PooledConnection[PostgresqlConnection{client=io.r2dbc.postgresql.client.ReactorNettyClient@371f0b2a, codecs=io.r2dbc.postgresql.codec.DefaultCodecs@47cc9ad9}]]] after transaction
2020-10-08 11:17:51.127  INFO [-,d2c3ca9a1d73d4ce,d2c3ca9a1d73d4ce,true] 67184 --- [ctor-http-nio-3] r.reproducer.StartAndEndLogWebFilter     : INTERNAL end request to /test with status code 200 OK. duration: 4ms

I'm not sure whether/how much Spring Cloud Sleuth and others play into that. I'd suggest to reduce the test case to a minimum (e.g. removing sleuth that decorates a lot of publishers) and optionally using R2DBC SPI directly to pin the issue down.

deblockt · 2020-10-09T06:59:01Z

Hi, I have remove some of useless deps on the test project, nothing change localy.

I have try to test some scenario to see if the issue come from r2dbc, I have try:
- remove the @Transactional and replace the request by Mono.delay of 5ms
- keep @Transactional and replace the request by Mono.delay of 5ms
- use the @Transactional and the request
- use the @Transactional and use r2dbc SPI to run the request
- remove the @Transactional and keep the request

The perf results are in this table:

usecase	0.8.3	0.8.4
transaction + request	avg: 4ms, min: 2ms, max:216ms	avg: 31ms, min: 4ms, max:217ms
request without @transactional	avg: 3ms, min: 1ms, max: 217ms	avg: 21ms, min: 2ms, max:216ms
using R2DBC spi and @transactional with ConnectionAccessor	avg: 4ms, min: 2ms, max: 35ms	avg: 31ms, min: 4ms, max:101ms
transaction + Mono.delay 5ms	avg: 9ms, min: 6ms, max: 217ms	avg: 9ms, min: 6ms, max: 217ms
Mono.delay 5ms	avg: 7ms, min: 5ms, max: 133ms	avg: 7ms, min: 5ms, max:116ms

It's strange that you didn't have the same results as me; I will try to run the API on a docker to see if you can reproduce.

I use this code to test with the R2DBC SPI:

package r2dbc.reproducer;

import io.r2dbc.spi.Result;
import lombok.extern.slf4j.Slf4j;
import org.springframework.data.r2dbc.core.ConnectionAccessor;
import org.springframework.data.r2dbc.core.DatabaseClient;
import org.springframework.http.ResponseEntity;
import org.springframework.transaction.annotation.Transactional;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RestController;
import reactor.core.publisher.Mono;

import java.time.Duration;
import java.util.UUID;

@Slf4j
@RestController
public class TestController {
    private final ConnectionAccessor databaseClient;

    public TestController(DatabaseClient client) {
        this.databaseClient = (ConnectionAccessor) client;
    }

    @Transactional
    @GetMapping("/test")
    public Mono<ResponseEntity<String>> simulate() {
        final var id = UUID.randomUUID().toString();
        return
            Mono.defer(() -> {log.info("before insert data (R2DBC 0.8.4)"); return Mono.empty();})
                .then(this.insert(id))
                .doOnNext(it -> log.info("after the insertion"))
                .map(r -> ResponseEntity.ok("ok " + id))
                .doOnNext(it -> log.info("after creating the response entity"));
    }

    public Mono<String> insert(String id) {
        return databaseClient.inConnection(con -> {
            final var query = con.createStatement("insert into applied_message(id) values($1)")
                    .bind(0, id)
                    .execute();
            return Mono.from(query)
                    .flatMap(result -> Mono.from(result.getRowsUpdated()));
        })
        .thenReturn(id);
    }
}

deblockt · 2020-10-09T07:25:49Z

I set the log level to trace, and I can see that:


2020-10-09 09:22:38.141  INFO 233168 --- [tor-tcp-epoll-2] r2dbc.reproducer.TestController          : before insert data (R2DBC 0.8.4)
2020-10-09 09:22:38.141 TRACE 233168 --- [tor-tcp-epoll-2] .s.t.r.TransactionSynchronizationManager : Retrieved value [org.springframework.data.r2dbc.connectionfactory.ConnectionHolder@e8ea1a2] for key [ConnectionPool[PostgreSQL]] bound to context [5dc7bef7-9d3a-49f0-b2ec-335a7a2e9344: r2dbc.reproducer.TestController.simulate]
2020-10-09 09:22:38.141 DEBUG 233168 --- [tor-tcp-epoll-2] io.r2dbc.postgresql.QUERY                : Executing query: insert into applied_message(id) values($1)
2020-10-09 09:22:38.141 DEBUG 233168 --- [tor-tcp-epoll-2] i.r.p.client.ReactorNettyClient          : Request:  Bind{name='B_2119', parameterFormats=[FORMAT_TEXT], parameters=[CompositeByteBuf(ridx: 0, widx: 36, cap: 36, components=1)], resultFormats=[], source='S_0'}
2020-10-09 09:22:38.141 TRACE 233168 --- [tor-tcp-epoll-2] i.r.p.client.ReactorNettyClient          : [id: 0x64f967a8, L:/127.0.0.1:53332 - R:localhost/127.0.0.1:5432] WRITE: 64B
         +-------------------------------------------------+
         |  0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f |
+--------+-------------------------------------------------+----------------+
|00000000| 42 00 00 00 3f 42 5f 32 31 31 39 00 53 5f 30 00 |B...?B_2119.S_0.|
|00000010| 00 01 00 00 00 01 00 00 00 24 66 65 30 63 66 39 |.........$fe0cf9|
|00000020| 35 31 2d 38 64 65 37 2d 34 39 38 34 2d 39 31 61 |51-8de7-4984-91a|
|00000030| 64 2d 65 65 33 66 30 32 66 35 38 39 64 32 00 00 |d-ee3f02f589d2..|
+--------+-------------------------------------------------+----------------+
2020-10-09 09:22:38.141 TRACE 233168 --- [tor-tcp-epoll-2] i.r.p.client.ReactorNettyClient          : [id: 0x64f967a8, L:/127.0.0.1:53332 - R:localhost/127.0.0.1:5432] FLUSH
2020-10-09 09:22:38.141 DEBUG 233168 --- [tor-tcp-epoll-2] i.r.p.client.ReactorNettyClient          : Request:  Describe{name='B_2119', type=PORTAL}
2020-10-09 09:22:38.141 TRACE 233168 --- [tor-tcp-epoll-2] i.r.p.client.ReactorNettyClient          : [id: 0x64f967a8, L:/127.0.0.1:53332 - R:localhost/127.0.0.1:5432] WRITE: 13B
         +-------------------------------------------------+
         |  0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f |
+--------+-------------------------------------------------+----------------+
|00000000| 44 00 00 00 0c 50 42 5f 32 31 31 39 00          |D....PB_2119.   |
+--------+-------------------------------------------------+----------------+
2020-10-09 09:22:38.141 TRACE 233168 --- [tor-tcp-epoll-2] i.r.p.client.ReactorNettyClient          : [id: 0x64f967a8, L:/127.0.0.1:53332 - R:localhost/127.0.0.1:5432] FLUSH
2020-10-09 09:22:38.141 DEBUG 233168 --- [tor-tcp-epoll-2] i.r.p.client.ReactorNettyClient          : Request:  Execute{name='B_2119', rows=0}
2020-10-09 09:22:38.141 TRACE 233168 --- [tor-tcp-epoll-2] i.r.p.client.ReactorNettyClient          : [id: 0x64f967a8, L:/127.0.0.1:53332 - R:localhost/127.0.0.1:5432] WRITE: 16B
         +-------------------------------------------------+
         |  0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f |
+--------+-------------------------------------------------+----------------+
|00000000| 45 00 00 00 0f 42 5f 32 31 31 39 00 00 00 00 00 |E....B_2119.....|
+--------+-------------------------------------------------+----------------+
2020-10-09 09:22:38.141 TRACE 233168 --- [tor-tcp-epoll-2] i.r.p.client.ReactorNettyClient          : [id: 0x64f967a8, L:/127.0.0.1:53332 - R:localhost/127.0.0.1:5432] FLUSH
2020-10-09 09:22:38.141 DEBUG 233168 --- [tor-tcp-epoll-2] i.r.p.client.ReactorNettyClient          : Request:  Close{name='B_2119', type=PORTAL}
2020-10-09 09:22:38.141 TRACE 233168 --- [tor-tcp-epoll-2] i.r.p.client.ReactorNettyClient          : [id: 0x64f967a8, L:/127.0.0.1:53332 - R:localhost/127.0.0.1:5432] WRITE: 13B
         +-------------------------------------------------+
         |  0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f |
+--------+-------------------------------------------------+----------------+
|00000000| 43 00 00 00 0c 50 42 5f 32 31 31 39 00          |C....PB_2119.   |
+--------+-------------------------------------------------+----------------+
2020-10-09 09:22:38.141 TRACE 233168 --- [tor-tcp-epoll-2] i.r.p.client.ReactorNettyClient          : [id: 0x64f967a8, L:/127.0.0.1:53332 - R:localhost/127.0.0.1:5432] FLUSH
2020-10-09 09:22:38.141 DEBUG 233168 --- [tor-tcp-epoll-2] i.r.p.client.ReactorNettyClient          : Request:  Sync{}
2020-10-09 09:22:38.141 TRACE 233168 --- [tor-tcp-epoll-2] i.r.p.client.ReactorNettyClient          : [id: 0x64f967a8, L:/127.0.0.1:53332 - R:localhost/127.0.0.1:5432] WRITE: 5B
         +-------------------------------------------------+
         |  0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f |
+--------+-------------------------------------------------+----------------+
|00000000| 53 00 00 00 04                                  |S....           |
+--------+-------------------------------------------------+----------------+
2020-10-09 09:22:38.141 TRACE 233168 --- [tor-tcp-epoll-2] i.r.p.client.ReactorNettyClient          : [id: 0x64f967a8, L:/127.0.0.1:53332 - R:localhost/127.0.0.1:5432] FLUSH
2020-10-09 09:22:38.141 TRACE 233168 --- [tor-tcp-epoll-2] i.r.p.client.ReactorNettyClient          : [id: 0x64f967a8, L:/127.0.0.1:53332 - R:localhost/127.0.0.1:5432] READ COMPLETE
2020-10-09 09:22:38.183 TRACE 233168 --- [tor-tcp-epoll-2] i.r.p.client.ReactorNettyClient          : [id: 0x64f967a8, L:/127.0.0.1:53332 - R:localhost/127.0.0.1:5432] READ: 37B
         +-------------------------------------------------+
         |  0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f |
+--------+-------------------------------------------------+----------------+
|00000000| 32 00 00 00 04 6e 00 00 00 04 43 00 00 00 0f 49 |2....n....C....I|
|00000010| 4e 53 45 52 54 20 30 20 31 00 33 00 00 00 04 5a |NSERT 0 1.3....Z|
|00000020| 00 00 00 05 54                                  |....T           |
+--------+-------------------------------------------------+----------------+
2020-10-09 09:22:38.183 DEBUG 233168 --- [tor-tcp-epoll-2] i.r.p.client.ReactorNettyClient          : Response: BindComplete{}
2020-10-09 09:22:38.183 DEBUG 233168 --- [tor-tcp-epoll-2] i.r.p.client.ReactorNettyClient          : Response: NoData{}
2020-10-09 09:22:38.183 DEBUG 233168 --- [tor-tcp-epoll-2] i.r.p.client.ReactorNettyClient          : Response: CommandComplete{command=INSERT, rowId=0, rows=1}
2020-10-09 09:22:38.183 DEBUG 233168 --- [tor-tcp-epoll-2] i.r.p.client.ReactorNettyClient          : Response: CloseComplete{}
2020-10-09 09:22:38.183 DEBUG 233168 --- [tor-tcp-epoll-2] i.r.p.client.ReactorNettyClient          : Response: ReadyForQuery{transactionStatus=TRANSACTION}
2020-10-09 09:22:38.183 TRACE 233168 --- [tor-tcp-epoll-2] .s.t.r.TransactionSynchronizationManager : Retrieved value [org.springframework.data.r2dbc.connectionfactory.ConnectionHolder@e8ea1a2] for key [ConnectionPool[PostgreSQL]] bound to context [5dc7bef7-9d3a-49f0-b2ec-335a7a2e9344: r2dbc.reproducer.TestController.simulate]
2020-10-09 09:22:38.183  INFO 233168 --- [tor-tcp-epoll-2] r2dbc.reproducer.TestController          : after the insertion

Some time this step can take 20ms (not all the time):

2020-10-09 09:22:38.141 TRACE 233168 --- [tor-tcp-epoll-2] i.r.p.client.ReactorNettyClient          : [id: 0x64f967a8, L:/127.0.0.1:53332 - R:localhost/127.0.0.1:5432] FLUSH
2020-10-09 09:22:38.141 TRACE 233168 --- [tor-tcp-epoll-2] i.r.p.client.ReactorNettyClient          : [id: 0x64f967a8, L:/127.0.0.1:53332 - R:localhost/127.0.0.1:5432] READ COMPLETE
2020-10-09 09:22:38.183 TRACE 233168 --- [tor-tcp-epoll-2] i.r.p.client.ReactorNettyClient          : [id: 0x64f967a8, L:/127.0.0.1:53332 - R:localhost/127.0.0.1:5432] READ: 37B
         +-------------------------------------------------+
         |  0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f |
+--------+-------------------------------------------------+----------------+
|00000000| 32 00 00 00 04 6e 00 00 00 04 43 00 00 00 0f 49 |2....n....C....I|
|00000010| 4e 53 45 52 54 20 30 20 31 00 33 00 00 00 04 5a |NSERT 0 1.3....Z|
|00000020| 00 00 00 05 54                                  |....T           |
+--------+-------------------------------------------------+----------------+
2020-10-09 09:22:38.183 DEBUG 233168 --- [tor-tcp-epoll-2] i.r.p.client.ReactorNettyClient          : Response: BindComplete{}

mp911de · 2020-10-14T15:01:25Z

There's a lot of additional background detail available from spring-projects/spring-data-r2dbc#477. The recorded packet capture shows that the TCP ACK for BIND has a 20-40ms delay which is weird in the first place, because that suggests that the server receive buffers are saturated or something the like.

mp911de · 2020-10-15T13:03:45Z

@cambierr I pushed a change where several command flow sequences are combined into single TCP packets (e.g. BIND + Describe). That comes closer to what the JDBC driver does, although some bits (combining bind+describe with execute/close/sync) require a bit more effort to avoid individual TCP packets where possible.

Care to retest against 0.8.6.BUILD-SNAPSHOT, available from repo.spring.io?

cambierr · 2020-10-15T14:59:42Z

just tested and.... no change at all: I graphed both results and they are exactly the same :/

mp911de · 2020-10-15T15:03:39Z

Do you mind joining Postgres slack at https://postgres-slack.herokuapp.com/? I think we need someone who is able to help us from the Postgres core team, especially the delayed ACK is something that draws my attention.

cambierr · 2020-10-15T16:57:38Z

Just joined and saw you posted already :)

mp911de · 2020-10-16T07:15:10Z

FTR: Increasing demand (ensuring there's always demand in FluxReceive to avoid ChannelConfig.setAutoRead(false)) and switching to nio (instead of epoll) does not fix the issue.

The issue can be reproduced by using a single connection and R2DBC Postgres directly (R2DBC Pool and Spring Data R2DBC don't affect the issue).

cambierr · 2020-10-16T07:15:17Z

Here is a network capture of oth 0.8.3 and 0.8.5 benchmarks
benchmark.zip

cambierr · 2020-10-16T07:35:17Z

simplest reproducer available at https://github.com/cambierr/r2dbc-perf-issue

mp911de · 2020-10-16T08:52:59Z

Update: The reason the issue was not reproducible is that there's a behavioral difference between MacOS and Linux. Can reproduce the issue on Ubuntu 20, Postgres 12, Java 8.

mp911de · 2020-10-16T09:08:24Z

The commit introducing this behavior is 233f16e (TCP KeepAlive and TCP NoDelay)

mp911de · 2020-10-16T09:15:11Z

Yeah, enabling TCP NoDelay lets the issue disappear.

mp911de · 2020-10-16T09:25:07Z

FWIW, https://www.extrahop.com/company/blog/2016/tcp-nodelay-nagle-quickack-best-practices describes the delayed ACK behavior.

mp911de · 2020-10-16T09:31:59Z

PGJDBC has TCP NoDelay enabled so it makes sense to flip the default here as well.

mp911de · 2020-10-16T09:50:07Z

We're going to flip the configuration default and enable TCP NoDelay by default. A workaround is enabling TCP NoDelay on affected systems.

We now enable TCP NoDelay by default to avoid delayed TCP ACK's. Our flushing behavior isn't ideal yet, but combining messages allows to reduce the number of TCP packets. [resolves #334][#138]

deblockt · 2020-10-16T10:10:37Z

It seem that rensponse times is better. but always slower than with 0.8.3 Version (there always have gap between requests).
I will try to provide a sample and open another issue if needed.

Thanks.

mp911de · 2020-10-16T10:12:01Z

Thanks @deblockt. Feel free to file a new ticket once you can provide a reproducer since this one has quite some comments regarding delayed TCP ACKs.

deblockt added the status: waiting-for-triage An issue we've not yet triaged label Oct 5, 2020

mp911de added the status: waiting-for-feedback We need additional information before we can continue label Oct 8, 2020

mp911de mentioned this issue Oct 12, 2020

Huge performance degradation after upgrade to Postgres 0.8.4 spring-projects/spring-data-r2dbc#477

Closed

mp911de added type: regression A regression from a previous release and removed status: waiting-for-feedback We need additional information before we can continue status: waiting-for-triage An issue we've not yet triaged labels Oct 16, 2020

mp911de changed the title ~~performance issue between 0.8.3.RELEASE and 0.8.4.RELEASE~~ Occasional delays caused by TCP NoDelay config option defaulting to false Oct 16, 2020

mp911de added this to the 0.8.6.RELEASE milestone Oct 16, 2020

mp911de mentioned this issue Oct 16, 2020

Combine extended flow messages into a single composite message #138

Closed

mp911de closed this as completed in 3745841 Oct 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Occasional delays caused by TCP NoDelay config option defaulting to false #334

Occasional delays caused by TCP NoDelay config option defaulting to false #334

deblockt commented Oct 5, 2020

mp911de commented Oct 5, 2020

deblockt commented Oct 5, 2020

deblockt commented Oct 6, 2020 •

edited

Loading

deblockt commented Oct 7, 2020 •

edited

Loading

mp911de commented Oct 7, 2020

deblockt commented Oct 7, 2020

mp911de commented Oct 8, 2020

deblockt commented Oct 9, 2020 •

edited

Loading

deblockt commented Oct 9, 2020

mp911de commented Oct 14, 2020

mp911de commented Oct 15, 2020

cambierr commented Oct 15, 2020

mp911de commented Oct 15, 2020

cambierr commented Oct 15, 2020 •

edited

Loading

mp911de commented Oct 16, 2020

cambierr commented Oct 16, 2020

cambierr commented Oct 16, 2020

mp911de commented Oct 16, 2020

mp911de commented Oct 16, 2020

mp911de commented Oct 16, 2020

mp911de commented Oct 16, 2020

mp911de commented Oct 16, 2020

mp911de commented Oct 16, 2020

deblockt commented Oct 16, 2020

mp911de commented Oct 16, 2020

Occasional delays caused by TCP NoDelay config option defaulting to false #334

Occasional delays caused by TCP NoDelay config option defaulting to false #334

Comments

deblockt commented Oct 5, 2020

Bug Report

Versions

Current Behavior

Table schema

Steps to reproduce

Expected behavior/code

Possible Solution

Additional context

mp911de commented Oct 5, 2020

deblockt commented Oct 5, 2020

deblockt commented Oct 6, 2020 • edited Loading

deblockt commented Oct 7, 2020 • edited Loading

mp911de commented Oct 7, 2020

deblockt commented Oct 7, 2020

mp911de commented Oct 8, 2020

deblockt commented Oct 9, 2020 • edited Loading

deblockt commented Oct 9, 2020

mp911de commented Oct 14, 2020

mp911de commented Oct 15, 2020

cambierr commented Oct 15, 2020

mp911de commented Oct 15, 2020

cambierr commented Oct 15, 2020 • edited Loading

mp911de commented Oct 16, 2020

cambierr commented Oct 16, 2020

cambierr commented Oct 16, 2020

mp911de commented Oct 16, 2020

mp911de commented Oct 16, 2020

mp911de commented Oct 16, 2020

mp911de commented Oct 16, 2020

mp911de commented Oct 16, 2020

mp911de commented Oct 16, 2020

deblockt commented Oct 16, 2020

mp911de commented Oct 16, 2020

deblockt commented Oct 6, 2020 •

edited

Loading

deblockt commented Oct 7, 2020 •

edited

Loading

deblockt commented Oct 9, 2020 •

edited

Loading

cambierr commented Oct 15, 2020 •

edited

Loading