RFC: logging: experimental batching writer #2910

pongad · 2018-02-15T09:16:53Z

Manually written batching implementation. Unlike GAX batching,
it implements flushing and does not deal with partition keys.

If we're OK with where this is heading,
we should make this work with batching settings (shouldn't be hard)
and load-test before migrating things to it.

If not, this can serve as a starting point for a better GAX batching
implementation.

In an experiment, I publish 1M messages of 300 bytes each.
Using LoggingHandler gives me ~14,000 msg/s.
BatchingWriter gives ~67,000, using similar BatchingSettings.
Letting users configure these settings might be important though;
I have observed ~300K msg/s given enough CPU and memory.

Manually written batching implementation. Unlike GAX batching, it implements flushing and does not deal with partition keys. If we're OK with where this is heading, we should make this work with batching settings (shouldn't be hard) and load-test before migrating things to it. If not, this can serve as a starting point for a better GAX batching implementation. In an experiment, I publish 1M messages of 300 bytes each. Using LoggingHandler gives me ~14,000 msg/s. BatchingWriter gives ~67,000, using similar BatchingSettings. Letting users configure these settings might be important though; I have observed ~300K msg/s given enough CPU and memory.

pongad · 2018-02-15T09:17:26Z

Updates #2796

google-cloud-logging/src/main/java/com/google/cloud/logging/BatchingWriter.java

+        // Whoever calls send serializes the proto; so we do it off-thread.
+        // This gives better CPU utilization if there are few producer threads
+        // on a many-core machine.
+        executor.execute(


igorbernstein2 · 2018-02-15T18:44:23Z

I still think that we should try refactoring gax code instead of writing batching code from scratch. There are a lot of useful features in the gax implementation that are lost here.

Before the bigtable client is ready I will need the following features in batching:

request byte size flow control
ability to know when either the the whole batch request failed or when an entry failed
ability to retry individual entries

General feedback on the PR:

add() should return a future so that the caller can find out the result of the write
writer should be closeable & throw an error if any rpcs failed
It would be nice to provide a means to do nonblocking back pressure: adding an bool isReady() & onReady(Callback) on the Writer
there should be a flush() method that can send all pending RPCs and wait for all of them to return

pongad · 2018-02-15T23:41:01Z

@igorbernstein2 Thank you for your thoughts. I agree that putting this in gax will be valuable. I'm mostly using this to figure out what logging needs from whatever ends up in gax. This is why it's an RFC after all.

To answer your points:

request byte size flow control

I think this is easily done by changing Semaphore to FlowController right?

ability to know when either the the whole batch request failed or when an entry failed
ability to retry individual entries
add() should return a future so that the caller can find out the result of the write

Please correct me if I'm wrong. I think these 3 are somewhat related. At least it's predicated on the ability to identify individual requests from a batch. Pubsub needs this too.

However, logging doesn't (writes are all or nothing, and failures are reported to error manager). I haven't measured how much CPU work this would cost. If it's significant, we should find a way for logging to not pay for this cost.

writer should be closeable & throw an error if any rpcs failed

I don't think this makes sense to logging. Doc explicitly discourages throwing exceptions.

there should be a flush() method that can send all pending RPCs and wait for all of them to return

Definitely makes sense as a convenience method. I believe it can be implemented by

initFlush();
for (Future<?> future : pendingRpcs()) future.get(); // error handling elided.

It would be nice to provide a means to do nonblocking back pressure: adding an bool isReady() & onReady(Callback) on the Writer

Does BigTable require single- or multiple-producer per writer? I think isReady makes sense in single, but could cause thundering herd problem in multiple. I need to think more about onReady.

pongad · 2018-02-23T05:25:46Z

@garrettjonesgoogle Are you OK with me submitting this as-is to a branch?

pongad · 2018-02-26T05:31:53Z

I'll merge this into a branch

pongad requested review from garrettjonesgoogle and igorbernstein2 February 15, 2018 09:16

googlebot added the cla: yes This human has signed the Contributor License Agreement. label Feb 15, 2018

pongad commented Feb 15, 2018

View reviewed changes

igorbernstein2 mentioned this pull request Feb 23, 2018

DON'T MERGE: Add some preliminary Batching benchmarks. googleapis/gax-java#486

Closed

pongad closed this Feb 26, 2018

pongad deleted the logging-expr branch February 26, 2018 05:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: logging: experimental batching writer #2910

RFC: logging: experimental batching writer #2910

pongad commented Feb 15, 2018

pongad commented Feb 15, 2018

This comment was marked as spam.

igorbernstein2 commented Feb 15, 2018

pongad commented Feb 15, 2018

pongad commented Feb 23, 2018

pongad commented Feb 26, 2018

RFC: logging: experimental batching writer #2910

RFC: logging: experimental batching writer #2910

Conversation

pongad commented Feb 15, 2018

pongad commented Feb 15, 2018

This comment was marked as spam.

igorbernstein2 commented Feb 15, 2018

pongad commented Feb 15, 2018

pongad commented Feb 23, 2018

pongad commented Feb 26, 2018