Os source buffer backoff retry #2849

graytaylor0 · 2023-06-08T22:27:36Z

Description

This change uses the BufferAccumulator class from the s3 source to write and flush documents to the buffer with backoff and retry.

The records_to_accumulate is currently the same as the batch_size (pagination size) for searching.

Waiting on #2847 before rebasing and pushing non-draft PR

Issues Resolved

Related to #1985

Check List

New functionality includes testing.
New functionality has been documented.
- New functionality has javadoc added
Commits are signed with a real name per the DCO

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Taylor Gray <[email protected]>

engechas · 2023-06-09T15:22:17Z

...main/java/org/opensearch/dataprepper/plugins/source/opensearch/worker/BufferAccumulator.java

@@ -0,0 +1,130 @@
+/*


Could we pull out the original BufferAccumulator into a place where multiple plugins can consume it? It'd be better to maintain the logic in a singular place

Agreed will do in a follow up PR

Here is the followup PR after this one is merged (#2857)

engechas · 2023-06-09T15:24:11Z

...rce/src/main/java/org/opensearch/dataprepper/plugins/source/opensearch/worker/PitWorker.java

+                    try {
+                        bufferAccumulator.add(record);
+                    } catch (Exception e) {
+                        LOG.error("Failed writing OpenSearch documents to buffer due to: {}", e.getMessage());


Should we emit a metric here and/or log the document that failed? The user won't be able to take action without knowing which documents are failing

I was following the s3 source flow that just logs and moves on. Will log that last added record's index and document id.

Looking back at the code, I think my comment was wrong. BufferAccumulator::add adds to the local collection and flushes if needed. There might be a case where adding to the local collection fails, but this exception should almost always be because the flush failed. I thought we were silently dropping the record if this path occurred, but that's not the case.

Having the doc/index in the log is useful to denote progress nonetheless

chenqi0805 · 2023-06-09T21:33:43Z

...main/java/org/opensearch/dataprepper/plugins/source/opensearch/worker/BufferAccumulator.java

+ *
+ * @param <T> Type of record to accumulate
+ */
+public class BufferAccumulator<T extends Record<?>> {


You can use javax annotation @NotThreadSafe

Signed-off-by: Taylor Gray <[email protected]>

kkondaka · 2023-06-09T17:18:29Z

...main/java/org/opensearch/dataprepper/plugins/source/opensearch/worker/BufferAccumulator.java

+        long nextDelay = INITIAL_FLUSH_RETRY_DELAY_ON_IO_EXCEPTION.toMillis();
+        boolean flushedSuccessfully;
+
+        for (int retryCount = 0; retryCount < MAX_FLUSH_RETRIES_ON_IO_EXCEPTION; retryCount++) {


Do we want this MAX_FLUSH_RETRIES_ON_IO_EXCEPTION to be configurable?

For now I don’t think there’s any value in it. But it’s always an option in the future

kkondaka · 2023-06-10T20:50:34Z

...main/java/org/opensearch/dataprepper/plugins/source/opensearch/worker/BufferAccumulator.java

+
+    private final Collection<T> recordsAccumulated;
+
+    private BufferAccumulator(final Buffer<T> buffer, final int numberOfRecordsToAccumulate, final Duration bufferTimeout) {


There are other Buffer Accumulator implementations. Do you think they can all be combined to one implementation?

Yes I have a follow-up PR to pull this out into a reusable class (#2857)

Use buffer accumulator in opensearch source to backoff and retry Signed-off-by: Taylor Gray <[email protected]> Signed-off-by: Marcos_Gonzalez_Mayedo <[email protected]>

graytaylor0 force-pushed the OsSourceBufferBackoffRetry branch 2 times, most recently from 14068c0 to 598202c Compare June 9, 2023 15:17

graytaylor0 marked this pull request as ready for review June 9, 2023 15:17

graytaylor0 requested review from chenqi0805, engechas, dinujoh, kkondaka, cmanning09, asifsmohammed, dlvenable and oeyh as code owners June 9, 2023 15:17

Use buffer accumulator in opensearch source to backoff and retry

168227d

Signed-off-by: Taylor Gray <[email protected]>

graytaylor0 force-pushed the OsSourceBufferBackoffRetry branch from 598202c to 168227d Compare June 9, 2023 15:18

engechas reviewed Jun 9, 2023

View reviewed changes

chenqi0805 reviewed Jun 9, 2023

View reviewed changes

Address PR comments

a9c727a

Signed-off-by: Taylor Gray <[email protected]>

kkondaka reviewed Jun 10, 2023

View reviewed changes

engechas approved these changes Jun 12, 2023

View reviewed changes

engechas mentioned this pull request Jun 12, 2023

Consolidate BufferAccumulator to buffer-common module for reuse of common buffer classes #2857

Merged

4 tasks

kkondaka approved these changes Jun 12, 2023

View reviewed changes

graytaylor0 merged commit d4cc0bb into opensearch-project:main Jun 12, 2023

graytaylor0 deleted the OsSourceBufferBackoffRetry branch June 12, 2023 14:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Os source buffer backoff retry #2849

Os source buffer backoff retry #2849

graytaylor0 commented Jun 8, 2023 •

edited

Loading

engechas Jun 9, 2023

graytaylor0 Jun 9, 2023 •

edited

Loading

graytaylor0 Jun 10, 2023

engechas Jun 9, 2023

graytaylor0 Jun 9, 2023 •

edited

Loading

engechas Jun 12, 2023

chenqi0805 Jun 9, 2023

kkondaka Jun 9, 2023

graytaylor0 Jun 11, 2023

kkondaka Jun 10, 2023

graytaylor0 Jun 11, 2023 •

edited

Loading


		private final Collection<T> recordsAccumulated;

		private BufferAccumulator(final Buffer<T> buffer, final int numberOfRecordsToAccumulate, final Duration bufferTimeout) {

Os source buffer backoff retry #2849

Os source buffer backoff retry #2849

Conversation

graytaylor0 commented Jun 8, 2023 • edited Loading

Description

Issues Resolved

Check List

Choose a reason for hiding this comment

graytaylor0 Jun 9, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

graytaylor0 Jun 9, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

graytaylor0 Jun 11, 2023 • edited Loading

Choose a reason for hiding this comment

graytaylor0 commented Jun 8, 2023 •

edited

Loading

graytaylor0 Jun 9, 2023 •

edited

Loading

graytaylor0 Jun 9, 2023 •

edited

Loading

graytaylor0 Jun 11, 2023 •

edited

Loading