Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Offloading large ledgers (>2GB) fail with Google Cloud Storage #15159

Closed
lhotari opened this issue Apr 13, 2022 · 4 comments · Fixed by #22220 or #22554
Closed

Offloading large ledgers (>2GB) fail with Google Cloud Storage #15159

lhotari opened this issue Apr 13, 2022 · 4 comments · Fixed by #22220 or #22554
Labels
lifecycle/stale Stale type/bug The PR fixed a bug or issue reported a bug

Comments

@lhotari
Copy link
Member

lhotari commented Apr 13, 2022

Describe the bug

Offloading large ledgers fail with Google Cloud Storage and the default settings. With default settings offloading ledgers over 2GB will fail to GCS since there's a limitation in JClouds (reported as [JCLOUDS-1606] Cannot upload more than 32 parts to GCS) which limits a multipart upload to 32 parts. GCS supports multipart uploads up to 10000 parts, but JClouds doesn't use the API in a way to achieve more than 32 parts.

Here's an example log entry of the problem:

java.util.concurrent.CompletionException: org.jclouds.http.HttpResponseException: command: POST https://www.googleapis.com/storage/v1/b/somebucket/o/ff553922-1fa3-4ceb-abcd-60106603b5c8-object-123456/compose HTTP/1.1 failed with response: HTTP/1.1 400 Bad Request; content: [{
  "error": {
    "code": 400,
    "message": "The number of source components provided (35) exceeds the maximum (32)",
    "errors": [
      {
        "message": "The number of source components provided (35) exceeds the maximum (32)",
        "domain": "global",
        "reason": "invalid"
      }
    ]
  }
}
]
        at java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:367) ~[?:?]
        at java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:376) ~[?:?]
        at java.util.concurrent.CompletableFuture$UniRelay.tryFire(CompletableFuture.java:1019) ~[?:?]
        at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) ~[?:?]
        at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088) ~[?:?]
        at org.apache.bookkeeper.mledger.offload.jcloud.impl.BlobStoreManagedLedgerOffloader.lambda$offload$0(BlobStoreManagedLedgerOffloader.java:237) ~[?:?]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
        at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131) [com.google.guava-guava-31.0.1-jre.jar:?]
        at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088) ~[?:?]
        at org.apache.bookkeeper.mledger.offload.jcloud.impl.BlobStoreManagedLedgerOffloader.lambda$offload$0(BlobStoreManagedLedgerOffloader.java:237) ~[?:?]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
        at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131) [com.google.guava-guava-31.0.1-jre.jar:?]
        at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:74) [com.google.guava-guava-31.0.1-jre.jar:?]
        at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:82) [com.google.guava-guava-31.0.1-jre.jar:?]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
        at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) [?:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.74.Final.jar:4.1.74.Final]
        at java.lang.Thread.run(Thread.java:829) [?:?]
Caused by: org.jclouds.http.HttpResponseException: command: POST https://www.googleapis.com/storage/v1/b/somebucket/o/ff553922-1fa3-4ceb-abcd-60106603b5c8-object-123456/compose HTTP/1.1 failed with response: HTTP/1.1 400 Bad Request; content: [{
  "error": {
    "code": 400,
    "message": "The number of source components provided (35) exceeds the maximum (32)",
    "errors": [
      {
        "message": "The number of source components provided (35) exceeds the maximum (32)",
        "domain": "global",
        "reason": "invalid"
      }
    ]
  }
}
]
        at org.jclouds.googlecloudstorage.handlers.GoogleCloudStorageErrorHandler.handleError(GoogleCloudStorageErrorHandler.java:40) ~[?:?]
        at org.jclouds.http.handlers.DelegatingErrorHandler.handleError(DelegatingErrorHandler.java:65) ~[?:?]
        at org.jclouds.http.internal.BaseHttpCommandExecutorService.shouldContinue(BaseHttpCommandExecutorService.java:138) ~[?:?]
        at org.jclouds.http.internal.BaseHttpCommandExecutorService.invoke(BaseHttpCommandExecutorService.java:107) ~[?:?]
        at org.jclouds.rest.internal.InvokeHttpMethod.invoke(InvokeHttpMethod.java:91) ~[?:?]
        at org.jclouds.rest.internal.InvokeHttpMethod.apply(InvokeHttpMethod.java:74) ~[?:?]
        at org.jclouds.rest.internal.InvokeHttpMethod.apply(InvokeHttpMethod.java:45) ~[?:?]
        at org.jclouds.reflect.FunctionalReflection$FunctionalInvocationHandler.handleInvocation(FunctionalReflection.java:117) ~[?:?]
        at org.apache.pulsar.jcloud.shade.com.google.common.reflect.AbstractInvocationHandler.invoke(AbstractInvocationHandler.java:89) ~[?:?]
        at com.sun.proxy.$Proxy144.composeObjects(Unknown Source) ~[?:?]
        at org.jclouds.googlecloudstorage.blobstore.GoogleCloudStorageBlobStore.completeMultipartUpload(GoogleCloudStorageBlobStore.java:405) ~[?:?]
        at jdk.internal.reflect.GeneratedMethodAccessor373.invoke(Unknown Source) ~[?:?]
        at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
        at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
        at org.apache.pulsar.jcloud.shade.com.google.inject.internal.DelegatingInvocationHandler.invoke(DelegatingInvocationHandler.java:50) ~[?:?]
        at com.sun.proxy.$Proxy88.completeMultipartUpload(Unknown Source) ~[?:?]
        at org.apache.bookkeeper.mledger.offload.jcloud.impl.BlobStoreManagedLedgerOffloader.lambda$offload$0(BlobStoreManagedLedgerOffloader.java:226) ~[?:?]
        ... 11 more

To Reproduce
Steps to reproduce the behavior:

  1. Configure offloading to use GCS
  2. Produce a lot of messages
  3. Trigger ledger offloading

Expected behavior

Ledger offloading should work with large ledger sizes

Workaround

# ensure that ledgers don't grow over 1500MB
managedLedgerMaxSizePerLedgerMbytes=1500
managedLedgerMinLedgerRolloverTimeMinutes=0

or

# increase block size to 128MB, maximum file size doubles to 4GB
gcsManagedLedgerOffloadMaxBlockSizeInBytes=134217728

Additional context

JClouds issue reported as [JCLOUDS-1606] Cannot upload more than 32 parts to GCS.

@github-actions
Copy link

The issue had no activity for 30 days, mark with Stale label.

@github-actions
Copy link

The issue had no activity for 30 days, mark with Stale label.

@pgier
Copy link
Contributor

pgier commented Mar 7, 2024

The related JClouds issue has been fixed in version 2.6.0.

pgier added a commit to pgier/pulsar that referenced this issue Mar 7, 2024
pgier added a commit to pgier/pulsar that referenced this issue Mar 12, 2024
pgier added a commit to pgier/pulsar that referenced this issue Mar 12, 2024
pgier added a commit to pgier/pulsar that referenced this issue Mar 12, 2024
@lhotari
Copy link
Member Author

lhotari commented Apr 22, 2024

#22220 alone doesn't fix the issue. Created #22554 to fix the remaining issue.

@lhotari lhotari reopened this Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/stale Stale type/bug The PR fixed a bug or issue reported a bug
Projects
None yet
2 participants