Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix][broker] upgrade jclouds 2.5.0 -> 2.6.0 #22220

Merged
merged 8 commits into from
Mar 15, 2024

Conversation

pgier
Copy link
Contributor

@pgier pgier commented Mar 7, 2024

Upgrade jclouds to fix offload issue on GCP clusters.

Fixes #15159

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • [ X ] doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository: pgier#11

Copy link

github-actions bot commented Mar 7, 2024

@pgier Please add the following content to your PR description and select a checkbox:

- [ ] `doc` <!-- Your PR contains doc changes -->
- [ ] `doc-required` <!-- Your PR changes impact docs and you will update later -->
- [ ] `doc-not-needed` <!-- Your PR changes do not impact docs -->
- [ ] `doc-complete` <!-- Docs have been already added -->

@github-actions github-actions bot added doc-label-missing doc-not-needed Your PR changes do not impact docs and removed doc-label-missing labels Mar 7, 2024
@pgier
Copy link
Contributor Author

pgier commented Mar 7, 2024

Hmm, I guess this PR was premature, 2.6.0 has only released an RC but not the final release yet. I'll leave this open for when they do release 2.6.0.

Copy link
Member

@dave2wave dave2wave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the vote thread: https://lists.apache.org/thread/47zpc6sqjhmwwsrvtl1kz5mmjjxk12hm

It looks like once Andrew votes it will pass and the release will happen.

@lhotari lhotari requested review from lhotari and eolivelli March 7, 2024 17:29
Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dao-jun dao-jun closed this Mar 11, 2024
@dao-jun dao-jun reopened this Mar 11, 2024
@lhotari
Copy link
Member

lhotari commented Mar 11, 2024

@dao-jun it seems that there are failures after upgrading to 2.6.0 and closing and reopening to get the latest master changes won't make a difference.

@eolivelli
Copy link
Contributor

It seems that there is some work to do, I am seeing these errors on CI:

BlobStoreBackedInputStreamTest.start
Unable to create injector, see the following errors:

1) [Guice/MissingImplementation]: No implementation for inject.Provider<Gson> was bound.

Did you mean?
    Gson bound at GsonModule.provideGson(GsonModule.java:103)
      \_ installed by: RestModule -> GsonModule

Requested by:
1  : GsonModule$PropertiesAdapter.<init>(GsonModule.java:260)
      \_ for 1st parameter
     at GsonModule.provideGson(GsonModule.java:103)
      \_ for 5th parameter
     at GsonModule.provideGson(GsonModule.java:103)
      \_ installed by: RestModule -> GsonModule

@lhotari
Copy link
Member

lhotari commented Mar 11, 2024

@pgier do you have a chance to check the test failures and fix them? thanks

@pgier pgier force-pushed the issue-15159-upgrade-jclouds branch 2 times, most recently from 5dfd0b0 to a31e415 Compare March 12, 2024 16:57
@pgier pgier force-pushed the issue-15159-upgrade-jclouds branch from a31e415 to 8bdb24c Compare March 12, 2024 17:25
@pgier
Copy link
Contributor Author

pgier commented Mar 12, 2024

Looks like jclouds upgrade requires a gson upgrade.
Edit: I tried updating the global gson, but it seems that this causes other failures.

@dao-jun
Copy link
Member

dao-jun commented Mar 12, 2024

Looks like jclouds upgrade requires a gson upgrade.

Please help resolve these failures,

@dao-jun
Copy link
Member

dao-jun commented Mar 13, 2024

Since I upgraded guice to 7.0.0, seems still cannot fix these failures

@lhotari
Copy link
Member

lhotari commented Mar 14, 2024

Since I upgraded guice to 7.0.0, seems still cannot fix these failures

@dao-jun It should be possible to set the versions specificly to jclouds-shaded/pom.xml, in a dependencyManager block so that the versions are only applied to jclouds-shaded within Pulsar.
Example of this approach:

<dependencyManagement>
<dependencies>
<dependency>
<groupId>io.netty</groupId>
<artifactId>netty-bom</artifactId>
<version>${netty.version}</version>
<type>pom</type>
<scope>import</scope>
</dependency>
<dependency>
<groupId>io.grpc</groupId>
<artifactId>grpc-bom</artifactId>
<version>${grpc.version}</version>
<type>pom</type>
<scope>import</scope>
</dependency>
<dependency>
<groupId>io.dropwizard.metrics</groupId>
<artifactId>metrics-jvm</artifactId>
<version>${metrics.version}</version>
</dependency>
</dependencies>
</dependencyManagement>
.

@lhotari lhotari merged commit 73dc213 into apache:master Mar 15, 2024
47 of 49 checks passed
@lhotari
Copy link
Member

lhotari commented Mar 15, 2024

@pgier @dave2wave This has been merged to master branch now. do you have a chance to run some tests with the changes to see if it really fixes the issues with >2GB on GCP?

@lhotari
Copy link
Member

lhotari commented Mar 19, 2024

@pgier @dave2wave This has been merged to master branch. do you have a chance to run some tests with the changes to see if it really fixes the issues with >2GB on GCP?

@pgier
Copy link
Contributor Author

pgier commented Apr 2, 2024

@lhotari Thanks for fixing this. We should be able to test it, but maybe not for a couple more weeks.

nikhil-ctds pushed a commit to datastax/pulsar that referenced this pull request Apr 3, 2024
Co-authored-by: 道君 <[email protected]>
Co-authored-by: Lari Hotari <[email protected]>
(cherry picked from commit 73dc213)
nikhil-ctds pushed a commit to datastax/pulsar that referenced this pull request Apr 4, 2024
Co-authored-by: 道君 <[email protected]>
Co-authored-by: Lari Hotari <[email protected]>
(cherry picked from commit 73dc213)
nikhil-ctds pushed a commit to datastax/pulsar that referenced this pull request Apr 4, 2024
Co-authored-by: 道君 <[email protected]>
Co-authored-by: Lari Hotari <[email protected]>
(cherry picked from commit 73dc213)
@lhotari
Copy link
Member

lhotari commented Apr 19, 2024

@lhotari Thanks for fixing this. We should be able to test it, but maybe not for a couple more weeks.

@pgier do you have any test results to share about JClouds 2.6.0 ? I'm thinking of cherry-picking to maintenance branches after there's more confirmation that JClouds 2.6.0 works well with Pulsar. /cc @dave2wave @eolivelli @nicoloboschi

@nikhil-ctds
Copy link

nikhil-ctds commented Apr 22, 2024

@lhotari With the above fix I tried testing GCS offloader with ledgers of size >= 2.5gb. I'm still facing the same error

Error in offload
null

Reason: Error offloading: org.apache.bookkeeper.mledger.ManagedLedgerException: java.util.concurrent.CompletionException: org.jclouds.http.HttpResponseException: command: POST https://www.googleapis.com/storage/v1/b/bucket-cognitree-ls1258/o/d7f69abd-dc45-4fc1-bbe3-bd9daaa8fea7-ledger-9/compose HTTP/1.1 failed with response: HTTP/1.1 400 Bad Request; content: [{
  "error": {
    "code": 400,
    "message": "The number of source components provided (38) exceeds the maximum (32)",
    "errors": [
      {
        "message": "The number of source components provided (38) exceeds the maximum (32)",
        "domain": "global",
        "reason": "invalid"
      }
    ]
  }
}
]

I have also written a simple code using Jclouds 2.6.0 to upload a large file of size 3gb to GCS bucket, it has uploaded successfully.

@lhotari
Copy link
Member

lhotari commented Apr 22, 2024

@lhotari With the above fix I tried testing GCS offloader with ledgers of size >= 2.5gb. I'm still facing the same error

Error in offload
null

Reason: Error offloading: org.apache.bookkeeper.mledger.ManagedLedgerException: java.util.concurrent.CompletionException: org.jclouds.http.HttpResponseException: command: POST https://www.googleapis.com/storage/v1/b/bucket-cognitree-ls1258/o/d7f69abd-dc45-4fc1-bbe3-bd9daaa8fea7-ledger-9/compose HTTP/1.1 failed with response: HTTP/1.1 400 Bad Request; content: [{
  "error": {
    "code": 400,
    "message": "The number of source components provided (38) exceeds the maximum (32)",
    "errors": [
      {
        "message": "The number of source components provided (38) exceeds the maximum (32)",
        "domain": "global",
        "reason": "invalid"
      }
    ]
  }
}
]

I have also written a simple code using Jclouds 2.6.0 to upload a large file of size 3gb to GCS bucket, it has uploaded successfully.

@nikhil-ctds Do you have a chance to rerun the test with gcsManagedLedgerOffloadMaxBlockSizeInBytes=134217728 ?
The default value gcsManagedLedgerOffloadMaxBlockSizeInBytes=67108864 will limit the max upload size to 2048MiB (64MiB * 32). Doubling gcsManagedLedgerOffloadMaxBlockSizeInBytes to 134217728 should lift the limit to 4096MiB.
Can you please verify? Thanks

@lhotari
Copy link
Member

lhotari commented Apr 22, 2024

@nikhil-ctds I created #22554 that changes the default value of gcsManagedLedgerOffloadMaxBlockSizeInBytes. It would be helpful if you could test that gcsManagedLedgerOffloadMaxBlockSizeInBytes=134217728 fixes the issue. thanks

@nikhil-ctds
Copy link

@lhotari I have tested with gcsManagedLedgerOffloadMaxBlockSizeInBytes=134217728, and was able to offload ledger of 3.5gb to gcs bucket.

lhotari pushed a commit that referenced this pull request Apr 22, 2024
Co-authored-by: 道君 <[email protected]>
Co-authored-by: Lari Hotari <[email protected]>
(cherry picked from commit 73dc213)
lhotari pushed a commit that referenced this pull request Apr 22, 2024
Co-authored-by: 道君 <[email protected]>
Co-authored-by: Lari Hotari <[email protected]>
(cherry picked from commit 73dc213)
lhotari pushed a commit that referenced this pull request Apr 22, 2024
Co-authored-by: 道君 <[email protected]>
Co-authored-by: Lari Hotari <[email protected]>
(cherry picked from commit 73dc213)
mukesh-ctds pushed a commit to datastax/pulsar that referenced this pull request Apr 23, 2024
Co-authored-by: 道君 <[email protected]>
Co-authored-by: Lari Hotari <[email protected]>
(cherry picked from commit 73dc213)
(cherry picked from commit fc136e8)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Offloading large ledgers (>2GB) fail with Google Cloud Storage
6 participants