-
Notifications
You must be signed in to change notification settings - Fork 643
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Endpoint url error #4811
Comments
Can be reproduced with a custom S3 endpoint hosted on Scaleway.
aws {
accessKey = 'ACCESSKEY'
secretKey = 'SECRETKEY'
region = 'fr-par'
client {
endpoint = 'https://s3.fr-par.scw.cloud'
protocol = 'https'
s3PathStyleAccess = true
}
} |
This is very important to be able to use private S3 implementations in Europe when you are analyzing data from hospitals that are forbidden to use AWS services for GDPR and regulatory reasons. |
The issue is that at some point something adds back the "amazonaws.com" suffix instead of using the custom S3 endpoint URI provided.
The stack trace is as follows:
We can look at what's happening in nextflow.cloud.aws.nio.S3Client.putObject(S3Client.java:209) which is at the boundary of nextflow package and going into AWS SDK. public PutObjectResult putObject(String bucket, String keyName, InputStream inputStream, ObjectMetadata metadata, List<Tag> tags, String contentType) {
PutObjectRequest req = new PutObjectRequest(bucket, keyName, inputStream, metadata);
if( cannedAcl != null ) {
req.withCannedAcl(cannedAcl);
}
if( tags != null && tags.size()>0 ) {
req.setTagging(new ObjectTagging(tags));
}
if( kmsKeyId != null ) {
req.withSSEAwsKeyManagementParams( new SSEAwsKeyManagementParams(kmsKeyId) );
}
if( storageEncryption!=null ) {
metadata.setSSEAlgorithm(storageEncryption.toString());
}
if( contentType!=null ) {
metadata.setContentType(contentType);
}
if( log.isTraceEnabled() ) {
log.trace("S3 PutObject request {}", req);
}
return client.putObject(req);
} The exception is raised at the last line, by the call to the AWS SDK client.putObject(req). public PutObjectResult putObject(String bucket, String keyName, InputStream inputStream, ObjectMetadata metadata, List<Tag> tags, String contentType) {
PutObjectRequest req = new PutObjectRequest(bucket, keyName, inputStream, metadata);
if( cannedAcl != null ) {
req.withCannedAcl(cannedAcl);
}
if( tags != null && tags.size()>0 ) {
req.setTagging(new ObjectTagging(tags));
}
if( kmsKeyId != null ) {
req.withSSEAwsKeyManagementParams( new SSEAwsKeyManagementParams(kmsKeyId) );
}
if( storageEncryption!=null ) {
metadata.setSSEAlgorithm(storageEncryption.toString());
}
if( contentType!=null ) {
metadata.setContentType(contentType);
}
if( log.isTraceEnabled() ) {
log.trace("S3 PutObject request {}", req);
}
for (Bucket b : client.listBuckets()) {
System.out.println("bucket "+b.getName());
}
return client.putObject(req);
} We can see that the accessible buckets are in fact correctly listed on standard output:
So we can conclude that the AWS S3 client can actually access the custom S3 endpoint and gets correct answers in principle, at least enough to be able to list buckets. How strange! for (S3ObjectSummary obj : client.listObjects("turing", "db").getObjectSummaries()) {
System.out.println("object "+obj.getKey());
} This time it fails and we get the exception raised inside AWS SDK coming from the listObjects method. com.amazonaws.SdkClientException: Unable to execute HTTP request: s3.fr-par.amazonaws.com
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1219)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1165)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:814)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:781)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:755)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:715)
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:697)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:561)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:541)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5558)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5505)
at com.amazonaws.services.s3.AmazonS3Client.listObjects(AmazonS3Client.java:950)
at com.amazonaws.services.s3.AmazonS3Client.listObjects(AmazonS3Client.java:915)
at nextflow.cloud.aws.nio.S3Client.putObject(S3Client.java:210)
at nextflow.cloud.aws.nio.S3FileSystemProvider.createDirectory(S3FileSystemProvider.java:492)
at java.base/java.nio.file.Files.createDirectory(Files.java:700)
at java.base/java.nio.file.Files.createAndCheckIsDirectory(Files.java:807)
at java.base/java.nio.file.Files.createDirectories(Files.java:753)
at org.codehaus.groovy.vmplugin.v8.IndyInterface.fromCache(IndyInterface.java:321)
at nextflow.extension.FilesEx.mkdirs(FilesEx.groovy:493)
at nextflow.Session.init(Session.groovy:406)
at nextflow.script.ScriptRunner.execute(ScriptRunner.groovy:129)
at nextflow.cli.CmdRun.run(CmdRun.groovy:372)
at nextflow.cli.Launcher.run(Launcher.groovy:503)
at nextflow.cli.Launcher.main(Launcher.groovy:657)
Caused by: java.net.UnknownHostException: s3.fr-par.amazonaws.com
at java.base/java.net.InetAddress$CachedAddresses.get(InetAddress.java:801)
at java.base/java.net.InetAddress.getAllByName0(InetAddress.java:1533)
at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1385)
at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1306)
at com.amazonaws.SystemDefaultDnsResolver.resolve(SystemDefaultDnsResolver.java:27)
at com.amazonaws.http.DelegatingDnsResolver.resolve(DelegatingDnsResolver.java:38)
at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:112)
at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:376)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:568)
at com.amazonaws.http.conn.ClientConnectionManagerFactory$Handler.invoke(ClientConnectionManagerFactory.java:76)
at com.amazonaws.http.conn.$Proxy27.connect(Unknown Source)
at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
at com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1346)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1157)
... 23 common frames omitted Conclusionmy guess at this point is that it is a bug inside AWS SDK because the S3 client appears to be well configured, with the right endpoint URI and works for just listing the buckets and for any query that does not attempt to read or write objects inside buckets. Some piece inside AWS SDK must overwrite parts of the endpoint URI for some reason. Why not AWS SDK > 2?Looking at the Gradle dependencies, question: is there a particular reason why Nextflow still uses AWS SDK 1.12.70 although it is clearly said that it is deprecated and we are at AWS SDK > 2 now? |
@rjb32 thanks for the triage. SDK v2 is on our roadmap but we just haven't gotten to it yet. It is not a trivial change. |
See #4741 |
Bug report
Nextflow file main.nf
nextflow.conf
command
Versions:
nextflow: 23.10.1
nf-amazon:2.1.4
Expected behavior and actual behavior
I expected that the file will be uploaded to the s3 bucket similar to the minio s3 server that I tested successfully on minio image with endpoint="http://localhost:9000".
I tested with the aws cli command that showed I have permission to put objects.
The error showed that it failed to parse the endpoint with no additional information.
I think it is failed because the nextflow plugin nf-amazon 2.1.4 or its dependencies SDK provided by the AWS failed to parsing the endpoint.
Steps to reproduce the problem
I can not provide the endpoint url and its credentials with specific patterns like above.
Program output
Environment
Additional context
Is there any documents for recompiling the plugin nf-amazon ? I tried to compile the nextflow after modification of the nf-amazon plugin but it created a build folder structure that is quite different from the [email protected] downloaded by nextflow. I cloned the nextflow repo using tag v23.10.1, then ran
The text was updated successfully, but these errors were encountered: