Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minio s3 ListMultipartUploads #5613

Closed
rvolykh opened this issue Mar 7, 2018 · 21 comments
Closed

Minio s3 ListMultipartUploads #5613

rvolykh opened this issue Mar 7, 2018 · 21 comments

Comments

@rvolykh
Copy link

rvolykh commented Mar 7, 2018

Hello, i'm trying to list multipart uploads with python boto library (boto==2.48.0) and all time getting response without multipart uploads while in storage directory of minio i can see my not completed uploads (.minio.sys/multipart/test_bucket/file1// there are fs.json, object1, object2 as I uploaded two parts), the same behavior with postman.

Expected Behavior

https://docs.aws.amazon.com/AmazonS3/latest/API/mpUploadListMPUpload.html
Upload with file1 key

Current Behavior

No Uploads tag at all

Steps to Reproduce (for bugs)

  1. Initiate multipart uploads (https://docs.aws.amazon.com/AmazonS3/latest/API/mpUploadInitiate.html)
  2. Upload part (https://docs.aws.amazon.com/AmazonS3/latest/API/mpUploadUploadPart.html)
  3. List multipart uploads (https://docs.aws.amazon.com/AmazonS3/latest/API/mpUploadListMPUpload.html)

No complete multipart was called.
Boto example:

import os.path
import boto.s3
from boto.s3.connection import S3Connection

conn = S3Connection("<my-access-key>", "<my-secret-key>", is_secure=false, port=9000, host=localhost, 
    calling_format='boto.s3.connection.OrdinaryCallingFormat')
bucket = conn.get_bucket('test_bucket')
mp = bucket.initiate_multipart_upload('file1')
with open('/tmp/largefile.zip', 'rb') as f:
    part1 = mp.upload_part_from_file(f, 1, size=10*1024*1024)
    part2 = mp.upload_part_from_file(f, 2, size=10*1024*1024)
print(bucket.list_multipart_uploads(upload_id_marker=mp.id)
mp.complete_upload()

Context

I've got integration tests which fails when run them on minio. Probably, can be also an issue for different s3 browsers.

Your Environment

  • Version used (minio version): minio.RELEASE.2018-01-18T20-33-21Z
  • Environment name and version (e.g. nginx 1.9.1): direct connection but inside docker image
  • Operating System and version (uname -a): Darwin data_race 17.4.0 Darwin Kernel Version 17.4.0: Sun Dec 17 09:19:54 PST 2017; root:xnu-4570.41.2~1/RELEASE_X86_64 x86_64
@benagricola
Copy link

benagricola commented Mar 7, 2018

I'm seeing something very similar using duplicity, which uses boto 2.48.0 under the hood. Duplicity believes the chunks have been uploaded successfully but mp.get_all_parts() returns an empty list and causes the upload to fail.

I'm using the following version:

Version: 2018-02-09T22-40-05Z
Release-Tag: RELEASE.2018-02-09T22-40-05Z
Commit-ID: 289457568c2c812604484ae9b4efeca0f33aac2c

@aead aead added the triage label Mar 7, 2018
@krishnasrinivas
Copy link
Contributor

@rvolykh The following script works fine with the latest release 2018-02-09T22-40-05Z what release are you using? I suspect you are using an old minio version.

import os.path
import boto.s3
from boto.s3.connection import S3Connection

conn = S3Connection("minio", "minio123", is_secure=False, port=9000, host="localhost", calling_format='boto.s3.connection.OrdinaryCallingFormat')
bucket = conn.get_bucket('test')
mp = bucket.initiate_multipart_upload('file1')
with open('/tmp/largefile.zip', 'rb') as f:
    part1 = mp.upload_part_from_file(f, 1, size=10*1024*1024)
    part2 = mp.upload_part_from_file(f, 2, size=10*1024*1024)
print mp.get_all_parts()
mp.complete_upload()

@benagricola mp.get_all_parts() in the above script returns a list for me on the latest release:

krishna@escape:~/dev/py-scripts$ python mp.py 
[<Part 1>, <Part 2>]
krishna@escape:~/dev/py-scripts$

Can you do export MINIO_HTTP_TRACE=/dev/stdout and run minio and get the trace logs?

@rvolykh
Copy link
Author

rvolykh commented Mar 9, 2018

Thanks for responses, updates:

uname -a: Linux f036ec4a1098 4.9.60-linuxkit-aufs #1 SMP Mon Nov 6 16:00:12 UTC 2017 x86_64 Linux

trace:

[REQUEST (objectAPIHandlers).ListMultipartUploadsHandler-fm] [152058239.929682] [2018-03-09 07:59:59 +0000]
GET /test/?max-uploads=1000&upload-id-marker=b98c02f1-5f19-4f63-9a99-1508424cdb68&uploads=
Host: localhost:9000
User-Agent: Minio (linux; amd64) minio-go/4.0.8
Authorization: AWS4-HMAC-SHA256 Credential=aSh3eemaeg1heex5brec/20180309/us-east-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=bf1014c3e043be55c3d70bd6e1d06cbed1a1e3f0345931dab5287fdb63bf7355
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Date: 20180309T075959Z


[RESPONSE] [152058239.929682] [2018-03-09 07:59:59 +0000]
200 OK
Vary: Origin
X-Amz-Request-Id: 151A31A2C1E5B24F
Server: Minio/RELEASE.2018-02-09T22-40-05Z (linux; amd64)
X-Amz-Bucket-Region: us-east-1
Accept-Ranges: bytes
Content-Type: application/xml

<?xml version="1.0" encoding="UTF-8"?>
<ListMultipartUploadsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Bucket>test</Bucket><KeyMarker></KeyMarker><UploadIdMarker>b98c02f1-5f19-4f63-9a99-1508424cdb68</UploadIdMarker><NextKeyMarker></NextKeyMarker><NextUploadIdMarker></NextUploadIdMarker><Delimiter></Delimiter><Prefix></Prefix><MaxUploads>1000</MaxUploads><IsTruncated>false</IsTruncated></ListMultipartUploadsResult>

minio/data catalog:

.minio.sys/
   - multipart/
      - 2b141aef48c4cbeefa61a0555e70184e9810d7febf7f52847cdc65e9ceec8cec/
         - b98c02f1-5f19-4f63-9a99-1508424cdb68/
            - 00001.6a8464c8216bfccc4c0334831c5a5d81
            - 00002.ddb2c74937a7aeff86be653545b98ed7
            - fs.json

cat fs.json:

{"version":"1.0.1","format":"fs","minio":{"release":"RELEASE.2018-02-09T22-40-05Z"},"meta":{"content-type":"application/octet-stream"}}

@benagricola
Copy link

Trace log from my side (note boto version for this log is 2.42.0 but same occurs with 2.48.0), looks very similar to output from @rvolykh:

[REQUEST (objectAPIHandlers).HeadBucketHandler-fm] [152084879.309180] [2018-03-12 09:59:53 +0000]
HEAD /redacted-servername01/
Host: redacted.hostname.tld
User-Agent: Boto/2.42.0 Python/2.6.6 Linux/2.6.32-696.18.7.el6.x86_64
Accept-Encoding: identity
Authorization: AWS4-HMAC-SHA256 Credential=REDACTEDCREDENTIAL/20180312/us-east-1/s3/aws4_request,SignedHeaders=host;user-agent;x-amz-content-sha256;x-amz-date,Signature=79557209b9ab911825d2f4274e844ead20a2450ca221d7170bdfa94a5d2d2c3d
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Date: 20180312T095953Z
X-Forwarded-For: 127.0.0.1


[RESPONSE] [152084879.309180] [2018-03-12 09:59:53 +0000]
200 OK
Server: Minio/RELEASE.2018-02-09T22-40-05Z (linux; amd64)
Accept-Ranges: bytes
Vary: Origin
X-Amz-Request-Id: 151B23EB655407A2


[REQUEST (objectAPIHandlers).NewMultipartUploadHandler-fm] [152084879.309901] [2018-03-12 09:59:53 +0000]
POST /redacted-servername01/duplicity-full.20180311T033517Z.vol897.difftar.gpg?uploads
Host: redacted.hostname.tld
User-Agent: Boto/2.42.0 Python/2.6.6 Linux/2.6.32-696.18.7.el6.x86_64
Accept-Encoding: identity
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
Content-Length: 0
Authorization: AWS4-HMAC-SHA256 Credential=REDACTEDCREDENTIAL/20180312/us-east-1/s3/aws4_request,SignedHeaders=content-type;host;user-agent;x-amz-content-sha256;x-amz-date;x-amz-storage-class,Signature=eea60f9df21d05c5f23c36338f159d4e016349516bfd970ac881773bb6c655a5
Content-Type: application/octet-stream
X-Amz-Date: 20180312T095953Z
X-Amz-Storage-Class: STANDARD
X-Forwarded-For: 127.0.0.1


[RESPONSE] [152084879.309901] [2018-03-12 09:59:53 +0000]
200 OK
X-Amz-Request-Id: 151B23EB68F5AD56
Server: Minio/RELEASE.2018-02-09T22-40-05Z (linux; amd64)
Accept-Ranges: bytes
Content-Type: application/xml
Vary: Origin

<?xml version="1.0" encoding="UTF-8"?>
<InitiateMultipartUploadResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Bucket>redacted-servername01</Bucket><Key>duplicity-full.20180311T033517Z.vol897.difftar.gpg</Key><UploadId>e3633465-8ba6-4eee-95e2-f847919dd045</UploadId></InitiateMultipartUploadResult>

[REQUEST (objectAPIHandlers).HeadBucketHandler-fm] [152084879.315931] [2018-03-12 09:59:53 +0000]
HEAD /redacted-servername01/
Host: redacted.hostname.tld
User-Agent: Boto/2.42.0 Python/2.6.6 Linux/2.6.32-696.18.7.el6.x86_64
Accept-Encoding: identity
Authorization: AWS4-HMAC-SHA256 Credential=REDACTEDCREDENTIAL/20180312/us-east-1/s3/aws4_request,SignedHeaders=host;user-agent;x-amz-content-sha256;x-amz-date,Signature=79557209b9ab911825d2f4274e844ead20a2450ca221d7170bdfa94a5d2d2c3d
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Date: 20180312T095953Z
X-Forwarded-For: 127.0.0.1


[RESPONSE] [152084879.315931] [2018-03-12 09:59:53 +0000]
200 OK
Vary: Origin
X-Amz-Request-Id: 151B23EB6959477E
Server: Minio/RELEASE.2018-02-09T22-40-05Z (linux; amd64)
Accept-Ranges: bytes


[REQUEST (objectAPIHandlers).HeadBucketHandler-fm] [152084879.315987] [2018-03-12 09:59:53 +0000]
HEAD /redacted-servername01/
Host: redacted.hostname.tld
User-Agent: Boto/2.42.0 Python/2.6.6 Linux/2.6.32-696.18.7.el6.x86_64
Accept-Encoding: identity
Authorization: AWS4-HMAC-SHA256 Credential=REDACTEDCREDENTIAL/20180312/us-east-1/s3/aws4_request,SignedHeaders=host;user-agent;x-amz-content-sha256;x-amz-date,Signature=79557209b9ab911825d2f4274e844ead20a2450ca221d7170bdfa94a5d2d2c3d
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Date: 20180312T095953Z
X-Forwarded-For: 127.0.0.1


[RESPONSE] [152084879.315987] [2018-03-12 09:59:53 +0000]
200 OK
Vary: Origin
X-Amz-Request-Id: 151B23EB696209AE
Server: Minio/RELEASE.2018-02-09T22-40-05Z (linux; amd64)
Accept-Ranges: bytes


[REQUEST (objectAPIHandlers).HeadBucketHandler-fm] [152084879.315975] [2018-03-12 09:59:53 +0000]
HEAD /redacted-servername01/
Host: redacted.hostname.tld
X-Amz-Date: 20180312T095953Z
X-Forwarded-For: 127.0.0.1
User-Agent: Boto/2.42.0 Python/2.6.6 Linux/2.6.32-696.18.7.el6.x86_64
Accept-Encoding: identity
Authorization: AWS4-HMAC-SHA256 Credential=REDACTEDCREDENTIAL/20180312/us-east-1/s3/aws4_request,SignedHeaders=host;user-agent;x-amz-content-sha256;x-amz-date,Signature=79557209b9ab911825d2f4274e844ead20a2450ca221d7170bdfa94a5d2d2c3d
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855


[RESPONSE] [152084879.315975] [2018-03-12 09:59:53 +0000]
200 OK
Vary: Origin
X-Amz-Request-Id: 151B23EB6962E74C
Server: Minio/RELEASE.2018-02-09T22-40-05Z (linux; amd64)
Accept-Ranges: bytes


[REQUEST (objectAPIHandlers).ListMultipartUploadsHandler-fm] [152084879.316232] [2018-03-12 09:59:53 +0000]
GET /redacted-servername01/?uploads
Host: redacted.hostname.tld
User-Agent: Boto/2.42.0 Python/2.6.6 Linux/2.6.32-696.18.7.el6.x86_64
Accept-Encoding: identity
Authorization: AWS4-HMAC-SHA256 Credential=REDACTEDCREDENTIAL/20180312/us-east-1/s3/aws4_request,SignedHeaders=host;user-agent;x-amz-content-sha256;x-amz-date,Signature=cf3159c304308799c1d3fce6ce70b075536b1281baf6ed545f7d2b286150f4b5
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Date: 20180312T095953Z
X-Forwarded-For: 127.0.0.1


[RESPONSE] [152084879.316232] [2018-03-12 09:59:53 +0000]
200 OK
Vary: Origin
X-Amz-Request-Id: 151B23EB698B5D60
Server: Minio/RELEASE.2018-02-09T22-40-05Z (linux; amd64)
Accept-Ranges: bytes
Content-Type: application/xml

<?xml version="1.0" encoding="UTF-8"?>
<ListMultipartUploadsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Bucket>redacted-servername01</Bucket><KeyMarker></KeyMarker><UploadIdMarker></UploadIdMarker><NextKeyMarker></NextKeyMarker><NextUploadIdMarker></NextUploadIdMarker><Delimiter></Delimiter><Prefix></Prefix><MaxUploads>1000</MaxUploads><IsTruncated>false</IsTruncated></ListMultipartUploadsResult>

[REQUEST (objectAPIHandlers).ListMultipartUploadsHandler-fm] [152084879.316430] [2018-03-12 09:59:53 +0000]
GET /redacted-servername01/?uploads
Host: redacted.hostname.tld
User-Agent: Boto/2.42.0 Python/2.6.6 Linux/2.6.32-696.18.7.el6.x86_64
Accept-Encoding: identity
Authorization: AWS4-HMAC-SHA256 Credential=REDACTEDCREDENTIAL/20180312/us-east-1/s3/aws4_request,SignedHeaders=host;user-agent;x-amz-content-sha256;x-amz-date,Signature=cf3159c304308799c1d3fce6ce70b075536b1281baf6ed545f7d2b286150f4b5
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Date: 20180312T095953Z
X-Forwarded-For: 127.0.0.1


[RESPONSE] [152084879.316430] [2018-03-12 09:59:53 +0000]
200 OK
Vary: Origin
X-Amz-Request-Id: 151B23EB69A5BBFA
Server: Minio/RELEASE.2018-02-09T22-40-05Z (linux; amd64)
Accept-Ranges: bytes
Content-Type: application/xml

<?xml version="1.0" encoding="UTF-8"?>
<ListMultipartUploadsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Bucket>redacted-servername01</Bucket><KeyMarker></KeyMarker><UploadIdMarker></UploadIdMarker><NextKeyMarker></NextKeyMarker><NextUploadIdMarker></NextUploadIdMarker><Delimiter></Delimiter><Prefix></Prefix><MaxUploads>1000</MaxUploads><IsTruncated>false</IsTruncated></ListMultipartUploadsResult>

[REQUEST (objectAPIHandlers).ListMultipartUploadsHandler-fm] [152084879.316586] [2018-03-12 09:59:53 +0000]
GET /redacted-servername01/?uploads
Host: redacted.hostname.tld
User-Agent: Boto/2.42.0 Python/2.6.6 Linux/2.6.32-696.18.7.el6.x86_64
Accept-Encoding: identity
Authorization: AWS4-HMAC-SHA256 Credential=REDACTEDCREDENTIAL/20180312/us-east-1/s3/aws4_request,SignedHeaders=host;user-agent;x-amz-content-sha256;x-amz-date,Signature=cf3159c304308799c1d3fce6ce70b075536b1281baf6ed545f7d2b286150f4b5
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Date: 20180312T095953Z
X-Forwarded-For: 127.0.0.1


[RESPONSE] [152084879.316586] [2018-03-12 09:59:53 +0000]
200 OK
X-Amz-Request-Id: 151B23EB69BD778F
Server: Minio/RELEASE.2018-02-09T22-40-05Z (linux; amd64)
Accept-Ranges: bytes
Content-Type: application/xml
Vary: Origin

<?xml version="1.0" encoding="UTF-8"?>
<ListMultipartUploadsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Bucket>redacted-servername01</Bucket><KeyMarker></KeyMarker><UploadIdMarker></UploadIdMarker><NextKeyMarker></NextKeyMarker><NextUploadIdMarker></NextUploadIdMarker><Delimiter></Delimiter><Prefix></Prefix><MaxUploads>1000</MaxUploads><IsTruncated>false</IsTruncated></ListMultipartUploadsResult>

[REQUEST (objectAPIHandlers).ListObjectPartsHandler-fm] [152084879.316970] [2018-03-12 09:59:53 +0000]
GET /redacted-servername01/duplicity-full.20180311T033517Z.vol897.difftar.gpg?uploadId=e3633465-8ba6-4eee-95e2-f847919dd045
Host: redacted.hostname.tld
User-Agent: Boto/2.42.0 Python/2.6.6 Linux/2.6.32-696.18.7.el6.x86_64
Accept-Encoding: identity
Authorization: AWS4-HMAC-SHA256 Credential=REDACTEDCREDENTIAL/20180312/us-east-1/s3/aws4_request,SignedHeaders=host;user-agent;x-amz-content-sha256;x-amz-date,Signature=2a9d6bfe19d0891467fbf66f3ca898865d955bf66379e04bea0796e4d18710cf
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Date: 20180312T095953Z
X-Forwarded-For: 127.0.0.1


[RESPONSE] [152084879.316970] [2018-03-12 09:59:53 +0000]
200 OK
Vary: Origin
X-Amz-Request-Id: 151B23EB6A34B6C1
Server: Minio/RELEASE.2018-02-09T22-40-05Z (linux; amd64)
Accept-Ranges: bytes
Content-Type: application/xml

<?xml version="1.0" encoding="UTF-8"?>
<ListPartsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Bucket>redacted-servername01</Bucket><Key>duplicity-full.20180311T033517Z.vol897.difftar.gpg</Key><UploadId>e3633465-8ba6-4eee-95e2-f847919dd045</UploadId><Initiator><ID>02d6176db174dc93cb1b899f7c6078f08654445fe8cf1b6ce98d8855f66bdbf4</ID><DisplayName></DisplayName></Initiator><Owner><ID>02d6176db174dc93cb1b899f7c6078f08654445fe8cf1b6ce98d8855f66bdbf4</ID><DisplayName></DisplayName></Owner><StorageClass>STANDARD</StorageClass><PartNumberMarker>0</PartNumberMarker><NextPartNumberMarker>0</NextPartNumberMarker><MaxParts>1000</MaxParts><IsTruncated>false</IsTruncated></ListPartsResult>

[REQUEST (objectAPIHandlers).AbortMultipartUploadHandler-fm] [152084879.317734] [2018-03-12 09:59:53 +0000]
DELETE /redacted-servername01/duplicity-full.20180311T033517Z.vol897.difftar.gpg?uploadId=e3633465-8ba6-4eee-95e2-f847919dd045
Host: redacted.hostname.tld
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Date: 20180312T095953Z
X-Forwarded-For: 127.0.0.1
User-Agent: Boto/2.42.0 Python/2.6.6 Linux/2.6.32-696.18.7.el6.x86_64
Accept-Encoding: identity
Authorization: AWS4-HMAC-SHA256 Credential=REDACTEDCREDENTIAL/20180312/us-east-1/s3/aws4_request,SignedHeaders=host;user-agent;x-amz-content-sha256;x-amz-date,Signature=c35d895d2922fa9acecae980da8d669dd2135f16b6ed9333063c9dbaa67f9f4a


[RESPONSE] [152084879.317734] [2018-03-12 09:59:53 +0000]
204 No Content
Vary: Origin
X-Amz-Request-Id: 151B23EB6AC5C490
Server: Minio/RELEASE.2018-02-09T22-40-05Z (linux; amd64)
Accept-Ranges: bytes

@krishnasrinivas
Copy link
Contributor

@rvolykh can you give me a sample script to reproduce the problem, because the script you have me (pasted below) works as fine for me:

import os.path
import boto.s3
from boto.s3.connection import S3Connection

conn = S3Connection("minio", "minio123", is_secure=False, port=9000, host="localhost", calling_format='boto.s3.connection.OrdinaryCallingFormat')
bucket = conn.get_bucket('test')
mp = bucket.initiate_multipart_upload('file1')
with open('/tmp/largefile.zip', 'rb') as f:
    part1 = mp.upload_part_from_file(f, 1, size=10*1024*1024)
    part2 = mp.upload_part_from_file(f, 2, size=10*1024*1024)
print mp.get_all_parts()
mp.complete_upload()
krishna@escape:~/dev/py-scripts$ python mp.py 
[<Part 1>, <Part 2>]
krishna@escape:~/dev/py-scripts$

@krishnasrinivas
Copy link
Contributor

@benagricola if you notice the trace log, there are no "put object part" requests hence mp.get_all_parts() returns 0 entries. Can you give me instructions on what commands to run to see duplicity error? (I have not used duplicity)

@rvolykh
Copy link
Author

rvolykh commented Mar 15, 2018

@krishnasrinivas
yes, just change a bit your script, i don't know why you replaced my example above :)

replace in your script: print mp.get_all_parts() with
print bucket.list_multipart_uploads()
Problem is in a method list_multipart_uploads for bucket not for a get_all_parts for object

@benagricola
Copy link

Hi @krishnasrinivas,

Correct - there is no PUT object parts because Duplicity uses bucket.list_multipart_uploads() prior to attempting to upload any chunks.

So the process is:

  • Create multipart upload
  • Spawn x number of workers to upload each chunk
  • Each worker checks multipart upload is in list_multipart_uploads (i.e. still a valid upload)
  • Upload file chunk and exit subprocess

But because list_multipart_uploads returns nothing, each worker simply returns and says the chunk has been uploaded successfully even though nothing has happened.

You can see the multipart boto-based code in duplicity with the relevant worker code here:

https://github.com/henrysher/duplicity/blob/master/duplicity/backends/_boto_multi.py#L203

@krishnasrinivas
Copy link
Contributor

@krishnasrinivas
yes, just change a bit your script, i don't know why you replaced my example above :)

oops sorry about that. Yes I see you try to do list_multipart_uploads

@krishnasrinivas oh, I see, a bit complicated but Minio rocks anyway :)
Probably we'll try to replace listMultipartUpload call and let you know if there would be some pitfalls

ok great 👍

@krishnasrinivas
Copy link
Contributor

@benagricola we made a change to simplify our backend format and also the code. As a result we will no longer be able to support ListMultipartUploads when prefix is given as ""

Reason for ListMultipartUploads to exist in AWS-S3 is for clients to list multipart uploads and remove them so that AWS do not bill for them. In our case we auto purge old multipart uploads and not support ListMultipartUploads.

In duplicity if you see:

            for mp in bucket.list_multipart_uploads():
                if mp.id == multipart_id:

there is no advantage of doing list_multipart_uploads() to check for the id, because even if you did not do, the subsequent putObjectPart() would have failed if the mp.id did not exist.

The right way for the application to behave is:

  1. create a new mp.id
  2. upload parts
  3. call complete-multipart

@benagricola because of this behavior by duplicity it won't work with minio, have you been using old minio version for your deployment?

@benagricola
Copy link

@krishnasrinivas yep, have been using duplicity with an older version of minio but recently upgraded.

@krishnasrinivas
Copy link
Contributor

@benagricola github.com/minio/mc now supports encryption https://docs.minio.io/docs/minio-client-complete-guide. https://github.com/restic/restic is another alternative.

@harshavardhana
Copy link
Member

@krishnasrinivas should we close this?

@harshavardhana
Copy link
Member

@krishnasrinivas should we close this?

Closing this issue as we are not going to bring back the older behavior of ListMultipartUploads for now.

@meinemitternacht
Copy link

meinemitternacht commented Aug 29, 2018

@krishnasrinivas @harshavardhana you mentioned in a previous post that minio will auto-purge old multipart uploads. I looked through some of the documentation but did not see a reference to this behavior. How often are they purged?

Where I work, we are using minio for some internal S3-compatible testing, and it would have made things much easier if listMultipartUploads with a less restrictive prefix (e.g. "folder1" instead of "folder1/file") were supported. We are uploading multi-terabyte files, and when simulating failures, the .minio.sys directory fills up the disk quite quickly.

Additionally, we found that when uploading a 5 TB file, minio requires twice the file size in free space to concatenate the file while it is being uploaded. Not really a problem, but again, I did not see that in the documentation.

@krishnasrinivas
Copy link
Contributor

@meinemitternacht old multipart uploads older than 2 weeks get purged.

We are uploading multi-terabyte files, and when simulating failures, the .minio.sys directory fills up the disk quite quickly.

Can you give more info on what kind of failures are being simulated? On any failure we cleanup the tmp files, hence .minio.sys directory should not get filled up.

@meinemitternacht
Copy link

Can you give more info on what kind of failures are being simulated? On any failure we cleanup the tmp files, hence .minio.sys directory should not get filled up.

We are using libs3 to upload files by piping data to the program. If the upload is aborted for some reason (or a there is a network connectivity issue), libs3 does not issue an "abortMultipartUpload" API call. This is not the fault of minio, as the client should be issuing that command. It is just rather annoying for us to have to perform "listMultipartUploads" for each key that was uploaded in order to delete the applicable temporary files. One call with a common prefix would be much more efficient.

I certainly don't want to persuade you to change this behavior, just wanted to convey our particular use case. Though, seeing as how other S3-compatible providers do support listing by a prefix, this could positively improve compatibility across the board. Minio is just one provider that we were testing, so it isn't critical that these issues are addressed.

@harshavardhana
Copy link
Member

We are using libs3 to upload files by piping data to the program. If the upload is aborted for some reason (or a there is a network connectivity issue), libs3 does not issue an "abortMultipartUpload" API call. This is not the fault of minio, as the client should be issuing that command. It is just rather annoying for us to have to perform "listMultipartUploads" for each key that was uploaded in order to delete the applicable temporary files. One call with a common prefix would be much more efficient.

I certainly don't want to persuade you to change this behavior, just wanted to convey our particular use case. Though, seeing as how other S3-compatible providers do support listing by a prefix, this could positively improve compatibility across the board. Minio is just one provider that we were testing, so it isn't critical that these issues are addressed.

you can listMultipartUploads if you know for which object it failed, we simply don't support hierarchical listing. Historically we used to support all of this in various combinations but we simply moved for a more simpler bug-free backend implementation which disallowed lesser used features.

If you look at aws-sdk upload managers they abort by default upon error, so libs3 should do the same here, why not use aws-sdk-c++ here?

@meinemitternacht
Copy link

we simply moved for a more simpler bug-free backend implementation which disallowed lesser used features.

That's perfectly fine, I was mainly curious about how long it would take for temp files to be deleted. And it seems that we should be handling the deletion of those files ourselves since they could potentially be up to ~10 TB per object uploaded (counting the temporary concatenation file). Having that data hang around for two weeks is wasteful.

If you look at aws-sdk upload managers they abort by default upon error, so libs3 should do the same here, why not use aws-sdk-c++ here?

Indeed, I agree that behavior should be present in libs3. We looked at using aws-sdk-c++, but it was much simpler to adapt libs3 for our environment (embedded device, controlling program written in C) than it was to integrate the necessary utilities for building that SDK. In the future we may be able to integrate it.

@harshavardhana
Copy link
Member

That's perfectly fine, I was mainly curious about how long it would take for temp files to be deleted. And it seems that we should be handling the deletion of those files ourselves since they could potentially be up to ~10 TB per object uploaded (counting the temporary concatenation file). Having that data hang around for two weeks is wasteful.

That may be true but we are sort of expecting that you don't have lots of timeouts when uploading large objects, but again if you know object and key then we do list the upload ids and you can forcibly abort them.

@lock
Copy link

lock bot commented Apr 25, 2020

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked as resolved and limited conversation to collaborators Apr 25, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants