Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storage: small file uploads are randomly returning md5 errors #2604

Closed
gcochard opened this issue Sep 12, 2017 · 6 comments
Closed

Storage: small file uploads are randomly returning md5 errors #2604

gcochard opened this issue Sep 12, 2017 · 6 comments
Assignees
Labels
api: storage Issues related to the Cloud Storage API.

Comments

@gcochard
Copy link
Contributor

gcochard commented Sep 12, 2017

Getting intermittent errors with codes FILE_NO_UPLOAD_DELETE and FILE_NO_UPLOAD when I am uploading a large amount of files in parallel.

Environment details

  • OS: Mac OS 10.12.6
  • Node.js version: 8.3.0
  • npm version: 5.4.1
  • google-cloud-node version: 0.56.0
  • google-cloud/storage version: 1.2.0

Steps to reproduce

This gist is a complete, self-contained example
https://gist.github.com/gcochard/5dd10b16911a57f3bad2b71bbbab833e

I have verified that the MD5sum of the file is correct before uploading, and that it is incorrect after uploading. It appears that the MD5sum returned is of the file minus the last 3 bytes. I will investigate further and see if this is consistent behavior.

@stephenplusplus
Copy link
Contributor

Can you try disabling resumable uploads?:

- .pipe(targetFile.createWriteStream({ validation: 'md5' }))
+ .pipe(targetFile.createWriteStream({ validation: 'md5', resumable: false }))

Related:

@stephenplusplus stephenplusplus added the api: storage Issues related to the Cloud Storage API. label Sep 12, 2017
@gcochard
Copy link
Contributor Author

With resumable: false, it does not reproduce within 10k files uploaded. I will let it run longer to see if it reproduces.

Forgot the stack trace (note that I modified the code to print the MD5 sums):

{ Error: The uploaded data did not match the data from the server. As a precaution, we attempted to delete the file, but it was not successful. To be sure the content is the same, you should try removing the file manually, then uploading the file again.
                                            
The delete attempt failed with this message:
                                                                                       
  Not Found MD5 expected: x1YQGmcVcup+gG5jHr2P5Q== MD5 actual: +XikEmgOezNtunVxjBFjPQ==                                  
    at /Users/gcochard/gcs-upload-timeout/node_modules/google-cloud/node_modules/@google-cloud/storage/src/file.js:952:21                  
    at Object.handleResp (/Users/gcochard/gcs-upload-timeout/node_modules/google-cloud/node_modules/@google-cloud/common/src/util.js:135:3)             
    at /Users/gcochard/gcs-upload-timeout/node_modules/google-cloud/node_modules/@google-cloud/common/src/util.js:465:12                         
    at Request.onResponse [as _callback] (/Users/gcochard/gcs-upload-timeout/node_modules/google-cloud/node_modules/retry-request/index.js:160:7)
    at Request.self.callback (/Users/gcochard/gcs-upload-timeout/node_modules/google-cloud/node_modules/request/request.js:188:22)                    
    at emitTwo (events.js:125:13)                                                                                                      
    at Request.emit (events.js:213:7)                                                                                            
    at Request.<anonymous> (/Users/gcochard/gcs-upload-timeout/node_modules/google-cloud/node_modules/request/request.js:1171:10)
    at emitOne (events.js:115:13)                                                                                                     
    at Request.emit (events.js:210:7) 
  code: 'FILE_NO_UPLOAD_DELETE',          
  errors:        
   [ { ApiError                                                                                                                                         
         at Object.parseHttpRespBody (/Users/gcochard/gcs-upload-timeout/node_modules/google-cloud/node_modules/@google-cloud/common/src/util.js:192:30)
         at Object.handleResp (/Users/gcochard/gcs-upload-timeout/node_modules/google-cloud/node_modules/@google-cloud/common/src/util.js:132:18)
         at /Users/gcochard/gcs-upload-timeout/node_modules/google-cloud/node_modules/@google-cloud/common/src/util.js:465:12                         
         at Request.onResponse [as _callback] (/Users/gcochard/gcs-upload-timeout/node_modules/google-cloud/node_modules/retry-request/index.js:160:7)
         at Request.self.callback (/Users/gcochard/gcs-upload-timeout/node_modules/google-cloud/node_modules/request/request.js:188:22)
         at emitTwo (events.js:125:13)                                                                                          
         at Request.emit (events.js:213:7)                                                                                            
         at Request.<anonymous> (/Users/gcochard/gcs-upload-timeout/node_modules/google-cloud/node_modules/request/request.js:1171:10)
         at emitOne (events.js:115:13)                                 
         at Request.emit (events.js:210:7)
       code: 404,      
       errors: [Array],    
       response: undefined,                                                 
       message: 'Not Found' } ] }                                                                                                                                                     

@gcochard
Copy link
Contributor Author

gcochard commented Sep 12, 2017

Ran with 250k files and still no repro. Seems like resumable uploads are the source of the issue.

@gcochard
Copy link
Contributor Author

gcochard commented Sep 12, 2017

Reproduced with fast-crc32c installed, so that is not the cause. Didn't expect it to be though.

It seems like the second chunk of the file is not uploaded when I reproduce, as the actual MD5 is the same (the base file) and I add a counter to the end of each chunk: https://gist.github.com/gcochard/5dd10b16911a57f3bad2b71bbbab833e#file-poc-gcs-upload-bug-js-L38

Is there an issue with the resumable upload module that would cause this? Am I doing it wrong?

@gcochard
Copy link
Contributor Author

gcochard commented Sep 12, 2017

Digging further into this, the gcs-resumable-upload module seems to be the culprit. I am trying to capture some context from that module while reproducing the error, but it seems to disappear when I modify the module locally. Will report back more tomorrow.

@jbdemonte
Copy link

Just to as some feedback regarding resumable: false
We were facing some errors while uploading some small files (<200ko), kinda:

2019-05-13T08:42:23.817Z - error: [9e86d100-0a3a-413b-a4d8-8e8f58c8b37c] Error: The uploaded data did not match the data from the server. As a precaution, the file has been deleted. To be sure the content is the same, you should try uploading the file again.
    at delete (/var/www/node/prod/node_modules/@google-cloud/storage/build/src/file.js:1316:35)
    at Util.handleResp (/var/www/node/prod/node_modules/@google-cloud/common/build/src/util.js:142:9)
    at retryRequest (/var/www/node/prod/node_modules/@google-cloud/common/build/src/util.js:423:22)
    at onResponse (/var/www/node/prod/node_modules/retry-request/index.js:200:7)
    at /var/www/node/prod/node_modules/teeny-request/build/src/index.js:208:17
    at bound (domain.js:396:14)

After digging a lot, we finally ends on this issue and tried the same resumable: false.

Here are our results:

  • no problem at all on around 1.5 million of upload
  • 7 times faster
  • CPU usage 60% => 2%

henrik242 added a commit to finn-no/cdn-uploader that referenced this issue Jun 14, 2021
Some uploads randomly return checksum errors. These can be mitigated using
resumable: false (optionally validation: false)

See
- googleapis/google-cloud-node#2604
- firebase/functions-samples#140

Example error:

```$ ../../node_modules/@finn-no/cdn-uploader/index.js --credentials ${CDN_UPLOADER_CREDENTIALS} --project-id foo-storage --bucket-name foo-assets --app-prefix foo-web/_next build/next

/foo/node_modules/@google-cloud/storage/src/file.js:1093
        const error = new Error(message);
                      ^
Error: The uploaded data did not match the data from the server. As a precaution, the file has been deleted. To be sure the content is the same, you should try uploading the file again.
    at /foo/node_modules/@google-cloud/storage/src/file.js:1093:23
    at Object.handleResp (/foo/node_modules/@google-cloud/common/src/util.js:134:3)
    at /foo/node_modules/@google-cloud/common/src/util.js:496:12
    at Request.onResponse [as _callback] (/foo/node_modules/retry-request/index.js:198:7)
    at Request.self.callback (/foo/node_modules/request/request.js:185:22)```
henrik242 added a commit to finn-no/cdn-uploader that referenced this issue Jun 15, 2021
Some uploads randomly return checksum errors. These can be mitigated using
resumable: false (optionally validation: false)

See
- googleapis/google-cloud-node#2604
- firebase/functions-samples#140

Example error:

```$ ../../node_modules/@finn-no/cdn-uploader/index.js --credentials ${CDN_UPLOADER_CREDENTIALS} --project-id foo-storage --bucket-name foo-assets --app-prefix foo-web/_next build/next

/foo/node_modules/@google-cloud/storage/src/file.js:1093
        const error = new Error(message);
                      ^
Error: The uploaded data did not match the data from the server. As a precaution, the file has been deleted. To be sure the content is the same, you should try uploading the file again.
    at /foo/node_modules/@google-cloud/storage/src/file.js:1093:23
    at Object.handleResp (/foo/node_modules/@google-cloud/common/src/util.js:134:3)
    at /foo/node_modules/@google-cloud/common/src/util.js:496:12
    at Request.onResponse [as _callback] (/foo/node_modules/retry-request/index.js:198:7)
    at Request.self.callback (/foo/node_modules/request/request.js:185:22)```
henrik242 added a commit to finn-no/cdn-uploader that referenced this issue Jun 15, 2021
Some uploads randomly return checksum errors. These can be mitigated using
resumable: false (optionally validation: false)

See
- googleapis/google-cloud-node#2604
- firebase/functions-samples#140

Example error:

```$ ../../node_modules/@finn-no/cdn-uploader/index.js --credentials ${CDN_UPLOADER_CREDENTIALS} --project-id foo-storage --bucket-name foo-assets --app-prefix foo-web/_next build/next

/foo/node_modules/@google-cloud/storage/src/file.js:1093
        const error = new Error(message);
                      ^
Error: The uploaded data did not match the data from the server. As a precaution, the file has been deleted. To be sure the content is the same, you should try uploading the file again.
    at /foo/node_modules/@google-cloud/storage/src/file.js:1093:23
    at Object.handleResp (/foo/node_modules/@google-cloud/common/src/util.js:134:3)
    at /foo/node_modules/@google-cloud/common/src/util.js:496:12
    at Request.onResponse [as _callback] (/foo/node_modules/retry-request/index.js:198:7)
    at Request.self.callback (/foo/node_modules/request/request.js:185:22)```
henrik242 added a commit to finn-no/cdn-uploader that referenced this issue Jun 15, 2021
Some uploads randomly return checksum errors. These can be mitigated using
resumable: false (optionally validation: false)

See
- googleapis/google-cloud-node#2604
- firebase/functions-samples#140

Example error:

```$ ../../node_modules/@finn-no/cdn-uploader/index.js --credentials ${CDN_UPLOADER_CREDENTIALS} --project-id foo-storage --bucket-name foo-assets --app-prefix foo-web/_next build/next

/foo/node_modules/@google-cloud/storage/src/file.js:1093
        const error = new Error(message);
                      ^
Error: The uploaded data did not match the data from the server. As a precaution, the file has been deleted. To be sure the content is the same, you should try uploading the file again.
    at /foo/node_modules/@google-cloud/storage/src/file.js:1093:23
    at Object.handleResp (/foo/node_modules/@google-cloud/common/src/util.js:134:3)
    at /foo/node_modules/@google-cloud/common/src/util.js:496:12
    at Request.onResponse [as _callback] (/foo/node_modules/retry-request/index.js:198:7)
    at Request.self.callback (/foo/node_modules/request/request.js:185:22)```
henrik242 added a commit to finn-no/cdn-uploader that referenced this issue Jun 15, 2021
Some uploads randomly return checksum errors. These can be mitigated using
resumable: false (optionally validation: false)

See
- googleapis/google-cloud-node#2604
- firebase/functions-samples#140

Example error:

```$ ../../node_modules/@finn-no/cdn-uploader/index.js --credentials ${CDN_UPLOADER_CREDENTIALS} --project-id foo-storage --bucket-name foo-assets --app-prefix foo-web/_next build/next

/foo/node_modules/@google-cloud/storage/src/file.js:1093
        const error = new Error(message);
                      ^
Error: The uploaded data did not match the data from the server. As a precaution, the file has been deleted. To be sure the content is the same, you should try uploading the file again.
    at /foo/node_modules/@google-cloud/storage/src/file.js:1093:23
    at Object.handleResp (/foo/node_modules/@google-cloud/common/src/util.js:134:3)
    at /foo/node_modules/@google-cloud/common/src/util.js:496:12
    at Request.onResponse [as _callback] (/foo/node_modules/retry-request/index.js:198:7)
    at Request.self.callback (/foo/node_modules/request/request.js:185:22)```
henrik242 added a commit to finn-no/cdn-uploader that referenced this issue Jun 15, 2021
Some uploads randomly return checksum errors. These can be mitigated using
resumable: false (optionally validation: false)

See
- googleapis/google-cloud-node#2604
- firebase/functions-samples#140

Example error:

```$ ../../node_modules/@finn-no/cdn-uploader/index.js --credentials ${CDN_UPLOADER_CREDENTIALS} --project-id foo-storage --bucket-name foo-assets --app-prefix foo-web/_next build/next

/foo/node_modules/@google-cloud/storage/src/file.js:1093
        const error = new Error(message);
                      ^
Error: The uploaded data did not match the data from the server. As a precaution, the file has been deleted. To be sure the content is the same, you should try uploading the file again.
    at /foo/node_modules/@google-cloud/storage/src/file.js:1093:23
    at Object.handleResp (/foo/node_modules/@google-cloud/common/src/util.js:134:3)
    at /foo/node_modules/@google-cloud/common/src/util.js:496:12
    at Request.onResponse [as _callback] (/foo/node_modules/retry-request/index.js:198:7)
    at Request.self.callback (/foo/node_modules/request/request.js:185:22)```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: storage Issues related to the Cloud Storage API.
Projects
None yet
Development

No branches or pull requests

3 participants