Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix/zenko 1197/s3 data leak #578

Merged
merged 2 commits into from
Oct 15, 2018

Conversation

vrancurel
Copy link
Contributor

Fixes a leak occurring when cloudserver abruptly closes the s3-data socket. The root-cause of this socket hang-up is for the moment unclear, but is probably due to some overload.

The main source of the leak is that we don't listen to the 'close' event on the dataStream. When receiving this event we have to destroy the fileStream and delete the filePath by ourselves because it is not managed by the pipe. Without this, the file descriptor is never closed, the incomplete file is never deleted, leaking disk space, and leaking VFS cache which is critical in e.g. kubernetes pods.

Another problem is that we don't destroy the socket upon receiving of a clientError event in the lib/network/http/server.js. Indeed the socket is not necessarily closed automatically by the underlying layers (e.g. upon receiving a RST it seems that the underlying socket is closed, but not in some cases when it is a normal FIN/ACK termination handshake). If we don't explicitly destroy the socket sometimes the 'close' event is not emitted to the dataStream.

@bert-e
Copy link
Contributor

bert-e commented Oct 4, 2018

Hello vrancurel,

My role is to assist you with the merge of this
pull request. Please type @bert-e help to get
information on this process.

Status report is not available.

@bert-e
Copy link
Contributor

bert-e commented Oct 4, 2018

Waiting for approval

The following approvals are needed before I can proceed with the merge:

  • 2 peers

Peer approvals must include at least 1 approval from the following list:

// this means the underlying socket has been closed
log.debug('Client closed socket while streaming',
{ method: 'put', key, filePath,
error: 'socket closed' });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

error should be an error object with message attribute to not mess up indexation. I suggest either wrapping it in an error object or calling the attribute name errorMessage

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it fine if I remove the error: field ? because there is no error in fact.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That works too :)

@@ -199,6 +199,16 @@ class DataFileStore {
return cbOnce(errors.InternalError.customizeDescription(
`read stream error: ${err.code}`));
});
dataStream.on('close', () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the close event never ever trigger in a legitimate end of stream?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, only in case of a close() during a stream processing.

@bert-e
Copy link
Contributor

bert-e commented Oct 5, 2018

Integration data created

I have created the integration data for the additional destination branches.

The following branches will NOT be impacted:

  • development/6.4
  • development/7.4

Follow integration pull requests if you would like to be notified of
build statuses by email.

@bert-e
Copy link
Contributor

bert-e commented Oct 5, 2018

Waiting for approval

The following approvals are needed before I can proceed with the merge:

  • 2 peers

Peer approvals must include at least 1 approval from the following list:

  When receiving this callback, sometimes the socket is already
  closed (e.g. upon RST) but sometimes we have to cloud it ourselves.
  When the underlying socket of the dataStream is closed this
  is not considered as a stream error. So we have to hook the
  event and do the cleanup by ourselves
@bert-e
Copy link
Contributor

bert-e commented Oct 15, 2018

History mismatch

Merge commit #4e522a4d78a0efd75ac1c53a78f66119c9d9a673 on the integration branch
w/8.1/bugfix/ZENKO-1197/s3-data-leak is merging a branch which is neither the current
branch bugfix/ZENKO-1197/s3-data-leak nor the development branch
development/8.1.

It is likely due to a rebase of the branch bugfix/ZENKO-1197/s3-data-leak and the
merge is not possible until all related w/* branches are deleted or updated.

Please use the reset command to have me reinitialize these branches.

@rahulreddy
Copy link
Collaborator

@bert-e reset

@bert-e
Copy link
Contributor

bert-e commented Oct 15, 2018

Reset complete

I have successfully deleted this pull request's integration branches.

@bert-e
Copy link
Contributor

bert-e commented Oct 15, 2018

Integration data created

I have created the integration data for the additional destination branches.

The following branches will NOT be impacted:

  • development/6.4
  • development/7.4

Follow integration pull requests if you would like to be notified of
build statuses by email.

@bert-e
Copy link
Contributor

bert-e commented Oct 15, 2018

History mismatch

Merge commit #4749bb7a6cf62ff668b39ad6dd9656f484128b83 on the integration branch
w/8.1/bugfix/ZENKO-1197/s3-data-leak is merging a branch which is neither the current
branch bugfix/ZENKO-1197/s3-data-leak nor the development branch
development/8.1.

It is likely due to a rebase of the branch bugfix/ZENKO-1197/s3-data-leak and the
merge is not possible until all related w/* branches are deleted or updated.

Please use the reset command to have me reinitialize these branches.

@vrancurel vrancurel force-pushed the bugfix/ZENKO-1197/s3-data-leak branch from 4f70a74 to 3dee6e2 Compare October 15, 2018 18:30
@rahulreddy
Copy link
Collaborator

@bert-e reset

@bert-e
Copy link
Contributor

bert-e commented Oct 15, 2018

Reset complete

I have successfully deleted this pull request's integration branches.

The following options are set: wait

@bert-e
Copy link
Contributor

bert-e commented Oct 15, 2018

Integration data created

I have created the integration data for the additional destination branches.

The following branches will NOT be impacted:

  • development/6.4
  • development/7.4

Follow integration pull requests if you would like to be notified of
build statuses by email.

@bert-e
Copy link
Contributor

bert-e commented Oct 15, 2018

In the queue

The changeset has received all authorizations and has been added to the
relevant queue(s). The queue(s) will be merged in the target development
branch(es) as soon as builds have passed.

The changeset will be merged in:

  • ✔️ development/8.0

  • ✔️ development/8.1

The following branches will NOT be impacted:

  • development/6.4
  • development/7.4

There is no action required on your side. You will be notified here once
the changeset has been merged. In the unlikely event that the changeset
fails permanently on the queue, a member of the admin team will
contact you to help resolve the matter.

IMPORTANT

Please do not attempt to modify this pull request.

  • Any commit you add on the source branch will trigger a new cycle after the
    current queue is merged.
  • Any commit you add on one of the integration branches will be lost.

If you need this pull request to be removed from the queue, please contact a
member of the admin team now.

@bert-e
Copy link
Contributor

bert-e commented Oct 15, 2018

I have successfully merged the changeset of this pull request
into targetted development branches:

  • ✔️ development/8.0

  • ✔️ development/8.1

The following branches have NOT changed:

  • development/6.4
  • development/7.4

Please check the status of the associated issue ZENKO-1197.

Goodbye vrancurel.

@bert-e bert-e merged commit 3dee6e2 into development/8.0 Oct 15, 2018
@rahulreddy rahulreddy deleted the bugfix/ZENKO-1197/s3-data-leak branch October 15, 2018 18:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants