Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open files after downloading image #181

Closed
8acker opened this issue Aug 1, 2016 · 10 comments
Closed

Open files after downloading image #181

8acker opened this issue Aug 1, 2016 · 10 comments

Comments

@8acker
Copy link

8acker commented Aug 1, 2016

Hi,

we are facing a problem using needle. We use needle to download several images within a web application. After downloading the images the file connections to local target path remain open.

Running a webservice who downloads files for a couple of time will produce an error too many open files which has side effects to the health of the service.

Dependencies:

$ npm ls
npm info using [email protected]
npm info using [email protected]
├─┬ [email protected]
└─┬ [email protected]
  └── [email protected]

You can reproduce this problem using the following code snippet:
And call the following resource http://localhost:3300/download once or several times:

var os = require('os');
var path = require('path');
var needle = require('needle');
var app = require('express')();

var image_link = 'https://assets-cdn.github.com/images/modules/logos_page/GitHub-Logo.png';

var needle_get = function(url, cb) {
    var output_path = path.join(os.tmpdir(), path.basename(url));
    needle.get(url, {output: output_path}, function(error) {
        console.log(error || 'image successfully downloaded to ' + output_path);
        cb(error);
    });
};

app.get('/download', function(req, res) {
    needle_get(image_link, function(error) {
        res.status(error ? 500 : 200);
        res.end();
    });
});

app.listen(3300, function () {
    console.log('started on ' + 3300);
});

The output of lsof | grep GitHub-Logo.png after the file was successfully downloaded:

node       6947  6980 username   15w      REG                8,1       6088   13505667 /tmp/GitHub-Logo.png

We expect the read-/writestream of the file should be closed afterwards.

Regards

@frankred
Copy link

frankred commented Aug 1, 2016

Could reproduce this behavior on windows10. For lsof I used handle -p node.exe which can be downloaded here: https://technet.microsoft.com/en-us/sysinternals/bb896655.aspx

C:\Windows\system32>handle -p node.exe
...
node.exe pid: 46872 UNITED\froth
   14: File  (RW-)   C:\Users\froth\Desktop\needle_issue
...
  2F4: File  (RWD)   C:\Users\froth\AppData\Local\Temp\GitHub-Logo.png
node -v  
v4.4.3

@frankred
Copy link

frankred commented Aug 1, 2016

Shouldn't the fileWriteStream be closed within line ?https://github.com/tomas/needle/blob/master/lib/needle.js#L563 with

out.on('end', function() {
    // NEW: Close file here
    file && file.end();

    // we want to be able to access to the raw data later, so keep a reference.
    resp.raw = Buffer.concat(resp.raw);
    ...

@tomas
Copy link
Owner

tomas commented Aug 1, 2016

Thanks @tbouchnafa for the report and @frankred for the debugging! Does the fix solve the issue?

@frankred
Copy link

frankred commented Aug 1, 2016

For me yes, but the needle implementation is too complex for me to understand completely. If this new line does not produce any side effects to other use cases then this is a solution. What do you mean @tomas. Best regards!

@tomas
Copy link
Owner

tomas commented Aug 1, 2016

Great. I'll do some tests and get it merged soon.

@tomas
Copy link
Owner

tomas commented Aug 1, 2016

Hm, according to the official docs:

By default, stream.end() is called on the destination Writable stream when the source Readable stream emits 'end', so that the destination is no longer writable. To disable this default behavior, the end option can be passed as false, causing the destination stream to remain open (...) One important caveat is that if the Readable stream emits an error during processing, the Writable destination is not closed automatically. If an error occurs, it will be necessary to manually close each stream in order to prevent memory leaks.

Which means that unless there's some error on the stream object, it shouldn't be necessary to call file.end() by hand. I'm guessing the error occurs because you're creating a WritableStream on top of an existing file description (same file name). @tbouchnafa can you try using a random filename as the output path and see what happens? Something like:

 var output_path = path.join(os.tmpdir(), path.basename(url) + Math.random().toString());

@frankred
Copy link

frankred commented Aug 2, 2016

Not sure what your intention is but I get the following. Same problem. In my opinion the example do not create a seperate file descriptor?!

C:\Users\froth\Desktop\needle_issue>node index.js
started on 3300
image successfully downloaded to C:\Users\froth\AppData\Local\Temp\GitHub-Logo.png0.5705635182093829
image successfully downloaded to C:\Users\froth\AppData\Local\Temp\GitHub-Logo.png0.36734199221245944
image successfully downloaded to C:\Users\froth\AppData\Local\Temp\GitHub-Logo.png0.2987525318749249
image successfully downloaded to C:\Users\froth\AppData\Local\Temp\GitHub-Logo.png0.5965731497853994
image successfully downloaded to C:\Users\froth\AppData\Local\Temp\GitHub-Logo.png0.29114395240321755
image successfully downloaded to C:\Users\froth\AppData\Local\Temp\GitHub-Logo.png0.3767805772367865
image successfully downloaded to C:\Users\froth\AppData\Local\Temp\GitHub-Logo.png0.2435523122549057
image successfully downloaded to C:\Users\froth\AppData\Local\Temp\GitHub-Logo.png0.2295996944885701
image successfully downloaded to C:\Users\froth\AppData\Local\Temp\GitHub-Logo.png0.7369571509771049
image successfully downloaded to C:\Users\froth\AppData\Local\Temp\GitHub-Logo.png0.339010690106079
image successfully downloaded to C:\Users\froth\AppData\Local\Temp\GitHub-Logo.png0.45418236358091235
image successfully downloaded to C:\Users\froth\AppData\Local\Temp\GitHub-Logo.png0.9200152296107262
image successfully downloaded to C:\Users\froth\AppData\Local\Temp\GitHub-Logo.png0.9670525107067078
image successfully downloaded to C:\Users\froth\AppData\Local\Temp\GitHub-Logo.png0.0026146695017814636
C:\Windows\system32>handle -p node.exe

node.exe pid: 46184 UNITED\froth
   14: File  (RW-)   C:\Users\froth\Desktop\needle_issue
   68: File  (RWD)   C:\Users\froth\AppData\Local\Temp\GitHub-Logo.png0.5705635182093829
  100: Section       \Sessions\1\BaseNamedObjects\node-debug-handler-46184
  224: File  (RWD)   C:\Users\froth\AppData\Local\Temp\GitHub-Logo.png0.36734199221245944
  2DC: File  (RWD)   C:\Users\froth\AppData\Local\Temp\GitHub-Logo.png0.5965731497853994
  2E0: File  (RWD)   C:\Users\froth\AppData\Local\Temp\GitHub-Logo.png0.3767805772367865
  2E8: File  (RWD)   C:\Users\froth\AppData\Local\Temp\GitHub-Logo.png0.2987525318749249
  2EC: File  (RWD)   C:\Users\froth\AppData\Local\Temp\GitHub-Logo.png0.29114395240321755
  2F4: File  (RWD)   C:\Users\froth\AppData\Local\Temp\GitHub-Logo.png0.2435523122549057
  2F8: File  (RWD)   C:\Users\froth\AppData\Local\Temp\GitHub-Logo.png0.339010690106079
  2FC: File  (RWD)   C:\Users\froth\AppData\Local\Temp\GitHub-Logo.png0.2295996944885701
  300: File  (RWD)   C:\Users\froth\AppData\Local\Temp\GitHub-Logo.png0.7369571509771049
  304: File  (RWD)   C:\Users\froth\AppData\Local\Temp\GitHub-Logo.png0.45418236358091235
  308: File  (RWD)   C:\Users\froth\AppData\Local\Temp\GitHub-Logo.png0.9200152296107262
  30C: File  (RWD)   C:\Users\froth\AppData\Local\Temp\GitHub-Logo.png0.0026146695017814636
  310: File  (RWD)   C:\Users\froth\AppData\Local\Temp\GitHub-Logo.png0.9670525107067078

Maybe the problem is this special PassThrough mode:

var out      = new stream.PassThrough({ objectMode: false }),

@frankred
Copy link

frankred commented Aug 2, 2016

Minified example:

var os = require('os');
var path = require('path');
var needle = require('needle');

var url = 'https://assets-cdn.github.com/images/modules/logos_page/GitHub-Logo.png';
var output_path = path.join(os.tmpdir(), path.basename(url) + Math.random().toString());

needle.get(url, {output: output_path}, function (error) {
    console.log(error || 'image successfully downloaded to ' + output_path);
});

setTimeout(function () {
}, 99999999);

@8acker
Copy link
Author

8acker commented Aug 2, 2016

The same here on Linux, still doesn't work.

the connection to the file remains open after the download, adding the suggestion of @frankred fixed it, but I am also not sure if this will have any side effects in Line#563

file && file.end() 

@tomas
Copy link
Owner

tomas commented Aug 8, 2016

Just fixed and pushed new version to npm. Thanks guys!

@tomas tomas closed this as completed Aug 8, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants