Spawning process for every request #313

kessler · 2017-03-15T00:07:22Z

This issue is about this code section:

 getClientUserAgentSeeded: function(seed, cb) {
    exec('uname -a', function(err, uname) {
      var userAgent = {};
      for (var field in seed) {
        userAgent[field] = encodeURIComponent(seed[field]);
      }

      // URI-encode in case there are unusual characters in the system's uname.
      userAgent.uname = encodeURIComponent(uname) || 'UNKNOWN';

      cb(JSON.stringify(userAgent));
    });
  },

from: https://github.com/stripe/stripe-node/blob/master/lib/stripe.js#L164

~~This code which is run for EVERY request, spawns a process!~~

First and foremost sniffing out details of the operating system and sending it to yourself without explicit notice is just wrong! this is a real breach of confidence.
~~In high volume situations on stressed machines this can cause errors and memory leaks.~~
Silencing errors is a very bad habit

brandur-stripe · 2017-03-15T00:30:54Z

Hi there,

This code which is run for EVERY request, spawns a process!

Check the code here. The results of the shell out get cached on a global and re-used.

First and foremost sniffing out details of the operating system and sending it to yourself without explicit notice is just wrong! this is a real breach of confidence.

I'm sorry to hear that you think so :/ If it helps, I can assure you that this information isn't collected for any kind of malicious intent. The majority of the time it's only used in aggregate so that we can get a feel for the types of requests that are coming in and to get an idea of what kinds of platforms we need to support. Otherwise, it's used occasionally to help troubleshoot particular users who are having trouble integrating, but not for much else.

In high volume situations on stressed machines this can cause errors and memory leaks.

As mentioned above, this call is cached, but even if it weren't it's worth keeping in mind that shelling out is slow, but still ~two orders of magnitude faster than the network call that's about to happen. Also, you'd hopefully only run into a memory leak if there was a bug in the package; the memory used to spawn the child process will be appropriately reclaimed.

Silencing errors is a very bad habit

Yes, this should probably be logged or something. I think the rational was that this type of problem isn't desirable, but also not worth stopping the program over, so we blow by it.

kessler · 2017-03-15T00:48:17Z

Thanks for the swift response.

You're right, I missed that if clause there. However we did run into a situation in production where stripe spawned lots of these processes, it was only resolved after we commented out the exec directive - I will try to recreate a distilled version of this, so it can be shared.

In addition, I really think you should notify your users that you are sending uname results with requests. This can contain private IP address and other sensitive information, they might not want to give out.

Another thing, if the exec operation is expected to fail without serious consequences it should also be wrapped in try/catch as some of the errors of exec can be thrown rather than return in the callback

brandur-stripe · 2017-03-17T20:44:48Z

You're right, I missed that if clause there. However we did run into a situation in production where stripe spawned lots of these processes, it was only resolved after we commented out the exec directive - I will try to recreate a distilled version of this, so it can be shared.

Oh crazy! Did you happen to notice whether they were zombies? It's possible that something isn't working as expected on our exec invocation. I checked the docs and it seems to look okay, but I guess you never know. Any luck with the repro case?

In addition, I really think you should notify your users that you are sending uname results with requests. This can contain private IP address and other sensitive information, they might not want to give out.

Cool. We should optimize in general for the much more common case, which is that there really isn't any sensitive information in uname -a (at least as far as I can think of). Even if an IP was embedded in the hostname, could it ever be considered harmful to send it out? We've already got your more valuable public IP based on you connecting to our servers.

The only trouble is that there really isn't one place that users are likely to see the notice, but I'll think about it.

Another thing, if the exec operation is expected to fail without serious consequences it should also be wrapped in try/catch as some of the errors of exec can be thrown rather than return in the callback

I was just reading the docs for exec and couldn't find any reference to thrown exceptions. Could you provide a little more detail here?

kessler · 2017-03-29T15:30:42Z

@brahn-stripe

Haven't gotten to write the distilled version of the problem yet.
Regarding the user notice, I think the README.md is a good place to do it.
About the exception, it is not documented, but here is an example of such exception: Repeatedly getting Error: spawn ENOMEM remy/nodemon#545 I think this still applies in latest versions of node

brandur-stripe · 2017-03-29T16:21:32Z

Thanks for continuing to look.

About the exception, it is not documented, but here is an example of such exception: remy/nodemon#545 I think this still applies in latest versions of node

Ah, interesting. I guess the only thing is that in the case of being out of memory, a thrown exception might not be that bad. Even if your program is able to tolerate it, you're almost certainly just going to be seeing more problems right away until something finally cracks. I'd certainly take a look at a pull for that, but I don't think it's a huge priority to patch.

kessler · 2017-04-14T18:39:54Z

I agree it's probably not a priority, however, I followed the code path down to uv_spawn (and stopped there for now) and I'm not sure it's the only possible exception.

Thing is that if an exception is thrown then the data won't get cached, causing the process to be spawned again and again

deontologician-stripe · 2017-09-25T23:19:13Z

Closing due to age

richardscarrott · 2021-07-27T17:26:25Z

Hi @brandonl-stripe, I've just been profiling our node server and noticed a lot of CPU activity in the stripe SDK which I think is related to the fact a new process is spawned for every request.

It only looks to be problematic when running from cold as the second run shows much less activity. Here's the code:

app.get('/stripe-test/', async (req, res, next) => {
    try {
      // Get the Stripe payment intent ids from the DB:
      const docs = await db
        .collection('orders')
        .find(
          { 'payment.gateway': 'STRIPE' },
          { projection: { 'payment.reference': 1 } }
        )
        .limit(50)
        .toArray();
      const paymentIntentIds = docs.map(doc => doc.payment.reference);

      // NOTE: There's no batch endpoint for paymentIntents so having to perform 50 requests in parallel:
      const paymentIntents = await Promise.all(
        paymentIntentIds.map(id => {
          return stripe.paymentIntents.retrieve(id, {
            expand:[
              'payment_method',
              'charges.data.balance_transaction',
            ]
          });
        })
      );
      res.json(paymentIntents);
    } catch (ex) {
      res.status(500).send(ex);
    }
});

From cold:

2nd run:

Additionally, I noticed its way less noisy if I fetch one on its own, and then the remaining 49, e.g.

     // ...
      const firstPaymentIntent = await stripe.paymentIntents.retrieve(paymentIntentIds[0]); // I guess this pre-caches the 'client user agent'?
      const paymentIntents = await Promise.all(
        paymentIntentIds.slice(1).map(id => {
          return stripe.paymentIntents.retrieve(id, {
            expand: [
              'payment_method',
              'charges.data.balance_transaction',
            ]
          });
        })
      );
      res.json([firstPaymentIntent, ...paymentIntents]);
      // ...
});

tbh, I've no idea why it's sparking up a new process in the first place but I'm guessing it didn't intend to do so for every parallel request?

I'm testing with an older version of the SDK (7.5.3) -- I wonder if this is a known issue / has it been fixed in a newer version?

richardscarrott · 2021-07-27T17:30:26Z

Scanning through the code it looks to still be an issue in the latest version

stripe-node/lib/stripe.js

Line 409 in 40eaaab

getClientUserAgent(cb) {

deontologician-stripe closed this as completed Sep 25, 2017

richardscarrott mentioned this issue Jul 27, 2021

Spawning process for every parallel request results in high CPU usage #1202

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spawning process for every request #313

Spawning process for every request #313

kessler commented Mar 15, 2017 •

edited

Loading

brandur-stripe commented Mar 15, 2017

kessler commented Mar 15, 2017

brandur-stripe commented Mar 17, 2017

kessler commented Mar 29, 2017 •

edited

Loading

brandur-stripe commented Mar 29, 2017

kessler commented Apr 14, 2017

deontologician-stripe commented Sep 25, 2017

richardscarrott commented Jul 27, 2021

richardscarrott commented Jul 27, 2021 •

edited

Loading

Spawning process for every request #313

Spawning process for every request #313

Comments

kessler commented Mar 15, 2017 • edited Loading

brandur-stripe commented Mar 15, 2017

kessler commented Mar 15, 2017

brandur-stripe commented Mar 17, 2017

kessler commented Mar 29, 2017 • edited Loading

brandur-stripe commented Mar 29, 2017

kessler commented Apr 14, 2017

deontologician-stripe commented Sep 25, 2017

richardscarrott commented Jul 27, 2021

richardscarrott commented Jul 27, 2021 • edited Loading

kessler commented Mar 15, 2017 •

edited

Loading

kessler commented Mar 29, 2017 •

edited

Loading

richardscarrott commented Jul 27, 2021 •

edited

Loading