Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random timeout while invoking long running lambda #2558

Closed
coxom opened this issue Feb 28, 2019 · 10 comments
Closed

Random timeout while invoking long running lambda #2558

coxom opened this issue Feb 28, 2019 · 10 comments
Labels
closing-soon This issue will automatically close in 4 days unless further comments are made. guidance Question that needs advice or information.

Comments

@coxom
Copy link

coxom commented Feb 28, 2019

Hi,

We are having this random timeout problem in our amazon CodeBuild that runs a node script that in turn uses aws-sdk (2.344.0) to invoke a lambda that takes ~6min.

Our CodeBuild will randomly timeout after an hour despite the lambda having a log saying it ran successfully:

20:01:43 START RequestId: 1173c17d-9a57-4281-b2c1-db5c118d0cda Version: $LATEST
20:07:43 END RequestId: 1173c17d-9a57-4281-b2c1-db5c118d0cda
20:07:43 REPORT RequestId: 1173c17d-9a57-4281-b2c1-db5c118d0cda	Duration: 359995.24 ms	Billed Duration: 360000 ms Memory Size: 3006 MB	Max Memory Used: 834 MB

A few days ago we adjusted the httpOptions timeout to 60000ms.
Today for the first time we actually got an output:

10:01:45 Importing products from xdms
10:41:46 { 
TimeoutError: Connection timed out after 600000ms
 at ClientRequest.<anonymous> (/.../aws-sdk/lib/http/node.js:83:34)
 at Object.onceWrapper (events.js:273:13)
 at ClientRequest.emit (events.js:182:13)
 at ClientRequest.EventEmitter.emit (domain.js:442:20)
 at TLSSocket.emitRequestTimeout (_http_client.js:661:40)
 at Object.onceWrapper (events.js:273:13)
 at TLSSocket.emit (events.js:182:13)
 at TLSSocket.EventEmitter.emit (domain.js:442:20)
 at TLSSocket.Socket._onTimeout (net.js:449:8)
 at ontimeout (timers.js:436:11)
 at tryOnTimeout (timers.js:300:5)
 at listOnTimeout (timers.js:263:5)
 at Timer.processTimers (timers.js:223:10)
 message: 'Connection timed out after 600000ms',
 code: 'TimeoutError',
 time: 2019-02-28T10:41:46.158Z,
 region: 'eu-west-1',
 hostname: 'lambda.eu-west-1.amazonaws.com',
 retryable: true }

On the lambda side we have it logging that it completed the call in less than 10mins:

10:01:46 START RequestId: 18cab60a-7208-46b2-9297-9bbdc2766ceb Version: $LATEST
10:08:14 END RequestId: 18cab60a-7208-46b2-9297-9bbdc2766ceb
10:08:14 REPORT RequestId: 18cab60a-7208-46b2-9297-9bbdc2766ceb	Duration: 388255.27 ms	Billed Duration: 388300 ms Memory Size: 3006 MB	Max Memory Used: 829 MB
10:11:45 START RequestId: fa4fd327-500f-4503-a783-b337f579820f Version: $LATEST
10:17:53 END RequestId: fa4fd327-500f-4503-a783-b337f579820f
10:17:53  REPORT RequestId: fa4fd327-500f-4503-a783-b337f579820f	Duration: 367544.50 ms	Billed Duration: 367600 ms Memory Size: 3006 MB	Max Memory Used: 829 MB
10:21:45 START RequestId: f3213fe9-bc25-4796-bc12-dc081ff610e1 Version: $LATEST
10:27:45 END RequestId: f3213fe9-bc25-4796-bc12-dc081ff610e1
10:27:45 REPORT RequestId: f3213fe9-bc25-4796-bc12-dc081ff610e1	Duration: 360218.76 ms	Billed Duration: 360300 ms Memory Size: 3006 MB	Max Memory Used: 829 MB
10:31:46 START RequestId: 57dd339c-438d-4523-a60b-4d51bcc24f87 Version: $LATEST
10:37:50 END RequestId: 57dd339c-438d-4523-a60b-4d51bcc24f87
10:37:50 REPORT RequestId: 57dd339c-438d-4523-a60b-4d51bcc24f87	Duration: 364504.07 ms	Billed Duration: 364600 ms Memory Size: 3006 MB	Max Memory Used: 829 MB

Seems like it retried 4 times and every time it timed-out?

Any tips on where we can look to debug this issue? Seems like it it between the aws-sdk and the lambda.

@AllanZhengYP AllanZhengYP added the investigating Issue has been looked at and needs deep dive work by OSDS. label Mar 4, 2019
@AllanZhengYP
Copy link
Contributor

Hey @coxom,

I suspect this has something to do with the keep-alive settings. The connection might be shut down by intermediate proxies within CodeBuild or Lambda. Can you try enabling socket keep-alive to see whether that mitigate the issue.

See: https://nodejs.org/api/net.html#net_socket_setkeepalive_enable_initialdelay

@AllanZhengYP AllanZhengYP added guidance Question that needs advice or information. and removed investigating Issue has been looked at and needs deep dive work by OSDS. labels Mar 4, 2019
@coxom
Copy link
Author

coxom commented Mar 5, 2019

@AllanFly120 thank you for the help.

I have added the keepAlive through the agent property in httpOptions like suggested here:
#671 hope it solves the issue.

Will give feedback if it gets solved.

@marcus-hiles
Copy link

Great @coxom

@coxom
Copy link
Author

coxom commented Mar 19, 2019

Hi,

Unfortunately this hasn't solved the issue. We continue to have random timeouts.

@alexandrusavin
Copy link

@coxom Did you manage to find a solution to your problem? We have a similar problem and we're struggling to find a solution.

@coxom
Copy link
Author

coxom commented Sep 16, 2019

@coxom Did you manage to find a solution to your problem? We have a similar problem and we're struggling to find a solution.

Unfortunately no. We just moved to use a ECS container to run the long running code.

@ajredniwja
Copy link
Contributor

Hey @alexandrusavin @coxom is this still a persisting issue with the latest version of the SDK?

@ajredniwja ajredniwja added the response-requested Waiting on additional info and feedback. Will move to \"closing-soon\" in 7 days. label Jan 22, 2020
@coxom
Copy link
Author

coxom commented Feb 6, 2020

@ajredniwja We are using an ECS container now and haven't got any code that invokes a long running lambda anymore. Was there a change in the latest version of the sdk that would help this scenario?

@ajredniwja
Copy link
Contributor

@coxom I am not sure if there were changes regarding to this particular scenario, but some logics or features may be added, you can have a look at change log.

@ajredniwja ajredniwja added closing-soon This issue will automatically close in 4 days unless further comments are made. and removed response-requested Waiting on additional info and feedback. Will move to \"closing-soon\" in 7 days. labels Feb 6, 2020
@lock
Copy link

lock bot commented Feb 28, 2020

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs and link to relevant comments in this thread.

@lock lock bot locked as resolved and limited conversation to collaborators Feb 28, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
closing-soon This issue will automatically close in 4 days unless further comments are made. guidance Question that needs advice or information.
Projects
None yet
Development

No branches or pull requests

5 participants