Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug stuck script #25

Closed
owmo-dev opened this issue Aug 22, 2024 · 8 comments
Closed

bug stuck script #25

owmo-dev opened this issue Aug 22, 2024 · 8 comments

Comments

@owmo-dev
Copy link
Member

Every now and then jobs seem to get stuck on executing scripts. Re-starting the workers doesn't work. Doing a "Reset" command in temporal will re-submit the jobs and suddenly the post tasks will execute without issue. I suspect there's something wrong with how I'm scheduling the work, but will need to investigate and try to find a re-produceable scenario (difficult, as it typically requires a lot of frames to render to occur).

@owmo-dev
Copy link
Member Author

This thread may be helpful in diagnosing the problem (it sounds similar, scheduled but not started)

https://community.temporal.io/t/activity-scheduled-but-not-started-need-help/4313/5

@owmo-dev
Copy link
Member Author

owmo-dev commented Aug 24, 2024

I noticed the following error in the script worker

2024-08-24T02:13:20.657120Z WARN temporal_sdk_core::worker::activities: Network error while completing activity error=Status { code: Cancelled, message: "operation was canceled", source: Some(tonic::transport::Error(Transport, hyper::Error(Canceled, "connection closed"))) }

This thread may offer useful advice to investigate:

https://community.temporal.io/t/activity-timeout-and-temporal-server-connectivity-issue/8869/2

@owmo-dev
Copy link
Member Author

Seems like the heartbeat fixed that issue, but after updating I now have a new issue to content with...

2024-08-27T17:42:04.546Z [INFO] Worker state changed { sdkComponent: 'worker', taskQueue: 'render', state: 'FAILED' } RangeError: "length" is outside of buffer bounds at Buffer.proto.utf8Write (node:internal/buffer:1066:13) at Op.writeStringBuffer [as fn] (/Users/owmo/dev/combomash-orchestrator/node_modules/protobufjs/src/writer_buffer.js:61:13) at BufferWriter.finish (/Users/owmo/dev/combomash-orchestrator/node_modules/protobufjs/src/writer.js:453:14) at Worker.handleActivation (/Users/owmo/dev/combomash-orchestrator/node_modules/@temporalio/worker/src/worker.ts:1164:10) { code: 'ERR_BUFFER_OUT_OF_BOUNDS' }

This runs when running a sequence. My best guess is that it's too much information for Temporal's memory limit...

@owmo-dev
Copy link
Member Author

It looks like that is a Node bug, which is said to be fixed in today's release 22.8.0

nodejs/node#54518

nodejs/node#54524

@owmo-dev
Copy link
Member Author

Down-graded to node@20 and everything is working. I'll install the update tomorrow and verify it's all working.

@owmo-dev
Copy link
Member Author

owmo-dev commented Aug 28, 2024

@owmo-dev
Copy link
Member Author

Installed [email protected] in the package for now to ensure consistency of operation

@owmo-dev
Copy link
Member Author

Latest installation is working without issue, I'm going to continue to use latest node build on the machine for now and will default back to local install if this occurs again for stability. Closing the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant