Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runner edge cases improvements #1030

Merged
merged 21 commits into from
Jun 19, 2022
Merged

Runner edge cases improvements #1030

merged 21 commits into from
Jun 19, 2022

Conversation

DavidGOrtega
Copy link
Contributor

@DavidGOrtega DavidGOrtega commented May 29, 2022

The purpose is to remove unwanted "driver" edge cases maintenance and refactor error handling.

  • Handles --tfResource inside runLocal where it belongs
  • Removes unnecessary try and catch preparing the workdir folder due to recursive param
  • Removes GH special handler to check jobs relying in log parsing
  • Refactors runner timer logic to be located just only within the setTimeout
  • Fixes error handling in shutdown making effective reason in graceful termination opposed to unhandled error adding the latter for non 0 terminated and/or lost communication with underlying runner
  • Minor: reordering of params putting runner, cloud and hidden.
  • closes runner -single restarts the workflows  #1025
  • closes Oddities in runner log parsing/detecting events #1037 RefactorparseRunnerLog.
    • GH was failing due to exceptional message parsing.
    • toString('utf8') catch is removed making the runner to die if an error happens while the parsing is happening
    • Handles multiple logs thrown by the runner with a basic regex rules engine
  • Adds unhandledRejection and uncaughtException termination handler
  • closes Failed parsing log: Unexpected token { in JSON at position 153 #671

@DavidGOrtega DavidGOrtega temporarily deployed to internal May 29, 2022 12:29 Inactive
@DavidGOrtega DavidGOrtega marked this pull request as draft May 29, 2022 12:29
@DavidGOrtega DavidGOrtega self-assigned this May 29, 2022
@DavidGOrtega DavidGOrtega added technical-debt Refactoring, linting & tidying p0-critical Max priority (ASAP) cml-runner Subcommand labels May 29, 2022
@DavidGOrtega DavidGOrtega temporarily deployed to internal May 29, 2022 16:35 Inactive
@DavidGOrtega DavidGOrtega temporarily deployed to internal May 30, 2022 10:36 Inactive
@DavidGOrtega DavidGOrtega temporarily deployed to internal May 30, 2022 10:45 Inactive
@DavidGOrtega DavidGOrtega temporarily deployed to internal May 30, 2022 10:53 Inactive
@DavidGOrtega DavidGOrtega temporarily deployed to internal May 30, 2022 11:03 Inactive
@DavidGOrtega DavidGOrtega temporarily deployed to internal May 30, 2022 11:37 Inactive
@DavidGOrtega DavidGOrtega temporarily deployed to internal May 30, 2022 13:18 Inactive
@DavidGOrtega DavidGOrtega temporarily deployed to internal May 30, 2022 13:52 Inactive
@DavidGOrtega DavidGOrtega temporarily deployed to internal June 2, 2022 14:57 Inactive
@DavidGOrtega
Copy link
Contributor Author

GL logs can fail because sometime two logs can come toguether, hence JSON parse fails

info: Preparing workdir /Users/davidgortega/.cml/cml-yrd0icvfjg...
info: Launching gitlab runner
warn: SpotNotifier can not be started.
{"arch":"amd64","level":"info","msg":"Runtime platform","os":"darwin","pid":75487,"revision":"febb2a09","time":"2022-06-04T13:20:38+02:00","version":"15.0.0"}

{"level":"info","msg":"Starting runner for https://gitlab.com with token f9z4UpXQ ...","time":"2022-06-04T13:20:38+02:00"}

info: runner status {"date":"2022-06-04T11:20:38.352Z","repo":"https://gitlab.com/DavidGOrtega/fashion-mnist","status":"ready"}
{"job":2546676556,"level":"info","msg":"Checking for jobs... received","repo_url":"https://gitlab.com/DavidGOrtega/fashion-mnist.git","runner":"f9z4UpXQ","time":"2022-06-04T13:21:39+02:00"}

info: runner status {"date":"2022-06-04T11:21:39.654Z","job":2546676556,"repo":"https://gitlab.com/DavidGOrtega/fashion-mnist","status":"job_started"}
{"job":2546676556,"level":"warning","msg":"Failed to pull image with policy \"always\": failed to register layer: Error processing tar file(exit status 1): write /usr/lib/go-1.18/pkg/linux_amd64/cmd/go/internal/load.a: no space left on device (manager.go:203:77s)","project":27939020,"runner":"f9z4UpXQ","time":"2022-06-04T13:22:57+02:00"}

{"duration_s":77.934181805,"job":2546676556,"level":"warning","msg":"Job failed: failed to pull image \"iterativeai/cml:0-dvc2-base1\" with specified policies [always]: failed to register layer: Error processing tar file(exit status 1): write /usr/lib/go-1.18/pkg/linux_amd64/cmd/go/internal/load.a: no space left on device (manager.go:203:77s)\n","project":27939020,"runner":"f9z4UpXQ","time":"2022-06-04T13:22:57+02:00"}

info: runner status {"date":"2022-06-04T11:22:57.591Z","job":2546676556,"repo":"https://gitlab.com/DavidGOrtega/fashion-mnist","status":"job_ended","success":false}
info: runner status {"reason":"single job","status":"terminated"}
info: Unregistering runner cml-yrd0icvfjg...
{"error":"failed to pull image \"iterativeai/cml:0-dvc2-base1\" with specified policies [always]: failed to register layer: Error processing tar file(exit status 1): write /usr/lib/go-1.18/pkg/linux_amd64/cmd/go/internal/load.a: no space left on device (manager.go:203:77s)","level":"error","msg":"Failed to process build","time":"2022-06-04T13:22:59+02:00"}
{"level":"info","msg":"This runner has processed its build limit, so now exiting","time":"2022-06-04T13:22:59+02:00"}

error: unhandledRejection: Unexpected token { in JSON at position 359
SyntaxError: Unexpected token { in JSON at position 359
    at JSON.parse (<anonymous>)
    at Gitlab.runnerParseLog (/Users/davidgortega/Documents/projects/@iterative/cml/src/drivers/gitlab.js:250:53)
    at CML.parseRunnerLog (/Users/davidgortega/Documents/projects/@iterative/cml/src/cml.js:274:27)
    at Socket.dataHandler (/Users/davidgortega/Documents/projects/@iterative/cml/bin/cml/runner.js:244:27)
    at Socket.emit (events.js:314:20)
    at addChunk (_stream_readable.js:303:12)
    at readableAddChunk (_stream_readable.js:279:9)
    at Socket.Readable.push (_stream_readable.js:218:10)
    at Pipe.onStreamRead (internal/stream_base_commons.js:188:23) {"date":"Sat Jun 04 2022 13:22:59 GMT+0200 (Central European Summer Time)","error":{},"exception":true,"os":{"loadavg":[3.71484375,3.4228515625,3.07763671875],"uptime":1200249},"process":{"argv":["/usr/local/Cellar/node/14.8.0/bin/node","/Users/davidgortega/Documents/projects/@iterative/cml/bin/cml.js","runner","--labels","cml","--single","--idle-timeout","500","--repo","https://gitlab.com/DavidGOrtega/fashion-mnist","--token","kZeyQWegQyAHRcVKB2zo"],"cwd":"/Users/davidgortega/Documents/projects/@iterative/cml","execPath":"/usr/local/Cellar/node/14.8.0/bin/node","gid":20,"memoryUsage":{"arrayBuffers":1550730,"external":19824409,"heapTotal":30584832,"heapUsed":25139296,"rss":66179072},"pid":75380,"uid":501,"version":"v14.8.0"},"stack":"SyntaxError: Unexpected token { in JSON at position 359\n    at JSON.parse (<anonymous>)\n    at Gitlab.runnerParseLog (/Users/davidgortega/Documents/projects/@iterative/cml/src/drivers/gitlab.js:250:53)\n    at CML.parseRunnerLog (/Users/davidgortega/Documents/projects/@iterative/cml/src/cml.js:274:27)\n    at Socket.dataHandler (/Users/davidgortega/Documents/projects/@iterative/cml/bin/cml/runner.js:244:27)\n    at Socket.emit (events.js:314:20)\n    at addChunk (_stream_readable.js:303:12)\n    at readableAddChunk (_stream_readable.js:279:9)\n    at Socket.Readable.push (_stream_readable.js:218:10)\n    at Pipe.onStreamRead (internal/stream_base_commons.js:188:23)","trace":[{"column":null,"file":null,"function":"JSON.parse","line":null,"method":"parse","native":false},{"column":53,"file":"/Users/davidgortega/Documents/projects/@iterative/cml/src/drivers/gitlab.js","function":"Gitlab.runnerParseLog","line":250,"method":"runnerParseLog","native":false},{"column":27,"file":"/Users/davidgortega/Documents/projects/@iterative/cml/src/cml.js","function":"CML.parseRunnerLog","line":274,"method":"parseRunnerLog","native":false},{"column":27,"file":"/Users/davidgortega/Documents/projects/@iterative/cml/bin/cml/runner.js","function":"Socket.dataHandler","line":244,"method":"dataHandler","native":false},{"column":20,"file":"events.js","function":"Socket.emit","line":314,"method":"emit","native":false},{"column":12,"file":"_stream_readable.js","function":"addChunk","line":303,"method":null,"native":false},{"column":9,"file":"_stream_readable.js","function":"readableAddChunk","line":279,"method":null,"native":false},{"column":10,"file":"_stream_readable.js","function":"Socket.Readable.push","line":218,"method":"push","native":false},{"column":23,"file":"internal/stream_base_commons.js","function":"Pipe.onStreamRead","line":188,"method":"onStreamRead","native":false}]}
info:   Success

@DavidGOrtega
Copy link
Contributor Author

very hard to reproduce

@DavidGOrtega DavidGOrtega temporarily deployed to internal June 5, 2022 17:49 Inactive
@DavidGOrtega DavidGOrtega marked this pull request as ready for review June 5, 2022 18:02
@DavidGOrtega DavidGOrtega requested a review from a team June 5, 2022 18:13
@dacbd
Copy link
Contributor

dacbd commented Jun 6, 2022

Seems to be fine with my basic actions tests I've been able to compose so far:
image

bin/cml/runner.js Show resolved Hide resolved
bin/cml/runner.js Outdated Show resolved Hide resolved
bin/cml/runner.js Outdated Show resolved Hide resolved
bin/cml/runner.js Show resolved Hide resolved
bin/cml/runner.js Show resolved Hide resolved
bin/cml/runner.js Show resolved Hide resolved
src/cml.js Outdated Show resolved Hide resolved
Copy link
Member

@0x2b3bfa0 0x2b3bfa0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpicks 💅🏼

bin/cml/runner.js Outdated Show resolved Hide resolved
src/drivers/bitbucket_cloud.js Outdated Show resolved Hide resolved
src/drivers/github.js Outdated Show resolved Hide resolved
src/drivers/gitlab.js Outdated Show resolved Hide resolved
src/cml.js Outdated Show resolved Hide resolved
@DavidGOrtega DavidGOrtega temporarily deployed to internal June 9, 2022 07:36 Inactive
@DavidGOrtega DavidGOrtega temporarily deployed to internal June 9, 2022 07:36 Inactive
@DavidGOrtega DavidGOrtega temporarily deployed to internal June 10, 2022 15:15 Inactive
Co-authored-by: Daniel Barnes <[email protected]>
@DavidGOrtega DavidGOrtega temporarily deployed to internal June 10, 2022 18:38 Inactive
@DavidGOrtega DavidGOrtega temporarily deployed to internal June 10, 2022 19:02 Inactive
@DavidGOrtega DavidGOrtega temporarily deployed to internal June 10, 2022 19:14 Inactive
Copy link
Contributor

@dacbd dacbd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am satisfied.

@DavidGOrtega DavidGOrtega temporarily deployed to internal June 16, 2022 10:10 Inactive
@DavidGOrtega
Copy link
Contributor Author

🤞 any luck here @iterative/cml

@DavidGOrtega DavidGOrtega mentioned this pull request Jun 19, 2022
2 tasks
Copy link
Member

@0x2b3bfa0 0x2b3bfa0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expedited static review: looks mergeable to me

@0x2b3bfa0 0x2b3bfa0 merged commit 987ada4 into master Jun 19, 2022
@0x2b3bfa0 0x2b3bfa0 deleted the runner-no-special-cases branch June 19, 2022 22:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-github cml-runner Subcommand p0-critical Max priority (ASAP) technical-debt Refactoring, linting & tidying
Projects
None yet
4 participants