Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ScramblePart1 139 error code #652

Open
MattWellie opened this issue Mar 6, 2024 · 1 comment
Open

ScramblePart1 139 error code #652

MattWellie opened this issue Mar 6, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@MattWellie
Copy link
Contributor

Bug Report

Affected module(s) or script(s)

wdl/GatherSampleEvidence

Affected version(s)

This codebase as-of this commit

Description

I've seen a couple of failing jobs recently where ScramblePart1 is terminating with a 139 error code. It looks like this is a kill signal from the OS/Hypervisor when the tool tries to access memory it has no permission to use. This is always coupled with a Wham failure - is the 139 error likely to be meaningful, or is this potentially a kill signal as a separate part of the workflow had failed, and so all jobs needed to be stopped? n.b. these samples had been running for an entire week on this variant calling stage, which typically takes only a few hours, though the trace below is from a re-run, which picked up the prior run's results. e.g.

Jobs:
  �[92m[#] LocalizeReads (26s)
    Call caching: true�[0m
  �[91m[!] Whamg (5h:19m:10s)
    stdout: None
    stderr: None
    rc: None
    error: Workflow failed, caused by: Task Whamg.RunWhamgOnCram:NA:2 failed. Job exit code 137. (...)
  �[92m[#] CollectCounts (18s)
    Call caching: true�[0m
  �[92m[#] Manta (33s)�[0m
  �[92m[#] CollectSVEvidence (26s)�[0m
  �[91m[!] Scramble (1h:0m:57s)
    stdout: None
    stderr: None
    rc: None
    error: Workflow failed, caused by: Job Scramble.ScramblePart1:NA:2 exited with return code 139 which has not been declared as a valid return code
@mwalker174
Copy link
Collaborator

Hi @MattWellie, that is unusual and not something we've run into before. Is it possible this is an issue with the CRAM? I would maybe try recompressing with the latest samtools and see if that resolves the issue.

@mwalker174 mwalker174 added the bug Something isn't working label Jun 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants