Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[warning] Only 1920342 fragments were mapped, but the number of burn-in fragments was set to 5000000. #677

Closed
aedavids opened this issue Jun 22, 2021 · 2 comments

Comments

@aedavids
Copy link

Our lab makes heavy use of Salmon. Its a great tool we use it almost daily

Is the bug primarily related to salmon (bulk mode) or alevin (single-cell mode)?
Salmon

Describe the bug
While digging through the log files to try and figure out why some of our biologic samples have low mapping rates I discovered a warning.

[2021-06-22 12:39:41.282] [jointLog] [warning] Only 1920342 fragments were mapped, but the number of burn-in fragments was set to 5000000.
The effective lengths have been computed using the observed mappings.

[2021-06-22 12:39:41.282] [jointLog] [info] Mapping rate = 55.5444%

about half of our samples have over 90% mapping rates.

Any idea what this warning means?

To Reproduce
Steps and data to reproduce the behavior:

salmon 1.4.0
Linux mustard 3.10.0-862.6.3.el7.x86_64 #1 SMP Tue Jun 26 16:32:21 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

$ cat /etc/redhat-release
CentOS Linux release 7.5.1804 (Core)

I think when I installed salmon I could not install the 1.5.x version. I forgot why

function runSalmon() {
    # runs salmon on one sample and outputs to that directory                                                                       
    salmonIndexDir="$1"
    rightReads="$2"
    leftReads="$3"
    outputDir="$4"

    #set -x # turn debug on                                                                                                         
    # set +x # turn debug off                                                                                                       

    if [[ ! -f "$outputDir"/quant.sf ]]; then

	mkdir -p "$outputDir"


        # printf "##############\n"                                                                                                 
        # printf "warning --minAssignedFrags is set to $minNumFrags to enable test data set\n"                                      
        #         minNumFrags=1                                                                                                     
        #            --minAssignedFrags=$minNumFrags \                                                                              
        # printf "##############\n"                                                                                                 

        #if [[ -f "$inputDir"/output_single_end.fq.gz ]]; then                                                                      

            numThr=12
            salmon quant \
                   -i $salmonIndexDir \
                   --libType A \
                   -1 "${rightReads}" \
                   -2 "${leftReads}" \
                   -p $numThr \
                   --recoverOrphans \
                   --validateMappings \
                   --gcBias \
                   --seqBias \
                   --rangeFactorizationBins 4 \
                   --writeUnmappedNames \
                   --output ${outputDir}

            salmonRet=$?
            if [ $salmonRet -ne 0 ]; then
                echo ERROR salmon "$rightReads" returned exit status "$exitStatus"
                continue
            fi

        #fi                                                                                                                         
    else
        echo "[INFO] skipping ${outputDir}/quant.sf it already exists"
    fi
}

Specifically, please provide at least the following information:

  • Which version of salmon was used?

  • salmon 1.4.0

  • How was salmon installed (compiled, downloaded executable, through bioconda)?

  • compiled locally salmon-1.4.0_linux_x86_64.tar.gz

  • Which reference (e.g. transcriptome) was used?

  • we have custom human ref with additional annotations

  • Which read files were used?
    paired reads

  • Which which program options were used?

  • see above

Expected behavior
A clear and concise description of what you expected to happen.

I think this is a potential documentation issue?

Screenshots
If applicable, add screenshots or terminal output to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. Ubuntu Linux, OSX]
  • Version [ If you are on OSX, the output of sw_vers. If you are on linux the output of uname -a and lsb_release -a]
  • $ lsb_release -a
    bash: lsb_release: command not found...
    (base) [aedavids@mustard bin]$ uname -a
    Linux mustard 3.10.0-862.6.3.el7.x86_64 Add a Gitter chat badge to README.md #1 SMP Tue Jun 26 16:32:21 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Additional context
Add any other context about the problem here.

@rob-p
Copy link
Collaborator

rob-p commented Jun 22, 2021

Hi @aedavids,

Thanks for the kind words about salmon!

Apart from the low mapping rate itself, the warning is nothing to be overly concerned about. Basically what it means is that the online phase of salmon estimates certain auxiliary model probabilities from the first X aligned fragments in the dataset (where X = 5M by default). If you don't see 5M observed aligned reads, salmon will just estimate those probabilities from the alignments it has observed, but it issues this warning just to let you know. This shouldn't be a problem itself, though you may want to check to see if there is anything about the sample itself that might lead to it having a considerably lower mapping rate than the others.

Best,
Rob

@aedavids
Copy link
Author

aedavids commented Jun 23, 2021 via email

@rob-p rob-p closed this as completed Jun 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants