Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: subscript contains invalid names #34

Open
suyanxun opened this issue Sep 22, 2019 · 9 comments
Open

Error: subscript contains invalid names #34

suyanxun opened this issue Sep 22, 2019 · 9 comments

Comments

@suyanxun
Copy link

suyanxun commented Sep 22, 2019

image

What's wrong when I run the calcGexpCnvBoundaries using example data like https://jef.works/HoneyBADGER/Getting_Started.html?

@JEFworks
Copy link
Collaborator

Hi,

Please check that the biomart object you are using is indeed properly loaded. You can double check this by accessing hb$genes in the tutorial.

Please check that the gene names consistent with the gene names used in your expression matrix.

Best,
Jean

@kylandra
Copy link

Dear Jean,
I had the same problem running the tutorial, as suyanxun reported.

> hb$calcGexpCnvBoundaries(init=TRUE, verbose=FALSE)
Loading required package: rjags
Loading required package: coda
Linked to JAGS 4.3.0
Loaded modules: basemod,bugs
ERROR: Error: subscript contains invalid names
NULL

As you suggested, I checked the hb$genes slot and the gene names in the matrix of the example and everything seems fine.

> print(gexp[1:5,1:5])
      MGH31_A02  MGH31_A03 MGH31_A04  MGH31_A05  MGH31_A06
AAAS   3.897660 -4.5248935 -4.183228  2.7093435  3.5550152
AADAT  2.166325  1.5769039  3.300534  2.1134482  0.8006273
AAGAB -4.193448  0.4587702 -4.232739 -2.6151893 -2.4645792
AAK1   2.360924  2.6094029  2.830466  2.1001597  2.1811283
AAMP   3.484948  2.2440590  3.007798  0.4639664  1.8083660

> head(hb$genes)
GRanges object with 6 ranges and 0 metadata columns:
        seqnames              ranges strand
           <Rle>           <IRanges>  <Rle>
   AAAS    chr12   53307456-53324864      *
  AADAT     chr4 170060222-170091699      *
  AAGAB    chr15   67200667-67255195      *
   AAK1     chr2   69457997-69674349      *
   AAMP     chr2 218264123-218270257      *
   AAR2    chr20   36236459-36270918      *
  -------
  seqinfo: 69 sequences from an unspecified genome; no seqlengths

how can we solve this issue in the tutorial? I was just checking each step of the pipeline before perform the analysis on my own datatset.

Thanks

@JEFworks
Copy link
Collaborator

JEFworks commented Jan 10, 2020 via email

@kylandra
Copy link

Hi Jean,

the version of biomart I'm using is biomaRt_2.42.0, if this can help you.

The first steps worked with no problem:

> hb <- new('HoneyBADGER', name='MGH31')
> hb$setGexpMats(gexp, ref, mart.obj, filter=FALSE, scale=FALSE, verbose=TRUE)
Initializing expression matrices ... 
Normalizing gene expression for 6082 genes and 75 cells ... 
Cache found
Done setting initial expression matrices! 
> hb$plotGexpProfile()
> hb$setMvFit(verbose=TRUE)
Modeling expected variance ... Done!
> hb$setGexpDev(verbose=TRUE)
Optimal deviance: 1.015797

Then I get the error:

hb$calcGexpCnvBoundaries(init=TRUE, verbose=FALSE)
Loading required package: rjags
Loading required package: coda
Linked to JAGS 4.3.0
Loaded modules: basemod,bugs
ERROR: Error: subscript contains invalid names
NULL

But if I double check what CNVs were identified, I obtain this result:

> bgf <- hb$bound.genes.final
> genes <- hb$genes
> regions.genes <- range(genes[unlist(bgf)])
> print(regions.genes)
GRanges object with 5 ranges and 0 metadata columns:
      seqnames              ranges strand
         <Rle>           <IRanges>  <Rle>
  [1]    chr11      167784-2397419      *
  [2]     chr7    816615-158956747      *
  [3]     chr9 137241287-138124624      *
  [4]    chr10    274190-133373354      *
  [5]     chr5  36876769-179634784      *
  -------
  seqinfo: 69 sequences from an unspecified genome; no seqlengths

So is it working?

I continued with the tutorial, and I got this error:

> hb$retestIdentifiedCnvs(retestBoundGenes = TRUE, retestBoundSnps = FALSE, verbose=FALSE)
> results <- hb$summarizeResults(geneBased=TRUE, alleleBased=FALSE)
> print(head(results[,1:5]))
> trees <- hb$visualizeResults(geneBased=TRUE, alleleBased=FALSE, details=TRUE, margins=c(25,15))
> hc <- trees$hc
> order <- hc$labels[hc$order]
> hb$plotGexpProfile(cellOrder=order)
Error in d[, cellOrder] : index out of bounds

and checking order vector I obtain:

> head(order)
[1] "del.gexp.MGH31_E09" "del.gexp.MGH31_A08" "del.gexp.MGH31_H07" "del.gexp.MGH31_G04"
[5] "del.gexp.MGH31_F10" "del.gexp.MGH31_F01"

what is going on?

thanks
Oriana

@aureliebousard
Copy link

Dear Jean,

I have exactly the same problem. Did you finally solve this issue?

Thanks

Aurélie

@garci1294
Copy link

garci1294 commented May 14, 2020

@kylandra @aureliebousard @JEFworks

Hi everyone!

In regards to the index out of bounds error, I tried the following:
Error in d[, cellOrder] : index out of bounds

I'm thinking the cellOrder is looking for integers, not strings. So I removed hc$labels.
order <- hc$labels[hc$order] changed to order = hc$order

The following code gave me the correct graph similar to the getting started tutorial.

> hb$setMvFit(verbose=TRUE)
> hb$setGexpDev(verbose=TRUE)
> hb$calcGexpCnvBoundaries(init=TRUE, verbose=FALSE)
> order <- hc$order
> hb$plotGexpProfile(cellOrder=hc$order)

I hope this helps. This allowed me to finish the getting started tutorial.

Jesus

@tinyi
Copy link

tinyi commented May 31, 2020

I have the same issue. If continue, it halts at :

results <- hb$summarizeResults(geneBased=TRUE, alleleBased=FALSE)
Error in data.frame(..., check.names = FALSE) :
arguments imply differing number of rows: 5, 4

@newihi001
Copy link

@tinyi
I had the same error, I found that the error of differing number of rows: 5, 4 is caused by mismatching row number of rgs and amp.gexp.prob. May need to get the same rows to bind them.
Good luck!

Qiu

@rpiskol
Copy link

rpiskol commented Jul 15, 2020

@suyanxun @aureliebousard @kylandra - you may have to specify a different host for the biomart.
The following is taken from the setGexpMats help and works fine.

mart.obj <- useMart(biomart = "ENSEMBL_MART_ENSEMBL", dataset = 'hsapiens_gene_ensembl', 
host = "jul2015.archive.ensembl.org")

R

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants