Error: subscript contains invalid names #34

suyanxun · 2019-09-22T11:26:41Z

What's wrong when I run the calcGexpCnvBoundaries using example data like https://jef.works/HoneyBADGER/Getting_Started.html?

JEFworks · 2019-09-25T01:03:55Z

Hi,

Please check that the biomart object you are using is indeed properly loaded. You can double check this by accessing hb$genes in the tutorial.

Please check that the gene names consistent with the gene names used in your expression matrix.

Best,
Jean

kylandra · 2020-01-10T13:21:28Z

Dear Jean,
I had the same problem running the tutorial, as suyanxun reported.

> hb$calcGexpCnvBoundaries(init=TRUE, verbose=FALSE)
Loading required package: rjags
Loading required package: coda
Linked to JAGS 4.3.0
Loaded modules: basemod,bugs
ERROR: Error: subscript contains invalid names
NULL

As you suggested, I checked the hb$genes slot and the gene names in the matrix of the example and everything seems fine.

> print(gexp[1:5,1:5])
      MGH31_A02  MGH31_A03 MGH31_A04  MGH31_A05  MGH31_A06
AAAS   3.897660 -4.5248935 -4.183228  2.7093435  3.5550152
AADAT  2.166325  1.5769039  3.300534  2.1134482  0.8006273
AAGAB -4.193448  0.4587702 -4.232739 -2.6151893 -2.4645792
AAK1   2.360924  2.6094029  2.830466  2.1001597  2.1811283
AAMP   3.484948  2.2440590  3.007798  0.4639664  1.8083660

> head(hb$genes)
GRanges object with 6 ranges and 0 metadata columns:
        seqnames              ranges strand
           <Rle>           <IRanges>  <Rle>
   AAAS    chr12   53307456-53324864      *
  AADAT     chr4 170060222-170091699      *
  AAGAB    chr15   67200667-67255195      *
   AAK1     chr2   69457997-69674349      *
   AAMP     chr2 218264123-218270257      *
   AAR2    chr20   36236459-36270918      *
  -------
  seqinfo: 69 sequences from an unspecified genome; no seqlengths

how can we solve this issue in the tutorial? I was just checking each step of the pipeline before perform the analysis on my own datatset.

Thanks

JEFworks · 2020-01-10T14:46:27Z

Hi, Thanks for the bug report. This may have to do with biomart or a related dependency being updated. To help me with the debugging, could you please let me know if the previous steps in the tutorial: ``` hb <- new('HoneyBADGER', name='MGH31') hb$setGexpMats(gexp, ref, mart.obj, filter=FALSE, scale=FALSE, verbose=TRUE) hb$plotGexpProfile() ## initial visualization hb$setMvFit(verbose=TRUE) ## model variance hb$setGexpDev(verbose=TRUE) ## model necessary expression deviation to identify CNVs ``` All worked properly? (e.g. without error or warning) Also, after running `calcGexpCnvBoundaries`, is there any output provided by: ``` ## double check what CNVs were identified bgf <- hb$bound.genes.final genes <- hb$genes regions.genes <- range(genes[unlist(bgf)]) print(regions.genes) ``` Or is the output null? Thanks for the help, Jean On Jan 10, 2020, at 8:21 AM, kylandra <[email protected]<mailto:[email protected]>> wrote: Dear Jean, I had the same problem running the tutorial, as suyanxun reported.

hb$calcGexpCnvBoundaries(init=TRUE, verbose=FALSE)

Loading required package: rjags Loading required package: coda Linked to JAGS 4.3.0 Loaded modules: basemod,bugs ERROR: Error: subscript contains invalid names NULL As you suggested, I checked the hb$genes slot and the gene names in the matrix of the example and everything seems fine.

print(gexp[1:5,1:5])

MGH31_A02 MGH31_A03 MGH31_A04 MGH31_A05 MGH31_A06 AAAS 3.897660 -4.5248935 -4.183228 2.7093435 3.5550152 AADAT 2.166325 1.5769039 3.300534 2.1134482 0.8006273 AAGAB -4.193448 0.4587702 -4.232739 -2.6151893 -2.4645792 AAK1 2.360924 2.6094029 2.830466 2.1001597 2.1811283 AAMP 3.484948 2.2440590 3.007798 0.4639664 1.8083660

head(hb$genes)

GRanges object with 6 ranges and 0 metadata columns: seqnames ranges strand <Rle> <IRanges> <Rle> AAAS chr12 53307456-53324864 * AADAT chr4 170060222-170091699 * AAGAB chr15 67200667-67255195 * AAK1 chr2 69457997-69674349 * AAMP chr2 218264123-218270257 * AAR2 chr20 36236459-36270918 * ------- seqinfo: 69 sequences from an unspecified genome; no seqlengths how can we solve this issue in the tutorial? I was just checking each step of the pipeline before perform the analysis on my own datatset. Thanks — You are receiving this because you commented. Reply to this email directly, view it on GitHub<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JEFworks_HoneyBADGER_issues_34-3Femail-5Fsource-3Dnotifications-26email-5Ftoken-3DACEPMSBJ3K4GC4SASV3U3ZTQ5BY5RA5CNFSM4IZCG34KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIT4R6Q-23issuecomment-2D573032698&d=DwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=2gb0vmLv11Vi98WTAqlCXyDkhi11d9lKeGWDXEU-qNw&m=sacPl6ehGsUf0W5tx7MFmwWS2ZNsqqYsq0zGdb6UUzk&s=SMKV3AkrFdkOYuy4gsVmOSPCQLQbB81Gjhwj0prZ7yU&e=>, or unsubscribe<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ACEPMSB7VFCPCQSTSHFMKG3Q5BY5RANCNFSM4IZCG34A&d=DwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=2gb0vmLv11Vi98WTAqlCXyDkhi11d9lKeGWDXEU-qNw&m=sacPl6ehGsUf0W5tx7MFmwWS2ZNsqqYsq0zGdb6UUzk&s=bU5ielDim2m8jcN1zBjn8YTmpQ_2bXB4zBiC8GwE7xU&e=>.

kylandra · 2020-01-10T16:04:59Z

Hi Jean,

the version of biomart I'm using is biomaRt_2.42.0, if this can help you.

The first steps worked with no problem:

> hb <- new('HoneyBADGER', name='MGH31')
> hb$setGexpMats(gexp, ref, mart.obj, filter=FALSE, scale=FALSE, verbose=TRUE)
Initializing expression matrices ... 
Normalizing gene expression for 6082 genes and 75 cells ... 
Cache found
Done setting initial expression matrices! 
> hb$plotGexpProfile()
> hb$setMvFit(verbose=TRUE)
Modeling expected variance ... Done!
> hb$setGexpDev(verbose=TRUE)
Optimal deviance: 1.015797

Then I get the error:

hb$calcGexpCnvBoundaries(init=TRUE, verbose=FALSE)
Loading required package: rjags
Loading required package: coda
Linked to JAGS 4.3.0
Loaded modules: basemod,bugs
ERROR: Error: subscript contains invalid names
NULL

But if I double check what CNVs were identified, I obtain this result:

> bgf <- hb$bound.genes.final
> genes <- hb$genes
> regions.genes <- range(genes[unlist(bgf)])
> print(regions.genes)
GRanges object with 5 ranges and 0 metadata columns:
      seqnames              ranges strand
         <Rle>           <IRanges>  <Rle>
  [1]    chr11      167784-2397419      *
  [2]     chr7    816615-158956747      *
  [3]     chr9 137241287-138124624      *
  [4]    chr10    274190-133373354      *
  [5]     chr5  36876769-179634784      *
  -------
  seqinfo: 69 sequences from an unspecified genome; no seqlengths

So is it working?

I continued with the tutorial, and I got this error:

> hb$retestIdentifiedCnvs(retestBoundGenes = TRUE, retestBoundSnps = FALSE, verbose=FALSE)
> results <- hb$summarizeResults(geneBased=TRUE, alleleBased=FALSE)
> print(head(results[,1:5]))
> trees <- hb$visualizeResults(geneBased=TRUE, alleleBased=FALSE, details=TRUE, margins=c(25,15))
> hc <- trees$hc
> order <- hc$labels[hc$order]
> hb$plotGexpProfile(cellOrder=order)
Error in d[, cellOrder] : index out of bounds

and checking order vector I obtain:

> head(order)
[1] "del.gexp.MGH31_E09" "del.gexp.MGH31_A08" "del.gexp.MGH31_H07" "del.gexp.MGH31_G04"
[5] "del.gexp.MGH31_F10" "del.gexp.MGH31_F01"

what is going on?

thanks
Oriana

aureliebousard · 2020-04-22T12:46:27Z

Dear Jean,

I have exactly the same problem. Did you finally solve this issue?

Thanks

Aurélie

garci1294 · 2020-05-14T19:22:03Z

@kylandra @aureliebousard @JEFworks

Hi everyone!

In regards to the index out of bounds error, I tried the following:
Error in d[, cellOrder] : index out of bounds

I'm thinking the cellOrder is looking for integers, not strings. So I removed hc$labels.
order <- hc$labels[hc$order] changed to order = hc$order

The following code gave me the correct graph similar to the getting started tutorial.

> hb$setMvFit(verbose=TRUE)
> hb$setGexpDev(verbose=TRUE)
> hb$calcGexpCnvBoundaries(init=TRUE, verbose=FALSE)
> order <- hc$order
> hb$plotGexpProfile(cellOrder=hc$order)

I hope this helps. This allowed me to finish the getting started tutorial.

Jesus

tinyi · 2020-05-31T22:05:09Z

I have the same issue. If continue, it halts at :

results <- hb$summarizeResults(geneBased=TRUE, alleleBased=FALSE)
Error in data.frame(..., check.names = FALSE) :
arguments imply differing number of rows: 5, 4

newihi001 · 2020-06-10T16:55:00Z

@tinyi
I had the same error, I found that the error of differing number of rows: 5, 4 is caused by mismatching row number of rgs and amp.gexp.prob. May need to get the same rows to bind them.
Good luck!

Qiu

rpiskol · 2020-07-15T22:55:21Z

@suyanxun @aureliebousard @kylandra - you may have to specify a different host for the biomart.
The following is taken from the setGexpMats help and works fine.

mart.obj <- useMart(biomart = "ENSEMBL_MART_ENSEMBL", dataset = 'hsapiens_gene_ensembl', 
host = "jul2015.archive.ensembl.org")

R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error: subscript contains invalid names #34

Error: subscript contains invalid names #34

suyanxun commented Sep 22, 2019 •

edited

Loading

JEFworks commented Sep 25, 2019

kylandra commented Jan 10, 2020

JEFworks commented Jan 10, 2020 via email

kylandra commented Jan 10, 2020

aureliebousard commented Apr 22, 2020

garci1294 commented May 14, 2020 •

edited

Loading

tinyi commented May 31, 2020

newihi001 commented Jun 10, 2020

rpiskol commented Jul 15, 2020

Error: subscript contains invalid names #34

Error: subscript contains invalid names #34

Comments

suyanxun commented Sep 22, 2019 • edited Loading

JEFworks commented Sep 25, 2019

kylandra commented Jan 10, 2020

JEFworks commented Jan 10, 2020 via email

kylandra commented Jan 10, 2020

aureliebousard commented Apr 22, 2020

garci1294 commented May 14, 2020 • edited Loading

tinyi commented May 31, 2020

newihi001 commented Jun 10, 2020

rpiskol commented Jul 15, 2020

suyanxun commented Sep 22, 2019 •

edited

Loading

garci1294 commented May 14, 2020 •

edited

Loading