Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge_mafs error #722

Closed
guillaume-rs opened this issue Jun 1, 2021 · 2 comments
Closed

merge_mafs error #722

guillaume-rs opened this issue Jun 1, 2021 · 2 comments

Comments

@guillaume-rs
Copy link

Hi,

I'm trying to run merge_mafs on TCGA MAF files, but I get the error below.

I was wondering if there is something wrong about column equivalence between the MAF files, or if I should use some parameter to ignore some columns?

Thank you

list_path_LAML_MAF = grep(list.files(path = "/path/gdac.broadinstitute.org_LAML.Mutation_Packager_Oncotated_Calls.Level_3.2016012800.0.0", full.names = TRUE), pattern='*MANIFEST*', invert=TRUE, value=TRUE)

test = merge_mafs(list_path_LAML_MAF, verbose = TRUE)
Merging 197 MAF files
Error in data.table::rbindlist(l = maf, fill = TRUE, idcol = "Source_MAF",  :
  Class attribute on column 168 of item 89 does not match with column 168 of item 1.
sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /path/.conda/envs/owkin/lib/libopenblasp-r0.3.15.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] maftools_2.8.0

loaded via a namespace (and not attached):
[1] compiler_4.1.0     Matrix_1.3-3       RColorBrewer_1.1-2 survival_3.2-11
[5] splines_4.1.0      grid_4.1.0         data.table_1.14.0  lattice_0.20-44
@PoisonAlien
Copy link
Owner

Hi,

It seems like it is a known data.table issue when columns differ in their attributes.

I think for now a quick fix would be to manually perform merging on selected/necessary columns. Below is a possible solution:

# only use following columns. Add other column names if you would like; for example, `HGVSp`
required.fields = c(
  'Hugo_Symbol',
  'Chromosome',
  'Start_Position',
  'End_Position',
  'Reference_Allele',
  'Tumor_Seq_Allele2',
  'Variant_Classification',
  'Variant_Type',
  'Tumor_Sample_Barcode'
)

mafs = lapply(X = list_path_LAML_MAF, function(x){
  data.table::fread(file = x, sep = "\t", select = required.fields, skip = "Hugo_Symbol")
})

mafs = data.table::rbindlist(l = mafs, use.names = TRUE, fill = TRUE, idcol = "Source_MAF")

mafs = maftools::read.maf(maf = mafs)

I haven't tested the above code but I am guessing it should work fine. Let me know if this helps.
I will look for an alternative solution meanwhile.

@guillaume-rs
Copy link
Author

Thank you for your quick feedback.
It works fine with the filtering step on the columns !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants