You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was trying to run STARSolo on poorly annotated species Macaca fascicularis. The GTF file was downloaded from Ensembl (v95). For genes that don't have gene name annotated, the biotype was used by STARSolo to represent the gene name.
For example, gene "ENSMFAG00000010714" only gets gene_id annotated in GTF file.
In the genes.tsv file generated by STARSolo, this gene was shown as:
ENSMFAG00000010714 protein_coding
And "protein_coding" was used as gene name (or row name) in count matrix (from matrix.mtx).
Please help fix this issue for those poorly annotation species. And it would be nice to offer an option to select gene_id/gene_name as the row name in count matrix. Thanks a lot.
Best,
Yang
The text was updated successfully, but these errors were encountered:
Hello,
I was trying to run STARSolo on poorly annotated species Macaca fascicularis. The GTF file was downloaded from Ensembl (v95). For genes that don't have gene name annotated, the biotype was used by STARSolo to represent the gene name.
For example, gene "ENSMFAG00000010714" only gets gene_id annotated in GTF file.
(from GTF:)
1 ensembl gene 1174457 1175395 . - . gene_id "ENSMFAG00000010714"; gene_version "1"; gene_source "ensembl"; gene_biotype "protein_coding";
In the genes.tsv file generated by STARSolo, this gene was shown as:
ENSMFAG00000010714 protein_coding
And "protein_coding" was used as gene name (or row name) in count matrix (from matrix.mtx).
Please help fix this issue for those poorly annotation species. And it would be nice to offer an option to select gene_id/gene_name as the row name in count matrix. Thanks a lot.
Best,
Yang
The text was updated successfully, but these errors were encountered: