Skip to content

Latest commit

 

History

History
81 lines (43 loc) · 13.5 KB

apiovirus_anomala.md

File metadata and controls

81 lines (43 loc) · 13.5 KB

Possible novel rhabdovirus whose anomalous structural genes will g-"rhabdo" your attention

written by: James Shi

Apiovirus anomala

We believe based on current data that this virus has the closest relation to the Apis rhabdovirus. A blastx search of an assembly of the candidate viral genome shows hits from Apis rhabodivrus 1 with 13 % query coverage and 27.76% identity. However, the candidate virus remains unclassified beyond the family level, as many rhabdoviruses are due to their high diversity. Thus, there is inherently some uncertainty in the assignment of a binomial name for this virus. I assigned a genus mimicking how "Apis" refers to bees, a highly suspected host for this virus. The second term anomala refers to the unexpected genomic organization of this virus.

Reports of new virus

Abstract

We present a novel virus in the Rhabdoviridae family, identified by raw RNA-seq transcriptome-wide runs of Hydra vulgaris. Initial analysis suggests the virus is most closely related to unclassified Apis rhabdoviruses, which terrorize bee colonies. The candidate virus presents a highly unusual gene structure with only 3 ORFs, two of which remain unidentified. This violates the expected 5 conserved genes found in all viruses of this family. Additionally, the L protein, responsible for replicating the genome, is substantially larger than most reported L protein sizes in related viruses. The anomalous genomic structure of this virus warrants future investigation for alternative expression mechanisms such as framshifting; it differs from common features of the rhabdoviruses, despite the fact that its L protein sequence identity is higher than values reported between known rhabdovirus relatives.

Results

In this report, I will detail the use of transcriptome RNA read data to characterize the novelty of a virus. Using a provided barcode ID for my virus, I searched for its barcode peptide sequence in the tabulated run observations. I then used the peptide sequence to find nodes corresponding to that barcode sequence. After noting down the matching nodes, I stored the corresponding RNA contig sequences of the nodes in the SRR922615 transcriptome data file.

Out of these contigs, NODE_22, the sequence with the second highest coverage from the virus was probed for its digital traces using the blastx program against the NR database. This was done to maximize comprehensiveness and sensitivity since the NR database is large and includes a variety of sources. Additionally, blastx is less likely to miss hits as proteins are more conserved than nucleotide sequences, since things like silent mutations may affect nucleotide matching but not protein sequence matches. The highest coverage contig would normally be the most logical starting point, but due to CPU limits with my specific laptop and the large size of the contig, I was not able to search NODE_12 without repetitive errors. Moving on with this in mind, the blastx search taxonomy data of NODE_22 revealed 116 hits from the riboviria clade, with 110 of those hits from a family of viruses called rhabdoviridae. This strongly points to the candidate virus being in this family.

A total of 7 contigs were found to match the barcode peptide sequence, and all of them were put through blastx to assess consistency. We already know that the top coverage NODE_12 could not be assessed due to machine limits, and NODE_22 outputs hits with rhabdovirus RdRps. NODE_30 also gave many rhabdovirus RdRP hits, and NODE_1020 gave hits from insect-based viruses. The remaining nodes, NODE82, NODE_83, and NODE_2051 gave hits from eukaryotic proteins like telomerase and polyamine ATPase. The top two nodes, NODE_22 and NODE_30, both gave rhabdovirus RdRp hits, and given that out of the testable nodes, the coverage for the other nodes lags significantly behind 22 and 30, I decided to ignore the hits from the other nodes and focus on rhabdovirus relations.

A search in the NCBI database with the accession code of the contig, SRR922615, revealed that the dataset with the viral RNA was generated by a transcriptome-wide RNA-seq run of Hydra vulgaris using Illumina technology. Hydra vulgaris are a species of Cnidarians known to have symbiotic relations with microbes. The data was submitted by the Vanderbilt University Medical Center and published for public access on 2014-05-01. There is no publication associated with these data.

Many high-level features of the candidate virus can be deduced from its apparent relationship to the rhabdovirus family. For one, rhabdoviruses are an incredibly diverse family, and consequently they include viruses that target a variety of hosts (Walker et al. 2011). Rhabdoviruses have been found to infect verteberates, invertebrates, and plants, and they are one of only three families that can infect all three host types (Walker et al. 2011). Rhabdoviruses can thrive in many kinds of environments, from marine to terrestrial (Walker et al. 2011). Reports of rhabdoviruses are found in every inhabited continent other than Australia and not really localized to any specific region (Levin et al. 2017). This likely owes to the large host range they possess. Often, many rhabdoviruses are transmitted via arthropods (insects), and thus their life cycle involves replicating in insects before being injected into other organisms (Longdon et al. 2015). This is especially the case for plant rhabdoviruses, although not entirely necessary since mechanisms like vegetative growth can also spread the virus (Gaafar et al. 2019).

One notable human virus from this family is the Rabies virus, or formally the Lyssavirus genus, which is very dangerous. Unlike many plant rhabdoviruses, animal rhabdoviruses do not necessarily go by arthropod vectors. For example, Rabies almost completely spreads via things like bites from mammals (Shepherd et al. 2023).

Virus Genome {Q3}

Visual representation of ORFs present along the candidate viral genome contig based on NCBI Orf Finder and BDGP Promoter Predictor. 3 ORFs were identified and labelled with their identities based on BLASTp hits (or as unknown in case of lack of hits). An online tool by BDGP was used to find the highest scoring predicted 3' leader promoter sequence before the predicted ORFs was labelled.

Other (bonus) sections

AlphaFold2 Prediction

AlphaFold2 Predicted Structure with Highest Ranked Confidence of Candidate Viral RdRp. The A motif is blue, B motif is green, and C motif is red.

Discussion

An initial blastx search of an RNA-seq assembly of the candidate virus against the NR protein database revealed over 100 hits in the rhabdovirus family. Possibly due to the large size of the query sequence, percent identity was low (roughly averaging 25%), suggesting the evolutionary relationship may be somewhat distant. However, given that no hits from other families were found, the rhabdovirus family remains the most likely categorization, or at least that the candidate falls under the same order of Mononegavirales. It is possible that this still indicates a close relationship, as some groups have reported 19% protein sequence identity between the L proteins of two rhabdoviruses (Liang 2020).

Assuming a relatively complete assembly, the genome size of the virus fits expectations, as a size of about 15 kb fits within the range of 11-16 kb typicaly of rhabdoviruses. Surprisingly, ORF Finder predicts only 3 ORFs within the viral genome, two of which have no matches in BLAST, and one large ORF matching the L protein with RdRp activity. However, this ORF gene product reaches around 300 kDa in size, exceeding most reported L protein sizes in Rhabdoviruses (Riedel et al. 2020).

Normally, we would expect 5 ORFs corresponding to the 5 conserved proteins common to the Mononegavirales order (Liang 2020). Many micro ORFs were also predicted under the large L protein ORF but they failed to match with any known proteins. It is possible that the ORF Finder software failed to predict gene products created by frameshift sites or perhaps by other methods such as proteolytic cleavage of immature polyproteins. It is also surprising that there were 2 unidentified ORFs, since most proteins of this family are quite conserved.

Many of the BLAST hits returned mapped to the Apis rhabdovirus species, so it could be a clue that the virus uses bees as a host like the Apis virus (Levin et al. 2017). For now, there is not enough data to computationally assess other features we expect, such as other hosts, transmissions, or pathogenesis.

References

References:

  1. Brown JC, Newcomb WW, Wertz GW. Helical virus structure: the case of the rhabdovirus bullet. Viruses [Internet]. 2010 Apr 12 [cited 2023 Dec 4];2(4):995–1001. Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3185653/

  2. Walker PJ, Dietzgen RG, Joubert DA, Blasdell KR. Rhabdovirus accessory genes. Virus Res [Internet]. 2011 Dec [cited 2023 Dec 4];162(1):110–25. Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7114375/

  3. Dietzgen RG, Kondo H, Goodin MM, Kurath G, Vasilakis N. The family Rhabdoviridae: mono- and bipartite negative-sense RNA viruses with diverse genome organization and common evolutionary origins. Virus Res [Internet]. 2017 Jan 2 [cited 2023 Dec 4];227:158–70. Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5124403/

  4. Riedel C, Hennrich AA, Conzelmann KK. Components and architecture of the rhabdovirus ribonucleoprotein complex. Viruses [Internet]. 2020 Aug 29 [cited 2023 Dec 4];12(9):959. Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7552012/

  5. Kuzmin IV, Novella IS, Dietzgen RG, Padhi A, Rupprecht CE. The rhabdoviruses: Biodiversity, phylogenetics, and evolution. Infection, Genetics and Evolution [Internet]. 2009 Jul 1 [cited 2023 Dec 4];9(4):541–53. Available from: https://www.sciencedirect.com/science/article/pii/S1567134809000380

  6. Walker PJ, Firth C, Widen SG, Blasdell KR, Guzman H, Wood TG, et al. Evolution of genome size and complexity in the rhabdoviridae. PLoS Pathog [Internet]. 2015 Feb 13 [cited 2023 Dec 4];11(2):e1004664. Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4334499/

  7. Ivanov I, Yabukarski F, Ruigrok RWH, Jamin M. Structural insights into the rhabdovirus transcription/replication complex. Virus Research [Internet]. 2011 Dec 1 [cited 2023 Dec 4];162(1):126–37. Available from: https://www.sciencedirect.com/science/article/pii/S0168170211003789

  8. Levin S, Galbraith D, Sela N, Erez T, Grozinger CM, Chejanovsky N. Presence of apis rhabdovirus-1 in populations of pollinators and their parasites from two continents. Front Microbiol. 2017;8:2482.

  9. Shepherd JG, Davis C, Streicker DG, Thomson EC. Emerging rhabdoviruses and human infection. Biology [Internet]. 2023 Jun [cited 2023 Dec 4];12(6):878. Available from: https://www.mdpi.com/2079-7737/12/6/878

  10. Longdon, B., Murray, G. G. R., Palmer, W. J., Day, J. P., Parker, D. J., Welch, J. J., Obbard, D. J., & Jiggins, F. M. (2015). The evolution, diversity, and host associations of rhabdoviruses. Virus Evolution, 1(1), vev014–vev014. https://doi.org/10.1093/ve/vev014

  11. Bdgp: neural network promoter prediction [Internet]. [cited 2023 Dec 4]. Available from: https://www.fruitfly.org/seq_tools/promoter.html

  12. Orffinder home - ncbi [Internet]. [cited 2023 Dec 4]. Available from: https://www.ncbi.nlm.nih.gov/orffinder/

  13. Liang B. Structures of the mononegavirales polymerases. J Virol [Internet]. 2020 Oct 27 [cited 2023 Dec 4];94(22):e00175-20. Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7592205/

Viral Short Story

I can barely hear my thoughts over the sound of pounding on my door. I sit in my barricaded lab room writing this as dozens of flesh-eating corpses ram into the door. This all started when a research group in America discovered a new species of virus. It was **somewhat related to rabies and other rhabdovirus family members**. Although we were on guard for possible health risks, no one knew it would turn out like this. We found out from early animal testing that **the virus was able to infect a range of host animals** such as dogs, pigs, guinea pigs, and more. During one of the trials, a researcher was inadvertently bitten by a dog. The researcher was initially healthy and wandered into a public area, where they quickly became aggressive, lost all sense of higher-order thinking, and attacked other people. Further testing showed that the **virus was able to persist in bodily fluids like saliva, so actions like biting would certainly spread it**. Once it circulates in the body, the virus seeks to gain entry into nervous system neurons, explaining its ability to cause psychosis. Those affected were quarantined in hopes they would recover. Unfortunately, the virus was able to **evade our immune responses by halting programmed cell death and stopping T cell function**, thus silently propagating inside the host. When we realized it was a threat, my lab set out to find some sort of treatment for it. My colleagues found that this **virus shares a conserved membrane glycoprotein with existing viruses, which it uses to enter cells**.  I started experiments to design an antibody vaccine against the virus, but by then, the virus had crept further than we imagined. Soon, the infected were like a tsunami, crippling societies everywhere. Earlier, they had trampled past our security measures and killed most of the staff here. Only a thin metal door stands between me and the horde. I can only hope someone else finds these records and continues my work.