Skip to content

This is the repository for the Swedish Reference Genome Portal, a service facilitating access and discovery of genome data of non-model eukaryotic species studied in Sweden

License

Notifications You must be signed in to change notification settings

ScilifelabDataCentre/genome-portal

Repository files navigation

Swedish Reference Genome Portal

This repository contains the source code for the Swedish Reference Genome Portal, which:

  • Showcases genome research performed in Sweden on non-model eukaryotic species.
  • Lowers the barrier of entry to access, visualise, and interpret genome data.
  • Encourages sharing of genomic annotations, even the seldom-published kind.
  • Strives to present FAIR data, available in public repositories.

Table of Contents

  1. Overview
  2. Cite this portal
  3. Contributing
  4. Funding
  5. Contact us
  6. Technical overview
  7. Credits

Overview

  • The Swedish Reference Genome Portal website is built using the Hugo static web generator.

  •  The JBrowse2 genome browser is embedded within the website to visually explore genome datasets.

  • Primary data file sources are available in public repositories (such as ENA), and prepared for display on JBrowse by our Makefile recipes (essentially compressing and indexing).

  • The code for the Genome Portal is available under an MIT (open source) license.

  • The Genome Portal website is currently hosted by the KTH Royal Institute of Technology in Stockholm.

Cite this portal

DOI

See 'Cite this repository' in the "About" section at the top right of this page.

Contributing

Two types of contributions are especially welcome:

  • Datasets for display in the portal: Consult our requirements for including a genome dataset to the portal, and contact us if you have any questions.

  • Source code and documentation: We welcome contributions, small and large, to our codebase and documentation. They will be published after review and approval by the Genome Portal team. Fork, open a PR, or contact us to discuss ideas!

Funding

This service is supported by SciLifeLab and the Knut and Alice Wallenberg Foundation through the Data-Driven Life Science (DDLS) program, as well as by the Swedish Foundation for Strategic Research (SSF).

Contact us

We welcome all questions and suggestions (including feature requests or bug reports).

Technical overview

This section contains high-level technical documentation about the source code.

Repository layout

  • The config/ directory contains information about data sources (tracks and assemblies) displayed in the genome browser.

    • Each species subdirectory inclues:
      • config.yml : specifies the assembly and tracks to be displayed in JBrowse2.
      • config.json : starting point from which to generate a complete JBrowse2 configuration, based on config.yaml. A common use is to define default browsing sessions.
  • Different make recipes prepare the material described in config/ for use by JBrowse2. The main operations are downloading data files, compressing using bgzip and indexing with samtools.

  • The website content resides in the hugo directory.

    • Most importantly, each species gets:
      1. A content subdirectory in hugo/content/species/ (e.g. hugo/content/species/clupea_harengus)
      2. A data directory in hugo/data/ (taxonomic information and statistics)
      3. An assets directory in hugo/assets (data inventory)
  • The scripts folder contains executables to help:

    1. build and serve the website using Docker
    2. add a new species to the website content
    3. add new datasets to the portal
  • The tests folder contains tests and fixtures, mainly covering the data preparation scripts.

  • The docker folder contains two Dockerfiles:

    1. docker/data.dockerfile used for data preparation (everything that make needs)
    2. docker/hugo.dockerfile used to build and serve the website.

Local development

The steps described below requires docker to be installed.

1. Clone the repository

git clone [email protected]:ScilifelabDataCentre/genome-portal.git
cd genome-portal

2. Build and install the genomic data

# Build local image from `docker/data.dockerfile`
./scripts/dockerbuild data

# Run the dockermake script to build the assets and install them locally.
./scripts/dockermake

You may need to be patient, some files are tens of Gigabytes. Should only a subset of species be of interest, you can restrict the scope of the build:

./scripts/dockermake SPECIES=clupea_harengus,linum_tenue

3. Run the web application container

Then to run the website locally, you have several options

Using the latest development image

docker pull ghcr.io/scilifelabdatacentre/swg-hugo-site:dev
./scripts/dockerserve

Using a local build

./scripts/dockerbuild hugo
SWG_TAG=local ./scripts/dockerserve

Using the Hugo development server

This last method is adequate when you want to see changes to the source immediately reflected in the web browser.

It requires the additional step of installing the JBrowse static bundle in hugo/static/browser

./scripts/download_jbrowse v2.15.4 hugo/static/browser
scripts/dockerserve --dev

Either of these methods will serve you the website at http://localhost:8080/

Credits

The Swedish Reference Genome Portal is developed and maintained by the DDLS Data Science Node in Evolution and Biodiversity (DSN-EB) team as part of the SciLifeLab Data Platform, operated by the  SciLifeLab Data Centre. Members if the DSN-EB team are affiliated with SciLifeLab Data Centre  and the National Bioinformatics Infrastructure Sweden (NBIS), based at Uppsala University and the Swedish Museum of Natural History.

About

This is the repository for the Swedish Reference Genome Portal, a service facilitating access and discovery of genome data of non-model eukaryotic species studied in Sweden

Resources

License

Stars

Watchers

Forks

Packages