This repository contains the source code for the Swedish Reference Genome Portal, which:
- Showcases genome research performed in Sweden on non-model eukaryotic species.
- Lowers the barrier of entry to access, visualise, and interpret genome data.
- Encourages sharing of genomic annotations, even the seldom-published kind.
- Strives to present FAIR data, available in public repositories.
-
The Swedish Reference Genome Portal website is built using the Hugo static web generator.
-
The JBrowse2 genome browser is embedded within the website to visually explore genome datasets.
-
Primary data file sources are available in public repositories (such as ENA), and prepared for display on JBrowse by our
Makefile
recipes (essentially compressing and indexing). -
The code for the Genome Portal is available under an MIT (open source) license.
-
The Genome Portal website is currently hosted by the KTH Royal Institute of Technology in Stockholm.
See 'Cite this repository' in the "About" section at the top right of this page.
Two types of contributions are especially welcome:
-
Datasets for display in the portal: Consult our requirements for including a genome dataset to the portal, and contact us if you have any questions.
-
Source code and documentation: We welcome contributions, small and large, to our codebase and documentation. They will be published after review and approval by the Genome Portal team. Fork, open a PR, or contact us to discuss ideas!
This service is supported by SciLifeLab and the Knut and Alice Wallenberg Foundation through the Data-Driven Life Science (DDLS) program, as well as by the Swedish Foundation for Strategic Research (SSF).
We welcome all questions and suggestions (including feature requests or bug reports).
- Email us at [email protected].
- Fill out our contact form on the website.
- Create an issue in Github
This section contains high-level technical documentation about the source code.
-
The
config/
directory contains information about data sources (tracks and assemblies) displayed in the genome browser.- Each species subdirectory inclues:
config.yml
: specifies the assembly and tracks to be displayed in JBrowse2.config.json
: starting point from which to generate a complete JBrowse2 configuration, based onconfig.yaml
. A common use is to define default browsing sessions.
- Each species subdirectory inclues:
-
Different
make
recipes prepare the material described inconfig/
for use by JBrowse2. The main operations are downloading data files, compressing usingbgzip
and indexing withsamtools
. -
The website content resides in the
hugo
directory.- Most importantly, each species gets:
- A content subdirectory in
hugo/content/species/
(e.g.hugo/content/species/clupea_harengus
) - A data directory in
hugo/data/
(taxonomic information and statistics) - An assets directory in
hugo/assets
(data inventory)
- A content subdirectory in
- Most importantly, each species gets:
-
The
scripts
folder contains executables to help:- build and serve the website using Docker
- add a new species to the website content
- add new datasets to the portal
-
The
tests
folder contains tests and fixtures, mainly covering the data preparation scripts. -
The
docker
folder contains two Dockerfiles:docker/data.dockerfile
used for data preparation (everything thatmake
needs)docker/hugo.dockerfile
used to build and serve the website.
The steps described below requires
docker
to be installed.
1. Clone the repository
git clone [email protected]:ScilifelabDataCentre/genome-portal.git
cd genome-portal
2. Build and install the genomic data
# Build local image from `docker/data.dockerfile`
./scripts/dockerbuild data
# Run the dockermake script to build the assets and install them locally.
./scripts/dockermake
You may need to be patient, some files are tens of Gigabytes. Should only a subset of species be of interest, you can restrict the scope of the build:
./scripts/dockermake SPECIES=clupea_harengus,linum_tenue
3. Run the web application container
Then to run the website locally, you have several options
docker pull ghcr.io/scilifelabdatacentre/swg-hugo-site:dev
./scripts/dockerserve
./scripts/dockerbuild hugo
SWG_TAG=local ./scripts/dockerserve
This last method is adequate when you want to see changes to the source immediately reflected in the web browser.
It requires the additional step of installing the JBrowse static
bundle in hugo/static/browser
./scripts/download_jbrowse v2.15.4 hugo/static/browser
scripts/dockerserve --dev
Either of these methods will serve you the website at http://localhost:8080/
The Swedish Reference Genome Portal is developed and maintained by the DDLS Data Science Node in Evolution and Biodiversity (DSN-EB) team as part of the SciLifeLab Data Platform, operated by the SciLifeLab Data Centre. Members if the DSN-EB team are affiliated with SciLifeLab Data Centre and the National Bioinformatics Infrastructure Sweden (NBIS), based at Uppsala University and the Swedish Museum of Natural History.