biomedical_id_resolver.js

js library for resolving biological ids to their equivalent ids in batch

Install

$ npm i biomedical_id_resolver

Usage

const resolve = require('biomedical_id_resolver');

// input should be an object, with semantic type as the key, and array of CURIEs as value
let input = {
    "Gene": ["NCBIGene:1017", "NCBIGene:1018", "HGNC:1177"],
    "SmallMolecule": ["CHEBI:15377"],
    "Disease": ["MONDO:0004976"],
    "Cell": ["CL:0002372"]
  };

(async () => {
  const resolver = new resolve();
	console.log(await resolver.resolve(input);
	//=> {'NCBIGene:1017': {...}, 'NCBIGene:1018': {...}, 'HGNC:1177': {...}, 'CHEBI:15377': {...}, 'MONDO:0004976': {...}, 'CL:0002372': {...}}
})();

Output Schema

Output is a javascript Object
The root keys are CURIES (e.g. NCBIGene:1017) which are passed in as input
The values represents resolved identifiers
Each CURIE will have 4 required fields
- id: the primary id (selected based on the ranking described in the next section) and label
- curies: an array, each element represents a resolved id in CURIE format
- type: the semantic type of the identifier
- db_ids: original ids from source database, could be curies or non-curies.
if an ID can not be resolved using the package, it will have an additional field called "flag", with value equal to "failed"
Example Output

{
  "NCBIGene:1017": {
    "id": {
      "label": "cyclin dependent kinase 2",
      "identifier": "NCBIGene:1017"
    },
    "db_ids": {
      "NCBIGene": [
        "1017"
      ],
      "ENSEMBL": [
        "ENSG00000123374"
      ],
      "HGNC": [
        "1771"
      ],
      "SYMBOL": [
        "CDK2"
      ],
      "UMLS": [
        "C1332733",
        "C0108855"
      ],
      "name": [
        "cyclin dependent kinase 2"
      ]
    },
    "type": "Gene",
    "curies": [
      "NCBIGene:1017",
      "ENSEMBL:ENSG00000123374",
      "HGNC:1771",
      "SYMBOL:CDK2",
      "UMLS:C1332733",
      "UMLS:C0108855"
    ]
  }
}

Query Using SRI node normalizer

Usage

const resolver = require('biomedical_id_resolver');

// input must be an object, with semantic type as the key, and array of CURIEs as value
let input = {
    "Gene": ["NCBIGene:1017", "NCBIGene:1018", "HGNC:1177"],
    "SmallMolecule": ["CHEBI:15377"],
    "Disease": ["MONDO:0004976"],
    "Cell": ["CL:0002372"]
};

(async () => {
  let res = await resolver.resolveSRI(input);
  console.log(res);
})();

Example Output

The output contains id and equivalent_identifiers straight from SRI as well as the same fields as the base resolver to make it backwards compatible with it.

{
  "NCBIGene:1017": [
    {
      "id": {
        "identifier": "NCBIGene:1017",
        "label": "CDK2"
      },
      "equivalent_identifiers": [
        {
          "identifier": "NCBIGene:1017",
          "label": "CDK2"
        },
        {
          "identifier": "ENSEMBL:ENSG00000123374"
        },
        {
          "identifier": "HGNC:1771",
          "label": "CDK2"
        },
        {
          "identifier": "OMIM:116953"
        },
        {
          "identifier": "UMLS:C1332733",
          "label": "CDK2 gene"
        }
      ],
      "type": [
        "biolink:Gene",
        "biolink:GeneOrGeneProduct",
        "biolink:BiologicalEntity",
        "biolink:NamedThing",
        "biolink:Entity",
        "biolink:MacromolecularMachineMixin"
      ],
      "primaryID": "NCBIGene:1017",
      "label": "CDK2",
      "attributes": {},
      "semanticType": "Gene",
      "semanticTypes": [
        "biolink:Gene",
        "biolink:GeneOrGeneProduct",
        "biolink:BiologicalEntity",
        "biolink:NamedThing",
        "biolink:Entity",
        "biolink:MacromolecularMachineMixin"
      ],
      "dbIDs": {
        "NCBIGene": [
          "1017"
        ],
        "ENSEMBL": [
          "ENSG00000123374"
        ],
        "HGNC": [
          "1771"
        ],
        "OMIM": [
          "116953"
        ],
        "UMLS": [
          "C1332733"
        ],
        "name": [
          "CDK2",
          "CDK2 gene"
        ]
      },
      "curies": [
        "NCBIGene:1017",
        "ENSEMBL:ENSG00000123374",
        "HGNC:1771",
        "OMIM:116953",
        "UMLS:C1332733"
      ]
    }
  ]
}

Available Semantic Types & prefixes

Gene, Transcript, Protein ID resolution is done through MyGene.info API

Gene
1. NCBIGene
2. ENSEMBL
3. HGNC
4. MGI
5. OMIM
6. UMLS
7. SYMBOL
8. UniProtKB
9. name
Transcript
1. ENSEMBL
2. SYMBOL
3. name
Protein
1. UniProtKB
2. ENSEMBL
3. UMLS
4. SYMBOL
5. name

Variant ID resolution is done through MyVariant.info API

SequenceVariant
1. CLINVAR
2. DBSNP
3. HGVS
4. MYVARIANT_HG19

SmallMolecule, Drug ID resolution is done through MyChem.info API

SmallMolecule
1. PUBCHEM.COMPOUND
2. CHEMBL.COMPOUND
3. UNII
4. CHEBI
5. DRUGBANK
6. MESH
7. CAS
8. HMDB
9. KEGG.COMPOUND
10. INCHI
11. INCHIKEY
12. UMLS
13. LINCS
14. name
Drug
1. RXCUI
2. NDC
3. DRUGBANK
4. PUBCHEM.COMPOUND
5. CHEMBL.COMPOUND
6. UNII
7. CHEBI
8. MESH
9. CAS
10. HMDB
11. KEGG.COMPOUND
12. INCHI
13. INCHIKEY
14. UMLS
15. LINCS
16. name

Disease, ClinicalFinding ID Resolution is done through MyDisease.info API

Disease
1. MONDO
2. DOID
3. OMIM
4. ORPHANET
5. EFO
6. UMLS
7. MESH
8. MEDDRA
9. NCIT
10. SNOMEDCT
11. HP
12. GARD
13. name
ClinicalFinding
1. LOINC
2. NCIT
3. EFO
4. name

Pathway ID Resolution is done through biothings.ncats.io/geneset API

Pathway
1. GO
2. REACT
3. KEGG
4. SMPDB
5. PHARMGKB.PATHWAYS
6. WIKIPATHWAYS
7. BIOCARTA
8. name

MolecularActivity ID Resolution is done through BioThings Gene Ontology Molecular Activity API

MolecularActivity
1. GO
2. REACT
3. RHEA
4. MetaCyc
5. KEGG.REACTION
6. name

CellularComponent ID Resolution is done through BioThings Gene Ontology Cellular Component API

CellularComponent
1. GO
2. MetaCyc
3. name

BiologicalProcess ID Resolution is done through BioThings Gene Ontology Biological Process API

BiologicalProcess
1. GO
2. REACT
3. MetaCyc
4. KEGG
5. name

AnatomicalEntity ID Resolution is done through BioThings UBERON API

AnatomicalEntity
1. UBERON
2. UMLS
3. MESH
4. NCIT
5. name

PhenotypicFeature ID Resolution is done through BioThings HPO API

PhenotypicFeature
1. HP
2. EFO
3. NCIT
4. UMLS
5. MEDDRA
6. MP
7. SNOMEDCT
8. MESH
9. name

Cell ID Resolution is done through Biothings Cell Ontology API

Cell
1. CL
2. NCIT
3. MESH
4. EFO
5. name

Development

Install Node 12 or later. You can use the package manager of your choice. Tests need to pass in Node 12 and 14.
Clone this repository.
Run npm ci to install the dependencies.
scripts are stored in /src folder
Add test to /__tests__ folder
run npm run release to bump version and generate change log
run npx depcheck to check for unused packages in package.json

CHANGELOG

See CHANGELOG.md

Name		Name	Last commit message	Last commit date
Latest commit History 360 Commits
.github/workflows		.github/workflows
__tests__		__tests__
src		src
.commitlintrc.json		.commitlintrc.json
.gitignore		.gitignore
.prettierrc		.prettierrc
.versionrc.json		.versionrc.json
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
diagram.svg		diagram.svg
jest.config.js		jest.config.js
jest.setup.js		jest.setup.js
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
tslint.json		tslint.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

biomedical_id_resolver.js

Install

Usage

Output Schema

Query Using SRI node normalizer

Usage

Example Output

Available Semantic Types & prefixes

Development

CHANGELOG

About

Releases

Packages

Languages

License

ericz1803/biomedical_id_resolver.js

Folders and files

Latest commit

History

Repository files navigation

biomedical_id_resolver.js

Install

Usage

Output Schema

Query Using SRI node normalizer

Usage

Example Output

Available Semantic Types & prefixes

Development

CHANGELOG

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages