Skip to content

Commit

Permalink
Release 2.9.6 (#272)
Browse files Browse the repository at this point in the history
* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Display es scores (#199)

* Include ES scores in variable results

* Round ES score to 6

* Update _version.py (#200)

* Dev version bump (#202)

* Release/2.8.0 (#198)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Update _version.py

* Update _version.py

updating version for final push to master

* Update factory.py

Adding more comments

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>

* Release/v2.9.0 (#201)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Display es scores (#199)

* Include ES scores in variable results

* Round ES score to 6

* Update _version.py (#200)

* Update _version.py

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* Attribute mapping from node to dug element (#203)

* Release/2.8.0 (#198)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Update _version.py

* Update _version.py

updating version for final push to master

* Update factory.py

Adding more comments

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>

* Release/v2.9.0 (#201)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Display es scores (#199)

* Include ES scores in variable results

* Round ES score to 6

* Update _version.py (#200)

* Update _version.py

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* adding more config options for node extraction

* some refactoring

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* Changed DbGaP to SPARC in the scicrunch parser (#204)

* Anvil (#207)

* Added updated anvil dataset catalog

* Added script for downloading all anvil data dicts

* Added current anvil data dictionaries to data folder to be used for indexing

* Anvil parser (#208)

* Release/2.8.0 (#198)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Update _version.py

* Update _version.py

updating version for final push to master

* Update factory.py

Adding more comments

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>

* Release/v2.9.0 (#201)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Display es scores (#199)

* Include ES scores in variable results

* Round ES score to 6

* Update _version.py (#200)

* Update _version.py

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* anvil parser

* bump number of files test

* Update dbgap_parser.py

* Update anvil_dbgap_parser.py

change to AnVIL

* Update test_parsers.py

update test

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* Initial Kaniko build.

* Move version file definition.

* Quote env vars.

* Update env vars.

* Update env vars.

* Update env vars.

* env var changes.

* env var changes.

* env var changes.

* env var changes.

* Update DOCKER_IMAGE var.

* Update DOCKER_IMAGE var in kaniko cmd.

* Update kaniko destination line.

* Update kaniko destination line.

* Moree variable madness.

* Programatically remove quotes from version tag.

* dug dump concepts api created and tested (#229)

Co-authored-by: Nathan Braswell <[email protected]>

* Update _version.py (#234)

* Version changes + separate build and publish.

* Semantic versioning prep.

* Add develop and master versioning and tagging.

* Bump version.

* Revert version to dug format.

* Ncpi index fix (#232)

* Renamed anvil to ncpi

* Update ncpi datasets catalog

* Modified script to download NCPI datasets into platform subfolders

* Updated NCPI integration dataset

* Removed unused variable

* Removed ncpi top level folder to spread results among subfolders

* Change output dir to data instead of ncpi subdir

* Moved NCPI subdirs into main data folder for ingest as per Yaphet's request

Co-authored-by: Alex Waldrop <[email protected]>

* Add github creds env var.

* Fix version typo.

* Initial commit

* Reduce ephemeral storage limits and requests

* More parsers (#248)

* Release/2.8.0 (#198)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Update _version.py

* Update _version.py

updating version for final push to master

* Update factory.py

Adding more comments

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>

* Release/v2.9.0 (#201)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Display es scores (#199)

* Include ES scores in variable results

* Round ES score to 6

* Update _version.py (#200)

* Update _version.py

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* consolidate dbgap format parser in single file , adds crdc and kfdrc parsers

* adding tests

* bump version

* parser when versions of studies are > 9

* test for version

* fix long text issues, and encoding errors

* nltk initialization

* change nltk approach for sliding window

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* Master to develop sync (#262)

* Release/2.8.0 (#198)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Update _version.py

* Update _version.py

updating version for final push to master

* Update factory.py

Adding more comments

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>

* Release/v2.9.0 (#201)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Display es scores (#199)

* Include ES scores in variable results

* Round ES score to 6

* Update _version.py (#200)

* Update _version.py

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* Release/2.9.1 (#205)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Display es scores (#199)

* Include ES scores in variable results

* Round ES score to 6

* Update _version.py (#200)

* Dev version bump (#202)

* Release/2.8.0 (#198)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Update _version.py

* Update _version.py

updating version for final push to master

* Update factory.py

Adding more comments

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>

* Release/v2.9.0 (#201)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Display es scores (#199)

* Include ES scores in variable results

* Round ES score to 6

* Update _version.py (#200)

* Update _version.py

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* Attribute mapping from node to dug element (#203)

* Release/2.8.0 (#198)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Update _version.py

* Update _version.py

updating version for final push to master

* Update factory.py

Adding more comments

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>

* Release/v2.9.0 (#201)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Display es scores (#199)

* Include ES scores in variable results

* Round ES score to 6

* Update _version.py (#200)

* Update _version.py

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* adding more config options for node extraction

* some refactoring

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* Changed DbGaP to SPARC in the scicrunch parser (#204)

* Release/2.9.1

Renames SPARC datasets as SPARC instead of dbgap

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>
Co-authored-by: Howard Lander <[email protected]>

* Update _version.py (#206)

* Release/2.9.2 (#209)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Display es scores (#199)

* Include ES scores in variable results

* Round ES score to 6

* Update _version.py (#200)

* Dev version bump (#202)

* Release/2.8.0 (#198)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Update _version.py

* Update _version.py

updating version for final push to master

* Update factory.py

Adding more comments

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>

* Release/v2.9.0 (#201)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Display es scores (#199)

* Include ES scores in variable results

* Round ES score to 6

* Update _version.py (#200)

* Update _version.py

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* Attribute mapping from node to dug element (#203)

* Release/2.8.0 (#198)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Update _version.py

* Update _version.py

updating version for final push to master

* Update factory.py

Adding more comments

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>

* Release/v2.9.0 (#201)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Display es scores (#199)

* Include ES scores in variable results

* Round ES score to 6

* Update _version.py (#200)

* Update _version.py

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* adding more config options for node extraction

* some refactoring

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* Changed DbGaP to SPARC in the scicrunch parser (#204)

* Anvil (#207)

* Added updated anvil dataset catalog

* Added script for downloading all anvil data dicts

* Added current anvil data dictionaries to data folder to be used for indexing

* Anvil parser (#208)

* Release/2.8.0 (#198)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Update _version.py

* Update _version.py

updating version for final push to master

* Update factory.py

Adding more comments

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>

* Release/v2.9.0 (#201)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Display es scores (#199)

* Include ES scores in variable results

* Round ES score to 6

* Update _version.py (#200)

* Update _version.py

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* anvil parser

* bump number of files test

* Update dbgap_parser.py

* Update anvil_dbgap_parser.py

change to AnVIL

* Update test_parsers.py

update test

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>
Co-authored-by: Howard Lander <[email protected]>
Co-authored-by: Alex Waldrop <[email protected]>

* Release 2.9.3 (#244)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Display es scores (#199)

* Include ES scores in variable results

* Round ES score to 6

* Update _version.py (#200)

* Dev version bump (#202)

* Release/2.8.0 (#198)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Update _version.py

* Update _version.py

updating version for final push to master

* Update factory.py

Adding more comments

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>

* Release/v2.9.0 (#201)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Display es scores (#199)

* Include ES scores in variable results

* Round ES score to 6

* Update _version.py (#200)

* Update _version.py

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* Attribute mapping from node to dug element (#203)

* Release/2.8.0 (#198)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Update _version.py

* Update _version.py

updating version for final push to master

* Update factory.py

Adding more comments

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>

* Release/v2.9.0 (#201)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Display es scores (#199)

* Include ES scores in variable results

* Round ES score to 6

* Update _version.py (#200)

* Update _version.py

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* adding more config options for node extraction

* some refactoring

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* Changed DbGaP to SPARC in the scicrunch parser (#204)

* Anvil (#207)

* Added updated anvil dataset catalog

* Added script for downloading all anvil data dicts

* Added current anvil data dictionaries to data folder to be used for indexing

* Anvil parser (#208)

* Release/2.8.0 (#198)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Update _version.py

* Update _version.py

updating version for final push to master

* Update factory.py

Adding more comments

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>

* Release/v2.9.0 (#201)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Display es scores (#199)

* Include ES scores in variable results

* Round ES score to 6

* Update _version.py (#200)

* Update _version.py

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* anvil parser

* bump number of files test

* Update dbgap_parser.py

* Update anvil_dbgap_parser.py

change to AnVIL

* Update test_parsers.py

update test

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* Initial Kaniko build.

* Move version file definition.

* Quote env vars.

* Update env vars.

* Update env vars.

* Update env vars.

* env var changes.

* env var changes.

* env var changes.

* env var changes.

* Update DOCKER_IMAGE var.

* Update DOCKER_IMAGE var in kaniko cmd.

* Update kaniko destination line.

* Update kaniko destination line.

* Moree variable madness.

* Programatically remove quotes from version tag.

* dug dump concepts api created and tested (#229)

Co-authored-by: Nathan Braswell <[email protected]>

* Update _version.py (#234)

* Version changes + separate build and publish.

* Semantic versioning prep.

* Add develop and master versioning and tagging.

* Ncpi index fix (#232)

* Renamed anvil to ncpi

* Update ncpi datasets catalog

* Modified script to download NCPI datasets into platform subfolders

* Updated NCPI integration dataset

* Removed unused variable

* Removed ncpi top level folder to spread results among subfolders

* Change output dir to data instead of ncpi subdir

* Moved NCPI subdirs into main data folder for ingest as per Yaphet's request

Co-authored-by: Alex Waldrop <[email protected]>

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>
Co-authored-by: Howard Lander <[email protected]>
Co-authored-by: Alex Waldrop <[email protected]>
Co-authored-by: Charles Bennett <[email protected]>
Co-authored-by: Nathaniel Braswell <[email protected]>
Co-authored-by: Nathan Braswell <[email protected]>
Co-authored-by: cnbennett3 <[email protected]>
Co-authored-by: Alex Waldrop <[email protected]>

* Release/2.9.4 (#260)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Display es scores (#199)

* Include ES scores in variable results

* Round ES score to 6

* Update _version.py (#200)

* Dev version bump (#202)

* Release/2.8.0 (#198)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Update _version.py

* Update _version.py

updating version for final push to master

* Update factory.py

Adding more comments

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>

* Release/v2.9.0 (#201)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Display es scores (#199)

* Include ES scores in variable results

* Round ES score to 6

* Update _version.py (#200)

* Update _version.py

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* Attribute mapping from node to dug element (#203)

* Release/2.8.0 (#198)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Update _version.py

* Update _version.py

updating version for final push to master

* Update factory.py

Adding more comments

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>

* Release/v2.9.0 (#201)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Display es scores (#199)

* Include ES scores in variable results

* Round ES score to 6

* Update _version.py (#200)

* Update _version.py

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* adding more config options for node extraction

* some refactoring

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* Changed DbGaP to SPARC in the scicrunch parser (#204)

* Anvil (#207)

* Added updated anvil dataset catalog

* Added script for downloading all anvil data dicts

* Added current anvil data dictionaries to data folder to be used for indexing

* Anvil parser (#208)

* Release/2.8.0 (#198)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Update _version.py

* Update _version.py

updating version for final push to master

* Update factory.py

Adding more comments

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>

* Release/v2.9.0 (#201)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Display es scores (#199)

* Include ES scores in variable results

* Round ES score to 6

* Update _version.py (#200)

* Update _version.py

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* anvil parser

* bump number of files test

* Update dbgap_parser.py

* Update anvil_dbgap_parser.py

change to AnVIL

* Update test_parsers.py

update test

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* Initial Kaniko build.

* Move version file definition.

* Quote env vars.

* Update env vars.

* Update env vars.

* Update env vars.

* env var changes.

* env var changes.

* env var changes.

* env var changes.

* Update DOCKER_IMAGE var.

* Update DOCKER_IMAGE var in kaniko cmd.

* Update kaniko destination line.

* Update kaniko destination line.

* Moree variable madness.

* Programatically remove quotes from version tag.

* dug dump concepts api created and tested (#229)

Co-authored-by: Nathan Braswell <[email protected]>

* Update _version.py (#234)

* Version changes + separate build and publish.

* Semantic versioning prep.

* Add develop and master versioning and tagging.

* Bump version.

* Revert version to dug format.

* Ncpi index fix (#232)

* Renamed anvil to ncpi

* Update ncpi datasets catalog

* Modified script to download NCPI datasets into platform subfolders

* Updated NCPI integration dataset

* Removed unused variable

* Removed ncpi top level folder to spread results among subfolders

* Change output dir to data instead of ncpi subdir

* Moved NCPI subdirs into main data folder for ingest as per Yaphet's request

Co-authored-by: Alex Waldrop <[email protected]>

* Add github creds env var.

* Fix version typo.

* Initial commit

* Reduce ephemeral storage limits and requests

* More parsers (#248)

* Release/2.8.0 (#198)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Update _version.py

* Update _version.py

updating version for final push to master

* Update factory.py

Adding more comments

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>

* Release/v2.9.0 (#201)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Display es scores (#199)

* Include ES scores in variable results

* Round ES score to 6

* Update _version.py (#200)

* Update _version.py

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* consolidate dbgap format parser in single file , adds crdc and kfdrc parsers

* adding tests

* bump version

* parser when versions of studies are > 9

* test for version

* fix long text issues, and encoding errors

* nltk initialization

* change nltk approach for sliding window

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* update version

* remove cruft from merge

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>
Co-authored-by: Howard Lander <[email protected]>
Co-authored-by: Alex Waldrop <[email protected]>
Co-authored-by: Charles Bennett <[email protected]>
Co-authored-by: Nathaniel Braswell <[email protected]>
Co-authored-by: Nathan Braswell <[email protected]>
Co-authored-by: cnbennett3 <[email protected]>
Co-authored-by: Alex Waldrop <[email protected]>
Co-authored-by: Hoid <[email protected]>

* version bump

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>
Co-authored-by: Howard Lander <[email protected]>
Co-authored-by: Alex Waldrop <[email protected]>
Co-authored-by: Charles Bennett <[email protected]>
Co-authored-by: Nathaniel Braswell <[email protected]>
Co-authored-by: Nathan Braswell <[email protected]>
Co-authored-by: cnbennett3 <[email protected]>
Co-authored-by: Alex Waldrop <[email protected]>
Co-authored-by: Hoid <[email protected]>

* Sprint (#264)

* Release/2.8.0 (#198)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Update _version.py

* Update _version.py

updating version for final push to master

* Update factory.py

Adding more comments

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>

* Release/v2.9.0 (#201)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Display es scores (#199)

* Include ES scores in variable results

* Round ES score to 6

* Update _version.py (#200)

* Update _version.py

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* Release/2.9.1 (#205)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Display es scores (#199)

* Include ES scores in variable results

* Round ES score to 6

* Update _version.py (#200)

* Dev version bump (#202)

* Release/2.8.0 (#198)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Update _version.py

* Update _version.py

updating version for final push to master

* Update factory.py

Adding more comments

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>

* Release/v2.9.0 (#201)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Display es scores (#199)

* Include ES scores in variable results

* Round ES score to 6

* Update _version.py (#200)

* Update _version.py

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* Attribute mapping from node to dug element (#203)

* Release/2.8.0 (#198)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Update _version.py

* Update _version.py

updating version for final push to master

* Update factory.py

Adding more comments

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>

* Release/v2.9.0 (#201)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Display es scores (#199)

* Include ES scores in variable results

* Round ES score to 6

* Update _version.py (#200)

* Update _version.py

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* adding more config options for node extraction

* some refactoring

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* Changed DbGaP to SPARC in the scicrunch parser (#204)

* Release/2.9.1

Renames SPARC datasets as SPARC instead of dbgap

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>
Co-authored-by: Howard Lander <[email protected]>

* Update _version.py (#206)

* Release/2.9.2 (#209)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Display es scores (#199)

* Include ES scores in variable results

* Round ES score to 6

* Update _version.py (#200)

* Dev version bump (#202)

* Release/2.8.0 (#198)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Update _version.py

* Update _version.py

updating version for final push to master

* Update factory.py

Adding more comments

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>

* Release/v2.9.0 (#201)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Display es scores (#199)

* Include ES scores in variable results

* Round ES score to 6

* Update _version.py (#200)

* Update _version.py

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* Attribute mapping from node to dug element (#203)

* Release/2.8.0 (#198)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Update _version.py

* Update _version.py

updating version for final push to master

* Update factory.py

Adding more comments

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>

* Release/v2.9.0 (#201)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Display es scores (#199)

* Include ES scores in variable results

* Round ES score to 6

* Update _version.py (#200)

* Update _version.py

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* adding more config options for node extraction

* some refactoring

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* Changed DbGaP to SPARC in the scicrunch parser (#204)

* Anvil (#207)

* Added updated anvil dataset catalog

* Added script for downloading all anvil data dicts

* Added current anvil data dictionaries to data folder to be used for indexing

* Anvil parser (#208)

* Release/2.8.0 (#198)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Update _version.py

* Update _version.py

updating version for final push to master

* Update factory.py

Adding more comments

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>

* Release/v2.9.0 (#201)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Display es scores (#199)

* Include ES scores in variable results

* Round ES score to 6

* Update _version.py (#200)

* Update _version.py

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* anvil parser

* bump number of files test

* Update dbgap_parser.py

* Update anvil_dbgap_parser.py

change to AnVIL

* Update test_parsers.py

update test

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>
Co-authored-by: Howard Lander <[email protected]>
Co-authored-by: Alex Waldrop <[email protected]>

* Release 2.9.3 (#244)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Display es scores (#199)

* Include ES scores in variable results

* Round ES score to 6

* Update _version.py (#200)

* Dev version bump (#202)

* Release/2.8.0 (#198)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Update _version.py

* Update _version.py

updating version for final push to master

* Update factory.py

Adding more comments

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>

* Release/v2.9.0 (#201)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Display es scores (#199)

* Include ES scores in variable results

* Round ES score to 6

* Update _version.py (#200)

* Update _version.py

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* Attribute mapping from node to dug element (#203)

* Release/2.8.0 (#198)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Update _version.py

* Update _version.py

updating version for final push to master

* Update factory.py

Adding more comments

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>

* Release/v2.9.0 (#201)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Display es scores (#199)

* Include ES scores in variable results

* Round ES score to 6

* Update _version.py (#200)

* Update _version.py

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* adding more config options for node extraction

* some refactoring

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* Changed DbGaP to SPARC in the scicrunch parser (#204)

* Anvil (#207)

* Added updated anvil dataset catalog

* Added script for downloading all anvil data dicts

* Added current anvil data dictionaries to data folder to be used for indexing

* Anvil parser (#208)

* Release/2.8.0 (#198)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Update _version.py

* Update _version.py

updating version for final push to master

* Update factory.py

Adding more comments

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>

* Release/v2.9.0 (#201)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Display es scores (#199)

* Include ES scores in variable results

* Round ES score to 6

* Update _version.py (#200)

* Update _version.py

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* anvil parser

* bump number of files test

* Update dbgap_parser.py

* Update anvil_dbgap_parser.py

change to AnVIL

* Update test_parsers.py

update test

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* Initial Kaniko build.

* Move version file definition.

* Quote env vars.

* Update env vars.

* Update env vars.

* Update env vars.

* env var changes.

* env var changes.

* env var changes.

* env var changes.

* Update DOCKER_IMAGE var.

* Update DOCKER_IMAGE var in kaniko cmd.

* Update kaniko destination line.

* Update kaniko destination line.

* Moree variable madness.

* Programatically remove quotes from version tag.

* dug dump concepts api created and tested (#229)

Co-authored-by: Nathan Braswell <[email protected]>

* Update _version.py (#234)

* Version changes + separate build and publish.

* Semantic versioning prep.

* Add develop and master versioning and tagging.

* Ncpi index fix (#232)

* Renamed anvil to ncpi

* Update ncpi datasets catalog

* Modified script to download NCPI datasets into platform subfolders

* Updated NCPI integration dataset

* Removed unused variable

* Removed ncpi top level folder to spread results among subfolders

* Change output dir to data instead of ncpi subdir

* Moved NCPI subdirs into main data folder for ingest as per Yaphet's request

Co-authored-by: Alex Waldrop <[email protected]>

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>
Co-authored-by: Howard Lander <[email protected]>
Co-authored-by: Alex Waldrop <[email protected]>
Co-authored-by: Charles Bennett <[email protected]>
Co-authored-by: Nathaniel Braswell <[email protected]>
Co-authored-by: Nathan Braswell <[email protected]>
Co-authored-by: cnbennett3 <[email protected]>
Co-authored-by: Alex Waldrop <[email protected]>

* Release/2.9.4 (#260)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Display es scores (#199)

* Include ES scores in variable results

* Round ES score to 6

* Update _version.py (#200)

* Dev version bump (#202)

* Release/2.8.0 (#198)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Update _version.py

* Update _version.py

updating version for final push to master

* Update factory.py

Adding more comments

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>

* Release/v2.9.0 (#201)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Display es scores (#199)

* Include ES scores in variable results

* Round ES score to 6

* Update _version.py (#200)

* Update _version.py

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* Attribute mapping from node to dug element (#203)

* Release/2.8.0 (#198)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Update _version.py

* Update _version.py

updating version for final push to master

* Update factory.py

Adding more comments

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>

* Release/v2.9.0 (#201)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Display es scores (#199)

* Include ES scores in variable results

* Round ES score to 6

* Update _version.py (#200)

* Update _version.py

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* adding more config options for node extraction

* some refactoring

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* Changed DbGaP to SPARC in the scicrunch parser (#204)

* Anvil (#207)

* Added updated anvil dataset catalog

* Added script for downloading all anvil data dicts

* Added current anvil data dictionaries to data folder to be used for indexing

* Anvil parser (#208)

* Release/2.8.0 (#198)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Update _version.py

* Update _version.py

updating version for final push to master

* Update factory.py

Adding more comments

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>

* Release/v2.9.0 (#201)

* Bumping version

* support for extracting dug elements from graph (#197)

* support for extracting dug elements from graph

* adding flag for enabling dug element extraction from graph

* adding new config for node_to dug element parsing

* adding more parameters to crawler to able configuration to element extraction logic

* add tests

* add tests for crawler

Co-authored-by: Yaphetkg <[email protected]>

* Display es scores (#199)

* Include ES scores in variable results

* Round ES score to 6

* Update _version.py (#200)

* Update _version.py

Co-authored-by: Carl Schreep <[email protected]>
Co-authored-by: Yaphetkg <[email protected]>
Co-authored-by: Ginnie Hench <[email protected]>

* anvil parser

* bump number of files test

* Update dbgap_parser.py

* Update anvil_dbgap_parse…
  • Loading branch information
13 people authored Jan 24, 2023
1 parent 63f2b01 commit 1a714e1
Show file tree
Hide file tree
Showing 14 changed files with 212 additions and 37 deletions.
9 changes: 9 additions & 0 deletions .coveragerc
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
[run]
include =
src/**

omit =
**/__init__.py
**/_version.py
src/dug/config.py
src/dug/hookspecs.py
81 changes: 81 additions & 0 deletions .githooks/commit-msg
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
#!/bin/bash

# set -xe

script_name=`basename "$0"`

# Make sure stdin is open--it's sometimes open when we come in here
exec 0< /dev/tty

SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )

# Get user's original commit text.
orig_msg=$(awk '!/^#/{print}' $1);

# Test for commit type/breaking change info already in commit text.
# This can occur if the user entered command using "-m" and the
# pre-commmit hook already handled getting this info.

rgx="^(feat:|fix:|test:|doc:|Merge).*"
if grep -E $rgx <<< "$orig_msg" > /dev/null 2>&1; then
echo "$script_name: Commit message passes commmit-type check."
echo "--------------------------------------------------------------------------------"
echo "$orig_msg"
echo "--------------------------------------------------------------------------------"
exit 0
fi

echo
echo 'Please select the type of your commmit from the following list:'
echo
nl $SCRIPT_DIR/commit_types.txt | awk '{ print $1, $2 }'
echo
count=$(wc -l $SCRIPT_DIR/commit_types.txt | awk '{ print $1 }')
n=0
while true; do
read -p 'Select option: ' n
# If $n is an integer between one and $count...
if [[ "$n" -eq $n ]] &&
[[ "$n" -gt 0 ]] &&
[[ "$n" -le "$count" ]]; then
break
fi
done

commit_type="$(sed -n "${n}p" $SCRIPT_DIR/commit_types.txt | awk '{ print $2 }')"
echo
yesno="n"
if [ $commit_type == "feat" ] || [ $commit_type == "fix" ]; then
while true; do
read -p 'Is this a breaking change [y/n]: ' yn
yesno="$(tr [A-Z] [a-z] <<< "$yn")"
if [ "$yesno" == "y" ] || [ "$yesno" == "yes" ] || \
[ "$yesno" == "n" ] || [ "$yesno" == "no" ]; then
break
fi
done
fi

breaking_commit=0
if [ "$yesno" == "y" ] || [ "$yesno" == "yes" ]; then
breaking_commit=1
fi

NL=$'\n'
if [ "$breaking_commit" -eq 1 ]; then
# Explicitly state it's a breaking change
msg="$msg${NL}${NL}BREAKING CHANGE"
fi

# Prepend commit type to original commit text.
new_msg="${commit_type}: $orig_msg";
echo $new_msg > $1;

echo
echo "Updated message:"
echo
echo "--------------------------------------------------------------------------------"
echo "$new_msg"
echo "--------------------------------------------------------------------------------"
echo
exit 0
4 changes: 4 additions & 0 deletions .githooks/commit_types.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
feature feat
fix fix
test test
documentation doc
1 change: 1 addition & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ COPY --chown=$USER . dug/
WORKDIR $HOME/dug

RUN make install
RUN make install.dug

# Run it
ENTRYPOINT dug
49 changes: 19 additions & 30 deletions Jenkinsfile
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
library 'pipeline-utils@master'

CCV = ""

pipeline {
agent {
kubernetes {
Expand Down Expand Up @@ -56,30 +58,33 @@ spec:
GITHUB_CREDS = credentials("${env.GITHUB_CREDS_ID_STR}")
REGISTRY = "${env.REGISTRY}"
REG_OWNER="helxplatform"
REG_APP="dug"
REPO_NAME="dug"
COMMIT_HASH="${sh(script:"git rev-parse --short HEAD", returnStdout: true).trim()}"
VERSION_FILE="src/dug/_version.py"
VERSION="${sh(script:'awk \'{ print $3 }\' src/dug/_version.py | xargs', returnStdout: true).trim()}"
IMAGE_NAME="${REGISTRY}/${REG_OWNER}/${REG_APP}"
TAG1="$BRANCH_NAME"
TAG2="$COMMIT_HASH"
TAG3="$VERSION"
TAG4="latest"
IMAGE_NAME="${REGISTRY}/${REG_OWNER}/${REPO_NAME}"
}
stages {
stage('Build') {
steps {
script {
container(name: 'go', shell: '/bin/bash') {
if (BRANCH_NAME.equals("master")) {
CCV = go.ccv()
}
}
container(name: 'kaniko', shell: '/busybox/sh') {
kaniko.build("./Dockerfile", ["$IMAGE_NAME:$TAG1", "$IMAGE_NAME:$TAG2", "$IMAGE_NAME:$TAG3", "$IMAGE_NAME:$TAG4"])
def tagsToPush = ["$IMAGE_NAME:$BRANCH_NAME", "$IMAGE_NAME:$COMMIT_HASH"]
if (CCV != null && !CCV.trim().isEmpty() && BRANCH_NAME.equals("master")) {
tagsToPush.add("$IMAGE_NAME:$CCV")
tagsToPush.add("$IMAGE_NAME:latest")
} else if (BRANCH_NAME.equals("develop")) {
def now = new Date()
def currTimestamp = now.format("yyyy-MM-dd'T'HH.mm'Z'", TimeZone.getTimeZone('UTC'))
tagsToPush.add("$IMAGE_NAME:$currTimestamp")
}
kaniko.buildAndPush("./Dockerfile", tagsToPush)
}
}
}
post {
always {
archiveArtifacts artifacts: 'image.tar', onlyIfSuccessful: true
}
}
}
stage('Test') {
steps {
Expand All @@ -88,21 +93,5 @@ spec:
'''
}
}
stage('Publish') {
steps {
script {
container(name: 'crane', shell: '/busybox/sh') {
def imageTagsPushAlways = ["$IMAGE_NAME:$TAG1", "$IMAGE_NAME:$TAG2"]
def imageTagsPushForDevelopBranch = ["$IMAGE_NAME:$TAG3"]
def imageTagsPushForMasterBranch = ["$IMAGE_NAME:$TAG4"]
image.publish(
imageTagsPushAlways,
imageTagsPushForDevelopBranch,
imageTagsPushForMasterBranch
)
}
}
}
}
}
}
14 changes: 13 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ DOCKER_OWNER = helxplatform
DOCKER_APP = dug
DOCKER_TAG = ${VERSION}
DOCKER_IMAGE = ${DOCKER_OWNER}/${DOCKER_APP}:$(DOCKER_TAG)
export PYTHONPATH = $(shell echo ${PWD})/src

.DEFAULT_GOAL = help

Expand All @@ -15,6 +16,11 @@ DOCKER_IMAGE = ${DOCKER_OWNER}/${DOCKER_APP}:$(DOCKER_TAG)
help:
@grep -E '^#[a-zA-Z\.\-]+:.*$$' $(MAKEFILE_LIST) | tr -d '#' | awk 'BEGIN {FS = ": "}; {printf "\033[36m%-20s\033[0m %s\n", $$1, $$2}'

init:
git --version
echo "Please make sure your git version is greater than 2.9.0. If it's not, this command will fail."
git config --local core.hooksPath .githooks/

#clean: Remove old build artifacts and installed packages
clean:
rm -rf build
Expand All @@ -27,13 +33,19 @@ clean:
install:
${PYTHON} -m pip install --upgrade pip
${PYTHON} -m pip install -r requirements.txt

#install.dug: Install dug as a library to the current Python environment.
install.dug:
${PYTHON} -m pip install .

#test: Run all tests
test:
# ${PYTHON} -m flake8 src
${PYTHON} -m pytest --doctest-modules src
${PYTHON} -m pytest tests
coverage run -m pytest tests

coverage:
coverage report

#build: Build Docker image
build:
Expand Down
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ To achieve this, we annotate study metadata with terms from [biomedical ontologi

## Quickstart

NOTE: You must run `make init` once you've cloned the repo to enable the commit-msg git hook so that conventional commits will apply automatically.

To install Dug in your environment , run `make install`. Alternatively,

```shell
Expand Down
2 changes: 1 addition & 1 deletion src/dug/_version.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = "2.9.5"
__version__ = "2.9.6"
4 changes: 3 additions & 1 deletion src/dug/core/parsers/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
from .topmed_tag_parser import TOPMedTagParser
from .topmed_csv_parser import TOPMedCSVParser
from .sprint_parser import SPRINTParser
from .bacpac_parser import BACPACParser


logger = logging.getLogger('dug')
Expand All @@ -28,6 +29,7 @@ def define_parsers(parser_dict: Dict[str, Parser]):
parser_dict["crdc"] = CRDCDbGaPParser()
parser_dict["kfdrc"] = KFDRCDbGaPParser()
parser_dict["sprint"] = SPRINTParser()
parser_dict["bacpac"] = BACPACParser()


class ParserNotFoundException(Exception):
Expand All @@ -46,4 +48,4 @@ def get_parser(hook, parser_name) -> Parser:
err_msg = f"Cannot find parser of type '{parser_name}'\n" \
f"Supported parsers: {', '.join(available_parsers.keys())}"
logger.error(err_msg)
raise ParserNotFoundException(err_msg)
raise ParserNotFoundException(err_msg)
48 changes: 48 additions & 0 deletions src/dug/core/parsers/bacpac_parser.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
import logging
from typing import List
from xml.etree import ElementTree as ET

from dug import utils as utils
from ._base import DugElement, FileParser, Indexable, InputFile

logger = logging.getLogger('dug')


class BACPACParser(FileParser):
# Class for parsing BACPAC data dictionaries in dbGaP XML format into a set of Dug Elements.

@staticmethod
def parse_study_name_from_filename(filename: str):
# Parse the form name from the xml filename
return filename.split('/')[-1].replace('.xml', '')

def __call__(self, input_file: InputFile) -> List[Indexable]:
logger.debug(input_file)
tree = ET.parse(input_file)
root = tree.getroot()
study_id = root.attrib['study_id']

# Parse study name from file handle
study_name = self.parse_study_name_from_filename(str(input_file))

if study_name is None:
err_msg = f"Unable to parse BACPAC Form name from data dictionary: {input_file}!"
logger.error(err_msg)
raise IOError(err_msg)

elements = []
for variable in root.iter('variable'):
description = variable.find('description').text or ""
elem = DugElement(elem_id=f"{variable.attrib['id']}",
name=variable.find('name').text,
desc=description.lower(),
elem_type="BACPAC",
collection_id=f"{study_id}",
collection_name=study_name)

# Add to set of variables
logger.debug(elem)
elements.append(elem)

# You don't actually create any concepts
return elements
1 change: 1 addition & 0 deletions src/dug/core/parsers/dbgap_parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,3 +72,4 @@ def _get_element_type(self):
class KFDRCDbGaPParser(DbGaPParser):
def _get_element_type(self):
return "Kids First"

3 changes: 1 addition & 2 deletions src/dug/core/parsers/sprint_parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@


class SPRINTParser(FileParser):
# Class for parsers NIDA Data dictionary into a set of Dug Elements
# Class for parsers SPRINT Data dictionary into a set of Dug Elements

@staticmethod
def parse_study_name_from_filename(filename: str):
Expand Down Expand Up @@ -41,7 +41,6 @@ def __call__(self, input_file: InputFile) -> List[Indexable]:
collection_id=f"{study_id}",
collection_name=study_name)

# Create NIDA links as study/variable actions
# Add to set of variables
logger.debug(elem)
elements.append(elem)
Expand Down
17 changes: 17 additions & 0 deletions tests/integration/data/bacpac_baseline_do_measures.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
<data_table id="bacpac_baseline_do_measures" study_id="bacpac_baseline_do_measures" study_name="bacpac_baseline_do_measures">
<variable id="record_id.demographic_and_baseline_characteristic_core_data_elements">
<name>record_id.demographic_and_baseline_characteristic_core_data_elements</name>
<description>Record ID</description>
<type>nan</type>
</variable>
<variable id="dob">
<name>dob</name>
<description>Date of birth:</description>
<type>text</type>
</variable>
<variable id="age">
<name>age</name>
<description>Age:</description>
<type>text</type>
</variable>
</data_table>
14 changes: 12 additions & 2 deletions tests/integration/test_parsers.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
from dug.core.parsers import DbGaPParser, NIDAParser, TOPMedTagParser, SciCrunchParser, AnvilDbGaPParser,\
CRDCDbGaPParser, KFDRCDbGaPParser, SPRINTParser
CRDCDbGaPParser, KFDRCDbGaPParser, SPRINTParser, BACPACParser
from tests.integration.conftest import TEST_DATA_DIR

def test_dbgap_parse_study_name_from_filename():
Expand Down Expand Up @@ -107,4 +107,14 @@ def test_sprint_parser():

def test_sprint_parser_form_name():
filename = "/opt/***/share/data/dug/input_files/sprint/adolescent_sleep_wake_scale_short_form_aswssf.xml"
assert SPRINTParser.parse_study_name_from_filename(filename) == "adolescent_sleep_wake_scale_short_form_aswssf"
assert SPRINTParser.parse_study_name_from_filename(filename) == "adolescent_sleep_wake_scale_short_form_aswssf"

def test_bacpac_parser():
parser = BACPACParser()
parse_file = str(TEST_DATA_DIR / "bacpac_baseline_do_measures.xml")
elements = parser(parse_file)
assert len(elements) == 3
for element in elements:
assert element.type == "BACPAC"
element_names = [e.name for e in elements]
assert "record_id.demographic_and_baseline_characteristic_core_data_elements" in element_names

0 comments on commit 1a714e1

Please sign in to comment.