Skip to content

docanalysis flags

Shweata N. Hegde edited this page Feb 11, 2022 · 5 revisions

docanalysis flags

  • --run_pygetpapers -> queries EuropePMC via pygetpapers --query --hits and project_name required
  • --query -> query to pygetpapers
  • -hits -> numbers of papers to download from pygetpapers
  • --project_name -> name of CProject folder (if pygetpapers is already run) or
  • --section -> section of paper from which you want to extract information (entities and/or key phrases)
  • --dictionary -> ami dictionary containing terms to narrow down sentences to extract entities from (semi-supervised entity extraction)
  • --entity_extraction -> extracts specified entities chosen from a list of entities (CARDINAL, DATE, EVENT, FAC, GPE, LANGUAGE, LAW, LOC, MONEY, NORP, ORDINAL, ORG, PERCENT, PERSON, PRODUCT, QUANTITY, TIME, WORK_OF_ART, GGP, SO, TAXON, CHEBI, GO, CL)
  • --key_phrase_extraction -> gives a list of keyphrases extracted either from specific sections or sentences for each paper in CProject. (unsupervised keyphrase extraction)
  • --make_ami_dict -> makes separate dictionaries from keyphrases and entities extracted. Merges duplicate entries into one

MIGHT NOT IMPLELEMENT IMMEDIATELY

  • --output_format-> specify the output format (csv, json), defaults to csv
  • --demo -> runs a demo entity (GPE and ORG) and key phrase extraction