-
Notifications
You must be signed in to change notification settings - Fork 5
Refactoring `search`
Currently ami search
is a monolithic command which carries out:
- norma-transformation if required
- word frequencies
- searches by dictionary
- specialist syntactic searches (
gene
,species
,acronyms
,regex
,identifiers
. These have all stopped working. They should be restored.) - datatables analysis which includes
- bibliography
- links to sources (e.g. EPMC)
- analysis of
results/
folder - analysis of
words
folder - links to Wikipedia (not yet Wikidata)
- cooccurrence and frequency plots
This was all done through an AMIArgProcessor
commandline which is now being gradually obsoleted.
There are an increasing number of transformations needed so these shouls be under picocli
control. Maybe re-institute norma
/ ami transform
This has been done but not customised (i.e. cannot easily change stopwords, remove junk. Also linking to Wikipedia is very crude and misleading.
The legacy search must be kept, but should have no triggering of ami words
or datatables
or cooccurrence
. It should simply build the results/
folders. The search includes the following:
- lowercasing
- stopwords
- stemming
- phrases ("trailing words")
These need mending
This has been partially done. It needs better control over Wikidata, icons, etc.
This has been partially done but not customised.