Feature/vocab guided query expansion #1544
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Depends on libris/definitions#508 and libris/lxlviewer#1194.
The main objective with this is to get rid of obviously bad hardcodings and make the whole query expansion machinery more consistent. Apart from some general cleanup/refactoring the main improvements are:
:integral
that are applicable to the queried type(s).Example query:
type:Work yearPublished:x
There are three
:integral
properties (:translationOf
,:exactMatch
,:hasInstance
) applicable to:Work
which, if we didn't know the domain of:yearPublished
, would give us these alternative paths for:yearPublished
:translationOf.yearPublished
exactMatch.yearPublished
hasInstance.yearPublished
We can however ignore both
:translationOf
and:exactMatch
since we know that:yearPublished :domain :Instance
and:Instance
is not in range of:translationOf
and:exactMatch
, i.e. we won't find:yearPublished
on any resource that these properties point to. So the only path that will be checked ishasInstance.yearPublished
(@reverse.instanceOf.publication.year
when fully expanded).(type:Work x) OR (type:Instance p1:v1) OR (type:Agent p2:v2)
where field boosting for
x
is based on the:Work
type and expansion ofp1
andp2
is based on:Instance
and:Agent
respectively.librisxl/whelk-core/src/main/groovy/whelk/search2/querytree/Group.java
Line 110 in 8d3fa03
If for example the query is
type:(Electronic Instance)
thenInstance
is removed. Just like in the old search:librisxl/whelk-core/src/main/groovy/whelk/search/ESQuery.groovy
Line 440 in bb09334
:category :shorthand
in definition for knowing which properties should be expanded viaowl:propertyChainAxiom
. a16fcb3librisxl/whelk-core/src/main/groovy/whelk/search2/querytree/PropertyValue.java
Line 76 in 8d3fa03
We only did this for Work and Instance before (hardcoded).
Eventually we'll need to make this optional.
hasInstanceType
toFormat
(Feature/update filter label mappings lxlviewer#1194) is enough to get the filter headers right at least when searching for works which is most important at this point. We can figure something out for instance search later.Disambiguate
class was also a mess so I decided to give it a proper makeover. At least is less messy now.Can't guarantee that this won't break anything. Planned to add more tests but ran out of time. Seems to work when I've tested myself.