-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Zentity not honoring nesting #115
Comments
I'm running into the same issue. zentity version: 1.6.1 |
Commenting to follow this issue. |
Update: Along the way, I found some past solutions for nested fields - in the |
I believe I've found the issue. I don't think it's a bug, because zentity is execuing the queries correctly. What's needed is an enhancement for supporting the When I execute your resolution job with the "_explanation": {
"resolvers": {
"street_zip": {
"attributes": [
"address.full_street",
"address.zip"
]
}
},
"matches": [
{
"attribute": "address.full_street",
"target_field": "Addresses.FullStreet.clean",
"target_value": [
"123 Main St",
"567 North St"
],
"input_value": "567 North St",
"input_matcher": "simple_nested_address_street",
"input_matcher_params": {}
},
{
"attribute": "address.zip",
"target_field": "Addresses.Zip",
"target_value": [
"02632",
"22201"
],
"input_value": "02632",
"input_matcher": "simple_nested_address_zip",
"input_matcher_params": {}
}
]
} We can see how the You can see what queries zentity is constructing by passing the
Let's copy the query clause that was constructed in this resolution job for the GET person_list/_search
{
"query": {
"bool": {
"_name": "address.full_street:Addresses.FullStreet.clean:simple_nested_addresses:NTY3IE5vcnRoIFN0:0",
"filter": {
"nested": {
"path": "Addresses",
"query": {
"match": {
"Addresses.FullStreet.clean": "567 North St"
}
}
}
}
}
}
} Here's the response: {
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 0.0,
"hits": [
{
"_index": "person_list",
"_id": "1",
"_score": 0.0,
"_source": {
"Names": [
{
"FirstName": "Jane",
"MiddleName": "D",
"LastName": "Smith"
}
],
"Addresses": [
{
"FullStreet": "123 Main St",
"City": "Barnstable",
"State": "MA",
"Zip": "02632",
"Zip4": ""
},
{
"FullStreet": "567 North St",
"City": "Arlington",
"State": "VA",
"Zip": "22201"
}
]
},
"matched_queries": [
"address.full_street:Addresses.FullStreet.clean:simple_nested_addresses:NTY3IE5vcnRoIFN0:0"
]
}
]
}
} When zentity receives this hit, it sees a single document that has two values for However, zentity could determine which nested object matched if we added GET person_list/_search
{
"query": {
"bool": {
"_name": "address.full_street:Addresses.FullStreet.clean:simple_nested_addresses:NTY3IE5vcnRoIFN0:0",
"filter": {
"nested": {
"path": "Addresses",
"query": {
"match": {
"Addresses.FullStreet.clean": "567 North St"
}
},
"inner_hits": {}
}
}
}
}
} Here's the response: {
"took": 15,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 0.0,
"hits": [
{
"_index": "person_list",
"_id": "1",
"_score": 0.0,
"_source": {
"Names": [
{
"FirstName": "Jane",
"MiddleName": "D",
"LastName": "Smith"
}
],
"Addresses": [
{
"FullStreet": "123 Main St",
"City": "Barnstable",
"State": "MA",
"Zip": "02632",
"Zip4": ""
},
{
"FullStreet": "567 North St",
"City": "Arlington",
"State": "VA",
"Zip": "22201"
}
]
},
"matched_queries": [
"address.full_street:Addresses.FullStreet.clean:simple_nested_addresses:NTY3IE5vcnRoIFN0:0"
],
"inner_hits": {
"Addresses": {
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 0.6931471,
"hits": [
{
"_index": "person_list",
"_id": "1",
"_nested": {
"field": "Addresses",
"offset": 1
},
"_score": 0.6931471,
"_source": {
"FullStreet": "567 North St",
"City": "Arlington",
"State": "VA",
"Zip": "22201"
}
}
]
}
}
}
}
]
}
} Notice the response lists only the street address of Unfortunately, you can't safely add
You can see that by trying to run this query, which attempts to search with GET person_list/_search
{
"query": {
"bool": {
"filter": [
{
"bool": {
"_name": "address.full_street:Addresses.FullStreet.clean:simple_nested_addresses:NTY3IE5vcnRoIFN0:0",
"filter": {
"nested": {
"path": "Addresses",
"query": {
"match": {
"Addresses.FullStreet.clean": "567 North St"
}
},
"inner_hits": {}
}
}
}
},
{
"bool": {
"_name": "address.zip:Addresses.Zip:simple_nested_addresses:MDI2MzI=:1",
"filter": {
"nested": {
"path": "Addresses",
"query": {
"match": {
"Addresses.Zip": "02632"
}
},
"inner_hits": {}
}
}
}
}
]
}
}
} What would safely work is if each GET person_list/_search
{
"query": {
"bool": {
"filter": [
{
"bool": {
"_name": "address.full_street:Addresses.FullStreet.clean:simple_nested_addresses:NTY3IE5vcnRoIFN0:0",
"filter": {
"nested": {
"path": "Addresses",
"query": {
"match": {
"Addresses.FullStreet.clean": "567 North St"
}
},
"inner_hits": {
"name": "address.full_street:Addresses.FullStreet.clean:simple_nested_addresses:NTY3IE5vcnRoIFN0:0"
}
}
}
}
},
{
"bool": {
"_name": "address.zip:Addresses.Zip:simple_nested_addresses:MDI2MzI=:1",
"filter": {
"nested": {
"path": "Addresses",
"query": {
"match": {
"Addresses.Zip": "02632"
}
},
"inner_hits": {
"name": "address.zip:Addresses.Zip:simple_nested_addresses:MDI2MzI=:1"
}
}
}
}
}
]
}
}
} Here's the response: {
"took": 7,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 0.0,
"hits": [
{
"_index": "person_list",
"_id": "1",
"_score": 0.0,
"_source": {
"Names": [
{
"FirstName": "Jane",
"MiddleName": "D",
"LastName": "Smith"
}
],
"Addresses": [
{
"FullStreet": "123 Main St",
"City": "Barnstable",
"State": "MA",
"Zip": "02632",
"Zip4": ""
},
{
"FullStreet": "567 North St",
"City": "Arlington",
"State": "VA",
"Zip": "22201"
}
]
},
"matched_queries": [
"address.full_street:Addresses.FullStreet.clean:simple_nested_addresses:NTY3IE5vcnRoIFN0:0",
"address.zip:Addresses.Zip:simple_nested_addresses:MDI2MzI=:1"
],
"inner_hits": {
"address.full_street:Addresses.FullStreet.clean:simple_nested_addresses:NTY3IE5vcnRoIFN0:0": {
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 0.6931471,
"hits": [
{
"_index": "person_list",
"_id": "1",
"_nested": {
"field": "Addresses",
"offset": 1
},
"_score": 0.6931471,
"_source": {
"FullStreet": "567 North St",
"City": "Arlington",
"State": "VA",
"Zip": "22201"
}
}
]
}
},
"address.zip:Addresses.Zip:simple_nested_addresses:MDI2MzI=:1": {
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 0.6931471,
"hits": [
{
"_index": "person_list",
"_id": "1",
"_nested": {
"field": "Addresses",
"offset": 0
},
"_score": 0.6931471,
"_source": {
"FullStreet": "123 Main St",
"City": "Barnstable",
"State": "MA",
"Zip": "02632",
"Zip4": ""
}
}
]
}
}
}
}
]
}
} Notice that there are two sets of Unfortunately, this isn't possible in zentity yet. The matcher clause would need to support a variable reference to the clause_name, which is generated at query time. An implementation would look something like this: {
...,
"simple_nested_addresses": {
"clause": {
"nested": {
"path": "Addresses",
"query": {
"match": {
"{{ field }}": "{{ value }}"
}
},
"inner_hits": {
"name": "{{ clause_name }}" // or simply "_name" as it's referred to in the "bool" query
}
}
},
"quality": 0.95
},
...
} Additionally, zentity would need to check the query results for any instances of I think this describes the problem well, and I think this can be targeted for inclusion in zentity 1.9.0. |
Awesome, @davemoore- !! |
This feature is being worked on branch Given that nested objects often represent separate entities with multiple attributes (e.g. an address identified by a street, city, state, and zip), I may need to implement compound attributes (#28) before completing this feature. Both features will add quite a bit of complexity to the project. Starting with compound attributes makes sense because it could be applied either to "flat" indices or to indices with nested objects. Once zentity shows that it can handle the complexity of compound attributes with flat indices, I'll feel more comfortable introducing yet another layer of complexity nested objects. |
Environment
Describe the bug
Zentity does not appear to honor a nested index structure. I have an index definition with person data fields (names, addresses, phones, etc.). The addresses are nested in the index definition (e.g. Addresses.FullStreet, Addresses.City). The matcher in the model definition specifies that they are nested and the indices section uses dot notation to map to the index definition (e.g. Addresses.FullStreet). When I search a street address, city, state, and zip, any one of the street addresses will match against anyone of the city, state, and zips rather than grouping a street address with a particular city, state, and zip. I also can't seem to get my firstname, lastname, state resolver to match but I'm assuming I have a typo somewhere.
Expected behavior
I expected that each full address (street, city, state, zip) would be evaluated separately from other full addresses.
Steps to reproduce
{"Names":[{"FirstName":"Jane","MiddleName":"D","LastName":"Smith"},"Addresses":[{"FullStreet":"123 Main St","City":"Barnstable","State":"MA","Zip":"02632","Zip4":""},{"FullStreet":"567 North St","City":"Arlington","State":"VA","Zip":"22201"}]}
{"index": {"_id": "2", "_index": "person_watch_list"}}
{
"attributes": {
"name.last_name": ["Smith"],
"name.first_name": ["Jane"],
"address.full_street":["567 North St"],
"address.city":["Barnstable"],
"address.state":["MA"],
"address.zip":["02632"]
}
}
Additional context
The search above matches with the street_city and street_zip resolver. I would hope this would not match and would only match if I changed address.full_street to "123 Main St". Thank you in advance for your help!
The text was updated successfully, but these errors were encountered: