Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] AnnotatedTextFieldMapperTests testBlockLoaderFromRowStrideReaderWithSyntheticSource failing #115076

Closed
elasticsearchmachine opened this issue Oct 18, 2024 · 7 comments · Fixed by #115114
Assignees
Labels
medium-risk An open issue or test failure that is a medium risk to future releases :StorageEngine/Mapping The storage related side of mappings Team:StorageEngine >test-failure Triaged test failures from CI

Comments

@elasticsearchmachine
Copy link
Collaborator

Build Scans:

Reproduction Line:

./gradlew ":plugins:mapper-annotated-text:test" --tests "org.elasticsearch.index.mapper.annotatedtext.AnnotatedTextFieldMapperTests.testBlockLoaderFromColumnReaderWithSyntheticSource" -Dtests.seed=C4C2E205B3F04149 -Dtests.locale=as-IN -Dtests.timezone=Japan -Druntime.java=22

Applicable branches:
main

Reproduces locally?:
N/A

Failure History:
See dashboard

Failure Message:

java.lang.AssertionError: expected null, but was:<BlockSourceReader.Bytes[Norms]>

Issue Reasons:

  • [main] 3 failures in test testBlockLoaderFromRowStrideReaderWithSyntheticSource (1.4% fail rate in 220 executions)
  • [main] 2 failures in step part1 (6.5% fail rate in 31 executions)
  • [main] 2 failures in pipeline elasticsearch-intake (6.5% fail rate in 31 executions)

Note:
This issue was created using new test triage automation. Please report issues or feedback to es-delivery.

@elasticsearchmachine elasticsearchmachine added :Search Foundations/Mapping Index mappings, including merging and defining field types >test-failure Triaged test failures from CI labels Oct 18, 2024
elasticsearchmachine added a commit that referenced this issue Oct 18, 2024
…apperTests testBlockLoaderFromRowStrideReaderWithSyntheticSource #115076
@elasticsearchmachine
Copy link
Collaborator Author

This has been muted on branch main

Mute Reasons:

  • [main] 3 failures in test testBlockLoaderFromRowStrideReaderWithSyntheticSource (1.4% fail rate in 220 executions)
  • [main] 2 failures in step part1 (6.5% fail rate in 31 executions)
  • [main] 2 failures in pipeline elasticsearch-intake (6.5% fail rate in 31 executions)

Build Scans:

@elasticsearchmachine elasticsearchmachine added needs:risk Requires assignment of a risk label (low, medium, blocker) Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch labels Oct 18, 2024
@elasticsearchmachine
Copy link
Collaborator Author

Pinging @elastic/es-search-foundations (Team:Search Foundations)

@cbuescher
Copy link
Member

Most likely related to c62a96c, will see if this just needs a quick test adjustment.

@cbuescher
Copy link
Member

The expectation to get a null BlockLoader in https://github.com/elastic/elasticsearch/blob/main/test/framework/src/main/java/org/elasticsearch/index/mapper/MapperTestCase.java#L1384 seems to not hold for TextFieldMapperTests and AnnotatedTextFieldMapperTests anymore after the change above. I'm not sure though if those two test cases need adaption (i.e. I saw probably overriding loadBlockExpected in those test might be necessary, but I don't understand synthetic source well enough to know how atm) or the more general MapperTestCase needs to be more lenient.
@martijnvg since you added the change above, would you mind taking a look?

@martijnvg
Copy link
Member

@cbuescher This is looks related to the change you refer to. I will look into this.

@martijnvg martijnvg added :StorageEngine/Mapping The storage related side of mappings and removed :Search Foundations/Mapping Index mappings, including merging and defining field types labels Oct 18, 2024
@elasticsearchmachine elasticsearchmachine added Team:StorageEngine and removed Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch labels Oct 18, 2024
@elasticsearchmachine
Copy link
Collaborator Author

Pinging @elastic/es-storage-engine (Team:StorageEngine)

martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Oct 18, 2024
Handle ignore_above in source BlockLoader correctly when synthetic source is enabled.
This done by loading that *.original stored field that keyword fields introduce if a value is longer than specified ignore_above threshold.

Closes elastic#115076
@martijnvg
Copy link
Member

The reasons why this failure and other failures (#115066, #115073, #115074, #115076) occurred:

  • Tests assumed that no block loader should be loaded and null values would be omitted if synthetic source was enabled and text fields have a keyword sub field.
  • The block loader unit tests were using source mode mapping attribute instead of index.mapping.source.mode index setting (BlockLoaderContext only knows about the latter).
  • The [CI] TextFieldMapperTests testBlockLoaderFromRowStrideReaderWithSyntheticSource failing #115066 failure uncovered a real issue. If text fields has a keyword sub field that ignore_above threshold then fields are missing from the synthesized source. The source block loader should also include the *._original field that a keyword field mapper stores. (this gets stored if a keyword field mapper exceeds ignore above).

#115114 addresses all the listed issues here.

@martijnvg martijnvg added the medium-risk An open issue or test failure that is a medium risk to future releases label Oct 18, 2024
@elasticsearchmachine elasticsearchmachine removed the needs:risk Requires assignment of a risk label (low, medium, blocker) label Oct 18, 2024
lkts pushed a commit to lkts/elasticsearch that referenced this issue Oct 18, 2024
…apperTests testBlockLoaderFromRowStrideReaderWithSyntheticSource elastic#115076
martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Oct 23, 2024
…equired stored fields (elastic#115114)

If source is required by a block loader then the StoredFieldsSpec that gets populated should be enhanced by SourceLoader#requiredStoredFields(...) in ValuesSourceReaderOperator. Otherwise in case of synthetic source many stored fields aren't loaded, which causes only a subset of _source to be synthesized. For example when unmapped fields exist or field values that exceed configured ignore above will not appear is _source.

This happens when field types fallback to a block loader implementation that uses _source. The required field values are then extracted from the source once loaded.

This change also reverts the production code changes introduced via elastic#114903. That change only ensured that _ignored_source field was added to the required list of stored fields. In reality more fields could be required. This change is better fix, since it handles also other cases and the SourceLoader implementation indicates which stored fields are needed.

Closes elastic#115076
elasticsearchmachine pushed a commit that referenced this issue Oct 23, 2024
…equired stored fields (#115114) (#115390)

If source is required by a block loader then the StoredFieldsSpec that gets populated should be enhanced by SourceLoader#requiredStoredFields(...) in ValuesSourceReaderOperator. Otherwise in case of synthetic source many stored fields aren't loaded, which causes only a subset of _source to be synthesized. For example when unmapped fields exist or field values that exceed configured ignore above will not appear is _source.

This happens when field types fallback to a block loader implementation that uses _source. The required field values are then extracted from the source once loaded.

This change also reverts the production code changes introduced via #114903. That change only ensured that _ignored_source field was added to the required list of stored fields. In reality more fields could be required. This change is better fix, since it handles also other cases and the SourceLoader implementation indicates which stored fields are needed.

Closes #115076
smalyshev pushed a commit to smalyshev/elasticsearch that referenced this issue Oct 23, 2024
…equired stored fields (elastic#115114)

If source is required by a block loader then the StoredFieldsSpec that gets populated should be enhanced by SourceLoader#requiredStoredFields(...) in ValuesSourceReaderOperator. Otherwise in case of synthetic source many stored fields aren't loaded, which causes only a subset of _source to be synthesized. For example when unmapped fields exist or field values that exceed configured ignore above will not appear is _source.

This happens when field types fallback to a block loader implementation that uses _source. The required field values are then extracted from the source once loaded.

This change also reverts the production code changes introduced via elastic#114903. That change only ensured that _ignored_source field was added to the required list of stored fields. In reality more fields could be required. This change is better fix, since it handles also other cases and the SourceLoader implementation indicates which stored fields are needed.

Closes elastic#115076
georgewallace pushed a commit to georgewallace/elasticsearch that referenced this issue Oct 25, 2024
…apperTests testBlockLoaderFromRowStrideReaderWithSyntheticSource elastic#115076
georgewallace pushed a commit to georgewallace/elasticsearch that referenced this issue Oct 25, 2024
…equired stored fields (elastic#115114)

If source is required by a block loader then the StoredFieldsSpec that gets populated should be enhanced by SourceLoader#requiredStoredFields(...) in ValuesSourceReaderOperator. Otherwise in case of synthetic source many stored fields aren't loaded, which causes only a subset of _source to be synthesized. For example when unmapped fields exist or field values that exceed configured ignore above will not appear is _source.

This happens when field types fallback to a block loader implementation that uses _source. The required field values are then extracted from the source once loaded.

This change also reverts the production code changes introduced via elastic#114903. That change only ensured that _ignored_source field was added to the required list of stored fields. In reality more fields could be required. This change is better fix, since it handles also other cases and the SourceLoader implementation indicates which stored fields are needed.

Closes elastic#115076
jfreden pushed a commit to jfreden/elasticsearch that referenced this issue Nov 4, 2024
…apperTests testBlockLoaderFromRowStrideReaderWithSyntheticSource elastic#115076
jfreden pushed a commit to jfreden/elasticsearch that referenced this issue Nov 4, 2024
…equired stored fields (elastic#115114)

If source is required by a block loader then the StoredFieldsSpec that gets populated should be enhanced by SourceLoader#requiredStoredFields(...) in ValuesSourceReaderOperator. Otherwise in case of synthetic source many stored fields aren't loaded, which causes only a subset of _source to be synthesized. For example when unmapped fields exist or field values that exceed configured ignore above will not appear is _source.

This happens when field types fallback to a block loader implementation that uses _source. The required field values are then extracted from the source once loaded.

This change also reverts the production code changes introduced via elastic#114903. That change only ensured that _ignored_source field was added to the required list of stored fields. In reality more fields could be required. This change is better fix, since it handles also other cases and the SourceLoader implementation indicates which stored fields are needed.

Closes elastic#115076
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
medium-risk An open issue or test failure that is a medium risk to future releases :StorageEngine/Mapping The storage related side of mappings Team:StorageEngine >test-failure Triaged test failures from CI
Projects
None yet
3 participants