Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ESQL] Remove Named Expcted Types map from testing infrastructure #111213

Conversation

not-napoleon
Copy link
Member

This removes the NAMED_EXPECTED_TYPES map from the testing infrastructure. It had become difficult to maintain, and pushed error message text farther way from the code actually testing it. This PR introduces some small functional interfaces to enable scalar function tests to have more fine grained control over their expected error messages.

We could do more here, but I think this is a good step in the right direction.

@not-napoleon not-napoleon added >test Issues or PRs that are addressing/adding tests :Analytics/ES|QL AKA ESQL v8.16.0 labels Jul 23, 2024
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jul 23, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@@ -617,12 +625,23 @@ protected interface TypeErrorMessageSupplier {
String apply(boolean includeOrdinal, List<Set<DataType>> validPerPosition, List<DataType> types);
}

@FunctionalInterface
protected interface PositionalErrorMessageSupplier {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably wants javadoc given the number of places we're using it.

@@ -74,10 +71,14 @@ public abstract class AbstractScalarFunctionTestCase extends AbstractFunctionTes
*/
protected static Iterable<Object[]> parameterSuppliersFromTypedDataWithDefaultChecks(
boolean entirelyNullPreservesType,
List<TestCaseSupplier> suppliers
List<TestCaseSupplier> suppliers,
PositionalErrorMessageSupplier positionalErrorMessageSupplier
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 for giving this a name.

(v, p) -> switch (p) {
case 0 -> "string";
case 1 -> "datetime";
default -> "";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this throw on unknown?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we get to the default if the code that figures out which position has the bad argument comes up with a number higher than the number of arguments. Like if it says "the fourth argument to + is bad". That should really never happen, and I didn't put a lot of thought into what to do if it did, I just made the switch expression happy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to make them happy by throwing if I don't expect it. I suppose in this case it's not a big difference.

@@ -41,7 +41,7 @@ public static Iterable<Object[]> parameters() {
cartesianPoints(cases, "mv_first", "MvFirst", DataType.CARTESIAN_POINT, (size, values) -> equalTo(values.findFirst().get()));
geoShape(cases, "mv_first", "MvFirst", DataType.GEO_SHAPE, (size, values) -> equalTo(values.findFirst().get()));
cartesianShape(cases, "mv_first", "MvFirst", DataType.CARTESIAN_SHAPE, (size, values) -> equalTo(values.findFirst().get()));
return parameterSuppliersFromTypedDataWithDefaultChecks(false, cases);
return parameterSuppliersFromTypedDataWithDefaultChecks(false, cases, (v, p) -> "");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this throw? Maybe say something like all types are valid or something.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, this will fail if a type ends up not being valid. The function will generate an error message with the string "representable" here, which will not match the test's expected empty string. Throwing from here doesn't really make that any clearer in my opinion? But I'm open to discuss.

@@ -38,7 +38,7 @@ public static Iterable<Object[]> parameters() {
longs(cases, "mv_max", "MvMax", (size, values) -> equalTo(values.max().getAsLong()));
unsignedLongs(cases, "mv_max", "MvMax", (size, values) -> equalTo(values.reduce(BigInteger::max).get()));
dateTimes(cases, "mv_max", "MvMax", (size, values) -> equalTo(values.max().getAsLong()));
return parameterSuppliersFromTypedDataWithDefaultChecks(false, cases);
return parameterSuppliersFromTypedDataWithDefaultChecks(false, cases, (v, p) -> "representableNonSpatial");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh boy.

o,
v,
t,
(l, p) -> "datetime, double, integer, ip, keyword, long, text, unsigned_long or version"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be better off returning a closure. But not sure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I poked around at it a bit, I didn't think it looked all that much better. Please feel free to submit a follow up that cleans it up if you want.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤘

Copy link
Contributor

@ivancea ivancea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

However, I feel like we're missing the possibility to standardize the error messages here. The current solution is far from ideal, but somehow guides you into using an specific format. I wonder if we could go towards some more strict generation, like using the @Param data to generate it.

Just speaking aloud. The only part I don't "like" here is that we're decentralizing the messages, which be make more tedious to undo later, if we want to do that

@@ -38,7 +38,7 @@ public static Iterable<Object[]> parameters() {
longs(cases, "mv_min", "MvMin", (size, values) -> equalTo(values.min().getAsLong()));
unsignedLongs(cases, "mv_min", "MvMin", (size, values) -> equalTo(values.reduce(BigInteger::min).get()));
dateTimes(cases, "mv_min", "MvMin", (size, values) -> equalTo(values.min().getAsLong()));
return parameterSuppliersFromTypedDataWithDefaultChecks(false, cases);
return parameterSuppliersFromTypedDataWithDefaultChecks(false, cases, (v, p) -> "representableNonSpatial");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this message is wrong (?), shouldn't it be human-readable?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is wrong, but let's grab it in a follow up, I think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it should. But again, this is just putting the string into the test, not changing the string that the function currently sends on type error. I do think it's worth spending some time to review our type errors and update them, but that's not the goal of this PR.

@nik9000
Copy link
Member

nik9000 commented Jul 24, 2024

However, I feel like we're missing the possibility to standardize the error messages here.

I think we can get that by making some "canned" error message suppliers or something like that. That way places that aren't standard can use this and that are can provide the closure with a named method call. But I think this is a step in the right direction.

@not-napoleon
Copy link
Member Author

decentralizing the messages, which be make more tedious to undo later, if we want to do that

Just to be clear, the code that's being removed was never involved in generating the actual error messages. It was just being used to guess the error message for the test to validate. The error messages were always generated within the type checking infrastructure, on each function. See, for example, MvMax#resolveFieldType

@not-napoleon not-napoleon added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Jul 24, 2024
@elasticsearchmachine elasticsearchmachine merged commit c5be248 into elastic:main Jul 24, 2024
14 of 15 checks passed
@not-napoleon not-napoleon deleted the esql-remove-named-expected-types branch July 24, 2024 16:39
weizijun added a commit to weizijun/elasticsearch that referenced this pull request Jul 25, 2024
* main: (39 commits)
  Update README.asciidoc (elastic#111244)
  ESQL: INLINESTATS (elastic#109583)
  ESQL: Document a little of `DataType` (elastic#111250)
  Relax assertions in segment level field stats (elastic#111243)
  LogsDB data generator - support nested object field (elastic#111206)
  Validate `Authorization` header in Azure test fixture (elastic#111242)
  Fixing HistoryStoreTests.testPut() and testStoreWithHideSecrets() (elastic#111246)
  [ESQL] Remove Named Expcted Types map from testing infrastructure  (elastic#111213)
  Change visibility of createWriter to allow tests from a different package to override it (elastic#111234)
  [ES|QL] Remove EsqlDataTypes (elastic#111089)
  Mute org.elasticsearch.repositories.azure.AzureBlobContainerRetriesTests testReadNonexistentBlobThrowsNoSuchFileException elastic#111233
  Abstract codec lookup by name, to make CodecService extensible (elastic#111007)
  Add HTTPS support to `AzureHttpFixture` (elastic#111228)
  Unmuting tests related to free_context action being processed in ESSingleNodeTestCase (elastic#111224)
  Upgrade Azure SDK (elastic#111225)
  Collapse transport versions for 8.14.0 (elastic#111199)
  Make sure contender uses logs templates (elastic#111183)
  unmute HistogramPercentileAggregationTests.testBoxplotHistogram (elastic#111223)
  Refactor Quality Assurance test infrastructure (elastic#111195)
  Mute org.elasticsearch.xpack.restart.FullClusterRestartIT testDisableFieldNameField {cluster=UPGRADED} elastic#111222
  ...

# Conflicts:
#	server/src/main/java/org/elasticsearch/TransportVersions.java
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) >test Issues or PRs that are addressing/adding tests v8.16.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants