Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-40777][SQL][PROTOBUF] Protobuf import support and move error-classes. #38344

Closed

Conversation

SandishKumarHN
Copy link
Contributor

@SandishKumarHN SandishKumarHN commented Oct 22, 2022

This is the follow-up PR to #37972 and #38212

What changes were proposed in this pull request?

  1. Move spark-protobuf error classes to the spark error-classes framework(core/src/main/resources/error/error-classes.json).
  2. Support protobuf imports
  3. validate protobuf timestamp and duration types.

Why are the changes needed?

N/A

Does this PR introduce any user-facing change?

None

How was this patch tested?

Existing tests should cover the validation of this PR.

CC: @rangadi @mposdev21 @gengliangwang

@@ -62,14 +63,16 @@ object SchemaConverters {
case STRING => Some(StringType)
case BYTE_STRING => Some(BinaryType)
case ENUM => Some(StringType)
case MESSAGE if fd.getMessageType.getName == "Duration" =>
case MESSAGE if fd.getMessageType.getName == "Duration" &&
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rangadi My assumption is that users should be able to use the Timestamp and Duration message types with different fields.

}

val descriptorProto: DescriptorProtos.FileDescriptorProto = fileDescriptorSet.getFile(0)
val descriptorProto: DescriptorProtos.FileDescriptorProto =
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rangadi This handles the protobuf import; please let me know if you know of a better approach.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@SandishKumarHN SandishKumarHN force-pushed the SPARK-40777-ProtoErrorCls branch from 90c61cc to c78ddca Compare October 22, 2022 04:47
@github-actions github-actions bot added the BUILD label Oct 22, 2022
@SandishKumarHN SandishKumarHN force-pushed the SPARK-40777-ProtoErrorCls branch from c78ddca to e6f3cab Compare October 22, 2022 04:50
@HyukjinKwon HyukjinKwon changed the title [SPARK-40777][SQL][SPARK-PROTOBUF] : Protobuf import support and move error-classes. [SPARK-40777][SQL][PROTOBUF] Protobuf import support and move error-classes. Oct 24, 2022
@HyukjinKwon
Copy link
Member

cc @MaxGekk FYI

protobufType: String,
sqlType: String): Throwable = {
new AnalysisException(
errorClass = "_LEGACY_ERROR_TEMP_2251",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use the prefix _LEGACY_ERROR_TEMP_ to convert existing exception to error classes but for new exception, please, assign appropriate names to error classes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense made changes.

toFieldStr(catalystPath),
s"${protoType} ${protoType.toProto.getLabel} ${protoType.getJavaType}" +
s" ${protoType.getType}",
catalystType.sql)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pass catalystType into cannotConvertProtobufTypeToSqlTypeError(), and use toSQLType to quote the type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -123,6 +123,7 @@
<configuration>
<protocArtifact>com.google.protobuf:protoc:${protobuf.version}</protocArtifact>
<protocVersion>${protobuf.version}</protocVersion>
<includeMavenTypes>direct</includeMavenTypes>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add comment here?
Is this for protos like `Timestamp' ? Looks like our code handles any Timestamp defined similar com.google.protobuf.Timestamp. Ok to not to include these.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rangadi this protoc maven option is equal to CLI "protoc --include_imports" and I have added validations to protobuf Timestamp and Duration types. and currently, maven protoc does not generate classes for google protobuf types. sbt is enabled by default.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. What would fail if you remove this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rangadi just "messageClassName" param feature unit tests(Timestamp and Duration). The protoc plugin is only used for the test right?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's because this removes existing 'Timestamp' message : https://github.com/apache/spark/pull/38344/files#diff-97aac63266f3c60eef9bd8dd1b76be3a5bd77fe4d17fa6fa370f5e0d9428a0a9L172

We could undo that change.

Copy link
Contributor Author

@SandishKumarHN SandishKumarHN Oct 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rangadi how about supporting imports? I made this change of "Timestamp" and "Duration" to show a unit test for imports. even if we undo the above change, a unit test for import case needs plugin change

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is better to have import of our own proto files. Currently we don't have any. We can split some of them. I think I left a TODO about that in my PR. Many production proto files will have imports.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rangadi how about splitting Timestamp and Duration into our own .proto files and then importing them in the functions_suite.proto file?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. That would be nice. That will mimic normal proto files organization.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rangadi made changes same.

@SandishKumarHN SandishKumarHN force-pushed the SPARK-40777-ProtoErrorCls branch from 7cea436 to a40c748 Compare October 24, 2022 06:24
@SandishKumarHN SandishKumarHN force-pushed the SPARK-40777-ProtoErrorCls branch from a40c748 to 26e471b Compare October 24, 2022 17:52
}

val descriptorProto: DescriptorProtos.FileDescriptorProto = fileDescriptorSet.getFile(0)
val descriptorProto: DescriptorProtos.FileDescriptorProto =
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you some brief comments here?
Is the last file the right file?

val descriptorProto: DescriptorProtos.FileDescriptorProto =
fileDescriptorSet.getFileList.asScala.last

var fileDescriptorList = List[Descriptors.FileDescriptor]()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the import file list? What happens when the imported file imports other files? i.e. A imports B and B imports C.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rangadi some changes were made to support nested imports. Reading protobuf descriptors from the bottom up, the last element in FileDescriptorSet is the initial FileDescriptorProto, from which we will continue to find more FileDescriptors recursively.

// google.protobuf package should default to corresponding Catalylist types.
case MESSAGE
if fd.getMessageType.getName == "Timestamp" &&
fd.getMessageType.getFields.size() == 2 &&
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could move all the conditions into braces (a & b & ...)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

message Timestamp {
int64 seconds = 1;
int32 nanos = 2;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Fix end-of-line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

assert(e.getMessage === expectMsg)
assert(e.getCause.getMessage === expectedCauseMessage)
if (e.getCause != null) {
assert(e.getCause.getMessage === expectedCauseMessage)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does it mean if e.getCause is null and expectedCauseMessage is not?

Copy link
Contributor Author

@SandishKumarHN SandishKumarHN Oct 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made changes to support new error class framework.

@SandishKumarHN SandishKumarHN force-pushed the SPARK-40777-ProtoErrorCls branch from 34ac86a to 38f43b0 Compare October 25, 2022 04:10
new AnalysisException(
errorClass = "PROTOBUF_CLASS_LOAD_ERROR",
messageParameters = Map("protobufClassName" -> protobufClassName, "message" -> message),
cause = Some(cause.getCause))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getCause() can return null, correct. If so, could you wrap it by Option(cause.getCause)

new AnalysisException(errorClass = "INVALID_BYTE_STRING_ERROR", messageParameters = Map.empty)
}

def malformedRecordsDetectedInRecordParsingError(cause: Throwable): Throwable = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it a compilation error not runtime (execution) one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is runtime, moved to ExecutionError class.

new AnalysisException(
errorClass = "PROTOBUF_DESCRIPTOR_PARSING_ERROR",
messageParameters = Map.empty(),
cause = Some(cause.getCause))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some -> Option?

new AnalysisException(
errorClass = "CANNOT_FIND_PROTOBUF_DESCRIPTOR_FILE_ERROR",
messageParameters = Map("filePath" -> filePath),
cause = Some(cause.getCause))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some -> Option

new AnalysisException(
errorClass = "PROTOBUF_DESCRIPTOR_ERROR",
messageParameters = Map.empty(),
cause = Some(cause.getCause))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some -> Option

messageParameters = Map(
"protobufType" -> protobufType,
"toType" -> toSQLType(sqlType)),
cause = Some(cause.getCause))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some -> Option

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

messageParameters = Map(
"protobufType" -> protobufType,
"toType" -> toSQLType(sqlType)),
cause = Some(cause.getCause))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some -> Option

new AnalysisException(
errorClass = "CATALYST_TYPE_TO_PROTOBUF_ENUM_TYPE_ERROR",
messageParameters = Map(
"sqlColumn" -> sqlColumn,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is a column id, wrap it by toSQLId()

s"schema is incompatible (sqlType = ${catalystType.sql}, " +
s"protoType = ${fieldDescriptor.getJavaType})")
throw QueryCompilationErrors.cannotConvertCatalystTypeToProtobufTypeError(
toFieldStr(catalystPath),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, don't quote it twice if you do that inside of cannotConvertCatalystTypeToProtobufTypeError() by toSQLId

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, will call toSQLType inside the error class, avoid using toFieldStr for sqlType

@SandishKumarHN SandishKumarHN force-pushed the SPARK-40777-ProtoErrorCls branch from 39b0981 to dd63be8 Compare October 28, 2022 16:32
@@ -23,6 +23,11 @@
],
"sqlState" : "42000"
},
"CANNOT_FIND_PROTOBUF_DESCRIPTOR_FILE_ERROR" : {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely remove _ERROR.
Would be nice if we could gravitate towards *_NOT_FOUND and *_ALREADY_EXIST style naming convention:
PROTOBUF_DESCRIPTOR_FILE_NOT_FOUND

@@ -65,6 +70,11 @@
],
"sqlState" : "22005"
},
"CATALYST_TYPE_TO_PROTOBUF_ENUM_TYPE_ERROR" : {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we call out Catalyst anywhere else. How is that relevant? Can we be generic?

@@ -65,6 +70,11 @@
],
"sqlState" : "22005"
},
"CATALYST_TYPE_TO_PROTOBUF_ENUM_TYPE_ERROR" : {
"message" : [
"Cannot convert SQL <sqlColumn> to Protobuf <protobufColumn> because <data> cannot be written since it's not defined in ENUM <enumString>"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a because "squared"... I don't have enough context to propose a rephrase..
Also If we have SQL here, maybe we can do CATALYST -> SQL for the name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@srielau there is no ENUM type in spark, so we convert the protobuf enum to spark string. so we are trying to throw an enum-specific error here.

@@ -517,6 +527,11 @@
"Invalid bucket file: <path>"
]
},
"INVALID_BYTE_STRING_ERROR" : {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove _ERROR

@@ -517,6 +527,11 @@
"Invalid bucket file: <path>"
]
},
"INVALID_BYTE_STRING_ERROR" : {
"message" : [
"Invalid ByteString format"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we spell out the invalid format?
For extra points: Can we spell out a valid format?

"Protobuf type not yet supported: <protobufType>."
]
},
"PROTOBUF_TYPE_TO_CATALYST_TYPE_ERROR" : {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"PROTOBUF_TYPE_TO_CATALYST_TYPE_ERROR" : {
"CANNOT_CONVERT_PROTOBUF_TYPE_TO_SQL_TYPE" : {

Copy link
Contributor Author

@SandishKumarHN SandishKumarHN Oct 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this gives which protobuf message had failed to convert, it would be the protobuf root name since the protobuf message might have a long list of fields, just keeping the root name.

"Unable to convert <protobufType> of Protobuf to SQL type <toType>."
]
},
"PROTOBUF_TYPE_TO_SQL_TYPE_ERROR" : {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this different from eh one above? Just more info available?

Copy link
Contributor Author

@SandishKumarHN SandishKumarHN Oct 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this gives which protobuf field had failed to convert, this clearly describes the field name and field type.

@@ -760,6 +850,11 @@
],
"sqlState" : "22023"
},
"SQL_TYPE_TO_PROTOBUF_TYPE_ERROR" : {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"SQL_TYPE_TO_PROTOBUF_TYPE_ERROR" : {
"CANNOT_CONVERT_SQL_TYPE_TO_PROTOBUF_TYPE" : {

@@ -792,6 +887,21 @@
"Unable to acquire <requestedBytes> bytes of memory, got <receivedBytes>"
]
},
"UNABLE_TO_CONVERT_TO_PROTOBUF_TYPE" : {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above?

"Unable to convert SQL type <toType> to Protobuf type <protobufType>."
]
},
"UNABLE_TO_LOCATE_PROTOBUF_MESSAGE_ERROR" : {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"UNABLE_TO_LOCATE_PROTOBUF_MESSAGE_ERROR" : {
"PROTOBUF_MESSAGE_NOT_FOUND" : {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@srielau fixed all the error class names

@SandishKumarHN SandishKumarHN force-pushed the SPARK-40777-ProtoErrorCls branch 2 times, most recently from 0229009 to 80b17a1 Compare October 28, 2022 20:52
error class name changes, more details to error message
@SandishKumarHN SandishKumarHN force-pushed the SPARK-40777-ProtoErrorCls branch from 80b17a1 to e5140b0 Compare October 29, 2022 00:49
@@ -175,19 +175,26 @@ private[sql] object ProtobufUtils extends Logging {
.asInstanceOf[Descriptor]
}

// TODO: Revisit to ensure that messageName is searched through all imports
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is missing? Looks fairly complete to me.
Better to state the problem here.

}

descriptor match {
descriptorList.last match {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment on why we are picking the last one? Will be useful for future readers as well.

}
}

private def parseFileDescriptor(descFilePath: String): Descriptors.FileDescriptor = {
private def parseFileDescriptor(descFilePath: String): List[Descriptors.FileDescriptor] = {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename to parseFileDescriptorSet (otherwise it sounds like it is parsing just one file descriptor).

def buildDescriptor(descFilePath: String, messageName: String): Descriptor = {
val descriptor = parseFileDescriptor(descFilePath).getMessageTypes.asScala.find { desc =>
desc.getName == messageName || desc.getFullName == messageName
val descriptorList = parseFileDescriptor(descFilePath).map(fileDescriptor => {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style: use find() rather than map().filter().

(you can use findLast() if there is a reason to use the last match).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rangadi makes sense to use find and return, fixed.

Copy link

@rangadi rangadi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM on import functionality changes. Will let other reviewers to approve the error message changes.

desc.getName == messageName || desc.getFullName == messageName
}
val fileDescriptor = parseFileDescriptorSet(descFilePath)
.find(!_.getMessageTypes.asScala.find(desc =>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, we need to find() invocations and check is made twice. How about this? [Optional to do, only a suggestion]

 // Find the first message descriptor that matches the name.
 val descriptorOpt =  parseFileDescriptorSet(descFilePath)
     .flatMap { fileDesc => 
            fileDesc.getMessageTypes.asScala.find { desc => 
                  desc.getName == messageName || desc.getFullName == messageName
            }
     }.headOption

  descriptoOpt match {
     case Some(d) => d
     case None => throw QueryCompilationErrors.unableToLocateProtobufMessageError(messageName)
 } 

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

val message = if (hasDots) "" else ". Ensure the class name includes package prefix."
new AnalysisException(
errorClass = "CANNOT_LOAD_PROTOBUF_CLASS",
messageParameters = Map("protobufClassName" -> protobufClassName, "message" -> message),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want text as a parameter. Aside to make this text depends on whether there is any dot seems to be quite subtle. Maybe it simply doesn't have enough dots?
I propose to move this text into the error message (unconditionally).

@@ -742,6 +832,11 @@
],
"sqlState" : "22023"
},
"SQL_TYPE_TO_PROTOBUF_ENUM_TYPE_ERROR" : {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename without _ERROR

"Type mismatch encountered for field: <field>"
]
},
"PROTOBUF_FIELD_TYPE_TO_SQL_TYPE_ERROR" : {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about:

Suggested change
"PROTOBUF_FIELD_TYPE_TO_SQL_TYPE_ERROR" : {
"CANNOT_CONVERT_PROTOBUF_FIELD_TYPE_TO_SQL_TYPE" : {

@@ -29,12 +44,22 @@
],
"sqlState" : "22007"
},
"CANNOT_LOAD_PROTOBUF_CLASS" : {
"message" : [
"Could not load Protobuf class with name <protobufClassName><message>"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"Could not load Protobuf class with name <protobufClassName><message>"
"Could not load Protobuf class with name <protobufClassName>. Ensure the class name includes package prefix."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@srielau fixed the suggested changes.

Copy link
Contributor

@srielau srielau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rangadi
Copy link

rangadi commented Nov 4, 2022

@MaxGekk are you still reviewing this?
@SandishKumarHN is there any more review to be addressed? If we are ready, I can ask @HeartSaVioR to merge this (before his weekend starts in Seoul :) ).

@SandishKumarHN
Copy link
Contributor Author

@MaxGekk are you still reviewing this? @SandishKumarHN is there any more review to be addressed? If we are ready, I can ask @HeartSaVioR to merge this (before his weekend starts in Seoul :) ).

@rangadi I have addressed all of the review comments.

Copy link
Member

@MaxGekk MaxGekk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error classes LGTM

Copy link
Contributor

@HeartSaVioR HeartSaVioR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@HeartSaVioR
Copy link
Contributor

Thanks! Merging to master.

def descrioptorParseError(descFilePath: String, cause: Throwable): Throwable = {
new AnalysisException(
errorClass = "CANNOT_PARSE_PROTOBUF_DESCRIPTOR",
messageParameters = Map.empty("descFilePath" -> descFilePath),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accidentally found this. Is messageParameters set to an empty map, if so, this will fail in runtime w/ an internal error. Could you write a test which trigger the error and fix the bug.

def failedParsingDescriptorError(descFilePath: String, cause: Throwable): Throwable = {
new AnalysisException(
errorClass = "CANNOT_CONSTRUCT_PROTOBUF_DESCRIPTOR",
messageParameters = Map.empty("descFilePath" -> descFilePath),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SandishKumarHN Please, fix this. Map.empty -> Map

SandishKumarHN added a commit to SandishKumarHN/spark that referenced this pull request Dec 12, 2022
…lasses

This is the follow-up PR to apache#37972 and apache#38212

### What changes were proposed in this pull request?
1. Move spark-protobuf error classes to the spark error-classes framework(core/src/main/resources/error/error-classes.json).
2. Support protobuf imports
3. validate protobuf timestamp and duration types.

### Why are the changes needed?
N/A

### Does this PR introduce _any_ user-facing change?
None

### How was this patch tested?
Existing tests should cover the validation of this PR.

CC: rangadi mposdev21 gengliangwang

Closes apache#38344 from SandishKumarHN/SPARK-40777-ProtoErrorCls.

Authored-by: SandishKumarHN <[email protected]>
Signed-off-by: Jungtaek Lim <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants