-
Notifications
You must be signed in to change notification settings - Fork 540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reject non-string keys #1004
Reject non-string keys #1004
Changes from 3 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -41,7 +41,6 @@ | |
import java.nio.charset.StandardCharsets; | ||
import java.sql.Timestamp; | ||
import java.util.List; | ||
import java.util.Set; | ||
|
||
/** | ||
* | ||
|
@@ -166,13 +165,28 @@ public void convert(JsonElement value, ColumnVector vect, int row) { | |
} | ||
|
||
static class MapColumnConverter implements JsonConverter { | ||
private JsonConverter[] childrenConverters; | ||
private JsonConverter[] childConverters; | ||
|
||
public MapColumnConverter(TypeDescription schema) { | ||
List<TypeDescription> kids = schema.getChildren(); | ||
childrenConverters = new JsonConverter[kids.size()]; | ||
for (int c = 0; c < childrenConverters.length; ++c) { | ||
childrenConverters[c] = createConverter(kids.get(c)); | ||
assertKeyType(schema); | ||
|
||
List<TypeDescription> childTypes = schema.getChildren(); | ||
childConverters = new JsonConverter[childTypes.size()]; | ||
for (int c = 0; c < childConverters.length; ++c) { | ||
childConverters[c] = createConverter(childTypes.get(c)); | ||
} | ||
} | ||
|
||
/** | ||
* Rejects non-string keys. This is a limitation imposed by JSON specifications that only allows strings | ||
* as keys. | ||
*/ | ||
private void assertKeyType(TypeDescription schema) { | ||
TypeDescription keyType = schema.getChildren().get(0); | ||
String keyTypeName = keyType.getCategory().getName(); | ||
if (!keyTypeName.equals("string")) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should it be case insensitive comparison? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That’s a good point. Not sure if ORC schema is case sensitive. But it would be a good idea to perform case insensitive comparison here just to be safe. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've tested what happens when ORC schema is defined with upper-case letters (e.g., |
||
throw new IllegalArgumentException( | ||
String.format("Unsupported key type: %s", keyTypeName)); | ||
} | ||
} | ||
|
||
|
@@ -192,8 +206,8 @@ public void convert(JsonElement value, ColumnVector vect, int row) { | |
|
||
int i = 0; | ||
for (String key : obj.keySet()) { | ||
childrenConverters[0].convert(new JsonPrimitive(key), vector.keys, (int) vector.offsets[row] + i); | ||
childrenConverters[1].convert(obj.get(key), vector.values, (int) vector.offsets[row] + i); | ||
childConverters[0].convert(new JsonPrimitive(key), vector.keys, (int) vector.offsets[row] + i); | ||
childConverters[1].convert(obj.get(key), vector.values, (int) vector.offsets[row] + i); | ||
i++; | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it guarantee that schema has Children? And what are each children schema representing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suppose we have an ORC schema like
map<string,int>
then the first child is aTypeDescription
instance representingstring
type. Not sure ifschema
is guaranteed to have at least one child. We might want to test with a malformed ORC schema likemap<>
. It may or may not reach this point when such schema is given.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also make sure the passed-in 'schema' argument really represents the map schema
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It certainly wouldn't hurt to do so, but I feel like this is somewhat redundant as it is already ensured by
VectorColumnFilter.createConverter()
. Its very logic doesn't leave any room for other types to fall intoMapColumnConverter
.If you still insist doing so, I would suggest to add some common type checking logic in
JsonConverter
interface or inVectorColumnFiller
class so that other types could also benefit from it. Any thoughts on this?