-
Notifications
You must be signed in to change notification settings - Fork 237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Visualize complex Hive column types #1072
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I'm not sure if the JSON format is the best way to help visualize complex hive data types. Json works well for the struct and array type, but not for the map and union types I think. People may get confused what's the actual type is. An alternative way I'm thinking is: how about just simply format the origin type string with indents, e.g.
struct<
date:struct<
year:int,
month:int,
day:int>,
hour:int,
minute:int,
second:int,
timeZoneId:string
>
so that when people are seeing types like map
or uniontype
, they will go and check what it is.
- Also I just realized that some really big type string may get truncated. It has a max length of 255 or 256 in our hive metastore. e.g.
map<string,struct<userid:bigint,matchtypeused:struct<value:int>,matchtypetouseridmap:map<int,bigint>,restrictedusereason:array<struct<value:int>>,isactivated:boolean,experimentname:string,experimentgroup:string,isoptin:boolean,useridtoconfidencescoremap:m
not sure how you will handle this case?
json: any; | ||
} | ||
|
||
export const Json: React.FC<IProps & Partial<ReactJsonViewProps>> = ({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about renaming the component as JsonViewer
, to differentiate from the built-in JSON
object.
@@ -57,6 +62,13 @@ export const DataTableColumnCard: React.FunctionComponent<IProps> = ({ | |||
{column.comment} | |||
</KeyContentDisplay> | |||
)} | |||
{parsedType !== column.type && ( | |||
<> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why need the <> </>
? it only has a single child
timeZoneId: 'string', | ||
} | ||
*/ | ||
export function parseType(type: string): any | string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will there be cases of
- uppercase, e.g.
STRUCT
- space between, e.g.
struct <
and what happens if the parsing fails?
Thanks for the feedback, I agree JSON isn't the most clear way to visualize some of these types. We're using another product that uses a nested column approach for complex types, but it is also confusing when looking at The advantages were that it was relatively easy to implement and reused the same JSON viewer as the results, providing some consistency for users. It also is collapsible which is nice for extremely nested types. I can experiment with different approaches and share some additional options. Re: the type length limit, I noticed this as well and was going to bring it up later. The parser degrades as well as it can, but ideally we'd increase the max length. We too have a significant number of columns synced with truncated types. |
This change makes it easier to understand complex Hive types, including
array<>
,map<>
,struct<>
, anduniontype<>
. It adds a parser that detects these types and converts them into a representative JSON object, which is then visualized using the react-json-view library added in #991.Currently it only works on these Hive types. If it's too specific/niche to merge that's fine, we'll maintain it internally on our side. Just wanted to throw it out there in case it would be useful to anyone else.