[Question] Is there a way to use the GenericRow to load a complex type in a MapType? #777
Replies: 9 comments 1 reply
-
Hi @thedanfields, thank you for your question! var schema = new StructType(
new List<StructField> {
new StructField("Identifier", new LongType()),
new StructField("NestedComplexMap", new MapType(new IntegerType(),
new MapType(new StringType(), new LongType()))),
}
); It showed me the following result for |
Beta Was this translation helpful? Give feedback.
-
@Niharikadutta !!! Thanks so much. Out of curiosity, what c# type should I use for a Thanks again for your time and attention! Really appreciate it. |
Beta Was this translation helpful? Give feedback.
-
I'm not sure I understood your question, but if you want to use |
Beta Was this translation helpful? Give feedback.
-
Hey @Niharikadutta, thanks for your response. I see why I was unclear. Ultimately I do not have control of the schema, it's being dictated by a process upstream. My goal is to be able to use the I am stuck with this schema: new StructType(
new List<StructField> {
new StructField("Identifier", new LongType()),
new StructField("NestedComplexMap", new MapType(new IntegerType(),
new StructType(new List<StructField>() {
new StructField("FirstProp", new LongType()),
new StructField("SecondProp", new LongType()),
new StructField("ThirdProp", new LongType()),
}))),
}
) I've tried a number of ways of using the This example: var row = new GenericRow(new object[] {
1, // Identifier
new Dictionary<int, GenericRow> { // ?? NestedComplexMap ??
{ 1, new GenericRow(new object[] {1L, 2L, 3L }) },
{ 2, new GenericRow(new object[] {1L, 2L, 3L }) }
}
});
var dataFrame = _session.CreateDataFrame(new [] { row }, schema);
var srows = dataFrame.Collect().ToArray(); Results in an Exception stating: I've also naively tried substituting the So given that schema, do you know how I can fabricate a DataFrame ? Thanks again for your attention here. I really appreciate the support. |
Beta Was this translation helpful? Give feedback.
-
@thedanfields Thanks for explaining your scenario and apologies for the late reply. It is currently not possible to create a DataFrame for the schema you have provided (or nested DataFrames in general) because we do not support |
Beta Was this translation helpful? Give feedback.
-
@Niharikadutta, Can you check if we can make the following work?
|
Beta Was this translation helpful? Give feedback.
-
@Niharikadutta, thanks again for the attention to this. Just to follow up, While I wasn't able to directly create the needed data, I managed to do it indirectly with two DataFrames and some joins. |
Beta Was this translation helpful? Give feedback.
-
Yes @imback82 I did test this and it failed with |
Beta Was this translation helpful? Give feedback.
-
@thedanfields could you elaborate a little more on how you created it indirectly with two DataFrames? Thanks! |
Beta Was this translation helpful? Give feedback.
-
Hi!
I'm currently using v1 of the Microsoft.Spark nuget packages and the microsoft-spark-2-4_2.11-1.0.0.jar.
I've currently got the need to have a DataFrame schema which looks similar to this:
I'm trying to write tests by generating data using a
GenericRow
to hydrate aDataFrame
like so:During execution, I receive run time errors stating:
Is there a way to do accomplish creating this kind of data using the
GenericRow
?Thanks so much for your time and efforts!
Beta Was this translation helpful? Give feedback.
All reactions