Add support for parsing copybooks given Spark options #672

yruslan · 2024-04-19T07:08:11Z

Background

Sometime we want to use RDDs and Spark schemas separately for processing input files. In this case it is important to generate Spark schema that matches the record schema exactly. But the parser accepts its own set of options, and Spark reader for the 'cobol' format accepts options via '.option()'. It would be useful for the copybook parser to also be able to parse via options get from a Map[String. String], with the same semantics as the Spark cobol format reader.

Feature

Add support for parsing copybooks given Spark options.

Example

val sparkOptions = Map("generate_record_id" -> "true")
val cobolSchema = CobolSchema.fromSparkOptions(sparkOptions)
val sparkSchema = cobolSchema.getSparkSchema

Proposed Solution

As per example

…nd use it across the code base.

yruslan added the enhancement New feature or request label Apr 19, 2024

yruslan self-assigned this Apr 19, 2024

yruslan added a commit that referenced this issue Apr 19, 2024

#672 Implement a method that returns the Spark schema for copybooks a…

4c3d79f

…nd use it across the code base.

yruslan added a commit that referenced this issue Apr 19, 2024

#672 Tidy up the code, remove a few instances of code duplications.

a39c527

yruslan added a commit that referenced this issue Apr 19, 2024

#672 Add unit tests for Spark schema generation.

e545cf9

yruslan added a commit that referenced this issue Apr 22, 2024

#672 Add tests for various scenarios.

d666694

yruslan added a commit that referenced this issue Apr 22, 2024

#672 Implement a method that returns the Spark schema for copybooks a…

1235883

…nd use it across the code base.

yruslan added a commit that referenced this issue Apr 22, 2024

#672 Tidy up the code, remove a few instances of code duplications.

b2a5434

yruslan added a commit that referenced this issue Apr 22, 2024

#672 Add unit tests for Spark schema generation.

1f9bfb5

yruslan added a commit that referenced this issue Apr 22, 2024

#672 Add tests for various scenarios.

6f78ed2

yruslan mentioned this issue Apr 23, 2024

Release Cobrix v2.7.0 #675

Merged

yruslan closed this as completed Apr 23, 2024

yruslan mentioned this issue Apr 24, 2024

#672 Implement parsing of copybooks with spark-cobol options #673

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for parsing copybooks given Spark options #672

Add support for parsing copybooks given Spark options #672

yruslan commented Apr 19, 2024

Add support for parsing copybooks given Spark options #672

Add support for parsing copybooks given Spark options #672

Comments

yruslan commented Apr 19, 2024

Background

Feature

Example

Proposed Solution