Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Under some circumstances Cobrix selects wrong record reader failing the Spark job #684

Closed
yruslan opened this issue Jun 4, 2024 · 0 comments · Fixed by #686
Closed

Under some circumstances Cobrix selects wrong record reader failing the Spark job #684

yruslan opened this issue Jun 4, 2024 · 0 comments · Fixed by #686
Labels
bug Something isn't working

Comments

@yruslan
Copy link
Collaborator

yruslan commented Jun 4, 2024

Describe the bug

This is spotted under some very specific options passed to cobrix.

The error is:

Error while encoding: java.lang.RuntimeException: 
  org.apache.spark.sql.catalyst.expressions.GenericRow is not a valid external type for schema of string

Code snippet that caused the issue

This is the code snippet that causes the error:

val df = spark
  .read
  .format("cobol")
  .option("copybook_contents", copybook)
  .option("record_format", "F")
  .option("segment_field", "IND")
  .option("segment_id_level0", "A")
  .option("segment_id_prefix", "ID")
  .option("redefine-segment-id-map:0", "SEGMENT1 => A")
  .option("redefine-segment-id-map:1", "SEGMENT2 => B")
  .option("redefine-segment-id-map:2", "SEGMENT3 => C")
  .option("pedantic", "true")
  .load("/data/file/location")

(the copybook is provided below)

Expected behavior

spark-cobol should choose variable-record length reader with fixed record length record extractor if the user requested segment if generation.

Context

  • Cobrix version: 2.7.1
  • Spark version: 3.3.4
  • Scala version: 2.12

Copybook (if possible)

         01  R.
           05  IND           PIC X(1).
           05  SEGMENT1.
              10    FIELD1   PIC X(1).
           05  SEGMENT2 REDEFINES SEGMENT1.
              10    FIELD2   PIC X(2).
           05  SEGMENT3 REDEFINES SEGMENT1.
              10    FIELD3   PIC X(3).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant