-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
data in string format apart from native target datatype format #545
Comments
Nice idea, but it can only work for fields having 'DISPLAY' usage, and also encoding (ascii/ebcdic) dependent.
Can you give a concrete example (field, its PIC, and value) that would help debugging it as a string? |
After thinking about it, the above feature makes sense for ASCII files, but not for EBCDIC. |
Thanks for reply and as rightly said, this would be very useful in case of ASCII case. please provide thoughts on adding this feature (plan and time etc.,) . Thank you for the support |
It is hard to say for certain. Maybe end of this year, or Jan next year. |
This is done and available in the latest 'master' |
Hi @yruslan , Thanks a lot, i am from python world and no idea about creating the jar file .could you help with steps to create a jar file or attach the jar file to this issue, so that i can test and let you know |
Sure. Which Spark and Scala version are you using? |
Using Spark 3.1.2, Scala 2.12 currently using the below cobrix version groupId: za.co.absa.cobrix |
Here, you can try this one: |
Awesome, validated and working as expected. Thanks for the quick turnaround with this enhancement |
Hi @yruslan, with option("debug","string"), we see string data in <col_name>_debug fields, how above showing this string data in actual fields instead of <col_name>_debug fields. this would help in showing actual data in actual columns and downstream can take care of handling next step one option, we can handle this post cobrix by custom code, |
So basically what you need is to slice ASCII records based on field lengths from a copybook with all columns are strings, right? I think in ASCII files you can only have numbers with usage DISPLAY. So if numbers could be retained as strings, it could help you, right? Here is another feature request related to this: #25 |
Yes, correct, is this #25 , already available currently in cobrix?, if yes, please add this info in documentation as i dont see this in documentation |
No, it is not implemented yet. But it is the plans to implement it in the future. |
Background
cobrix converts the data to native type ( decimal, integer etc.,) based on the copybook information.
Feature
having an option of just dividing the record to columns and having them in string format(as it is , without any trimming) instead of converting to native type would be helpful and provide the below benefits
Example [Optional]
A simple example if applicable.
Proposed Solution [Optional]
Solution Ideas
The text was updated successfully, but these errors were encountered: