-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reading the mainframe files data as text field irrespective of the copybook data type #291
Comments
This is a tricky question. The simple answer is no, we currently do not support this. And I'm not sure what is the way to support it that might be helpful. COBOL has a system of types and formats and some of these types have values that do not have meaning (semantic mapping). Let's consider 2 approaches.
Please, try the |
Thanks for the quick reply. I checked the #4 option to make debug=true.while converting the HEX number to ascii do i need to always use the character set as Cp037. As Cp037 is one of the encoder character set for EBCIDIC for IBM, do we have any other EBCIDIC character set that may come. |
For packed Decimal Comp3 S9(2)V999 for the value 12.000 we are getting the HEX as 00000000012000C. but while converting to string with characterset Cp037, its not working as intended because i suppose i need to look for the field datatype from copybook and accordingly convert to ascii. i.e here C means its positive value and (999) means 3 digits after decimal. Please suggest are there any other ways to convert the values to the raw value in cobrix |
The raw values are presented exactly the same way as they were in the original file, no encoding conversion is happening. I'm trying to understand what do you want to achieve. Could you please provide a made-up example? Something like: "for fields having raw values so and so we want values to be so and so". |
For example my copybook file is as below 01 StudentDetail Name Name_debug ID ID_debug Mark Mark_debug So if you will see the Mark_debug column if I want to reverse engineer the 0000000015000C to 15.00 then how should I know whether the actual ASCII value in MARK as 15.000 or 015.000. Or is it the case that in ebcidic binary format the data wont come like 015.000. i.e can the data come like this in Mark column which is also a valid data and alligned to datatype. Name Name_debug ID ID_debug Mark Mark_debug |
The difference between Another way to look at it is that |
Thats correct, but our requirement is to read the raw value as it is. so in this scenario if i will read the mark field as text field then 15.000 and 015.000 as two different thing. So basically we need to read the value as it is without manipulating anything. |
Okay. I have a question to you too. What in the copybook says that |
no it is not mentioned, but I am assuming for example if in my data file the value is 015.000 or 15.000 or 0015.000 all the cases the HEX encoded value will be 0000000015000C. So as per my requirement how should I get the actual raw value from the HEX encoded. |
Just a suggestion: But then again, your requirement depends on the interpretation of what do you mean by 'raw value'. And your notion of 'raw value' depends on your requirements. So all completely up to you. |
Cobrix won't ever unpack COMP-3 encoded value as '015.000' then it should be fine. As of now with version 2.0.7 I am planning to use debug option and reverse engineer the hex value to the raw value so that i can use the data as it is with out any changes. If for future instead of providing HEX if we can provide raw value it will be really good. It seems while adding debug fields in addDebugFields function i guess val debugDataType = AlphaNumeric(s"X($size)", size, None, None, None) will keep the raw value. Please suggest. |
Yes, we can add an option to generate raw values for debugging easily, Adding it to the backlog,
Yes, and also you need to change this method to return raw values instead of HEX: cobrix/cobol-parser/src/main/scala/za/co/absa/cobrix/cobol/parser/decoders/StringDecoders.scala Line 122 in a4b24ce
|
Thanks a lot. It was great having discussion with you. Do you want me to close this issue or this will be tracked as part of backlog. |
No problem 😄 Let's leave this issue open. I'll use it to make the change to support raw values. |
it seems for the packed decimal field ** Mark Mark_debug** cobrix does not convert to actual HEX. the value in Mark_debug seems like the data is converted to be stored in ASCII environment. Please confirm |
Data in Remember, BCD is a binary encoding format. The notion of encoding (EBCDIC vs ASCII) is not applicable here. |
Got it, Thanks |
Background
If the data does not match with the data type of copybook while reading with cobrix the invalid data becomes null.
Feature
Can we read all the data by text fields irrespective of the copybook data type so that no data would be lost while reading. It’s just like reading a csv file with all the data type as string
Example [Optional]
A simple example if applicable.
Proposed Solution [Optional]
Solution Ideas
1.
2.
3.
The text was updated successfully, but these errors were encountered: