Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mrc file with utf-8 BOM fails #161

Open
patrickzurek opened this issue Sep 9, 2016 · 0 comments
Open

mrc file with utf-8 BOM fails #161

patrickzurek opened this issue Sep 9, 2016 · 0 comments

Comments

@patrickzurek
Copy link

JIRA issue created by: rcook
Originally opened: 2012-06-26 09:32 AM

Issue body:
GC Issue http://code.google.com/p/xcoaitoolkit/issues/detail?id=86 and there are attachments

Reported by project member [email protected], Jul 21, 2011

The attached file starts with 3 bytes (EF BB BF)
http://en.wikipedia.org/wiki/Byte_order_mark#UTF-8

When I run convertload.sh on it, I get:

ERROR - [LIB] MarcException unable to parse record length. NumberFormatException For input string: "03".

I'm not sure if we should support this or not, but let's decide.

randy_urresearch.mrk

28.0 KB Download
Delete comment
Comment 1 by project member [email protected], Jul 21, 2011

Nate reported a possibly similar failure with URResearch/IR+. The reason for the failure in IR+ was due to the byte order mark embedded in the file. I’m guessing the same error would occur with XC since it uses marc4j as well. If this is the same issue, Nate offered to give some advice on how to fix this issue. Please let me know if it is similar and we can get the correct discussions going.

Delete comment
Comment 2 by project member [email protected], Jul 21, 2011

Yes, this is the same issue that Nate encountered. I just spoke w/ him. He modified the file using marcedit to make it work. Steps involved:

  1. open marcedit (I used version 5.5.4218.36332)
  2. File -> MARC Tools -> MarcBreaker
    a) Input File: point to attached file
    b) Output File: give it a new name)
    c) check "Translate to UTF-8"
    d) click "Execute"
  3. Open the new file in MARCEditor
    a) File -> Compile file into MARC
    b) save the file somewhere (this file will then work in the oai-toolkit)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant