Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specification document on the format of the transcription (full list of XML tags) #8

Open
fabianoP opened this issue Apr 16, 2018 · 4 comments
Assignees
Labels
critical need special attention

Comments

@fabianoP
Copy link

To date we don't have the specification document on the transcription format from Omeka, we need asap to understand if you want to use only and or add more tags.

This input is necessary for the continuation of the project.

@fabianoP fabianoP added the critical need special attention label Apr 16, 2018
@fabianoP fabianoP added this to the Transcription specification document milestone Apr 16, 2018
@fabianoP
Copy link
Author

Monika will go through the PDF file with Susan and will come up with a definitive solution in few weeks.

@MonikaBarget
Copy link
Collaborator

This is the transcription information which we currently give to volunteers. It contains all the XML tags currently displayed in OMEKA as buttons:

http://letters1916.maynoothuniversity.ie/images/ProofingXMLGuidelines.pdf

Monika has copied all these tags into a spreadsheet with further explanations:

https://docs.google.com/spreadsheets/d/1pC4KEqc8TNwfzwQoEjLgISi4Xfx95BGLSr_RrPS2IEI/edit#gid=18408623

Susan was assigned to task of letting Fabiano which tags can be made invisible in the current OMEKA, and which tags should not be added to the new system.

@MonikaBarget
Copy link
Collaborator

MonikaBarget commented Apr 23, 2018

Monika spoke to Susan on the phone and explained the updated spreadsheet to her. Susan will take care of the XML tags, Monika will finalize the list of UTF8-special characters.

@MonikaBarget MonikaBarget changed the title Specification document on the format of the transcription Specification document on the format of the transcription (full list of XML tags) Apr 24, 2018
@MonikaBarget
Copy link
Collaborator

Susan and Monika have gone through the full list of tags currently used by volunteer transcribers AND tags added later on by Letters admins:

https://docs.google.com/spreadsheets/d/1pC4KEqc8TNwfzwQoEjLgISi4Xfx95BGLSr_RrPS2IEI/edit#gid=18408623

All the tags I crossed out are not needed for the letters project, neither in the transcription desk not in the background.
Tags we wish to implement via form in the future have a note in blue.

Besides, there is one important issue that still needs to be discussed with @stavrosangelis and @fabianoP :

add the moment, a "change" tag is inserted automatically whenever a transcriber saves his file. All changes are tracked in multiple "change" tags throughout the XML files and make the files very long and hard to read for humans.

Susan would prefer to implement the versioning control elsewhere, but Stavros has indicated that this change could mess up the XML structure and cause instability.

This issue should therefore be clarified in the final meeting.

Pádraig needs to give feedback on the "person name" and "place name" tags.

Issues assigned to Pádraig and @stavrosangelis are also marked in the XML spreadsheet.

UTF-8 characters will be (as agreed in the meeting on 1 May 2018) be added manually: users are to copy our list of valid UTF-8 characters from either a PDF or a linked page on Wordpress.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
critical need special attention
Projects
None yet
Development

No branches or pull requests

2 participants