-
Notifications
You must be signed in to change notification settings - Fork 11
Post Alignment Change Proposal
Christopher Klapp edited this page Nov 3, 2017
·
9 revisions
- We separate concerns by continuing to store target language verses as strings.
- The extra data in the aligned USFM 3 project should be persisted in the alignment reducer.
- Create a script or function that parses data from
USFM 3
to the desired bible resource format used intC
(Chapter, Verse, word Objects format)
- Gateway Langauge Resource Bibles should now have alignment data but not all of them.
- Importing Resource Bibles needs to maintain alignment data.
- Store as an array of word objects like Primary Languages like Greek/Hebrew.
- See USFM.js changes
- Projects import as regular USFM, whether through tS or tC.
- Projects store targetLanguage verses as strings.
- Can be aligned and store alignment data separately.
- Alignment data is to be exportable as USFM 3.
- Projects import as USFM 3 with embedded alignment data in the words object.
- Parsed alignment data is persisted via alignment reducer.
- Parsed verse data could be persisted as a string to maintain backward compatibility.
- Using tools to edit the verse can modify the string but would invalidate some alignments.
- Alignment data is to be exportable as USFM 3.
- Alignment data matches the BHP version used in the app.
- Uses of the alignment data such as scripture pane and alignment tool need to gracefully handle changed BHP data.
- Edited verses in tools like Autographa and translationWords invalidates alignment data.
- App and Tools need to gracefully handle invalidated alignments due to verse edits.
- Export needs to be able to validate alignments.
- Current parsing doesn't maintain punctuation, only word objects.
- Bidirectional support for mixing word markers and non-word markers.
- Needs to maintain data like footnote markers and footnote text on import/export.
- Needs to maintain markers like quotes that are inline and have verse text in them.
- Preserve punctuation through support of mixed word markers and text
- Support for all USFM markers
- Add as few test use cases as possible to have good coverage of issues.
- Inline Markers need to be maintained in the target language verse text to ensure they are not lost.
- Inline Markers need to be filtered dynamically when handled in Alignment tool.
- Footnote text should be filtered out, other marker data should be left in but not the marker itself.
- This allows the alignment to function on the verse text but bypasses the non-verse text.
- There may be other markers to be treated as footnotes, but not sure which ones are used.
- Handle footnotes in a way that can be extended in case others arise.
- Primary Bible to be the BHP, the Primary Language.
- Highlighting phrases from the tool needs to be the Primary Language.
- Gracefully handle showing or not showing word details (lexicon information) for non-Greek bibles resources.
- Add support for non-word marker objects that come from non-word markers.
- Render the ULB and BHP based on array of objects
- Pass the highlighted/quote BHP word object (word, strongs, occurrence, occurrences) to both BHP and ULB.
- Highlight the word object(s) in the BHP that matches the BHP quote.
- Highlight the word object(s) in the ULB/others that are aligned to BHP quote.
- Languages without alignment data or no aligned matches found will not highlight any words.
- Maintaining target language as strings ensures backward compatibility.
- Verse edits may invalidate alignments but may not be a concern here.
- Footnotes and inline markers may need filtered on view but not on edit to ensure they are not lost.
- Selections made can populate the word alignment reducer.
- Validate alignments as changes may have invalidated some.
- The verse may have been edited since last aligned.
- Primary Language version may have been updated.
- The tool needs an overhaul to base check off of the Strongs number associated with the check.
- Quote needs updated to be the Primary Language word via Strongs number instead of ULB.
- Check Card needs to display the Primary Language word and Strongs number along with GL.
- ContextIds in GroupData needs to be based on Strongs and Greek word in Quote.
- Multiple occurrences of a strongs number in the Greek to be verified how to handle.
- False positives where strongs number may occur.
- Scripture Pane Greek highlighting to be done via Strongs number in check and should be present in the contextId, potentially the Greek word as the quote as well.
- Check Info Card to additionally pass Greek info into it and needs to be designed on how to display.
- Show greek lexical information in the Check Info card or anywhere else?
- Remove tons of defunct code no longer in use.
- Folders: filters/js/scripts/translation_words/utils
- Files: loader.js/USFMParse.js
- Fix linting errors
- The previous tool checking menu is showing English words from the article.
- Is this still the expected behavior? It seems as though it should be.
- Support for showing translated article titles in the menu, when source tool articles are translated.
- May need core helpers for common functionality.
- Pivot a verse that is an array of word objects to a string and alignment data.
- Pivot a verse that is a string and alignment data to an array of word objects.
- Finding Gateway Language words to be highlighted with the provided Primary Language word.
- Convert verses that are array of word objects with punctuation
- CSV Export - to prepare alignment data for CSV export
- Project Details Helper - Tool Details - Calculate Progress to handle alignment data for word alignment tool (maybe relocate?).
- Project Validation - validate alignments?
- Selection Helpers - occurrence/occurrences in a string... if used on ULB that is going to change.
- May not be needed since word objects include occurrence/occurrences.
- CSV Export - actions to export alignment data if needed
- GroupDataActions - explore changes to completed verse alignment not from empty word bank
- Import Local - handling aligned USFM 3 files and pivoting into verse strings and alignment data
- word markers with alignment data populates alignments
- text/word markers without alignment data populate the word bank.
- Merge Conflict Actions - support aligned usfm 3 files
- Resource Loading - always load/generate alignment reducer for uses with selections
- Project Details - Do we want to show if alignment data is available?
- Project Validation Actions - check/validate alignment data
- Project Selection Actions - load/generate alignment data
- Resources Actions - always load/generate alignment data for other uses.
- Selections Actions - leverage selections as alignments and alignments as selections
- Target Language Actions - update USFM import/export to handle alignment and USFM 3.
- USFM Export Actions - support alignment data and USFM 3, also support USFM 2 w/o alignment.
- Word Alignment Actions - Generate selections when alignments are created?
- May need new actions for alignment prediction.
- Rename
BodyUIActions
to something more specific. - LoaderActions - see if sendProgressForKey is really being used? Do we wan to finish this?
- ModalActions - see if it is needed and remove, tools has its own system.
- OnlineModeAction - see if it is needed and remove
- Recent Projects Actions vs My Projects, do we need "recent" projects or just My Projects?
- Remove all code and components related to old modal.
- SideBar Actions - Update naming conventions for consistency to GroupMenu and fix it.
- No changes needed from what we can tell.
- There are quite a few places in the code base that join the BHP words into a string.
- Create a helper for doing this it would be in a single place, and extend it for things other than words/punctuation.
- resourcesReducer.bibles.targetLanguage[1][1] #=> "Jesus, wept."
[
{ type: 'word', text: 'Jesus', attributes: '...' },
{ type: 'text', text: ', ' },
{ type: 'word', text: 'wept', attributes: '...' },
{ type: 'text', text: '.' },
{ type: 'footnote', text: 'Here is a foot note that needs to be preserved but not words in the verse.' }
]
- A new tool may need to be created to keep the code maintainable.
- May need a new reducer to handle alignment prediction look-ups efficiently.
- Alignment prediction can be included in Verse Check tool to predict selections.
- Should be able to export a USFM 3 project with aligned data information.
- Create a helper to find word alignments for USFM3 export based on the word order of target language verse.
- Pivot the data to match that of what USFM 3 library would provide when parsing USFM 3.
- Example output:
\id TIT
\c 1
\v 1
\w The book of|x-bhp-phrase="βίβλος" x-bhp-occurrence="1/1" x-occurrence="1/1" \w*
\w the genealogy of|x-bhp-phrase="γενέσεως" x-bhp-occurrence="1/1" x-occurrence="1/1" \w*
\w Jesus|x-bhp-phrase="ἰησοῦ" x-bhp-occurrence="1/1" x-occurrence="1/1" \w*
\w Christ|x-bhp-phrase="χριστοῦ" x-bhp-occurrence="1/1" x-occurrence="1/1" \w*,
\w son of|x-bhp-phrase="υἱοῦ" x-bhp-occurrence="1/2" x-occurrence="1/2" \w*
\w David|x-bhp-phrase="δαυεὶδ" x-bhp-occurrence="1/1" x-occurrence="1/1" \w*,
\w son of|x-bhp-phrase="υἱοῦ" x-bhp-occurrence="2/2" x-occurrence="2/2" \w*
\w abraham|x-bhp-phrase="ἀβραάμ" x-bhp-occurrence="1/1" x-occurrence="1/1" \w*.
- Should be able to import a USFM 3 project with aligned data information, see above.
- This project should be able to generate a target bible as strings and alignment data matching app expectations.
- The data coming from USFM.js should look something like:
{
"1": {
"1": [
{
"type": "word",
"text": "the book of",
"bhp-phrase": "βίβλος",
"bhp-occurrence": "1/1",
"occurrence": "1/1"
},
{
"type": "word",
"text": "the genealogy of",
"bhp-phrase": "γενέσεως",
"bhp-occurrence": "1/1",
"occurrence": "1/1"
},
{
"type": "word",
"text": "Jesus",
"bhp-phrase": "ἰησοῦ",
"bhp-occurrence": "1/1",
"occurrence": "1/1"
},
{
"type": "word",
"text": "Christ",
"bhp-phrase": "χριστοῦ",
"bhp-occurrence": "1/1",
"occurrence": "1/1"
},
{
"type": "word",
"text": "son of",
"bhp-phrase": "υἱοῦ",
"bhp-occurrence": "1/2",
"occurrence": "1/2"
},
{
"type": "word",
"text": "David",
"bhp-phrase": "δαυεὶδ",
"bhp-occurrence": "1/1",
"occurrence": "1/1"
},
{
"type": "text",
"text": ","
},
{
"type": "word",
"text": "son of",
"bhp-phrase": "υἱοῦ",
"bhp-occurrence": "2/2",
"occurrence": "2/2"
},
{
"type": "word",
"text": "Abraham",
"bhp-phrase": "ἀβραάμ",
"bhp-occurrence": "1/1",
"occurrence": "1/1"
},
{
"type": "text",
"text": "."
},
{
"type": "f",
"text": "This is a footnote about this verse, It shouldn't be rendered inline with the text."
}
]
}
}
- Create a helper if not already created that renders alignment data from the array of objects.
- Alignment example for
son of
&&υἱοῦ
would be:{ topWords: [{ word: 'υἱοῦ', ... }], bottomWords: [{ word: 'son', ... }], { word: 'of', ... }] }
- Splitting words would have to update occurrence(s) when pivoting.
- Ex. even though
son of
occurs twice,son
occurs twice, butof
occurs four times.
- Ex. even though
- Alignment example for
- Create a helper if not already created that renders a text string from the array of objects.
- Target Language Verse would be:
The book of the genealogy of Jesus Christ, son of David, son of Abraham.
- Something like
verseArray.map((el)=> {return el.word }).join(" ");
in the target language actions create target language bible from USFM. - Add support for added object types for punctuation and markers such as footnotes.