Post Alignment Change Proposal

Alignment Data Changes Overview

We separate concerns by continuing to store target language verses as strings.
The extra data in the aligned USFM 3 project should be persisted in the alignment reducer.

Alignment Data Flow

https://docs.google.com/drawings/d/1H5SqAp1RKRKq7bowC1jpRGZrRrHg5RKXEalZAJgqMfU/edit

tC_Resources Importing

Create a script or function that parses data from USFM 3 to the desired bible resource format used in tC (Chapter, Verse, word Objects format)

Resource Bibles

Gateway Langauge Resource Bibles should now have alignment data but not all of them.
Importing Resource Bibles needs to maintain alignment data.
Store as an array of word objects like Primary Languages like Greek/Hebrew.
- See USFM.js changes

Previous Projects

Projects import as regular USFM, whether through tS or tC.
Projects store targetLanguage verses as strings.
Can be aligned and store alignment data separately.
Alignment data is to be exportable as USFM 3.

Aligned Projects

Projects import as USFM 3 with embedded alignment data in the words object.
Parsed alignment data is persisted via alignment reducer.
Parsed verse data could be persisted as a string to maintain backward compatibility.
Using tools to edit the verse can modify the string but would invalidate some alignments.
Alignment data is to be exportable as USFM 3.

Invalidated Alignments

Alignment data matches the BHP version used in the app.
Uses of the alignment data such as scripture pane and alignment tool need to gracefully handle changed BHP data.
Edited verses in tools like Autographa and translationWords invalidates alignment data.
App and Tools need to gracefully handle invalidated alignments due to verse edits.
Export needs to be able to validate alignments.

USFM.js

Current parsing doesn't maintain punctuation, only word objects.

Non-Word Markers

Bidirectional support for mixing word markers and non-word markers.
- Needs to maintain data like footnote markers and footnote text on import/export.
- Needs to maintain markers like quotes that are inline and have verse text in them.
- Preserve punctuation through support of mixed word markers and text
- Support for all USFM markers

Tests

Add as few test use cases as possible to have good coverage of issues.

Inline Markers such as Footnotes

Inline Markers need to be maintained in the target language verse text to ensure they are not lost.
Inline Markers need to be filtered dynamically when handled in Alignment tool.
- Footnote text should be filtered out, other marker data should be left in but not the marker itself.
- This allows the alignment to function on the verse text but bypasses the non-verse text.
- There may be other markers to be treated as footnotes, but not sure which ones are used.
- Handle footnotes in a way that can be extended in case others arise.

Scripture Pane

Primary Bible to be the BHP, the Primary Language.
Highlighting phrases from the tool needs to be the Primary Language.
Gracefully handle showing or not showing word details (lexicon information) for non-Greek bibles resources.
Add support for non-word marker objects that come from non-word markers.

Proposed workflow:

Render the ULB and BHP based on array of objects
Pass the highlighted/quote BHP word object (word, strongs, occurrence, occurrences) to both BHP and ULB.
Highlight the word object(s) in the BHP that matches the BHP quote.
Highlight the word object(s) in the ULB/others that are aligned to BHP quote.
Languages without alignment data or no aligned matches found will not highlight any words.

Verse Check

Maintaining target language as strings ensures backward compatibility.
Verse edits may invalidate alignments but may not be a concern here.
Footnotes and inline markers may need filtered on view but not on edit to ensure they are not lost.
Selections made can populate the word alignment reducer.

Word Alignment Tool

Validate alignments as changes may have invalidated some.
- The verse may have been edited since last aligned.
- Primary Language version may have been updated.

TranslationWords Tool

The tool needs an overhaul to base check off of the Strongs number associated with the check.
Quote needs updated to be the Primary Language word via Strongs number instead of ULB.
Check Card needs to display the Primary Language word and Strongs number along with GL.
ContextIds in GroupData needs to be based on Strongs and Greek word in Quote.
Multiple occurrences of a strongs number in the Greek to be verified how to handle.
False positives where strongs number may occur.
Scripture Pane Greek highlighting to be done via Strongs number in check and should be present in the contextId, potentially the Greek word as the quote as well.
Check Info Card to additionally pass Greek info into it and needs to be designed on how to display.
Show greek lexical information in the Check Info card or anywhere else?

Other Changes

Remove tons of defunct code no longer in use.
- Folders: filters/js/scripts/translation_words/utils
- Files: loader.js/USFMParse.js
Fix linting errors

Tools GroupData Menu

The previous tool checking menu is showing English words from the article.
Is this still the expected behavior? It seems as though it should be.
Support for showing translated article titles in the menu, when source tool articles are translated.

Core Helpers

May need core helpers for common functionality.
Pivot a verse that is an array of word objects to a string and alignment data.
Pivot a verse that is a string and alignment data to an array of word objects.
Finding Gateway Language words to be highlighted with the provided Primary Language word.
Convert verses that are array of word objects with punctuation
CSV Export - to prepare alignment data for CSV export
Project Details Helper - Tool Details - Calculate Progress to handle alignment data for word alignment tool (maybe relocate?).
Project Validation - validate alignments?
Selection Helpers - occurrence/occurrences in a string... if used on ULB that is going to change.
- May not be needed since word objects include occurrence/occurrences.

Core Actions

CSV Export - actions to export alignment data if needed
GroupDataActions - explore changes to completed verse alignment not from empty word bank
Import Local - handling aligned USFM 3 files and pivoting into verse strings and alignment data
- word markers with alignment data populates alignments
- text/word markers without alignment data populate the word bank.
Merge Conflict Actions - support aligned usfm 3 files
Resource Loading - always load/generate alignment reducer for uses with selections
Project Details - Do we want to show if alignment data is available?
Project Validation Actions - check/validate alignment data
Project Selection Actions - load/generate alignment data
Resources Actions - always load/generate alignment data for other uses.
Selections Actions - leverage selections as alignments and alignments as selections
Target Language Actions - update USFM import/export to handle alignment and USFM 3.
USFM Export Actions - support alignment data and USFM 3, also support USFM 2 w/o alignment.
Word Alignment Actions - Generate selections when alignments are created?
May need new actions for alignment prediction.

Other Changes

Rename BodyUIActions to something more specific.
LoaderActions - see if sendProgressForKey is really being used? Do we wan to finish this?
ModalActions - see if it is needed and remove, tools has its own system.
OnlineModeAction - see if it is needed and remove
Recent Projects Actions vs My Projects, do we need "recent" projects or just My Projects?
Remove all code and components related to old modal.
SideBar Actions - Update naming conventions for consistency to GroupMenu and fix it.

Core Reducers

No changes needed from what we can tell.

Preserving Punctuation

There are quite a few places in the code base that join the BHP words into a string.
Create a helper for doing this it would be in a single place, and extend it for things other than words/punctuation.
resourcesReducer.bibles.targetLanguage[1][1] #=> "Jesus, wept."

[
  { tag: 'w', type: 'word', text: 'Jesus', attributes: '...' },
  { type: 'text', text: ', ' },
  { tag: 'w', type: 'word', text: 'wept', attributes: '...' },
  { type: 'text', text: '.' },
  { tag: 'f', type: 'footnote', text: 'Here is a foot note that needs to be preserved but not words in the verse.' }
]

Alignment Prediction

A new tool may need to be created to keep the code maintainable.
May need a new reducer to handle alignment prediction look-ups efficiently.
Alignment prediction can be included in Verse Check tool to predict selections.

Exporting USFM 3

Should be able to export a USFM 3 project with aligned data information.
Create a helper to find word alignments for USFM3 export based on the word order of target language verse.
Pivot the data to match that of what USFM 3 library would provide when parsing USFM 3.
- Example output:

\id TIT
\c 1
\v 1
\w The book of|x-bhp-phrase="βίβλος" x-bhp-occurrence="1/1" x-occurrence="1/1" \w*
\w the genealogy of|x-bhp-phrase="γενέσεως" x-bhp-occurrence="1/1" x-occurrence="1/1" \w*
\w Jesus|x-bhp-phrase="ἰησοῦ" x-bhp-occurrence="1/1" x-occurrence="1/1" \w*
\w Christ|x-bhp-phrase="χριστοῦ" x-bhp-occurrence="1/1" x-occurrence="1/1" \w*,
\w son of|x-bhp-phrase="υἱοῦ" x-bhp-occurrence="1/2" x-occurrence="1/2" \w*
\w David|x-bhp-phrase="δαυεὶδ" x-bhp-occurrence="1/1" x-occurrence="1/1" \w*,
\w son of|x-bhp-phrase="υἱοῦ" x-bhp-occurrence="2/2" x-occurrence="2/2" \w*
\w abraham|x-bhp-phrase="ἀβραάμ" x-bhp-occurrence="1/1" x-occurrence="1/1" \w*.

Importing USFM 3

Should be able to import a USFM 3 project with aligned data information, see above.
This project should be able to generate a target bible as strings and alignment data matching app expectations.
- The data coming from USFM.js should look something like:

{
  "1": {
    "1": [
      {
        "type": "word",
        "text": "the book of",
        "bhp-phrase": "βίβλος",
        "bhp-occurrence": "1/1",
        "occurrence": "1/1"
      },
      {
        "type": "word",
        "text": "the genealogy of",
        "bhp-phrase": "γενέσεως",
        "bhp-occurrence": "1/1",
        "occurrence": "1/1"
      },
      {
        "type": "word",
        "text": "Jesus",
        "bhp-phrase": "ἰησοῦ",
        "bhp-occurrence": "1/1",
        "occurrence": "1/1"
      },
      {
        "type": "word",
        "text": "Christ",
        "bhp-phrase": "χριστοῦ",
        "bhp-occurrence": "1/1",
        "occurrence": "1/1"
      },
      {
        "type": "word",
        "text": "son of",
        "bhp-phrase": "υἱοῦ",
        "bhp-occurrence": "1/2",
        "occurrence": "1/2"
      },
      {
        "type": "word",
        "text": "David",
        "bhp-phrase": "δαυεὶδ",
        "bhp-occurrence": "1/1",
        "occurrence": "1/1"
      },
      {
        "type": "text",
        "text": ","
      },
      {
        "type": "word",
        "text": "son of",
        "bhp-phrase": "υἱοῦ",
        "bhp-occurrence": "2/2",
        "occurrence": "2/2"
      },
      {
        "type": "word",
        "text": "Abraham",
        "bhp-phrase": "ἀβραάμ",
        "bhp-occurrence": "1/1",
        "occurrence": "1/1"
      },
      {
        "type": "text",
        "text": "."
      },
      {
        "type": "f",
        "text": "This is a footnote about this verse, It shouldn't be rendered inline with the text."
      }
    ]
  }
}

Data Pivot

Create a helper if not already created that renders alignment data from the array of objects.
- Alignment example for son of && υἱοῦ would be:
  - { topWords: [{ word: 'υἱοῦ', ... }], bottomWords: [{ word: 'son', ... }], { word: 'of', ... }] }
- Splitting words would have to update occurrence(s) when pivoting.
  - Ex. even though son of occurs twice, son occurs twice, but of occurs four times.

Target Language Verse

Create a helper if not already created that renders a text string from the array of objects.
Target Language Verse would be:
- The book of the genealogy of Jesus Christ, son of David, son of Abraham.
Something like verseArray.map((el)=> {return el.word }).join(" "); in the target language actions create target language bible from USFM.
Add support for added object types for punctuation and markers such as footnotes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Post Alignment Change Proposal

Alignment Data Changes Overview

Alignment Data Flow

tC_Resources Importing

Resource Bibles

Previous Projects

Aligned Projects

Invalidated Alignments

USFM.js

Non-Word Markers

Tests

Inline Markers such as Footnotes

Scripture Pane

Proposed workflow:

Verse Check

Word Alignment Tool

TranslationWords Tool

Other Changes

Tools GroupData Menu

Core Helpers

Core Actions

Other Changes

Core Reducers

Preserving Punctuation

Alignment Prediction

Exporting USFM 3

Importing USFM 3

Data Pivot

Target Language Verse

Clone this wiki locally