Robustness of Daisy3 SMIL generation #67

rom1mouret · 2014-12-02T11:28:25Z

We are in trouble with this kind of structure:

<sent id="sent3">begin sentence <pagenum id="pn1">42</pagenum> after</sent>

DAISY3 specs expect the pagenum to have a SMIL counterpart. But if sent3 is synthesized, the current script will create a <par> node in the SMIL that links back to sent3. Yet <par> nodes can't have any children. In particular, they can't have the pagenum as child.

So far it worked because the NLP scripts transform the structure into:

<sent id="sent3">
    <span id="b1">begin sentence</span>
    <pagenum id="pn1">42</pagenum>
    <span id="a1">after</span>
</sent>

So the Pipeline can have a different <par> for b1, pn1 and a1, though depending on NLP is not a robust approach, especially if NLP scripts are mocked for testing purposes.

When this kind of structure is detected, the Daisy3 SMIL generation algorithm should ignore the audio attached to sent3 and deletes the corresponding mp3 from the audio-map (otherwise it will be copied to the output dir). Alternatively, perhaps it is okay with the specs to place the pagenum after the sent in the SMIL file.

The text was updated successfully, but these errors were encountered:

josteinaj · 2014-12-02T12:49:52Z

If NLP is not robust enough, could you instead wrap text siblings in a span (or p depending on context) in a preprocessing XSLT step?

If you have to place the pagenums after the sentence then I doubt it would be a problem. We often move pagebreaks to the end of sentences when producing DTBooks. I'm not sure what our narrators do when pagebreaks occurs in the middle of a sentence but I can check.

rom1mouret · 2014-12-02T13:29:23Z

Yes, I can wrap text siblings in case NLP didn't do it.

I don't mind moving the pagenums. I think we are allowed to move them in the SMIL files only, thereby leaving the DTBook unchanged, but I have to check. Moving the noterefs is more questionable.

josteinaj · 2014-12-02T14:10:42Z

For the record, I asked Roald here at NLB who produces narrated books about what we do in these cases. Where the pagebreak and noterefs are read aloud is something that narrators have to use their own judgement to decide in each individual case.

The pagebreaks are almost never read aloud mid-sentence. Usually they are moved to after the sentence, but often they are also moved to before the sentence (there's no clear rule). They can also be moved to after the paragraph, if it makes sense based on the content (and I assume they make similar judgements when the pagebreaks occurs within figures, tables, etc).

For noterefs they seem to usually either read the note at the location of the noteref or at the end of the sentence where the noteref occurs, depending on what the content calls for. Endnotes are read only as noterefs it seems, while the notes themselves are read together with the other endnotes at the end.

I believe there would be less "judgement calls" and more strict guidelines if there was widespread support for skippable content in reading systems (but I'm not sure).

Anyway, I don't know if this helps your issue at all as we don't encounter the SMIL problem you describe (we almost exclusively do audio-only books, what full-text we have is still experimental).

rom1mouret · 2014-12-02T14:39:59Z

Thanks. This is good to know.
In normal circumstances, NLP does the job and the TTS engines will use SSML marks to find where the noterefs are located in the audio files, so I think I'll choose the less costly option that complies with the specs, even if it is not perfect with regard to end users' expectations.

rom1mouret self-assigned this Dec 2, 2014

rom1mouret added the enhancement label Dec 2, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Robustness of Daisy3 SMIL generation #67

Robustness of Daisy3 SMIL generation #67

rom1mouret commented Dec 2, 2014

josteinaj commented Dec 2, 2014

rom1mouret commented Dec 2, 2014

josteinaj commented Dec 2, 2014

rom1mouret commented Dec 2, 2014

Robustness of Daisy3 SMIL generation #67

Robustness of Daisy3 SMIL generation #67

Comments

rom1mouret commented Dec 2, 2014

josteinaj commented Dec 2, 2014

rom1mouret commented Dec 2, 2014

josteinaj commented Dec 2, 2014

rom1mouret commented Dec 2, 2014