-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Robustness of Daisy3 SMIL generation #67
Comments
If NLP is not robust enough, could you instead wrap text siblings in a span (or p depending on context) in a preprocessing XSLT step? If you have to place the pagenums after the sentence then I doubt it would be a problem. We often move pagebreaks to the end of sentences when producing DTBooks. I'm not sure what our narrators do when pagebreaks occurs in the middle of a sentence but I can check. |
Yes, I can wrap text siblings in case NLP didn't do it. I don't mind moving the pagenums. I think we are allowed to move them in the SMIL files only, thereby leaving the DTBook unchanged, but I have to check. Moving the noterefs is more questionable. |
For the record, I asked Roald here at NLB who produces narrated books about what we do in these cases. Where the pagebreak and noterefs are read aloud is something that narrators have to use their own judgement to decide in each individual case. The pagebreaks are almost never read aloud mid-sentence. Usually they are moved to after the sentence, but often they are also moved to before the sentence (there's no clear rule). They can also be moved to after the paragraph, if it makes sense based on the content (and I assume they make similar judgements when the pagebreaks occurs within figures, tables, etc). For noterefs they seem to usually either read the note at the location of the noteref or at the end of the sentence where the noteref occurs, depending on what the content calls for. Endnotes are read only as noterefs it seems, while the notes themselves are read together with the other endnotes at the end. I believe there would be less "judgement calls" and more strict guidelines if there was widespread support for skippable content in reading systems (but I'm not sure). Anyway, I don't know if this helps your issue at all as we don't encounter the SMIL problem you describe (we almost exclusively do audio-only books, what full-text we have is still experimental). |
Thanks. This is good to know. |
We are in trouble with this kind of structure:
DAISY3 specs expect the pagenum to have a SMIL counterpart. But if
sent3
is synthesized, the current script will create a<par>
node in the SMIL that links back tosent3
. Yet<par>
nodes can't have any children. In particular, they can't have the pagenum as child.So far it worked because the NLP scripts transform the structure into:
So the Pipeline can have a different
<par>
forb1
,pn1
anda1
, though depending on NLP is not a robust approach, especially if NLP scripts are mocked for testing purposes.When this kind of structure is detected, the Daisy3 SMIL generation algorithm should ignore the audio attached to
sent3
and deletes the corresponding mp3 from the audio-map (otherwise it will be copied to the output dir). Alternatively, perhaps it is okay with the specs to place the pagenum after the sent in the SMIL file.The text was updated successfully, but these errors were encountered: