You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I took the test file and used powerpoint to save as an RTF file. Using textutil on OSX, I generated a baseline. Ideally, textract should produce the exact text:
B) Newlines are completely lost. For example, slide 10 reads
Who thought this would be a good idea?
Unfortunately the arrow keys act relative to the screen rather than the text
The entire input situation is confusing
but textract is writing
Who thought this would be a good idea? Unfortunately the arrow keys act relative to the screen rather than the text The entire input situation is confusing
C) The … character U+2026 is missing (is that intentional?)
The text was updated successfully, but these errors were encountered:
I took the test file and used powerpoint to save as an RTF file. Using textutil on OSX, I generated a baseline. Ideally, textract should produce the exact text:
While the differences might be conscious decisions, it's worth clarifying:
A) the line "textract not ready, retrying in .5 seconds" is printed to stdout. This probably should be printed to stderr: https://github.com/dbashford/textract/blob/master/lib/extract.js#L72 should use
console.error
rather thanconsole.log
B) Newlines are completely lost. For example, slide 10 reads
but textract is writing
C) The
…
character U+2026 is missing (is that intentional?)The text was updated successfully, but these errors were encountered: