You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Sequence assembly (thank you bioinformatics, again!) could be adapted to piece objects together using probabilities that they follow each other.
Ideas on what this could use:
n-grams
positioning heuristics (or learned):
if two blocks are positioned such that really follow each other, it's highly probable that they follow each other in content
a block NW of another block can't be the subsequent piece of content (you can only move in the NE, SE, SW quadrants)
formatting: content blocks that follow each other should have similar formatting (except cases like when it's a title that comes next)
linking figures and footnotes to the main text could be done using the numbers that appear in the main text, assuming we can classify figures and footnotes as such before (or at the same time as) inferring the body text sequence.
The text was updated successfully, but these errors were encountered:
Sequence assembly (thank you bioinformatics, again!) could be adapted to piece objects together using probabilities that they follow each other.
Ideas on what this could use:
The text was updated successfully, but these errors were encountered: