-
Notifications
You must be signed in to change notification settings - Fork 5
ami svg
petermr edited this page Aug 21, 2020
·
8 revisions
Analyzes svg/ for text and paths
@Test
public void testExtractVectors() {
File projectDir = TEST_VECTOR;
File treeDir = new File(projectDir, "PMC4491181");
File targetDir = new File(TARGET_VECTOR, "create/");
CMineTestFixtures.cleanAndCopyDir(projectDir, targetDir);
String cmd = ""
+ " -vv"
+ " --forcemake"
// + " -t " + treeDir
+ " -p " + targetDir
+ " pdfbox"
+ " --maxprimitives=100000"
// + " --pages=4 5"
;
// AMI.execute(cmd);
Note sure whether panels
is required
cmd = ""
+ " -vv"
+ " --forcemake"
// + " -t " + treeDir
+ " -p " + projectDir
+ " svg"
+ " --panels "
;
AMI.execute(cmd);
}
Generic values (AMISVGTool)
================================
input basename null
input basename list null
cproject /Users/pm286/workspace/cmdev/ami3/src/test/resources/org/contentmine/ami/vector10
ctree
cTreeList 10 trees [/Users/pm286/workspace/cmdev/ami3/src/test/resour
excludeBase {}
excludeTrees {}
forceMake true
includeBase {}
includeTrees null
log4j {}
verbose 2
Specific values (AMISVGTool)
================================
Command line options for 'ami svg':
--caches : d null
--pages : d null
--panels : m {xwidth=200,ywidth=100}
--regex : d null
--regexfile : d null
--tidysvg : d null
--vectorlog : d vectors.log
--vectordir : d vectors/
--logfile : d null
--help : d false
--version : d false
AMISVGTool cTree: PMC4491181
cTree: PMC4491181
PAGE: p.0: 5
writing paths to /Users/pm286/workspace/cmdev/ami3/src/test/resources/org/contentmine/ami/vector10/PMC4491181/svg/fulltext-page.0/paths.svg
PAGE: p.1: 35
writing paths to /Users/pm286/workspace/cmdev/ami3/src/test/resources/org/contentmine/ami/vector10/PMC4491181/svg/fulltext-page.1/paths.svg
PAGE: p.2: 168
writing paths to /Users/pm286/workspace/cmdev/ami3/src/test/resources/org/contentmine/ami/vector10/PMC4491181/svg/fulltext-page.2/paths.svg
PAGE: p.3: 560
writing paths to /Users/pm286/workspace/cmdev/ami3/src/test/resources/org/contentmine/ami/vector10/PMC4491181/svg/fulltext-page.3/paths.svg
PAGE: p.4: 1521
writing paths to /Users/pm286/workspace/cmdev/ami3/src/test/resources/org/contentmine/ami/vector10/PMC4491181/svg/fulltext-page.4/paths.svg
PAGE: p.5: 409
writing paths to /Users/pm286/workspace/cmdev/ami3/src/test/resources/org/contentmine/ami/vector10/PMC4491181/svg/fulltext-page.5/paths.svg
PAGE: p.6: 0
PAGE: p.7: 10
writing paths to /Users/pm286/workspace/cmdev/ami3/src/test/resources/org/contentmine/ami/vector10/PMC4491181/svg/fulltext-page.7/paths.svg
AMISVGTool cTree: PMC4503998
cTree: PMC4503998
no svg/ dir
...
output tree
├── PMC4491181
│ ├── eupmc_result.json
│ ├── fulltext.pdf
│ ├── fulltext.xml
│ ├── pdfimages
│ │ ├── image.2.1.124_235.587_697.png
│ │ ├── image.2.2.246_357.587_697.png
│ │ ├── image.2.3.368_478.586_638.png
│ │ ├── image.2.4.368_474.649_698.png
│ │ ├── image.3.1.248_359.617_728.png
│ │ ├── image.4.1.70_275.70_186.png
│ │ ├── image.4.2.304_418.69_184.png
│ │ ├── image.4.3.429_532.82_173.png
│ │ ├── image.4.4.70_533.195_312.png
│ │ ├── image.4.5.69_534.318_434.png
│ │ ├── image.5.1.78_236.198_328.png
│ │ ├── image.5.2.250_371.198_328.png
│ │ └── image.6.1.121_475.205_280.png
│ ├── scholarly.html
│ └── svg
│ ├── fulltext-page.0
│ │ └── paths.svg
│ ├── fulltext-page.0.svg
│ ├── fulltext-page.1
│ │ └── paths.svg
│ ├── fulltext-page.1.svg
│ ├── fulltext-page.2
│ │ └── paths.svg
│ ├── fulltext-page.2.svg
│ ├── fulltext-page.3
│ │ └── paths.svg
│ ├── fulltext-page.3.svg
│ ├── fulltext-page.4
│ │ └── paths.svg
│ ├── fulltext-page.4.svg
│ ├── fulltext-page.5
│ │ └── paths.svg
│ ├── fulltext-page.5.svg
│ ├── fulltext-page.6.svg
│ ├── fulltext-page.7
│ │ └── paths.svg
│ ├── fulltext-page.7.svg
│ └── vectors.log
├── PMC4503998
Paths can be extracted from the SVG and will often represent diagrams. Example:
NOTE:
- the PDF contains small rectangles which are added by
appendRectangle()
. However this has not been debugged yet and so some are missing. - the letters and digits are not included. Note that the bold letters are not coded characters, but stroked glyphs which will need decoding.
2020-08-21
appendRectangle
is used both to draw rectangles and clip. Because I don't understand the clipping fully, I disabled it. Occasionally it is real rectangles. So rectangles may be missing. Hmmm...
Maybe make all rects unfilled...