-
-
Notifications
You must be signed in to change notification settings - Fork 354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
src: Add support for AI files (Adobe Illustrator) #323
src: Add support for AI files (Adobe Illustrator) #323
Conversation
I need to fix an error message: Borewit/strtok3#147 |
@Borewit while you're here, hi! I found an odd bug? (or misuse issue on my part) with |
core.js
Outdated
|
||
const buffer = Buffer.alloc(minimumBytes); | ||
|
||
await tokenizer.peekBuffer(buffer, {position: options.position, length: 512, mayBeLess: true}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can safely use tokenizer.readBuffer
(advancing the position) once you detected the is PDF kind.
What you can look for however is the string xmp:CreatorToolAdobe Illustrator 24.0 (Windows)</xmp:CreatorTool> or similar, which is near the start of the file, before all the data!
Do you think it possible to tokenizer.read...
as long we read before all the data is met?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume you're talking about readToken
, correct? If so, we probably can read the buffer starting at the 1350
s byte, read 512 (for safety) and check if the resulting string includes Adobe Illustrator
? Is that what you're referring to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not looked the format at all, but I was (naively) hoping there is a structured way of reading possible. Similar how the zip or png file is done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is a screenshot about how it looks. The items marked with a square are those that can (and will) change; specifically the document title (file name you save it as) and some arbitrary length... This is why the code skips 1350
bytes (which gives plenty leeway even for long file names) and reads 512
bytes which should hopefully catch the CreatorTool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we can somehow figure out the length, then ignore till the title start, ignore title length, then till CreatorTool
, we could probably use readToken
... But, at least in this case... It's easier to read 512 bytes than do that. Plus, this method makes it work with other file types, if needed! Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is a header or field length encoded, it could be worth to try to decode that. Your current solution simple, which is also worth something.
|
Thought I'd give a shot at adding support for Adobe's AI files. This might also allow other sequence-based searches to be done! Closes #64 (what an oldie!)
How does it work? There's no easy way to differentiate between AI and PDF. The first bytes are the same (
%PDF
in bytes). What you can look for however is the string<xmp:CreatorTool>Adobe Illustrator 24.0 (Windows)</xmp:CreatorTool>
or similar, which is near the start of the file, before all the data!The only other way I could think of is creating a PDF parser of sorts and reading it through that, but that doesn't sound easy enough 😝
Link: https://en.wikipedia.org/wiki/Adobe_Illustrator_Artwork
Question
If wanted, we could also just remove the limit when reading the buffer for a sequence check, which might help for stripped files.
Quick note
Some of the changes in this PR have been made to make xo happy.
IssueHunt Summary
Referenced issues
This pull request has been submitted to:
IssueHunt has been backed by the following sponsors. Become a sponsor