-
-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Windows html to docx fails to embed images in the docx file #316
Comments
hi @saumcor |
Hey @JessicaTegner extra_args = ['--data-dir=<windows path>']
pypandoc.convert_file(
'index.html',
to='docx',
format='html',
outputfile='test.docx',
extra_args=extra_args,
) |
Hey @JessicaTegner, what's the issue here? Do you need additional info? Any solutions/workarounds? |
hi @saumcor sorry for not getting back to you :) It seems a bunch of people have had the same issues over times, but I still don't know what the root cause of this is. |
Hey @JessicaTegner, no worries, thanks for helping |
I had the same issue and passing
@JessicaTegner It seems setting |
@sanjass and others
So there's 2 options.
I would be willing to consider alternatives, such as setting sandbox to false by default. What do people think? |
@JessicaTegner thanks for the prompt response. While I'm no expert on the implications of the options you provided, I don't think it's unreasonable to have Namely, when using pandoc directly one would have to explicitly provide In either case, more thorough documentation is needed, especially if we keep |
@sanjass you are right. We should probably have sandbox set to false by default, to replicate the pandoc cli |
Update: After reading through the pandoc user manual, under the "General options", it seems that sandbox default behavior is indeed true. If that's the case, pypandoc is currently doing as the pandoc cli. We could probably, in that case, add some better documentation referencing the pandoc user manual. What does people think? |
Hmm, that's weird. I found this line in the pandoc code When testing locally with pandoc version 2.19.2, it also seems Given a |
hmm interesting. Yeah in that case sandbox = false should be default in pypandoc. |
@saumcor and @sanjass I have aded some tech logic, replicating what OP had an issue with. This conversion however, doesn't seem to produce any warnings or errors. Let me know what you think. |
Hey @JessicaTegner that seems to be in line with the behaviour of pandoc without the |
yes @saumcor but as you can see from the code, I didn't actually change anything, just wrote a test case for it, matching this issue |
@JessicaTegner I didn't run the test so I can't confirm, but could it be that you're seeing a different outcome because of pandoc version? Based on L351-L353 |
@sanjass yes, because "sandbox" was introduced in pandoc = 2.15, so on earlier versions it has no effects. I tested with pandoc 2.19x |
I have an html file which links to an image in the same folder, when converting from html to docx on windows it throws the error
[WARNING] Could not fetch resource test.png: PandocResourceNotFound "test.png"
pypandoc-binary==1.10
html file:
python script
output of
python test.py
:[WARNING] Could not fetch resource test.png: PandocResourceNotFound "test.png"
The text was updated successfully, but these errors were encountered: