Replies: 2 comments 4 replies
-
This very much depends on how this watermark is implemented technically. Except one thing: it is text and also some vector graphics. |
Beta Was this translation helpful? Give feedback.
-
Sorry - I took the wrong road. It cannot be detected and removed by the official API. for page in doc:
print(doc.xref_get_key(page.xref,"Resources/XObject"))
('dict', '<</Im0 89 0 R/Fm0 94 0 R>>')
('dict', '<</Im0 3 0 R/Fm0 94 0 R>>')
('dict', '<</Im0 6 0 R/Fm0 94 0 R>>')
('dict', '<</Im0 9 0 R/Fm0 94 0 R>>')
('dict', '<</Im0 12 0 R/Fm0 94 0 R>>')
('dict', '<</Im0 15 0 R/Fm0 94 0 R>>')
('dict', '<</Im0 18 0 R/Fm0 94 0 R>>')
('dict', '<</Im0 21 0 R/Fm0 94 0 R>>') From the above, we see that object print(doc.xref_object(94))
<<
/Subtype /Form
/Length 96
/OC 96 0 R
/PieceInfo <<
/ADBE_CompoundType <<
/Private /Watermark
/LastModified (D:20080804204943+08'00')
>>
>>
/Matrix [ 1 0 0 1 0 0 ]
/Resources <<
/Font <<
/TT0 91 0 R
>>
/ProcSet [ /PDF /Text ]
>>
/BBox [ 0 -87.1205 516.025 -5.61621 ]
/LastModified (D:20080804204943+08'00')
/FormType 1
>> Shows that it seems to be a watermark. We can remove the watermark by setting object 94 to empty: doc.update_object(94, "<<>>") When saving we will see that the watermark is gone. |
Beta Was this translation helpful? Give feedback.
-
I have large number of PDF files with following textbox as watermark or advertisement:
In this picture,, the textbox has blue border and has some degree as well as transparent text.
I want to remove this textbox and do not have any effect on text under this textbox.
The following code snippet doesn't work as it will remove the text under this textbox:
`
pf = fitz.open(fp)
for pg in range(pf.page_count):
page = pf[pg]
`
The code above produces the following pdf:
What is the the right and effective way to do this ? I appreciate your help.
Beta Was this translation helpful? Give feedback.
All reactions