Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transferred Annotations not Rendering Correctly #2960

Open
eth-wa opened this issue Nov 20, 2024 · 3 comments
Open

Transferred Annotations not Rendering Correctly #2960

eth-wa opened this issue Nov 20, 2024 · 3 comments

Comments

@eth-wa
Copy link

eth-wa commented Nov 20, 2024

We often get new/revised PDF's for plan documents. Part of the transfer process is transferring any Annotations from the previous version to the new version. We've been doing this using our PDF software, but I wanted to try an hammer it out programmatically with pypdf.
However, I'm not sure if our PDF software is under-defining the Annotation data or what, but it's inconsistent with what existed vs. what's rendered in the new PDF.
If this is user error, I apologize. I'm intermediate at best in Python, so if there's a way I can filter the Annotation data to get on the correct format, I'm all ears (and very grateful)
*There are some foobar annotations that I was using for testing.

Environment

$ python -m platform
Windows-11-10.0.22631-SP0

$ python -c "import pypdf;print(pypdf._debug_versions)"
pypdf==5.1.0, crypt_provider=('local_crypt_fallback', '0.0.0'), PIL=none

Code + PDF

Below is a test code I was working on that pulls in the New/Old PDF, extracts the Annotation objects from the Old PDF, and attempts to write them to the New PDF

import pypdf

writer = pypdf.PdfWriter()

OldPath = "\\\\UNC\\Folder\\OldPath.pdf"
NewPath = "\\\\UNC\\Folder\\NewPath.pdf"
SavePath = "\\\\UNC\\Folder\\SavePath.pdf"

#setup Writer (new plan sheet)
writer.append(NewPath)

#setup Reader (old plan sheet)
input_stream = open(OldPath, "rb")
reader = pypdf.PdfReader(input_stream, strict=False)
page = reader.pages[0]

if '/Annots' in page: #if the page has Annotations
    for annot in page['/Annots']:
        obj = annot.get_object()
        if '/Type' in obj: #Annotation has /Type
            if obj['/Type'] == '/Annot': #/Type is /Annot
                if obj['/F'] !=132: #Skip Locked Annotations
                    writer.add_annotation(page_number=0, annotation=obj) #Add annotation to Writer

with open(SavePath,"wb") as new:
    writer.write(new) #Save PDF

#Clean up
reader.close
writer.close

Below are the PDF files - OldPath, NewPath, and SavePath
OldPath.pdf
NewPath.pdf
SavePath.pdf

Traceback

No Traceback as the code runs without error, but the end result is unexpected.

@stefan6419846
Copy link
Collaborator

stefan6419846 commented Nov 21, 2024

Not related to your issue, but reader.close and writer.close do nothing unless you actually call .close() instead ;)

Apart from this: The issue should be much easier to analyze if you could provide a small example showing the same problem without all the overhead of the unrelated page content. I would appreciate a simple reproducer with the following properties:

  • A simple PDF file without annotations.
  • The simple PDF file with the annotations added.
  • The simple PDF file with the annotations copied on through pypdf.

This way, we should be able to better spot which values might changed in-between without too much overhead.

Are you able to provide such an example?

@eth-wa
Copy link
Author

eth-wa commented Nov 21, 2024

@stefan6419846 , yes I will be able to provide a much simpler example at a later time, hopefully within the next 24 hours at the most. I'll also include multiple annotation types.
thanks for the not-related comment -- that actually answers a not-related issue I was having while doing my troubleshooting. I'm too used to VBA it seems.

@eth-wa
Copy link
Author

eth-wa commented Nov 22, 2024

@stefan6419846 , attached are four files:

File to read Annotations from (reader):
1OldFile.pdf

File to write Annotations to (writer):
2NewFile.pdf

PyPDF Output (writer.write):
3SaveFile.pdf

Expected Output (3rd party PDF software annotations transfer):
4ExpectedFile.pdf

If I need to make it any simpler, please let me know. I added about every annotation type I had at my disposal.

And thank you again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants