-
Notifications
You must be signed in to change notification settings - Fork 513
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pix.color_topusage raise Segmentation fault (core dumped) #3994
Comments
I cannot reproduce your problem: everything works fine under Windows and Linux. The method crashes the interpreter if the clip represents an empty rectangle or leads to an empty pixmap. import pymupdf
doc = pymupdf.open("test.pdf")
page = doc[0]
txt_blocks = [blk for blk in page.get_text("dict")["blocks"] if blk["type"] == 0]
for blk in txt_blocks:
clip = pymupdf.Rect([int(v) for v in blk["bbox"]])
if clip.is_empty:
print(f"this is empty: {clip=}")
continue
pix = page.get_pixmap(
clip=clip,
colorspace=pymupdf.csRGB,
alpha=False,
)
print(f"{pix.color_topusage()=}") You do not need to convert to an integer rectangle - the method can deal with lists or tuples of 4 numbers directly: just pass import pymupdf
print(pymupdf.version)
doc = pymupdf.open("test.pdf")
page = doc[0]
pix = page.get_pixmap()
txt_blocks = [blk for blk in page.get_text("dict")["blocks"] if blk["type"] == 0]
for blk in txt_blocks:
clip = blk["bbox"]
print(f"{pix.color_topusage(clip=clip)=}") |
BTW a colleague also tried on a Mac and doesn't see a segv either. |
I am sorry, I upload wrong file, this is the file |
import fitz I see a txt_block bbox that x1 is larger than the page width: Rect(35.0, 636.0, 63.0, 738.0), the page width is page.get_text('dict')['width'] 612, but the x1 is 636 |
Ok, now that we have that file, here again my recommendation to use the function with more care until we have immunized it against wrong calls: import pymupdf
doc = pymupdf.open("test3.pdf")
page = doc[0]
txt_blocks = page.get_text("dict", flags=pymupdf.TEXTFLAGS_TEXT)["blocks"]
for blk in txt_blocks:
clip = pymupdf.IRect(blk["bbox"]) & page.rect # only inside visible page!
if clip.is_empty: # and never with empty clips!
print(f"empty: {clip=}")
continue
pix = page.get_pixmap(clip=clip)
pix.color_topusage() Shows this: empty: clip=IRect(35, 636, 63, 612)
empty: clip=IRect(523, 618, 536, 612)
empty: clip=IRect(132, 637, 142, 612)
empty: clip=IRect(159, 725, 169, 612)
empty: clip=IRect(167, 693, 177, 612)
empty: clip=IRect(288, 633, 298, 612)
empty: clip=IRect(334, 633, 344, 612)
empty: clip=IRect(395, 612, 405, 612)
empty: clip=IRect(426, 619, 435, 612)
empty: clip=IRect(464, 612, 474, 612)
empty: clip=IRect(502, 612, 512, 612)
empty: clip=IRect(514, 735, 527, 612)
|
Thanks for the updated test file. I've reproduced the segv with the current release PyMuPDF-1.24.12. Happily the bug is already fixed in PyMuPDF git (with the fix for #3848), so will be fixed in our next release. |
Fixed in 1.24.13. |
thanks very much! |
Take the |
got , very thanks ~~ |
Description of the bug
test3.pdf
I test diffirent colorspace, and reduce the bbox, it always raise "Segmentation fault", I don't know why
How to reproduce the bug
PyMuPDF version
1.24.12
Operating system
Linux
Python version
3.12
The text was updated successfully, but these errors were encountered: