PyPDF2 can't read French Accented Characters (œ) #532

hamzaAmier · 2020-01-03T17:04:22Z

hello,
I'm working for text mining project using PyPDF2 and i'm facing problem to extract this character "œ" (it's french character) . the method "extract Text ()" of a page object can't detect it at all.

Thank you for your help.

MartinThoma · 2022-04-08T22:10:36Z

I think #464 might solve this

MartinThoma · 2022-04-16T10:58:03Z

Duplicates #524

MartinThoma added the is-bug From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF label Apr 8, 2022

MartinThoma added the workflow-text-extraction From a users perspective, text extraction is the affected feature/workflow label Apr 16, 2022

MartinThoma closed this as completed Apr 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PyPDF2 can't read French Accented Characters (œ) #532

PyPDF2 can't read French Accented Characters (œ) #532

hamzaAmier commented Jan 3, 2020

MartinThoma commented Apr 8, 2022

MartinThoma commented Apr 16, 2022

PyPDF2 can't read French Accented Characters (œ) #532

PyPDF2 can't read French Accented Characters (œ) #532

Comments

hamzaAmier commented Jan 3, 2020

MartinThoma commented Apr 8, 2022

MartinThoma commented Apr 16, 2022