Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cannot extract Chinese comma symbol #99

Closed
lcc19941214 opened this issue Oct 14, 2016 · 3 comments
Closed

cannot extract Chinese comma symbol #99

lcc19941214 opened this issue Oct 14, 2016 · 3 comments

Comments

@lcc19941214
Copy link

I'm using textract and it's awesome! I can easily extract content from any of my .doc or .docx files.

However, most of my time I'm handling with documents full of Chinese characters and it seems like textract has some porblem with extracting Chinese comma symbol ',' (with space instead).

@dbashford
Copy link
Owner

👍

I tend to update textract every 3 months or so and I am overdue. Hoping to be taking a good look at all the various issues/PRs this time next week.

@dbashford
Copy link
Owner

@lcc19941214 Not an issue anymore?

The last time I spent time working textract stuff I was working this. Didn't check anything in but believe I was close to resolving this.

dbashford added a commit that referenced this issue Dec 23, 2016
@dbashford
Copy link
Owner

FWIW, the fix for this was just published as textract 2.1, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants