-
Notifications
You must be signed in to change notification settings - Fork 7.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: make doc extractor node also can extract text by file extension #9543
Conversation
Do you mind providing a file that will cause the error? According to my attempt, the .md file can be correctly identified as text/markdown. |
I tried these files, the mimetype is always I think this behavior is depends on the browser, I use |
someone else encounter this issue #9757 I think for the remote_file extract by mimetype, for the local file extract by extension is more reasonable I tried same file with firefox browser, it works, so the mimetype of local file depends on different browser |
lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Checklist:
Important
Please review the checklist below before submitting your pull request.
dev/reformat
(backend) andcd web && npx lint-staged
(frontend) to appease the lint godsDescription
currently, extract doc by the mimetype is not so much reliable. for example, the markdown file will always raise error:
so extract doc by the filename when can't recognize the mimetype can improve success rate
Type of Change
Testing Instructions
test locally