-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML] File upload: Adds support for PDF files #186956
Conversation
/ci |
/ci |
/ci |
/ci |
/ci |
/ci |
Thank you for the guidance, this makes perfect sense! |
+1 having this call out and the ability to add the |
...isualizer/public/application/file_data_visualizer/components/about_panel/welcome_content.tsx
Outdated
Show resolved
Hide resolved
...isualizer/public/application/file_data_visualizer/components/about_panel/welcome_content.tsx
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kibana-gis changes LGTM
code review only
tags: ['access:fileUpload:analyzeFile'], | ||
body: { | ||
accepts: ['application/json'], | ||
maxBytes: MAX_FILE_SIZE_BYTES, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be MAX_TIKA_FILE_SIZE_BYTES?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good spot, thanks. Updated in e756a3f
...er/public/application/file_data_visualizer/components/import_settings/semantic_text_info.tsx
Outdated
Show resolved
Hide resolved
@serenachou I've gone with a combination of your suggestion and my original text. with a link out to the semantic_text documentation. |
...er/public/application/file_data_visualizer/components/import_settings/semantic_text_info.tsx
Outdated
Show resolved
Hide resolved
…sualizer/components/import_settings/semantic_text_info.tsx Co-authored-by: Liam Thompson <[email protected]>
💛 Build succeeded, but was flaky
Failed CI StepsMetrics [docs]Module Count
Public APIs missing comments
Async chunks
Page load bundle
History
To update your PR or re-run it, just comment with: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Latest changes LGTM ⚡
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Latest change LGTM. Tested with a selection of pdf, txt and docx files.
With some datasets the find structure api will not generate an ingest pipeline. A recent [change](#186956) to how we catch and display errors during file upload means an upload with no pipeline now produces an error which aborts the upload. Previously all pipeline creation errors were ignored and hidden from the user. This PR changes changes the file upload endpoint to allow it to receive no ingest pipeline and also changes the UI to not display the pipeline creation step during upload. This file can be used to test the fix. https://github.com/elastic/eland/blob/main/tests/flights.json.gz
With some datasets the find structure api will not generate an ingest pipeline. A recent [change](elastic#186956) to how we catch and display errors during file upload means an upload with no pipeline now produces an error which aborts the upload. Previously all pipeline creation errors were ignored and hidden from the user. This PR changes changes the file upload endpoint to allow it to receive no ingest pipeline and also changes the UI to not display the pipeline creation step during upload. This file can be used to test the fix. https://github.com/elastic/eland/blob/main/tests/flights.json.gz (cherry picked from commit ee1a147)
# Backport This will backport the following commits from `main` to `8.x`: - [[ML] Fix file upload with no ingest pipeline (#193744)](#193744) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"James Gowdy","email":"[email protected]"},"sourceCommit":{"committedDate":"2024-09-25T14:30:30Z","message":"[ML] Fix file upload with no ingest pipeline (#193744)\n\nWith some datasets the find structure api will not generate an ingest\r\npipeline. A recent\r\n[change](#186956) to how we catch\r\nand display errors during file upload means an upload with no pipeline\r\nnow produces an error which aborts the upload.\r\nPreviously all pipeline creation errors were ignored and hidden from the\r\nuser.\r\n\r\nThis PR changes changes the file upload endpoint to allow it to receive\r\nno ingest pipeline and also changes the UI to not display the pipeline\r\ncreation step during upload.\r\n\r\nThis file can be used to test the fix.\r\nhttps://github.com/elastic/eland/blob/main/tests/flights.json.gz","sha":"ee1a147baca52dca5703663d35b66e7c44f3b676","branchLabelMapping":{"^v9.0.0$":"main","^v8.16.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:fix",":ml","Feature:File and Index Data Viz","Feature:File Upload","v9.0.0","v8.16.0"],"title":"[ML] Fix file upload with no ingest pipeline","number":193744,"url":"https://github.com/elastic/kibana/pull/193744","mergeCommit":{"message":"[ML] Fix file upload with no ingest pipeline (#193744)\n\nWith some datasets the find structure api will not generate an ingest\r\npipeline. A recent\r\n[change](#186956) to how we catch\r\nand display errors during file upload means an upload with no pipeline\r\nnow produces an error which aborts the upload.\r\nPreviously all pipeline creation errors were ignored and hidden from the\r\nuser.\r\n\r\nThis PR changes changes the file upload endpoint to allow it to receive\r\nno ingest pipeline and also changes the UI to not display the pipeline\r\ncreation step during upload.\r\n\r\nThis file can be used to test the fix.\r\nhttps://github.com/elastic/eland/blob/main/tests/flights.json.gz","sha":"ee1a147baca52dca5703663d35b66e7c44f3b676"}},"sourceBranch":"main","suggestedTargetBranches":["8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/193744","number":193744,"mergeCommit":{"message":"[ML] Fix file upload with no ingest pipeline (#193744)\n\nWith some datasets the find structure api will not generate an ingest\r\npipeline. A recent\r\n[change](#186956) to how we catch\r\nand display errors during file upload means an upload with no pipeline\r\nnow produces an error which aborts the upload.\r\nPreviously all pipeline creation errors were ignored and hidden from the\r\nuser.\r\n\r\nThis PR changes changes the file upload endpoint to allow it to receive\r\nno ingest pipeline and also changes the UI to not display the pipeline\r\ncreation step during upload.\r\n\r\nThis file can be used to test the fix.\r\nhttps://github.com/elastic/eland/blob/main/tests/flights.json.gz","sha":"ee1a147baca52dca5703663d35b66e7c44f3b676"}},{"branch":"8.x","label":"v8.16.0","branchLabelMappingKey":"^v8.16.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: James Gowdy <[email protected]>
Also txt, rtf, doc, docx, xls, xlsx, ppt, pptx, odt, ods, and odp.
Adds the ability to automatically add a semantic text field to the mappings and a
copy_to
processor to duplicate the field. This is needed for the mappings generated for the attachment processor which adds a nestedattachment.content
field which cannot be used as a semantic text field.After a successful import, a link to Search's Playground app is shown. Navigating there lets the user instantly query the newly uploaded file.
2024-07-24.20-21-45.2024-07-24.20_22_53.mp4