-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not extract file ending from Urls #4547
Conversation
Should also fix #4480. Please check @tobiasdiez I cannot access JSTOR stuff due to 403. However with any URL and some appended key values like ?accetp=true it works for me, again. |
Please fix the checkstyle issue (import with * is not allowed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That fixes the problem but I am not sure about other unwanted side effects. For example, the file https://someurl.io/file.pdf
is no longer recognized as a pdf file. What do you think about using url.openConnection().getContentType()
and or https://docs.oracle.com/javase/7/docs/api/java/net/URLConnection.html#guessContentTypeFromName(java.lang.String) in case the path is a valid url?
Just to better understand the problem, at which point is the FileHelper.getFileExtension method called in
public void download() { |
|
||
import org.junit.jupiter.api.Test; | ||
|
||
import static org.junit.jupiter.api.Assertions.*; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I remember we have a no * for imports convention.
AutoDetectParser parser = new AutoDetectParser(); | ||
Detector detector = parser.getDetector(); | ||
MediaType mediaType = detector.detect(bis, metaData); | ||
MimeType mimeType = TikaConfig.getDefaultConfig().getMimeRepository().forName(mediaType.toString()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need a whole new library simply for the detection of the file extension?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's actually not that straight forward to detect media types and map them to file extensions. The library itself claims to be very small when only reliying on the core package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine to me. The tika library is only 600kb, so that's not a big deal especially when considering that there might be synergy effects with Lucene in the future.
* upstream/master: Add uncaught exception message (#4565) Converts integrity check dialog to JavaFX (#4559) Do not extract file ending from Urls (#4547) Bump checkstyle from 8.15 to 8.16 (#4562) Bump xmpbox from 2.0.12 to 2.0.13 (#4561) Delete the deprecated BibEntry Constructor (#4560) # Conflicts: # src/main/java/org/jabref/gui/DialogService.java # src/main/java/org/jabref/gui/FXDialogService.java # src/main/java/org/jabref/gui/actions/CopyFilesAction.java # src/main/java/org/jabref/gui/util/CurrentThreadTaskExecutor.java
* Fixes JabRef#4544 Do not extract file ending from Urls * Add tests * file type for any resource type * Keep simple file name extraction for files * checkstyle
* upstream/master: (583 commits) update jfoenix and gradle plugins Replace outdated transformer log4j2 with official new one Update journalList.txt Fix for Issue #4437 - Some bugs in preference->Entry table columns (#4546) Don't set column sort type at startup (#4577) Add uncaught exception message (#4565) Converts integrity check dialog to JavaFX (#4559) Do not extract file ending from Urls (#4547) Bump checkstyle from 8.15 to 8.16 (#4562) Bump xmpbox from 2.0.12 to 2.0.13 (#4561) Delete the deprecated BibEntry Constructor (#4560) Refactor BibEntry deprecated method (#4554) Added extra stats to be sent with MrDLib recommendations (#4452) improve styling of preferences side menu (#4556) Cleanup interfaces (#4553) Bump fontbox from 2.0.12 to 2.0.13 (#4552) Bump pdfbox from 2.0.12 to 2.0.13 (#4551) Bump wiremock from 2.19.0 to 2.20.0 (#4550) Fixes that renaming a group did not change the group name in the interface (#4549) Bump applicationinsights-logging-log4j2 from 2.2.1 to 2.3.0 (#4540) Bump antlr4-runtime from 4.7.1 to 4.7.2 (#4542) ... # Conflicts: # src/main/java/org/jabref/gui/JabRefFrame.java # src/main/java/org/jabref/model/entry/BibtexString.java
* Cleanup interfaces (#4553) * improve styling of preferences side menu (#4556) * Added extra stats to be sent with MrDLib recommendations (#4452) * Refactor BibEntry deprecated method (#4554) * Refactor BibEntry deprecated method * Fixed error * More on checkstyle fixing * Fixed checkstyle issues * Added custom entrytype for types not registered in the enumerator. * Added getTypeOrDefault method refactor code to use it and fix NPE problem * Fixing checkstyle rules * More on checkstyle * More on getType getTypeOrDefault replacement * Revert Article EntryType into Electronic * Added break line between different packages * Refactor BibtextEntryTypes.getTypeOrDefault method * Removed unused import * Removed extra new line, checkstyle error fixing * Delete the deprecated BibEntry Constructor (#4560) * Bump xmpbox from 2.0.12 to 2.0.13 (#4561) Bumps xmpbox from 2.0.12 to 2.0.13. Signed-off-by: dependabot[bot] <[email protected]> * Bump checkstyle from 8.15 to 8.16 (#4562) Bumps [checkstyle](https://github.com/checkstyle/checkstyle) from 8.15 to 8.16. - [Release notes](https://github.com/checkstyle/checkstyle/releases) - [Commits](checkstyle/checkstyle@checkstyle-8.15...checkstyle-8.16) Signed-off-by: dependabot[bot] <[email protected]> * Do not extract file ending from Urls (#4547) * Fixes #4544 Do not extract file ending from Urls * Add tests * file type for any resource type * Keep simple file name extraction for files * checkstyle * Converts integrity check dialog to JavaFX (#4559) * Converts integrity check dialog to JavaFX Moreover: - Show entry by reference and not by id. Fixes #2181. - Fixes a few issues that occurred when opening the entry editor by code from the integrity dialog - Reuse gridpane in entry editor (should have a slightly superior performance) - Improve display of progress dialog - Invoke copy files task using central task executor * fix l10n fix aborting of copy files task and showing of integrity check dialog * fix l10n * Add uncaught exception message (#4565) * Add error message for uncaught exceptions Added a new view to the project. It's shown after the uncaught exception is logged. * Add simple text to fallback error view Added a label asking the user to look into the logfiles for more details. * Add error message to language files Added the error message to the german and english language files. * Add ErrorDialogAndWait Removed the FallbackErrorView. Added the showErrorDialogAndWait call instead. * BibTex parser triggered on focus out of the text area instead of on value change event * BibTex parser triggered on focus out of the text area instead of on value change event * Added a post-parsing validation to be sure there's no relevant unconsidered content into epilog. Added DialogService in SourceTab as current NotificationPane seems not working. * Added a post-parsing validation to be sure there's no relevant unconsidered content into epilog. Added DialogService in SourceTab as current NotificationPane seems not working. * Codacy/PR Quality Review change. * Codacy/PR Quality Review change.
Fixes #4544 Do not extract file ending from Urls