-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cleanup activity for URL field #9970
Merged
Merged
Changes from all commits
Commits
Show all changes
22 commits
Select commit
Hold shift + click to select a range
a8a1251
Make the necessary adjustments to several classes
Alexandra-Stath 07973f7
Add cleanup activity regarding URL field
Alexandra-Stath e251b39
Add parameterized tests for URLCleanup java class
Alexandra-Stath ef5ff9d
Include cleanup activity description in properties files
Alexandra-Stath 00dd484
Update javadoc comment
Alexandra-Stath a586e68
Add description in CHANGELOG.md
Alexandra-Stath d572905
Resolve merge conflicts
Alexandra-Stath 940c3ec
Update properties files
Alexandra-Stath 9c83833
Update URLCleanup java class with javadoc comment about urlRegex
Alexandra-Stath 7b3878e
Update URLCleanup java class with javadoc comment about urlRegex
Alexandra-Stath d69d98c
Update Latex command usage and fix checkstyle errors
Alexandra-Stath 7db3f62
Remove comment from CleanupPreferences class
Alexandra-Stath 07db0a6
Update cleanup activity in URLCleanup class
Alexandra-Stath 846dda5
Update tests structure and add more cases
Alexandra-Stath 3b8b5db
Update comment in URLCleanup class
Alexandra-Stath 907d774
Resolve conflicts caused in CHANGELOG.md
Alexandra-Stath e48c418
Merge branch 'main' into fix-for-koppor-216
Alexandra-Stath 583917b
Correction of CleanupPresetPanel and property files
Alexandra-Stath dbb41af
Merge branch 'fix-for-koppor-216' of https://github.com/Alexandra-Sta…
Alexandra-Stath ec6ff74
Update CHANGELOG.md
koppor 0a0ce4f
Apply suggestions from code review
koppor c8d45b3
Update of URLCleanup class
Alexandra-Stath File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,76 @@ | ||
package org.jabref.logic.cleanup; | ||
|
||
import java.util.ArrayList; | ||
import java.util.List; | ||
import java.util.regex.Matcher; | ||
import java.util.regex.Pattern; | ||
|
||
import org.jabref.model.FieldChange; | ||
import org.jabref.model.entry.BibEntry; | ||
import org.jabref.model.entry.field.Field; | ||
import org.jabref.model.entry.field.StandardField; | ||
|
||
/** | ||
* Checks whether URL exists in note field, and stores it under url field. | ||
*/ | ||
public class URLCleanup implements CleanupJob { | ||
|
||
private static final Field NOTE_FIELD = StandardField.NOTE; | ||
private static final Field URL_FIELD = StandardField.URL; | ||
|
||
@Override | ||
public List<FieldChange> cleanup(BibEntry entry) { | ||
List<FieldChange> changes = new ArrayList<>(); | ||
|
||
String noteFieldValue = entry.getField(NOTE_FIELD).orElse(null); | ||
|
||
/* | ||
* The urlRegex was originally fetched from a suggested solution in | ||
* https://stackoverflow.com/questions/28185064/python-infinite-loop-in-regex-to-match-url. | ||
* In order to be functional, we made the necessary adjustments regarding Java | ||
* features (mainly doubled backslashes). | ||
*/ | ||
String urlRegex = "(?i)\\b((?:https?://|www\\d{0,3}[.]|[a-z0-9.\\-]+[.]" | ||
+ "[a-z]{2,4}/)(?:[^\\s()<>\\\\]+|\\(([^\\s()<>\\\\]+|(\\([^\\s()" | ||
+ "<>\\\\]+\\)))*\\))+(?:\\(([^\\s()<>\\\\]+|(\\([^\\s()<>\\\\]+\\" | ||
+ ")))*\\)|[^\\s`!()\\[\\]{};:'\".,<>?«»“”‘’]))"; | ||
|
||
final Pattern pattern = Pattern.compile(urlRegex, Pattern.CASE_INSENSITIVE); | ||
final Matcher matcher = pattern.matcher(noteFieldValue); | ||
|
||
if (matcher.find()) { | ||
String url = matcher.group(); | ||
|
||
// Remove the URL from the NoteFieldValue | ||
String newNoteFieldValue = noteFieldValue | ||
.replace(url, "") | ||
|
||
/* | ||
* The following regex erases unnecessary remaining | ||
* content in note field. Explanation: | ||
* <ul> | ||
* <li>"(, )?": Matches an optional comma followed by a space</li> | ||
* <li>"\\?": Matches an optional backslash</li> | ||
* <li>"url\{\}": Matches the literal string "url{}"</li> | ||
* </ul> | ||
* Note that the backslashes are doubled as Java requirement | ||
*/ | ||
.replaceAll("(, )?\\\\?url\\{\\}(, )?", ""); | ||
|
||
/* | ||
* In case the url and note fields hold the same URL, then we just | ||
* remove it from the note field, and no other action is performed. | ||
*/ | ||
if (entry.hasField(URL_FIELD)) { | ||
String urlFieldValue = entry.getField(URL_FIELD).orElse(null); | ||
if (urlFieldValue.equals(url)) { | ||
entry.setField(NOTE_FIELD, newNoteFieldValue).ifPresent(changes::add); | ||
} | ||
} else { | ||
entry.setField(NOTE_FIELD, newNoteFieldValue).ifPresent(changes::add); | ||
entry.setField(URL_FIELD, url).ifPresent(changes::add); | ||
} | ||
} | ||
return changes; | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1055,7 +1055,7 @@ exportFormat=Format d'exportation | |
Output\ file\ missing=Fichier de sortie manquant | ||
The\ output\ option\ depends\ on\ a\ valid\ input\ option.=L'option de sortie dépend d'une option d'entrée valide. | ||
Linked\ file\ name\ conventions=Conventions pour les noms de fichiers liés | ||
Filename\ format\ pattern=Modèle de format de nom de fichier | ||
Filename\ format\ pattern=Modèle de format de nom de fichier | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These changes will be overwritten. You need to go to Crowdin service to update issues at the translations. |
||
Additional\ parameters=Paramètres additionnels | ||
Cite\ selected\ entries\ between\ parenthesis=Citer les entrées sélectionnées entre parenthèses | ||
Cite\ selected\ entries\ with\ in-text\ citation=Citer les entrées sélectionnées comme incluse dans le texte | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
118 changes: 118 additions & 0 deletions
118
src/test/java/org/jabref/logic/cleanup/URLCleanupTest.java
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,118 @@ | ||
package org.jabref.logic.cleanup; | ||
|
||
import java.util.stream.Stream; | ||
|
||
import org.jabref.model.entry.BibEntry; | ||
import org.jabref.model.entry.field.StandardField; | ||
|
||
import org.junit.jupiter.params.ParameterizedTest; | ||
import org.junit.jupiter.params.provider.Arguments; | ||
import org.junit.jupiter.params.provider.MethodSource; | ||
|
||
import static org.junit.jupiter.api.Assertions.assertEquals; | ||
|
||
public class URLCleanupTest { | ||
|
||
@ParameterizedTest | ||
@MethodSource("provideURL") | ||
public void testChangeURL(BibEntry expected, BibEntry urlInputField) { | ||
URLCleanup cleanUp = new URLCleanup(); | ||
cleanUp.cleanup(urlInputField); | ||
|
||
assertEquals(expected, urlInputField); | ||
} | ||
|
||
private static Stream<Arguments> provideURL() { | ||
return Stream.of( | ||
|
||
// Input Note field has two arguments stored , with the latter being a url. | ||
Arguments.of( | ||
new BibEntry().withField(StandardField.URL, | ||
"https://hdl.handle.net/10442/hedi/6089") | ||
.withField(StandardField.NOTE, | ||
"this is a note"), | ||
new BibEntry().withField(StandardField.NOTE, | ||
"this is a note, \\url{https://hdl.handle.net/10442/hedi/6089}")), | ||
|
||
// Input Note field has two arguments stored, with the former being a url. | ||
Arguments.of( | ||
new BibEntry().withField(StandardField.URL, | ||
"https://hdl.handle.net/10442/hedi/6089") | ||
.withField(StandardField.NOTE, | ||
"this is a note"), | ||
new BibEntry().withField(StandardField.NOTE, | ||
"\\url{https://hdl.handle.net/10442/hedi/6089}, this is a note")), | ||
|
||
// Input Note field has more than one URLs stored. | ||
Arguments.of( | ||
new BibEntry().withField(StandardField.URL, | ||
"https://hdl.handle.net/10442/hedi/6089") | ||
.withField(StandardField.NOTE, | ||
"\\url{http://142.42.1.1:8080}"), | ||
new BibEntry().withField(StandardField.NOTE, | ||
"\\url{https://hdl.handle.net/10442/hedi/6089}, " | ||
+ "\\url{http://142.42.1.1:8080}")), | ||
|
||
// Input Note field has several values stored. | ||
Arguments.of( | ||
new BibEntry().withField(StandardField.URL, | ||
"https://example.org") | ||
.withField(StandardField.NOTE, | ||
"cited by Kramer, 2002."), | ||
new BibEntry().withField(StandardField.NOTE, | ||
"\\url{https://example.org}, cited by Kramer, 2002.")), | ||
|
||
/* | ||
* Several input URL types (e.g, not secure protocol, password included for | ||
* authentication, IP address, port etc.) to be correctly identified. | ||
*/ | ||
Arguments.of( | ||
new BibEntry().withField(StandardField.URL, | ||
"https://hdl.handle.net/10442/hedi/6089"), | ||
new BibEntry().withField(StandardField.NOTE, | ||
"\\url{https://hdl.handle.net/10442/hedi/6089}")), | ||
|
||
Arguments.of( | ||
new BibEntry().withField(StandardField.URL, | ||
"http://hdl.handle.net/10442/hedi/6089"), | ||
new BibEntry().withField(StandardField.NOTE, | ||
"\\url{http://hdl.handle.net/10442/hedi/6089}")), | ||
|
||
Arguments.of( | ||
new BibEntry().withField(StandardField.URL, | ||
"http://userid:[email protected]:8080"), | ||
new BibEntry().withField(StandardField.NOTE, | ||
"\\url{http://userid:[email protected]:8080}")), | ||
|
||
Arguments.of( | ||
new BibEntry().withField(StandardField.URL, | ||
"http://142.42.1.1:8080"), | ||
new BibEntry().withField(StandardField.NOTE, | ||
"\\url{http://142.42.1.1:8080}")), | ||
|
||
Arguments.of( | ||
new BibEntry().withField(StandardField.URL, | ||
"http://☺.damowmow.com"), | ||
new BibEntry().withField(StandardField.NOTE, | ||
"\\url{http://☺.damowmow.com}")), | ||
|
||
Arguments.of( | ||
new BibEntry().withField(StandardField.URL, | ||
"http://-.~_!$&'()*+,;=:%40:80%2f::::::@example.com"), | ||
new BibEntry().withField(StandardField.NOTE, | ||
"\\url{http://-.~_!$&'()*+,;=:%40:80%2f::::::@example.com}")), | ||
|
||
Arguments.of( | ||
new BibEntry().withField(StandardField.URL, | ||
"https://www.example.com/foo/?bar=baz&inga=42&quux"), | ||
new BibEntry().withField(StandardField.NOTE, | ||
"\\url{https://www.example.com/foo/?bar=baz&inga=42&quux}")), | ||
|
||
Arguments.of( | ||
new BibEntry().withField(StandardField.URL, | ||
"https://www.example.com/foo/?bar=baz&inga=42&quux"), | ||
new BibEntry().withField(StandardField.NOTE, | ||
"https://www.example.com/foo/?bar=baz&inga=42&quux")) | ||
); | ||
} | ||
} |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought, this could lead to issues, but I did not see it. There were no tests covering the orElse branch.
In future: In case Optionals are available, make use of them!
Fix and link to issue at #10435