-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cleanup activity for URL field #9970
Conversation
|
||
String NoteFieldValue = entry.getField(NOTE_FIELD).orElse(null); | ||
String urlRegex = ( | ||
"(?i)\\b((?:https?://|www\\d{0,3}[.]|[a-z0-9.\\-]+[.]" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please add a comment/link on the regex, e.g where you got it from
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was originally fetched from stackoverflow, but there were some modifications in order for it to be functional. Shall I comment that link in my code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, and the modifications you did
@@ -1136,6 +1136,7 @@ Cleanup\ entries=Καθαρισμός καταχωρήσεων | |||
Automatically\ assign\ new\ entry\ to\ selected\ groups=Αυτόματη ανάθεση νέας καταχώρησης στις επιλεγμένες ομάδες | |||
%0\ mode=λειτουργία %0 | |||
Move\ DOIs\ from\ note\ and\ URL\ field\ to\ DOI\ field\ and\ remove\ http\ prefix=Μετακίνηση του DOI (Ψηφιακό Αναγνωριστικό Αντικειμένου) από το πεδίο σημείωσης και διεύθυνσης URL, στο πεδίο DOI και αφαίρεση του προθέματος http | |||
Move\ URLs\ in\ note\ field\ to\ url\ field= |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please only edit the en file. All other languages are handled by crowdin
Thanks for your contribution and really great that you added tests as well! |
There is one code style check failing (open rewrite)
Run 'gradle rewriteRun' to apply the recipes. |
// Expected entry in case note field holds more than one URLs | ||
BibEntry expectedWithMultipleURLs = new BibEntry() | ||
.withField(StandardField.URL, "https://hdl.handle.net/10442/hedi/6089") | ||
.withField(StandardField.NOTE, "/url{http://142.42.1.1:8080}"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please test with \url{}
? See also the link from koppor in the description. That would make more sense because it's a LaTeX command that is very liked to end up in the field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
import org.jabref.model.entry.field.StandardField; | ||
|
||
/** | ||
* Checks whether URL exists in note field, and stores it under url field . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No space before final dot of a sentence.
* Checks whether URL exists in note field, and stores it under url field . | |
* Checks whether URL exists in note field, and stores it under url field. |
/** | ||
* Moves URL from NOTE to URL field. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No comment necessary - the comment is in URLCleanup
class - similar to the other cleanup jobs.
/** | |
* Moves URL from NOTE to URL field. | |
*/ |
@@ -1055,7 +1055,7 @@ exportFormat=Format d'exportation | |||
Output\ file\ missing=Fichier de sortie manquant | |||
The\ output\ option\ depends\ on\ a\ valid\ input\ option.=L'option de sortie dépend d'une option d'entrée valide. | |||
Linked\ file\ name\ conventions=Conventions pour les noms de fichiers liés | |||
Filename\ format\ pattern=Modèle de format de nom de fichier | |||
Filename\ format\ pattern=Modèle de format de nom de fichier |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes will be overwritten. You need to go to Crowdin service to update issues at the translations.
|
||
private static Stream<Arguments> provideURL() { | ||
/* | ||
* Expected entries with various types of URLs(e.g, not secure protocol, password included for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Space before opening brace in English text
* Expected entries with various types of URLs(e.g, not secure protocol, password included for | |
* Expected entries with various types of URLs (e.g, not secure protocol, password included for |
BibEntry[] expectedwithURL = { | ||
new BibEntry().withField(StandardField.URL, "https://hdl.handle.net/10442/hedi/6089"), | ||
new BibEntry().withField(StandardField.URL, "http://hdl.handle.net/10442/hedi/6089"), | ||
new BibEntry().withField(StandardField.URL, "http://userid:[email protected]:8080"), | ||
new BibEntry().withField(StandardField.URL, "http://142.42.1.1:8080"), | ||
new BibEntry().withField(StandardField.URL, "http://☺.damowmow.com"), | ||
new BibEntry().withField(StandardField.URL, "http://-.~_!$&'()*+,;=:%40:80%2f::::::@example.com"), | ||
new BibEntry().withField(StandardField.URL, "https://www.example.com/foo/?bar=baz&inga=42&quux"), | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These values are used only ones. Please inline them below.
Background: We aim to keep expected and actual togeher so that one can read them easily. Using this array approach, reading will get difficult. See one inline-proposal below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@koppor So I follow this approach expected/actual together, as well for the other expected entries (e.g., expectedWithTwoNoteValues
and expectedWithMultipleURLs
Arguments.of(expectedwithURL[0], new BibEntry().withField( | ||
StandardField.NOTE, "\\url{https://hdl.handle.net/10442/hedi/6089}")), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arguments.of(expectedwithURL[0], new BibEntry().withField( | |
StandardField.NOTE, "\\url{https://hdl.handle.net/10442/hedi/6089}")), | |
Arguments.of( | |
new BibEntry().withField(StandardField.URL, "https://hdl.handle.net/10442/hedi/6089"), | |
new BibEntry().withField(StandardField.NOTE, "\\url{https://hdl.handle.net/10442/hedi/6089}")), |
String newNoteFieldValue = NoteFieldValue | ||
.replace(url, "") | ||
.replace("\\url{}", "") | ||
.replace(",", "").trim(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't work. in call cases. Maybe remove it - or fix it with maybe checking for regex "^, " - so that ,
appears at the beginning of the line AND is followed by a space.
Examples for note content:
\url{http://example.org}, cited by Kramer, 2002.
And some other strings one cannot forsee. Thus, simply removing all commas will produce wrong content.
I agree, this happens in edge cases. If you don't find a quick solution, keep as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@koppor, three proposed solutions here:
- Either
url
is preceded or followed by a comma
String newNoteFieldValue = NoteFieldValue
.replace(url, "")
.replaceAll("(, )?\\\\?url\\{\\}(, )?", "").trim();
- Not preferred but works
String newNoteFieldValue = NoteFieldValue
.replace(url, "")
.replace(", \\url{}", "")
.replace("\\url{}, ", "")
.replace("\\url{}", "").trim();
- Another solution, working with the current cases
// works with: \url{http://example.org}, cited by Kramer, 2002.
// doesn't work with: Cited by Kramer, 2002, \url{http://example.org}
String newNoteFieldValue = NoteFieldValue
.replace(url, "")
.replace("\\url{}", "")
.replaceFirst(", ", "").trim();
Overall, I could go with the first proposed solution as is more effective and covers plenty of cases
Also added a test case for: \url{http://example.org}, cited by Kramer, 2002.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fist one looks good, you can also add a comment before the RegEx to explain it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Failing test is not related
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small nitpicks regarding casing of variable names (they should all be lower case in Java). I'll fix them for myself and then merge.
Thank you for working on this!
@Alexandra-Stath Thank you for the follow-up. I did not use any IDE and thus missed one thing. Merged now! |
@koppor @Siedlerchr |
* remove it from the note field, and no other action is performed. | ||
*/ | ||
if (entry.hasField(URL_FIELD)) { | ||
String urlFieldValue = entry.getField(URL_FIELD).orElse(null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought, this could lead to issues, but I did not see it. There were no tests covering the orElse branch.
In future: In case Optionals are available, make use of them!
Fix and link to issue at #10435
* upstream/main: (68 commits) Fix issue 9863 - Change Tab selection order (#9907) New Crowdin updates (#9994) Enable editing typo with double clicking on field name in custom entry type (#9977) Remove "Field name:" from localization - we already have "Field name" (#9991) Changed database to catalog in org.jabref.gui.slr and org.jabref.logic.crawler (#9989) Use environment variables for hostname detection (#9910) Add cleanup activity for URL field (#9970) Fix freezing when fetching IBSN and no results are found (#9987) Make Group(Node)TreeViewModel more OO (#9978) Fix container for group item count still visible if display count is off (#9980) Fix paste of BibTeX data (#9985) Bring back SimplifyBoolean* and UnnecessaryParantheses (and refine guide) (#9981) Add new menu entry to remove braces from selection, aka unprotect it. (#9968) User specific comment (#9727) Bump org.openrewrite.recipe:rewrite-recipe-bom from 1.19.3 to 2.0.0 (#9975) Fix typos Remove non-needed link Fix typos Enable RemoveJavaDocAuthorTag (#9973) Bump com.fasterxml.jackson.datatype:jackson-datatype-jsr310 (#9974) ...
Fixes koppor#216
Add cleanup activity for URL field. Specifically when a url is found in the note field of a BibTex entry, we place it under the URL field.
Mandatory checks
CHANGELOG.md
described in a way that is understandable for the average user (if applicable)