Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix OutOfBoundsException when importing multiple entries in medline format #1611

Merged

Conversation

tschechlovdev
Copy link
Contributor

@tschechlovdev tschechlovdev commented Jul 21, 2016

@zellerdev found out a bug when importing more than one entry in the medline format into JabRef.
JabRef version
JabRef 3.5
windows 8.1 6.3 amd64
Java 1.8.0_25

Steps to reproduce:

  1. Select more than entry from here .
  2. Select "Send to File" and then choose "Medline" Format.
  3. Open Jabref and select "Import into new database".
  4. I get an IndexOutOfBoundException shown on the console and no entries are imported.

Alternatively use the testfiles I added in this PR.

Quick fix: Add an if clause around the substring method, that causes the Exception.

  • Tests created for changes

@boceckts
Copy link
Contributor

I can confirm the bug but next time please create an issue for this and assign it to yourself if you want to fix it. Also please use the "copy version to clipboard" functionallity in the About dialog in the help menu to easily paste your JabRef-, OS- and Java-version.

@@ -127,91 +126,93 @@ public ParserResult importDatabase(BufferedReader reader) throws IOException {
}
}
String entry = current.toString();
if (entry.contains("-")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the while-loop above this is already trying to check the current line for validity and tries to skip all the invalid ones. I would propose to extract a new method called private boolean checkLineValidity(String line) and do all the checks in there such as: empty line, line doesn't contain "-", "-" symbol not at 4th position, etc...
then the while loop above can also be replaced and you don't need to modify the counter variable "j". You can then simply skip any invalid lines with:

if (!checkLineValidity(lines[j])) {
    continue;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would also mean you don't need to modify as much code as you did now because then you don't have the if-clause around all the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I wouldn't extract the while loop above, because in the while loop are also correcctions, like adding space if '-' is at third position. And putting the while loop into a method that returns a boolean could have side effects, since you don't expect, that this method is changing something.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could also extract to multiple methods, private boolean first checkLineValidity(String line), if false try to correct it by calling another method that tries to insert spaces or similar things.

}

return new ParserResult(bibitems);

}

private boolean checkLineValidity(String line) {
if (line.length() >= 4) {
return line.contains("-") || !(line.charAt(4) == '-') ? true : false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!(line.charAt(4) == '-') this will throw an OutOfBoundsException if the length of the line is exactly 4. Please cover this case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also please check the if statement again, I can't find a case where this will return false.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

? true : false is unnecessary.

@boceckts boceckts added status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers and removed stupro-ready-for-internal-review labels Jul 26, 2016
} else {
return false;
}
return (line.length() >= 5) ? line.contains("-") && (line.charAt(4) == '-') : false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think line.contains("-") is unnecessary when you also check for it's position.

@tschechlovdev tschechlovdev changed the title [WIP] Fix OutOfBoundsException when importing multiple entries in medline format Fix OutOfBoundsException when importing multiple entries in medline format Jul 26, 2016
@tschechlovdev
Copy link
Contributor Author

No remarks?

public void testWithMultipleEntries() throws IOException, URISyntaxException {
Path file = Paths
.get(MedlinePlainImporter.class.getResource("MedlinePlainImporterStringOutOfBounds.txt").toURI());
List<BibEntry> entries = importer.importDatabase(file, Charset.defaultCharset()).getDatabase().getEntries();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the bib file is in UTF8, then please set the Charset explicit to UTF8, too. DefaultCharsets.UTF8 (or similar), otherwise test will fail on windows in gradle.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@Siedlerchr
Copy link
Member

In general good. 👍 But I would recommend adding some test with invalid formats, e.g. if an exceptions is thrown (expected) or whatever.

@tschechlovdev
Copy link
Contributor Author

Hm an Excpetion should be not thrown, except if the reader would be null. If the format is invalid then the entries list will be empty, but no exception will be thrown. I'll added tests for both.

@tschechlovdev
Copy link
Contributor Author

Any other remarks?

@stefan-kolb
Copy link
Member

LGTM, just one remark: can you keep the test files as minimal as needed inside the tests? No need for 300 attributes when you can recreate the test issue with one or two.

@tschechlovdev
Copy link
Contributor Author

Ok, I've minimized the testfile for the issue.

@stefan-kolb
Copy link
Member

Maybe fix the two Codacy warnings (https://www.codacy.com/app/simonharrer/jabref/pullRequest?prid=289709) and add a key to the @misc entries. Otherwise im 👍 We need a second review from a @JabRef/developers tho.

Path file = Paths
.get(MedlinePlainImporter.class.getResource("MedlinePlainImporterTestInvalidFormat.xml").toURI());
List<BibEntry> entries = importer.importDatabase(file, Charsets.UTF_8).getDatabase().getEntries();
assertEquals(Collections.EMPTY_LIST, entries);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the method Collections,emptyList() rather than the constant, because the constant does not return a generic list and is therefore not typesafe.
(In this case not so important, but could possinly produce an unchecked warning)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@Siedlerchr
Copy link
Member

Just the collections,emptyList and the codacy and it's fine 👍 ( I already reviewd the main parts before)

@stefan-kolb stefan-kolb merged commit f6ccf27 into JabRef:master Aug 3, 2016
@tschechlovdev tschechlovdev deleted the MedlinePlainStringOutOfBounds branch August 3, 2016 11:44
Siedlerchr added a commit to Siedlerchr/jabref that referenced this pull request Aug 5, 2016
* master:
  Fixed OO/LO manual connection dialog on Linux
  Removed thrown Exception declarations (JabRef#1673)
  Fix JabRef#1288 Newly opened bib-file is not focused (JabRef#1671)
  Refactor DB loading
  Fix OutOfBoundsException when importing multiple entries in medline format (JabRef#1611)
  Removed the possibility to auto show or hide the groups interface (JabRef#1668)
  Add test to describe workaround for JabRef#1633
  Fixed JabRef#1643: Searching with double quotes in a specific field ignores the last character
  fix build
  Fixes JabRef#1554: JabRefFrame is set as owner for ImportInspectionDialog
  Fixed most of the ErrorProne warnings
  Replaced output of getResolvedField to Optional<String> (JabRef#1650)
  PushToApplication cleanup and refactoring (JabRef#1659)
  Replaced Object with appropriate class where possible (JabRef#1660)
  Replaced some array return types (JabRef#1661)
  Fix XMP test
  Localization
  Moved the main part of XMPUtil to jabref.XMPUtilMain and injected a b… (JabRef#1642)
  Made possible to make the OO/LO panel a bit more narrow (JabRef#1652)
  French localization: Jabref_fr: empty strings + some cleaning
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants