Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Searching for ( in specific fields #1633

Closed
oscargus opened this issue Jul 27, 2016 · 12 comments
Closed

Searching for ( in specific fields #1633

oscargus opened this issue Jul 27, 2016 · 12 comments
Labels
bug Confirmed bugs or reports that are very likely to be bugs search

Comments

@oscargus
Copy link
Contributor

JabRef version: Latest developer

Search for e.g. booktitle=( without regex or booktitle=\( with regex. It seems like the parser does not find the field as it says any field that contains booktitle=(. Maybe a weird case, but I actually wanted to do that right now.

Searching for ( works well though.

@oscargus oscargus added bug Confirmed bugs or reports that are very likely to be bugs search labels Jul 27, 2016
@simonharrer
Copy link
Contributor

The problem is that this is not detected as a grammar based search, but as a contains-based or simple search. You can circumvent the problem by searching for booktitle="(" which should work as expected. We could improve the search grammar, but this just takes too much time right now. Hence, I close this.

@oscargus
Copy link
Contributor Author

oscargus commented Aug 2, 2016

OK! Didn't think of using "", but clearly that is a good (enough) approach!

Siedlerchr added a commit to Siedlerchr/jabref that referenced this issue Aug 5, 2016
* master:
  Fixed OO/LO manual connection dialog on Linux
  Removed thrown Exception declarations (JabRef#1673)
  Fix JabRef#1288 Newly opened bib-file is not focused (JabRef#1671)
  Refactor DB loading
  Fix OutOfBoundsException when importing multiple entries in medline format (JabRef#1611)
  Removed the possibility to auto show or hide the groups interface (JabRef#1668)
  Add test to describe workaround for JabRef#1633
  Fixed JabRef#1643: Searching with double quotes in a specific field ignores the last character
  fix build
  Fixes JabRef#1554: JabRefFrame is set as owner for ImportInspectionDialog
  Fixed most of the ErrorProne warnings
  Replaced output of getResolvedField to Optional<String> (JabRef#1650)
  PushToApplication cleanup and refactoring (JabRef#1659)
  Replaced Object with appropriate class where possible (JabRef#1660)
  Replaced some array return types (JabRef#1661)
  Fix XMP test
  Localization
  Moved the main part of XMPUtil to jabref.XMPUtilMain and injected a b… (JabRef#1642)
  Made possible to make the OO/LO panel a bit more narrow (JabRef#1652)
  French localization: Jabref_fr: empty strings + some cleaning
@ajbelle
Copy link

ajbelle commented Aug 10, 2016

Although this is closed I post here as a user report as I think my experience is related to the “uncertain” search nomenclature that is required. Also related is issue #1505 regarding the apparent heterogeneous regex-regular expression implementation. IMHO being able to search reliably IS ESSENTIAL and of high priority for large database situation.
I was attempting to do a general search with a specific exclusion (word1 and not word2) and conclude that at best the current help files http://help.jabref.org/en/SearchHelp and http://jabref.sourceforge.net/help/SearchHelp.php for Search are misleading and incomplete. The poor search behaviour has been present within JabRef and the recent changes have unfortunately not corrected these issues it seems.

Hover-over the search entry shows the full logic in words. This is a great feature, but doesn't seem to reflect what the help file suggest is should, not what Search seems to do.

""word1 and not word2"" selects all entries and highlights all those words including 'and' and 'not'.
""word1 not word2"" selects all entries and highlights all those words including 'not'.

If I enter "word1 and not field2=word2"" eg: "flow and not memo=PhD" search totally fails with hover-over showing "... flowand and not and memo=PhD" It appears to not recognise the and, instead ignoring the space and adding it to the word, then inserting extra ands!

I concluded the algorithm has be rewritten to look for Field2=word1...Field2=word2... but how do you specify to search all fields for one word? From the helpfile the following should work
title|keywords = "image processing"
title--keywords = "image processing" but neither works.
eg: (author = miller or title|keywords = "image processing") and not author = brown cut and paste from the help file does work (neither with a hyphen or a double hypen).

Interestingly Search automatically decides content in a non-regex entry is regex, but it doesn't seem to work.
A very useful trick to solve my composite entry requiring a search on all fields would be .* =flow . The hover-over pop-up states that "any entries in which the field matching .* contains the word flow." however it does not do this in practice.

entrytype=Phdthesis where = means contains works
entrytype!=PhD where != means does not contain works
entrytype==Phdthesis where == means is exactly works
"word1 word2 word 3" where "" indicate a string works

Bracketing doesn't work at all as I expect from the help files with spaces before and after words changing the hover-over displayed meaning.

I expect the problems identified are the different logic and regexpertise level of the programmers exacerbated by the limited help available. If so could this flag an upgrade of the help file with more extensive examples of multipart searches.

@koppor koppor added the on-hold label Aug 10, 2016
@ajbelle
Copy link

ajbelle commented Aug 28, 2016

An update to my previous post that is part of my confion and I think is a BUG.

When searching for an expression it can be enclosed in "expression words" however the closing " removes the s so it finds expression word ! I doubt debugging would pick this up becasue it will find the target expression words It will also annoyingly pick up expression wordsmith and badexpression word . To avoid these you have to insert a space before and two after to get the expected result (ie: " expression words " two trailing spaces! )

Once you figure the crytic entry requirement it isn't a problem until you attempt to find something like LES. The issue is that LES also appears in words such as bubbles and coalescence. To avoid these you can use the spaces trick, however this then does not pick up first word field entry starting with LES. If all LES entries were capitals that would be the solution, but they are not always :-(

What I really wanted was a way to specify keyword "LES" meaning just LES not ILES or LES boundary conditions. so using spaces to isolate a keyword is imperfect cludge.

I have still to find how to search allfields=xyz and keywords!="just-LES" as there is not an allfields search option. The outcome is that capturing all entries while cutting out garbage hits is time consuming on a large database.

@oscargus
Copy link
Contributor Author

@ajbelle The issue of the last character disappearing was reported in #1643 and is claimed to be fixed in 2c9c7dd (with tests showing that it probably is). Have you tried version 3.6?

The allfields approach should be quite feasible to implement, although somehow it would be even better if not specifying a field leads to that behaviour (that I do not know how to do though).

Although it is not solving what you are actually asking (searching the keywords field on a keyword by keyword basis rather than the complete field), I just want to point out the possibility to write something like not entrytype==book which would match e.g. inbook and booklet as opposed to entrytype!=book.

From a code perspective adding special handling of the keywords field should be quite feasible, but I do not know how it will hurt performance. Would it make sense to introduce a pseudo-field keyword for searching on a keyword by keyword basis or is it just confusing as it is so close to keywords?

I may have time to look at this in the near future, although the person with the best knowledge of the search system is having a break to finish his dissertation.

We should probably have a look at the documentation as well, especially if these new things are introduced.

@oscargus
Copy link
Contributor Author

@ajbelle Turned out that I just as well could add those lines right now. Please try it out at https://builds.jabref.org/bettersearch in a few minutes. Both allfields and keyword are added.

However, when trying it I realized that not keyword==LES would find entries which has at least one keyword that isn't LES rather than not containing any keyword that is LES. I wonder if I should add yet another pseudo-field anykeyword? What do you think?

@oscargus
Copy link
Contributor Author

Sorry, I was wrong about the discussion above regarding anykeyword.

@oscargus
Copy link
Contributor Author

@ajbelle If you try out the very latest version I renamed these fields to anyfield and anykeyword as it gets a bit more fluent when you read the query.

anyfield contains fruit and anykeyword matches banana

@ajbelle
Copy link

ajbelle commented Sep 5, 2016

Thank you @oscargus your solution is brilliant. It all seems to be working as you say and I can now use anyfield contains ‘searchWord’ and abstract!= ‘searchWord’ and review!= ‘searchWord’which picks up all my own fields without all the junk hits!
You were correct the #1643 issue is fixed in Ver3.6 :-) Sorry I missed that :-(
Thanks also for the identically equals == head up. not entrytype==book
I note that entrytype!==bookdoes not work, which seems inconsistent (a UI logic issue).
I also note that when you simply type a word in the Search field it highlights each occurrence of the word in the BibTex source panel, but when you generate a freeform search expression it does not. It is not a great problem, but if it was possible to patch the highlighting code in it would be helpful and provide consistent behaviour.
It would be great to have an example of each syntax (like you provided) that works with the current version in the help file. Debugging is far easier when you know what should work ;-)

@ajbelle
Copy link

ajbelle commented Sep 13, 2016

I have a regular issue with https://builds.jabref.org/bettersearch crashing my Win7 box. The most consistent cause is editing something and then selecting the undo icon. There seems to be other situations, but I am never sure I have't pressed a key by mistake.

This did not occur for me with ver3.6

@oscargus
Copy link
Contributor Author

Sorry to hear about your problem @ajbelle . My guess is that it is not directly related to the actual search PR, but to something else we have changed, but I will double check the code I added.

Indeed, the help file should be improved, especially with good examples!

@ajbelle
Copy link

ajbelle commented Sep 19, 2016

Yes @oscargus I had no time to check what code change it may, but the undo capability for edit seems totally broken. Sorry for posting on your issue, it was simply the bettersearch build I downloaded. I am very impressed with your update (and from the behaviour do not think it is the cause).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Confirmed bugs or reports that are very likely to be bugs search
Projects
None yet
Development

No branches or pull requests

4 participants