[Brainstorming] Groups / keywords architecture #12133

ThiloteE · 2024-10-30T19:07:25Z

There has been a lot of discussion and confusion about folders, groups, tags, keywords and labels and what differentiates them. See for example
#11026 (comment) and #8739 (comment)

What we have right now in JabRef 5.15:

A Frankenstein groups feature that is actually something in between groups and keywords (more keyword than group, to be honest), but we call it groups and the information is stored in the "groups" field, but also partially in binary information in JabRef internal syntax at the bottom of the library file. Entries can be added and removed to groups via the entry editor and via the groups sidepane on the left. See https://docs.jabref.org/finding-sorting-and-cleaning-entries/groups.
The field "keywords", whose definition follows the bibtex standard. Those keywords are accessible via the entry editor.

What primary characteristics differentiate group/keyword systems?

Can groups/keywords be nested? (Is there a hierarchy?)
- Currently, yes, but groups and sub-groups cannot have the same name.
Can entries be part of multiple groups?
- Currently, yes, that's why it might be more accurate to call it a keyword system.
Are groups/keywords associated with the entry itself?
- Currently, yes. that's why it might be more accurate to call it a keyword system.

What secondary characteristics exist? Those are qualities that depend on how the primary qualities are implemented.

If you change the entry's keyword, will it change the name of the group(s)?
If you change the entry's keyword, will it create a new group, if it doesn't already exist?
Can entries be automatically assigned to groups/keywords?
Where is the data stored? In a (.bib) library file or a database?
What happens if entries / groups / keywords are shared remotely? Will remote (server) take precedence or the local file? What preferences have to be shared?

Database structure of JabRef

As far as I am aware, JabRef 5.15 stores everything in the library file.

Database structure of external Apps

Thunderbird's experiments with MySQL. See https://bugzilla.mozilla.org/show_bug.cgi?id=1921394. It's interesting. I basically opened this issue to post that link, lol. Didn't know where to put it.

koppor · 2024-10-30T19:56:49Z

This refs #11026 (comment).

ThiloteE · 2024-10-30T19:59:27Z

Koppor, you think like me. Funny. I referenced the same comment in my second sentence of this issue here.

koppor · 2024-10-30T20:01:58Z

For a structured approach, one needs to write down what existing tools are doing. For instance: BibDesk.

They are are good in distinguishing automatic and non-autoamtic groups:

I think, users need both: automatic (e.g., based on citations, keywords, ...) and non-automatic (manual categorization)

This also refs https://github.com/JabRef/jabref/blob/main/docs/decisions/0019-implement-special-fields-as-separate-fields.md. Thus, the dimenstion is not only automatic and manual, but also how to render in the entry table.

koppor · 2024-10-30T20:03:28Z

To really come up with a solution, one needs to have a minimal example showing the different options. One can start with Chocolate.bib. In other words requirements analysis 😅

ThiloteE · 2024-10-30T20:13:00Z

This issue here is not yet about a solution. Just brainstorming for now. I wanted you to look at the Thunderbird link I posted.

koppor · 2024-10-30T20:37:54Z

We need "Draft issues" 🤣🤣

I personally use OneNote for such things, but in the Web Browser, its aweful 🙈.

ryan-carpenter · 2024-11-05T06:04:17Z

For instance: BibDesk ... are are good in distinguishing automatic and non-autoamtic groups:

Other reference managers also separate or distinguish between these, and I agree that JabRef could benefit from an easier way to do this. I often use colours or icons to indicate when a group is search-based.

I think, users need both: automatic (e.g., based on citations, keywords, ...) and non-automatic (manual categorization)

Absolutely essential.

ryan-carpenter · 2024-11-05T08:20:00Z

If you change the entry's keyword, will it create a new group, if it doesn't already exist?

Changing an entry does not automatically create "graphical" groups, so for some time my workflow included adding (text) groups to entries, and then creating graphical search-groups to appear in the panel. One day, I finally discovered that explicit groups also located the entries of interest automatically and that renaming the graphical group had the same effect on the text entry. This makes sense to me, though I did make sure to test carefully to avoid unexpected "corruption" of my grouping.

Automatic creation of graphical groups from the text groups seems like a more predictable/discoverable approach. If users don't want the panel to show every group contained in the entries, then the settings for each group could include a "hide" option. Creating/showing the groups by default also has the advantage of revealing errors, such as typos and accidental variations, in the text groups.

If people think having keywords and groups is too complex, consider for example, that PubMed records usually include at least two types of keywords (Other terms and MeSH terms) that have already lost resolution by the time they land in JabRef. Having a means of organising entries that does contaminate or get contaminated by keywords is very important. Import batches are another kind of grouping that is conceptually separate from keywords.

I am not sure about the architectural implications of "folders", "groups", "keywords", and "tags". However, it is clear from the discussions about nested groups (already linked above) that users need at least one layer of personal organisation. Consider too that commercial reference managers allow "piggy back" and "daisy-chain" groups (created from combinations or series of existing groups). Perhaps a "pivot table" is a better metaphor for user need than folders, groups, keywords, or tags (storing, clustering, indexing, and classifying). All of these could have the same underlying architecture and still be useful as separate inputs to my "data model". The important part is having a dynamic view of the entries in the collection.

ThiloteE added groups keywords needs-refinement type: documentation labels Oct 30, 2024

koppor changed the title ~~Groups / keywords architecture~~ [Brainstorming] Groups / keywords architecture Oct 30, 2024

ThiloteE mentioned this issue Nov 15, 2024

The new tag system ignores hierarchical keywords. Only the parent keyword is shown. #11390

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Brainstorming] Groups / keywords architecture #12133

[Brainstorming] Groups / keywords architecture #12133

ThiloteE commented Oct 30, 2024 •

edited

Loading

koppor commented Oct 30, 2024

ThiloteE commented Oct 30, 2024

koppor commented Oct 30, 2024

koppor commented Oct 30, 2024

ThiloteE commented Oct 30, 2024 •

edited

Loading

koppor commented Oct 30, 2024

ryan-carpenter commented Nov 5, 2024

ryan-carpenter commented Nov 5, 2024

[Brainstorming] Groups / keywords architecture #12133

[Brainstorming] Groups / keywords architecture #12133

Comments

ThiloteE commented Oct 30, 2024 • edited Loading

What we have right now in JabRef 5.15:

What primary characteristics differentiate group/keyword systems?

What secondary characteristics exist? Those are qualities that depend on how the primary qualities are implemented.

Database structure of JabRef

Database structure of external Apps

koppor commented Oct 30, 2024

ThiloteE commented Oct 30, 2024

koppor commented Oct 30, 2024

koppor commented Oct 30, 2024

ThiloteE commented Oct 30, 2024 • edited Loading

koppor commented Oct 30, 2024

ryan-carpenter commented Nov 5, 2024

ryan-carpenter commented Nov 5, 2024

ThiloteE commented Oct 30, 2024 •

edited

Loading

ThiloteE commented Oct 30, 2024 •

edited

Loading