Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store Citation Relations in an LRU cache #10958

Closed
HoussemNasri opened this issue Mar 2, 2024 · 7 comments
Closed

Store Citation Relations in an LRU cache #10958

HoussemNasri opened this issue Mar 2, 2024 · 7 comments
Assignees
Labels
good first issue An issue intended for project-newcomers. Varies in difficulty. type: enhancement

Comments

@HoussemNasri
Copy link
Member

HoussemNasri commented Mar 2, 2024

Problem

The current implementation caches the related entries (references and citations) as soon as they are fetched after visiting the Citation Relations Tab. The fetched entries are held cached in RAM until JabRef is restarted. This approach unnecessarily bloats the computer memory and could result in an out-of-memory exception in special circumstances.

Suggested solution

Instead of caching all entries, we could cache the citation relations of the top N entries that are most likely to be visited in the future. The Least Recently Used (LRU) Cache operates on the principle that the data most recently accessed is likely to be accessed again in the near future.

Implementation details

  • You don't have to write an LRU cache from scratch, JabRef depends on a LOT of libraries, some of them provide an LRU cache already, so try to use that if possible.
@HoussemNasri HoussemNasri added type: enhancement good first issue An issue intended for project-newcomers. Varies in difficulty. labels Mar 2, 2024
@github-project-automation github-project-automation bot moved this to Free to take in Good First Issues Mar 2, 2024
@cardionaut
Copy link
Contributor

I'd love to give this a shot!

@HoussemNasri
Copy link
Member Author

@cardionaut, Sure go ahead, you're assigned.

@HoussemNasri HoussemNasri moved this from Free to take to Reserved in Good First Issues Mar 3, 2024
@cardionaut
Copy link
Contributor

@HoussemNasri Is there a large example .bib I could download from anywhere for testing purposes?
I assume the RAM effects are only visible for larger collections.

@Siedlerchr
Copy link
Member

@cardionaut We have a generator script bib file generator under the scripts directory

@cardionaut
Copy link
Contributor

@Siedlerchr Thanks!
These dummy entries won't have any References or Citations though, right?

@HoussemNasri
Copy link
Member Author

HoussemNasri commented Mar 3, 2024

Is there a large example .bib I could download from anywhere for testing purposes?
I assume the RAM effects are only visible for larger collections.

If you're looking for examples of BibTeX libraries you can have a look at testbib. The problem could be observable when using a large library run on a low-end computer over a long period of time. So it's more of a small optimization that would only benefit a small number of users. Overall, I don't think you have to reproduce the issue to fix it. You can launch the debugger and observe the number of allocated citations/references in the cache over time, but it's up to you.

@koppor
Copy link
Member

koppor commented Mar 4, 2024

@cardionaut You are right, the generated libraries do not help here. https://github.com/JabRef/jabref/blob/main/src/test/resources/testbib/Chocolate.bib also does not help. - What I did: Query dblp for "Kopp", imported the 100 entires. Then I quiery "Breitenbücher", added 100 other entries - and then went through. OK, I got http 429, because of some API rate limits IMHO.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue An issue intended for project-newcomers. Varies in difficulty. type: enhancement
Projects
Archived in project
Development

No branches or pull requests

4 participants