Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically reload cache when metadata changes #337

Closed
agx opened this issue Jul 16, 2021 · 8 comments
Closed

Automatically reload cache when metadata changes #337

agx opened this issue Jul 16, 2021 · 8 comments

Comments

@agx
Copy link

agx commented Jul 16, 2021

I'm using AsPool like

  self->as_pool = as_pool_new ();
  as_pool_set_flags (self->as_pool, AS_POOL_FLAG_READ_METAINFO);

  as_pool_load_async (self->as_pool, self->cancel,
                      on_as_pool_load_ready,
                      data);

so only bother about metainfo of installed applicaticions (/usr/share/metainfo, etc) however i need to refresh the pool when new metainfo gets added there and for that as_pool_add_metadata_locations () would be nice so i can set up appropriate file monitors.

The alternative would be to explicitly add all metainfo locations but that looks odd given AsPools built-in knowledge. But maybe i'm using it incorrectly?

Or would adding file watches via g_file_monitor_directory in AsPool itself be an option so caches stay up to date automatically for these?

@ximion
Copy link
Owner

ximion commented Jul 16, 2021

There is as_pool_add_metadata_location, but I don't think that method will work for metainfo files (its purpose is for metadata collections, like Flatpak and asgen produce). Doesn't mean that the scope of that function couldn't be expaned though...

Or would adding file watches via g_file_monitor_directory in AsPool itself be an option so caches stay up to date automatically for these?

I would definitely prefer that above all other options. I did even want to write this feature, but so far it didn't have a high enough priority. Annoyingly, due to the way caching works in AsPool currently, this means we would need to reload everything once there is a change detected, but that will probably be okay for most cases (and in future, things can be made smarter when the new caching backend gets implemented).

@agx
Copy link
Author

agx commented Jul 17, 2021

Thanks @ximion . That looks preferable to me as well. I'm not sure i get to this right away but will put it in my list.

@ximion
Copy link
Owner

ximion commented Aug 28, 2021

This will likely be much easier to add once libappstream was ported to use xmlb as caching backend, and some cache refactoring has been done. AsPool comes from a time where AppStream wasn't so ubiquitous and accepted, so it contains a lot of now unneeded hacks and workarounds and doesn't handle cases with many metadata sources that well, while AsCache & Co. come from a time where Flatpak didn't exist and applications were mainly shipped by distribution packages. So a lot of assumptions there don't hold true anymore, and this needs refactoring (and the resulting code will likely be much simpler).
There are some pretty major dragons hiding in that code at the moment though, so I don't know yet when I can get to it.

@ximion ximion changed the title Need a way to query metainfo locations Automatically reload cache when metadata changes Sep 7, 2021
@ximion ximion added this to the AppStream 1.0 milestone Sep 9, 2021
@agx
Copy link
Author

agx commented Nov 15, 2021

@ximion gentle ping. The flatpak side for being able to filter apps on form factor in shells is now merged (flatpak/flatpak#4350 (comment)) and if we had this I could finish up the phosh side so we don't need to rely on X-Purism-* in desktop files but use the app metadata.

@ximion
Copy link
Owner

ximion commented Nov 15, 2021

If this is a blocker, I can actually make it a priority instead of the archive work that I'm currently doing.
The thing is that this is not a small change, but actually a huge piece of work - but has to be done eventually, so I might as well start now...

ximion added a commit that referenced this issue Nov 20, 2021
The new cache is now backed by xmlb instead of an LMDB database, which
allows us to perform a lot more complex queries with low effort.
The cache is also now shared between all applications by default (by
popular request), for every grouping of metadata.
In addition to that, what AsPool understands as "cache" is now a
collection of partitions, called sections, which represent AppStream
metadata from one domain, e.g. one Flatpak repository, the OS'
collection metadata, the combined metainfo/desktop-entry data of the
system, etc. This permits updating those sections independently, which
means that if a MetaInfo file changes, we will not have to rebuild the
whole cache, but only a small section of it.

This is a prerequisite for efficient monitoring of metadata directories,
and a lot of other neat optimizations. Since AppStream is used in
desktop shells nowadays, support for this is a needed addition to not
keep the system busy with needless work and cause lag.

In addition to that, a lot of cruft and complex code has also been
cleaned up. The current code runs about 60% slower than the previous
cache on cache rebuilds, query time is about 10% slower. There is a lot
of room for improvements though, and we will likely get to the previous
times before release.

Caution! The new code is not yet fully threadsafe and has various rough
edges, but it compiles and passes the testsuite. Further improvements
are located in smaller, easier to manage follow-up patches.

CC: #337
ximion added a commit that referenced this issue Nov 21, 2021
This will be used by AsPool to automatically reload its cache whenever
metadata changes.
CC: #337
@ximion ximion closed this as completed in 4d75e0e Nov 21, 2021
@ximion
Copy link
Owner

ximion commented Nov 21, 2021

This was an insane week-long effort... But also necessary maintenance work for the future of the project. After the essential cache refactoring was done, implementing cache auto-reload was rather easy. Currently, queries are a lot slower (at least twice as slow, but depending on the results returned) than before, but speeding that up will require some API additions in libxmlb (planned for later). Aside from that, everything else works better and is more robust.
The cache reloads can on Debian at least also be automatically done in the background via a dpkg trigger on /usr/share/metainfo, for that to work efficiently we will need to monitor the cache file itself as well though, which is not yet implemented.

The tests should cover all the common cases, but please test the code to see if everything (still) works fine! A lot of API has also been deprecated, so clients will need to adjust to that in the long run.

@agx
Copy link
Author

agx commented Nov 26, 2021

I checked that adding/removing items triggeres the signal as expected, thanks!

I'll open seprate bugs should I hit any issues.

@ximion
Copy link
Owner

ximion commented Nov 26, 2021

@agx I will likely also add a helper function (probably not in the next release but the one after that) that will check the dependency relations (requires/recommends/supports) for the client tool, so that not every client has to reimplement the "has enough memory" check etc. Anything that can't be checked in libappstream would be passed to the client to verify in a callback (because I can't depend on a GUI toolkit to check screen resolutions, for example).
This combined with a filter to only check specific subsets of relations may turn out to be useful for your usecase as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants