Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #5046: Prevent multiple data rebuild runs on module listing #3600

Closed
wants to merge 1 commit into from

Conversation

indigoxela
Copy link
Member

@indigoxela indigoxela commented Apr 19, 2021

POC fo fix backdrop/backdrop-issues#5046

It prevents multiple runs of the rather expensive function system_rebuild_module_data() on the module listing page form.

On "admin/modules" the cache refresh, which should prevent stale data, is already handled by the system module.

@backdrop-ci
Copy link
Collaborator

Related to: backdrop/backdrop-issues#5046

@dragonbot
Copy link
Collaborator

Tugboat has finished building a preview for this pull request!

Website: https://pr3600-ck7nzxtzxivt632vjzsdfjzvv7usukrh.tugboat.qa/
Username: admin
Password: 8c0ddce929d5

@backdrop-ci
Copy link
Collaborator

Website: http://3600.backdrop.backdrop.qa.backdropcms.org
Username: admin
Password: qZQpnqMn

Copy link
Contributor

@alanmels alanmels left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just tested this and works as expected. RTBC!

@alanmels
Copy link
Contributor

Will you re-launch PR, so it passed zenci/deploy/pr-3600 — request timed out

@indigoxela
Copy link
Member Author

indigoxela commented Apr 19, 2021

Will you re-launch PR, so it passed zenci/deploy/pr-3600 — request timed out

@alanmels, yes sure will. I didn't bother this morning, as I was short on time and if it runs into a timeout, it usually does again. 😏

@indigoxela indigoxela closed this Apr 19, 2021
@backdrop-ci
Copy link
Collaborator

Website: http://3600.backdrop.backdrop.qa.backdropcms.org Removed

@indigoxela indigoxela reopened this Apr 19, 2021
@dragonbot
Copy link
Collaborator

Tugboat has finished building a preview for this pull request!

Website: https://pr3600-vc2yffkhhs6bbdpqwiouqsbcuucxucto.tugboat.qa/
Username: admin
Password: 7cea95f18d4b

@backdrop-ci
Copy link
Collaborator

Website: http://3600.backdrop.backdrop.qa.backdropcms.org
Username: admin
Password: jG5DbsAa

@docwilmot
Copy link
Contributor

Well, though, I'm still not sure this is a good idea. If we remove admin/modules from that array, then none of the rebuilding that normally happens in update_get_projects() will happen, and note that system_rebuild_module_data() is only one of the many cache rebuilding steps that function performs.

@indigoxela
Copy link
Member Author

Well, though, I'm still not sure this is a good idea. If we remove admin/modules from that array, then none of the rebuilding that normally happens in update_get_projects() will happen.

@docwilmot Data do get refreshed, the system module already takes care of that, the update module doesn't have to and shouldn't interfere on that page. My PR prevents the double rebuild call, but system_rebuild_module_data() still runs - but only once.

@docwilmot
Copy link
Contributor

The way I read this is as follows:

  • when update_get_projects() runs, it calls update_project_cache(). For paths in that array, the update cache is cleared and an empty array is returned.
  • if (empty($projects)), (that is, if the path is in that array) then a bunch of rebuilding happens:
      $module_data = system_rebuild_module_data();
      $theme_data = system_rebuild_theme_data();
      $layout_data = _update_build_layout_data();
      _update_process_info_list($projects, $module_data, 'module', TRUE);
      _update_process_info_list($projects, $theme_data, 'theme', TRUE);
      _update_process_info_list($projects, $layout_data, 'layout', TRUE);
      if (config_get('update.settings', 'update_disabled_extensions')) {
        _update_process_info_list($projects, $module_data, 'module', FALSE);
        _update_process_info_list($projects, $theme_data, 'theme', FALSE);
        _update_process_info_list($projects, $layout_data, 'layout', FALSE); //todo
      }
  • system module does not do all that. So if we remove the path, yes system_rebuild_module_data() happens, but not all the rest.

I could be wrong.

@indigoxela
Copy link
Member Author

system module does not do all that.

@docwilmot why in the world should the system module on page /admin/modules - clearly only meant for modules - bother refreshing theme and layout stuff?

@docwilmot
Copy link
Contributor

I suspect we're misunderstanding each other.

Summary: On certain paths Update module requires certain rebuild functions to run. It seems admin/modules is one of those paths. If we remove admin/modules from that array, those rebuild functions will not run. Thats it, that's my point.

My statement "system module does not do all that" was in response to your statement that "data do get refreshed, the system module already takes care of that". System only does one of those rebuild processes, that is, it runs system_rebuild_module_data().

@indigoxela
Copy link
Member Author

I suspect we're misunderstanding each other.

I suspect the same. 😏

Function system_rebuild_module_data() prevents stale data on the page that provides the ability to enable and disable modules. This page does not run any updates, nor does it handle layouts or themes. The only information needed there is recent info of all enabled/disabled modules.

If you have any evidence, that something's actually missing, please clarify.

@alanmels
Copy link
Contributor

alanmels commented Apr 20, 2021

@docwilmot, my understanding is that when visiting the admin/modules page, the system_rebuild_module_data() is triggered and whatever was needed from the function is already satisfied by that point. And there is absolutely no need for another run of the same function.

As for:

On certain paths Update module requires certain rebuild functions to run.

we should really operate based not on guesses, but specific examples. I've run number of scenarios to trigger the Update module (such as deliberately downloading the older version of modules bot core and contributed modules) and seen no problem at all. If you give at least one of those certain paths, then I'm happy to run additional tests. But even in that unlikely scenario I would think it's a bug that needs to be fixed elsewhere.

@docwilmot
Copy link
Contributor

"guesses"

function update_project_cache($cid) {
  $projects = array();

  // On certain paths, we should clear the cache and recompute the projects for
  // update status of the site to avoid presenting stale information.
  $q = $_GET['q'];
  $paths = array(
    'admin/modules',
    'admin/modules/update',
    'admin/appearance',
    ...

@alanmels
Copy link
Contributor

alanmels commented Apr 20, 2021

@docwilmot I see that. What I meant was give me one use case scenario when we would run on a real problem. When triggering the update function by placing the older versions of modules I didn't see any problems with stale data. Updates works just fine. Probably already present 'admin/modules/update' is just enough.

@indigoxela
Copy link
Member Author

indigoxela commented Apr 20, 2021

@docwilmot many thanks for that snippet from file update.compare.inc.

// On certain paths, we should clear the cache and recompute the projects for
// update status of the site to avoid presenting stale information.

recompute the projects for update status That's why I explicitly mentioned in my previous comment, what data are necessary on /admin/modules. It doesn't run any updates. It's a form to enable or disable modules, that are already present.

Another question: why should this be necessary on /admin/modules but not on /admin/modules/list? That's the same page.

Wild guessing: back in 2010 or even earlier (git blame) someone thought, that the page /admin/modules also needs this, but they didn't verify if this is actually the case. Since then the page is slow (expensive code run twice). Almost twelve years later... maybe we can fix that mistake.

But probably not today. Although I appreciate passionate discussions about code details - phew, I need a pause. And I assume, I'm not the only one. 😉

@indigoxela
Copy link
Member Author

indigoxela commented May 1, 2021

@docwilmot how about that:

The system module should be the only one doing cache resets on that page, but...

If the update module is turned on (module_exists), it does the same as the update module would do on that page, if it's not turned on it only runs the refresh as it currently does.

The update module should do nothing with that page. That way we make sure that expensive code only runs once, but no stale data are around - even if this flushes a little more than page /admin/modules and /admin/modules/list actually need.

The current culprit is that system_rebuild_module_data() runs twice - that's not hard to fix IMO.

@docwilmot
Copy link
Contributor

@indigoxela if we fix the static cache in system_rebuild_module_data(), then the expensive code will only ever run once, regardless of which module(s) call that function. That still seems to me to be the most logical solution to this issue.

@indigoxela
Copy link
Member Author

@docwilmot if you insist on multiple calls (static cache fixed or not), I guess, we're stuck here.

My branch has conflicts anyway, closing.

@indigoxela indigoxela closed this May 7, 2021
@backdrop-ci
Copy link
Collaborator

Website: http://3600.backdrop.backdrop.qa.backdropcms.org Removed

@quicksketch
Copy link
Member

What I meant was give me one use case scenario when we would run on a real problem.

I think I have found the situation which explains why this line is there and what its intention is. Here is an example scenario:

  1. Download the Webform contrib module and enable it.
  2. Edit the .info file to make Webform be an older version, such as 1.x-4.20.0. Webform 4.21 was a security update, so this causes a big warning on the modules page:
    image
  3. Update the .info file again to the latest version 1.x-4.23.0, pretending that we are "updating" the module.
  4. Now visit /admin/module again.

The behavior changes at this point.

Before this PR: The warning immediately goes away, because the comparison of available module versions versus installed versions is recalculated.
After this PR: The message stays, notifying the administrator that updates are available, even though they are not. Visiting the admin/reports/updates page does rebuild the cache. And afterwards visiting admin/modules the message is no longer shown.

How frequently would this actually be a problem? It would only occur if the administrator manually updated the module on disk, did not clear the caches before visiting /admin/modules, did not run update.php (perhaps if no run was needed), and did not load another page that still is in the list of pages that initiate rebuilds (such as admin/reports/updates).

Unfortunately, there's a pretty good chance that administrators will visit admin/modules after applying updates, since it's the most obvious place to check the installed module version.

For now, I think keeping the intended behavior would be a good thing (the message should go away immediately), and we can fix the performance problem using @docwilmot's approach in #3601.

Longer-term, I personally really dislike the way Update module yells at you all over the place. IMO the red dot in the admin bar and the big errors on the Available Updates page and the Status Report are sufficient. If we were to rework/remove these messages from other pages of the site, it would be an alternative way to save expensive processing when visiting the main admin pages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fixed performance on the Modules listing page /admin/modules
6 participants