Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Temp Files in Cache Directory Fail to Clear #397

Closed
raamdev opened this issue Jan 12, 2015 · 23 comments
Closed

Temp Files in Cache Directory Fail to Clear #397

raamdev opened this issue Jan 12, 2015 · 23 comments
Assignees
Milestone

Comments

@raamdev
Copy link
Contributor

raamdev commented Jan 12, 2015

Forked from Issue #288 ("Fatal Error: 'Unable to clear dir'").

This has been an ongoing issue since implementing cache locking in #288. The bug presents itself as a cache directory with many *-tmp cache files that fail to clear when the Clear Cache button is clicked or when an automatic clear cache routine is run.


A Quick Cache Pro user reports in a support ticket the following error appears at various times when Quick Cache runs a clear routine, such as when adding a new post:

Warning: rename(/var/sites/f/mydomain.co.uk/public_html/wp-content/cache/quick-cache/cache/http/www-mydomain-co-uk-54b3ed17443a4352558262-tmp,/var/sites/f/mydomain.co.uk/public_html/wp-content/cache/quick-cache/cache/http/www-mydomain-co-uk): File exists in /var/sites/f/mydomain.co.uk/public_html/wp-content/plugins/quick-cache-pro/includes/share.php on line 1688

Fatal error: Uncaught exception 'Exception' with message 'Unable to delete files. Rename failure on tmp directory: `/var/sites/f/mydomain.co.uk/public_html/wp-content/cache/quick-cache/cache/http/www-mydomain-co-uk-54b3ed17443a4352558262-tmp`.' in /var/sites/f/mydomain.co.uk/public_html/wp-content/plugins/quick-cache-pro/includes/share.php:1689 
Stack trace: 
#0 /var/sites/f/mydomain.co.uk/public_html/wp-content/plugins/quick-cache-pro/includes/share.php(1428): quick_cache\share->delete_files_from_host_cache_dir('/^\\//aaegory\\//...') 
#1 /var/sites/f/mydomain.co.uk/public_html/wp-content/plugins/quick-cache-pro/quick-cache-pro.inc.php(2152): quick_cache\share->clear_files_from_host_cache_dir('/^\\//aaegory\\//...') 
#2 [internal function]: quick_cache\plugin->auto_clear_post_terms_cache(9502) 
#3 /var/sites/f/mydomain.co.uk/public_html/wp-includes/plugin.php(496): call_user_func_array(Array, Array) 
#4 /var/sites/f/mydomain.co.uk/public_html/wp-includes/taxonomy.php(3025): do_action(' in/var/sites/f/mydomain.co.uk/public_html/wp-content/plugins/quick-cache-pro/includes/share.php on line 1689

@mchlbrry reports in #288 (comment):

[08-Jan-2015 21:17:03 UTC] PHP Fatal error:  Uncaught exception 'RuntimeException' with message 'SplFileInfo::getMTime(): stat failed for /var/www/html/wordpress/wp-content/cache/quick-cache/cache/http/www-domain-com-54aef3cf69cc8928126844-tmp/album-of-the-year-2013.html' in /var/www/html/wordpress/wp-content/plugins/quick-cache-pro/includes/share.php:1675
Stack trace:
#0 /var/www/html/wordpress/wp-content/plugins/quick-cache-pro/includes/share.php(1675): SplFileInfo->getMTime()
#1 /var/www/html/wordpress/wp-content/plugins/quick-cache-pro/includes/share.php(1462): quick_cache\share->delete_files_from_host_cache_dir('/^.+/i', true)
#2 /var/www/html/wordpress/wp-content/plugins/quick-cache-pro/quick-cache-pro.inc.php(1291): quick_cache\share->purge_files_from_host_cache_dir('/^.+/i')
#3 [internal function]: quick_cache\plugin->purge_cache()
#4 /var/www/html/wordpress/wp-includes/plugin.php(580): call_user_func_array(Array, Array)
#5 /var/www/html/wordpress/wp-cron.php(103): do_action_ref_array('_cron_quick_cac...', Array)
#6 {main}
  thrown in /var/www/html/wordpress/wp-content/plugins/quick-cache-pro/includes/share.php on line 1675

@jaswsinc also reported seeing this 'SplFileInfo::getMTime(): stat failed error on s2Member.com.

@jaswrks
Copy link

jaswrks commented Jan 12, 2015

This has been an ongoing issue since implementing cache locking in #288.

Thanks! Just wanted to clarify that errors like these were present both before and after cache locking. The recent cache locking improvements were aimed at resolving this issue.


I'm seeing this issue occur randomly two or three times each week. Looking over the codebase it would seem impossible, and thus I'm guessing at this point that it's still due to some corruption in the stat cache on some versions of PHP. A possible workaround for this might include being more aggressive with clearstatcache() in other areas of Quick Cache.


That said; I'm curious to know if anyone else reporting this problem is still seeing a large number of these errors in a log file; i.e. more than a few times daily? For instance, in the previous issue (#288) it would seem that our latest clearstatcache() adjustments resolved this for most. Anyone out there still with major issues related to this sort of error?

@raamdev
Copy link
Contributor Author

raamdev commented Jan 12, 2015

Just wanted to clarify that errors like these were present both before and after cache locking. The recent cache locking improvements were aimed at resolving this issue.

Right, sorry. I should have said that temp files getting left behind have been ongoing since implementing cache locking, since before cache locking we did not have any temp directories. Also, this issue (prior to cache locking) only exhibited itself rarely, on very large and busy sites. Cache locking (with the new creation and renaming of temp directories) seems to have exasperated the issue (for a yet unknown reason), so more users are now experiencing an issue that previously was only experienced by very large and busy sites.

@jaswrks jaswrks self-assigned this Jan 12, 2015
@eurobank
Copy link

I'm sorry to say that i have also found today some tmp leftovers in the "cache/http" directory.

I'm running LiteSpeed, Cloudlinux, php 5.5.20 and the server is pretty stable. I guess this is a plugin problem and has not been resolved yet.

@jaswrks
Copy link

jaswrks commented Jan 15, 2015

Referencing this block of code. The line with getMTime() is where the error occurs.

/** @var $_file_dir \RecursiveDirectoryIterator Regex iterator reference for IDEs. */
foreach(($_dir_regex_iteration = $this->dir_regex_iteration($_host_cache_dir_tmp, $_host_cache_dir_tmp_regex)) as $_file_dir)
{
    if(($_file_dir->isFile() || $_file_dir->isLink()) // Files and/or symlinks only.

       // Don't delete files in the immediate directory; e.g. `zc-advanced-cache` or `.htaccess`, etc.
       // Actual `http|https/...` cache files are nested. Files in the immediate directory are for other purposes.
       && ($_host_cache_dir !== $cache_dir || strpos($_file_dir->getSubpathname(), '/') !== FALSE)

       // If NOT checking max age; or if we ARE, and the file has expired now.
       && (!$check_max_age || (!empty($max_age) && $_file_dir->getMTime() < $max_age))

    ) // Throw an exception if a deletion failure occurs.
    {
        if(!unlink($_file_dir->getPathname())) // Throw exception if unable to delete.
            throw new \exception(sprintf(__('Unable to delete file: `%1$s`.', $this->text_domain), $_file_dir->getPathname()));
        $counter++; // Increment counter for each file we delete.
    }
    else if(!$check_max_age && $regex === '/^.+/i' && $_file_dir->isDir()) // Directories too?
    {
        if(!rmdir($_file_dir->getPathname())) // Throw exception if unable to delete the directory itself.
            throw new \exception(sprintf(__('Unable to delete dir: `%1$s`.', $this->text_domain), $_file_dir->getPathname()));
        # $counter++; // Increment counter for each directory we delete. ~ NO don't do that here.
    }
}
unset($_dir_regex_iteration, $_file_dir); // Housekeeping after this `foreach()` loop.

I'm currently researching PHP bug reports related to SplFileInfo::getMTime(). Something tells me this is not a problem with Quick Cache, but a PHP bug that we need to work around. As you can see from this block of code (which is also wrapped by a cache lock routine and preceded by a clearstatcache() call). It clearly checks that the file exists before it runs getMTime(). So for getMTime() to fail, something very unexpected is taking place at what appears to be random times throughout the week.

I suppose it is possible that DirectoryIterator::isFile() is not working in the way it is documented. If that's the case, we could try adding an additional is_file() call before running SplFileInfo::getMTime().

@jaswrks
Copy link

jaswrks commented Jan 15, 2015

I just noticed that getMTime() will look at the modification time of the realpath() in the case of a symlink. That could explain some of this, because in the case of 404 errors being cached, the underlying 404 error file may have already been deleted in the iteration, before we reach the symlinks that pointed to it. Thus, getMTime() would fail in that scenario.

However, in my error logs I'm seeing entries associated with paths that do exist. So that doesn't explain the entire issue reported here in my view; at least not yet.

The fact that symlinks are not returning a file modification time of their own though; that's going to be an issue of it's own for us to look at.

@jaswrks
Copy link

jaswrks commented Jan 15, 2015

Referencing: http://php.net/manual/en/function.lstat.php

@raamdev
Copy link
Contributor Author

raamdev commented Jan 16, 2015

I just noticed that getMTime() will look at the modification time of the realpath() in the case of a symlink. That could explain some of this, because in the case of 404 errors being cached, the underlying 404 error file may have already been deleted in the iteration, before we reach the symlinks that pointed to it. Thus, getMTime() would fail in that scenario.

Good catch!

The fact that symlinks are not returning a file modification time of their own though; that's going to be an issue of it's own for us to look at.

@jaswsinc Would you like me to open a new GitHub Issue for that, or do you see that being worked on as part of this issue?

@jaswrks
Copy link

jaswrks commented Jan 16, 2015

Would you like me to open a new GitHub Issue for that, or do you see that being worked on as part of this issue?

I think I can work on it as part of this issue. I'm working to improve the way this routine handles symlinks as part of the work to help resolve this, so I think I can just include it here.

@raamdev
Copy link
Contributor Author

raamdev commented Jan 16, 2015

I can just include it here.

Cool. Sounds great. Thanks!

@jaswrks
Copy link

jaswrks commented Jan 16, 2015

@raamdev I submitted some pull requests that are now ready to merge. They work to resolve this issue. However, I'd like to leave this open for a couple of days while I work on a second PR to take things a step further after additional testing.

@raamdev
Copy link
Contributor Author

raamdev commented Jan 16, 2015

I submitted some pull requests that are now ready to merge.

Thank you. I reviewed this and merged them.

I'd like to leave this open for a couple of days while I work on a second PR to take things a step further after additional testing.

Copy that. I'll leave this open.

@jaswrks
Copy link

jaswrks commented Jan 16, 2015

I'm running the latest changes at s2Member.com to see if the getMTime() issue still occurs there after the last round of improvements.

@jaswrks
Copy link

jaswrks commented Jan 16, 2015

So far so good. I'll leave it for another day or so.

@jaswrks
Copy link

jaswrks commented Jan 17, 2015

So far so good. I'll leave it until Monday and provide another update.

@jaswrks
Copy link

jaswrks commented Jan 19, 2015

@raamdev I think we have knocked this out finally. I'm not seeing the same issue since the latest improvements in the PRs submitted prior. I think it's safe to close this now and gather additional feedback during the RC phase.

@raamdev
Copy link
Contributor Author

raamdev commented Jan 19, 2015

I think it's safe to close this now and gather additional feedback during the RC phase.

Copy that. Thank you for the testing! I will close this for further testing in the Release Candidate.

@raamdev
Copy link
Contributor Author

raamdev commented Jan 19, 2015

Next release changelog:

  • Bug Fix: Several fixes for a stubborn bug related to "Fatal Error: 'Unable to clear dir'" error messages and errors referencing "SplFileInfo::getMTime(): stat failed". Props @jaswsinc. See #397.

@raamdev
Copy link
Contributor Author

raamdev commented Jan 22, 2015

@eurobank Quick Cache v150121 (Release Candidate) was released yesterday that includes further fixes for this issue. If you'd like to help test the Release Candidate, that would be greatly appreciated! Please see http://www.websharks-inc.com/post/quick-cache-v150121-release-candidate/

@eurobank
Copy link

Last time i spoke tooo soon and the problem was not fixed. This time, after 3 days, i see no tmp leftover files or anything.

@raamdev
Copy link
Contributor Author

raamdev commented Jan 28, 2015

i see no tmp leftover files or anything.

@eurobank That's great to hear! Thank you for the update. :)

@Venique
Copy link

Venique commented Jan 29, 2015

Unfortunately, this update does not solve the problem for me... still many temp folders. Which information you need for diagnose?

@raamdev
Copy link
Contributor Author

raamdev commented Jan 29, 2015

@Venique writes...

this update does not solve the problem for me

Which update are you referring to? Which version number specifically?

@raamdev
Copy link
Contributor Author

raamdev commented Jan 30, 2015

The fixes applied from this issue have been released today with v150129:

  • Bug Fix: Several fixes for a stubborn bug related to "Fatal Error: 'Unable to clear dir'" error messages and errors referencing "SplFileInfo::getMTime(): stat failed". Props @jaswsinc. See #397.

See the v150129 announcement post.


This issue will now be locked to further updates. If you have something to add related to this Issue, please open a new GitHub Issue and mention this one (#397).

@wpsharks wpsharks locked and limited conversation to collaborators Jan 30, 2015
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants