Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce new cronjob to regularly cleanup outdated lock files if fil… #39372

Open
wants to merge 1 commit into
base: 2.4-develop
Choose a base branch
from

Conversation

hostep
Copy link
Contributor

@hostep hostep commented Nov 15, 2024

…e based lock provider is being used.

Description (*)

When Magento is setup to use file based locking, then we need to keep the directory that stores these files under control.
I'm introducing a cronjob here, that runs once per day and searches for lock files that haven't been modified in the last 24 hours and can thus be safely removed. This will keep the contents of the lock files directory under control.
This cronjob will only execute something when the lock provider is configured to use files, not when one of the others is used (database - the default, zookeeper or cache)

Related Pull Requests

Fixed Issues (if relevant)

  1. Partially fixes When using file storage for the lock provider, we get an ever growing directory of files without any cleanup happening #39369

Manual testing scenarios (*)

  1. Setup clean Magento with sample data
  2. Make sure cronjobs are running
  3. From the root of Magento, execute these commands:
$ mkdir "$(pwd)/var/locks"
$ bin/magento setup:config:set --lock-provider=file --lock-file-path="$(pwd)/var/locks"
$ bin/magento cache:enable
$ bin/magento cache:flush
  1. Start visiting the frontend, please visit many different pages: homepage, category pages, product detail pages, ...
  2. Also create some orders
  3. Keep the shop running for at least 48 hours, preferably longer
  4. After more than 48 hours have passed, take a look at the files in var/locks, expected is that no file should be older then 48 hours
  5. Also search the var/log/system.log file for the phrase Deleted xxx old lock files. After 48 hours, it should appear at least once, maybe twice or more, the xxx should in theory be bigger then 0 at least once.

Questions or comments

  • Since there isn't a module in app/code/Magento that is related to Locks, I picked the Magento/Backend one, because it already has a similar cronjob to keep caches clean
  • I've implemented cleanup for file based locking only. As far as I can see, this isn't needed for the database, cache & zookeeper one, those will keep themselves maintained as far as I understand it
  • Should we be worried about users selecting a directory for file based locking that can potentially contains other files? Because the assumption in this PR is that such directory only contains lock files and we just deleting any file we find.
    If this isn't a good idea, how should we identify lock files? Maybe an extra check can be added to only delete files that are 0 bytes? And skip files that are bigger?
    The existing integration test used /tmp, so maybe people use this directory as well in real life and deleting all kinds of random files from that directory might not be a good idea?

Contribution checklist (*)

  • Pull request has a meaningful description of its purpose
  • All commits are accompanied by meaningful commit messages
  • All new or changed code is covered with unit/integration tests (if applicable)
  • README.md files for modified modules are updated and included in the pull request if any README.md predefined sections require an update
  • All automated tests passed successfully (all builds are green)

Copy link

m2-assistant bot commented Nov 15, 2024

Hi @hostep. Thank you for your contribution!
Here are some useful tips on how you can test your changes using Magento test environment.
❗ Automated tests can be triggered manually with an appropriate comment:

  • @magento run all tests - run or re-run all required tests against the PR changes
  • @magento run <test-build(s)> - run or re-run specific test build(s)
    For example: @magento run Unit Tests

<test-build(s)> is a comma-separated list of build names.

Allowed build names are:
  1. Database Compare
  2. Functional Tests CE
  3. Functional Tests EE
  4. Functional Tests B2B
  5. Integration Tests
  6. Magento Health Index
  7. Sample Data Tests CE
  8. Sample Data Tests EE
  9. Sample Data Tests B2B
  10. Static Tests
  11. Unit Tests
  12. WebAPI Tests
  13. Semantic Version Checker

You can find more information about the builds here
ℹ️ Run only required test builds during development. Run all test builds before sending your pull request for review.


For more details, review the Code Contributions documentation.
Join Magento Community Engineering Slack and ask your questions in #github channel.

@hostep
Copy link
Contributor Author

hostep commented Nov 15, 2024

@magento run all tests

@kandy
Copy link
Contributor

kandy commented Nov 15, 2024

You can archive almost the same by pointing lock directory to system tmp directory

@hostep
Copy link
Contributor Author

hostep commented Nov 17, 2024

Thanks @kandy, that sounds like a great idea.

However, when checking some of the servers that our projects are hosted on, the files that are kept in /tmp vary wildly.
On some of our servers, indeed, it seems like the files in there get cleaned up regularly, but then on some other servers, I still find files that are over a year old for example.

So I don't think we can trust the cleanup of /tmp by the operating system, because it seems to differ a lot between different operating systems and versions and different hosting parties about how they configure these things.

@hostep
Copy link
Contributor Author

hostep commented Nov 18, 2024

@magento run all tests

@m2-community-project m2-community-project bot added the Priority: P2 A defect with this priority could have functionality issues which are not to expectations. label Nov 18, 2024
@hostep
Copy link
Contributor Author

hostep commented Nov 19, 2024

@kandy, I reached out to one of the hosting companies we work with on a regular basis and they provide us with servers that at the moment run on Debian 11 or 12. And they told me that Debian only cleans out the /tmp directory when the server is being rebooted, so if the server has an uptime of 2 years, it's expected that /tmp can still contain files that are 2 years old.

So I wouldn't suggest to use /tmp as the general solution, because of this and also because its cleanup seems to vary wildly depending on different operating system, if I can trust this 12 year old stack overflow answer: https://serverfault.com/a/377349

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority: P2 A defect with this priority could have functionality issues which are not to expectations. Progress: pending review
Projects
None yet
2 participants