Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[16.0][IMP] fs_storage: invalidate filesystem cache when connection fails #320

Merged

Conversation

JordiMForgeFlow
Copy link
Contributor

@JordiMForgeFlow JordiMForgeFlow commented Jan 12, 2024

@JordiMForgeFlow JordiMForgeFlow marked this pull request as draft January 12, 2024 10:00
@JordiMForgeFlow JordiMForgeFlow force-pushed the 16.0-imp-fs_storage-store_connection branch 4 times, most recently from 3707e52 to 302a29a Compare January 19, 2024 11:23
@JordiMForgeFlow JordiMForgeFlow force-pushed the 16.0-imp-fs_storage-store_connection branch from 302a29a to 91f673f Compare January 23, 2024 12:32
@JordiMForgeFlow JordiMForgeFlow changed the title [IMP] fs_storage: add flag to decide whether connection is stored and reused [IMP] fs_storage: add setting to store connection and invalidating filesystem cache Jan 23, 2024
@JordiMForgeFlow JordiMForgeFlow marked this pull request as ready for review January 23, 2024 12:32
@JordiMForgeFlow JordiMForgeFlow force-pushed the 16.0-imp-fs_storage-store_connection branch from 91f673f to 3c08fcc Compare January 23, 2024 12:38
@JordiMForgeFlow JordiMForgeFlow changed the title [IMP] fs_storage: add setting to store connection and invalidating filesystem cache [16.0][IMP] fs_storage: add setting to store connection and invalidating filesystem cache Jan 23, 2024
Copy link
Contributor

@lmignon lmignon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JordiMForgeFlow Thank you for this improvement. A few comments... In the same time can you add a file named 320.feature into the fs_storage/readme/newsframent with a description of this new feature. The content will be used at merge time to fill the addon's changelog.
Could you also update the USAGE.rst file to describe the new parameters?

fs_storage/models/fs_storage.py Outdated Show resolved Hide resolved
fs_storage/models/fs_storage.py Outdated Show resolved Hide resolved
@JordiMForgeFlow JordiMForgeFlow force-pushed the 16.0-imp-fs_storage-store_connection branch from 3c08fcc to 0d6a88a Compare January 23, 2024 14:58
Copy link
Contributor

@LoisRForgeFlow LoisRForgeFlow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

invalidate_cache_on_error maybe is not even needed as a setting and what it adds should be just the standard behavior, but not a big deal.

@JordiMForgeFlow JordiMForgeFlow force-pushed the 16.0-imp-fs_storage-store_connection branch 2 times, most recently from 20a1f72 to 17fb3c3 Compare January 25, 2024 07:40
@JordiMForgeFlow
Copy link
Contributor Author

JordiMForgeFlow commented Jan 25, 2024

@lmignon after trying using fs.exists and looking into more detail, I have put back using fs.ls.

The fs.exists does not raise an error, but simply returns true or false. Also, it basically calls the method info which ends up calling the method fs.ls but with detail=True, which is worse than what we had. (code can be seen here: https://filesystem-spec.readthedocs.io/en/latest/_modules/fsspec/spec.html#AbstractFileSystem)

@JordiMForgeFlow JordiMForgeFlow force-pushed the 16.0-imp-fs_storage-store_connection branch 3 times, most recently from 36a5624 to 4b50530 Compare January 25, 2024 09:25
fs_storage/models/fs_storage.py Outdated Show resolved Hide resolved
fs_storage/models/fs_storage.py Outdated Show resolved Hide resolved
@lmignon
Copy link
Contributor

lmignon commented Jan 26, 2024

@lmignon after trying using fs.exists and looking into more detail, I have put back using fs.ls.

The fs.exists does not raise an error, but simply returns true or false. Also, it basically calls the method info which ends up calling the method fs.ls but with detail=True, which is worse than what we had. (code can be seen here: https://filesystem-spec.readthedocs.io/en/latest/_modules/fsspec/spec.html#AbstractFileSystem)

My concern is about the performance cost when the root directory contains hundred of thousand files... To limit the scope of the ls command, it could be performed on a specific path to a marker file. What do you thing about?

@simahawk
Copy link
Contributor

@lmignon after trying using fs.exists and looking into more detail, I have put back using fs.ls.
The fs.exists does not raise an error, but simply returns true or false. Also, it basically calls the method info which ends up calling the method fs.ls but with detail=True, which is worse than what we had. (code can be seen here: filesystem-spec.readthedocs.io/en/latest/_modules/fsspec/spec.html#AbstractFileSystem)

My concern is about the performance cost when the root directory contains hundred of thousand files... To limit the scope of the ls command, it could be performed on a specific path to a marker file. What do you thing about?

I agree. We should probably create a file on 1st check (eg: .odoo_fs_storage_$id.marker).

@JordiMForgeFlow JordiMForgeFlow force-pushed the 16.0-imp-fs_storage-store_connection branch from 4b50530 to c076645 Compare January 29, 2024 08:48
@JordiMForgeFlow JordiMForgeFlow changed the title [16.0][IMP] fs_storage: add setting to store connection and invalidating filesystem cache [16.0][IMP] fs_storage: invalidate filesystem cache when connection fails Jan 29, 2024
fs_storage/models/fs_storage.py Outdated Show resolved Hide resolved
fs_storage/models/fs_storage.py Outdated Show resolved Hide resolved
@simahawk simahawk requested a review from sebalix January 29, 2024 10:07
@simahawk
Copy link
Contributor

@sebalix can you try by installing it from https://github.com/fsspec/filesystem_spec ?

@sebalix
Copy link
Contributor

sebalix commented Feb 2, 2024

Regarding connection failure, I replaced the use of the deprecated functions in edi_storage_oca there: OCA/edi-framework#65 (comment)
Copying my comment:

I'm suspecting (but I'm not sure) it was the use of these deprecated methods that triggers connection issues over time. These methods are decorated, and it could be that the decorator is keeping an old reference to self.__fs having a socket that gets closed after some time. That could explain why the "Test connection" button of fs_storage provides the expected result, while the EDI jobs are failing even when requeued.
This will be tested anyway.

Maybe it's not related at all.

@JordiMForgeFlow JordiMForgeFlow force-pushed the 16.0-imp-fs_storage-store_connection branch from f2ffb04 to 7f810aa Compare February 5, 2024 09:12
@JordiMForgeFlow JordiMForgeFlow force-pushed the 16.0-imp-fs_storage-store_connection branch from 7f810aa to 1f4c9f0 Compare February 6, 2024 13:42
fs_storage/models/fs_storage.py Outdated Show resolved Hide resolved
@JordiMForgeFlow JordiMForgeFlow force-pushed the 16.0-imp-fs_storage-store_connection branch from 1f4c9f0 to 46aab5b Compare February 6, 2024 15:02
@lmignon
Copy link
Contributor

lmignon commented Feb 6, 2024

/ocabot merge minor

@OCA-git-bot
Copy link
Contributor

What a great day to merge this nice PR. Let's do it!
Prepared branch 16.0-ocabot-merge-pr-320-by-lmignon-bump-minor, awaiting test results.

@OCA-git-bot OCA-git-bot merged commit f3b8084 into OCA:16.0 Feb 6, 2024
5 of 7 checks passed
@OCA-git-bot
Copy link
Contributor

Congratulations, your PR was merged at 916060f. Thanks a lot for contributing to OCA. ❤️

# Use a marker file to limit the scope of the LS command for performance.
try:
self._check_connection(self.__fs)
except Exception as e:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some tests, we are still facing connections issues and the current implementation doesn't cover it: OSError: Socket is closed
This errors arrives only after some time (several hours, workers have a high lifespan here and FS object is never recreated meanwhile).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sebalix maybe then we should not store the FS object at all in the slot, and always call to _get_filesystem

Copy link
Contributor

@lmignon lmignon Feb 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The module can also be used in contexts of heavy traffic between odoo and external filesystems. This is particularly the case when it is used to manage Odoo's default attachment storage. In such circumstances, the cost of the reconnection required each time the fs proxy would be recreated is not negligible. Perhaps it would be a good idea to set up a 'keep-alive' mechanism via a cron running every 'X' minutes...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lmignon we could do that or putting it as a configuration in the FS storage to decide if you want to check the connection everytime or not. The cron approach however seems to be the solution covering both cases, and one can decide to run it more/less frequently. What do you think @sebalix ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure to understand the cron approach regarding this in-memory object?
But yes, the only fix that is working for us right now is to return a new FS object every time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand it would be to have a cron that invalidates the __fs slot after some time

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cron process could updates memory objects of other Odoo process (workers)? Not sure it's possible 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The raise e is an issue here (testing the code during a workshop), it is blocking somehow the queue-job.
Clearing the fsspec cache should be enough here, so a new connection will be established in case of error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, after one day of testing, we encountered no connection issues with the help of the skip_instance_cache parameter of fsspec (set in the options). This will add some overhead but at least we are safe... dealing with this cache in Odoo seems error prone (connection stability, configuration of the SFTP server...).

I don't know how huge is this overhead, load testing will tell us I guess.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[16.0] fs_storage + EDI: connection closed after some time
6 participants