-
-
Notifications
You must be signed in to change notification settings - Fork 730
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Flaky] rspec ./spec/system/consumer/caching/shops_caching_spec.rb:17 #11010
Comments
Looks like it got introduced by merging: Confusingly, the next merge introduced another failing spec: |
I can't reproduce this. 🤔 |
Humm, this one always fails for me locally - which is a good thing I guess 😁 |
This error still appears after merging your pull request, @filipefurtad0. |
I managed to make it fail once in a 100 tries... config.cache_store = :file_store, Rails.root.join("tmp", "cache", "paralleltests#{ENV['TEST_ENV_NUMBER']}") The only thing I can think of is maybe the hardrive being overloaded and the cache entry not being written yet by the time we check for it. Adding a sleep might fix the issue ? @mkllnk @filipefurtad0 any other idea ? |
Interesting. It did fail for Filipe almost every time. I don't think that a file system would be that slow. I would also imagine a filesystem keeping track of changes to be written and blocking if needed. All environments except test use Redis. And we do set up Redis on CI because it's used by several parts of the application. Maybe we should try to switch test to Redis as well? It may even be faster, who knows. Definitely more realistic. |
This patch works for me but I have no idea if it will be flaky on CI: diff --git a/config/environments/test.rb b/config/environments/test.rb
index 71228c5e3..70ebf9678 100644
--- a/config/environments/test.rb
+++ b/config/environments/test.rb
@@ -14,7 +14,11 @@ Openfoodnetwork::Application.configure do
config.public_file_server.headers = { 'Cache-Control' => 'public, max-age=3600' }
# Separate cache stores when running in parallel
- config.cache_store = :file_store, Rails.root.join("tmp", "cache", "paralleltests#{ENV['TEST_ENV_NUMBER']}")
+ config.cache_store = :redis_cache_store, {
+ driver: :hiredis,
+ url: ENV.fetch("OFN_REDIS_URL", "redis://localhost:6379/1"),
+ reconnect_attempts: 1
+ }
# Show full error reports and disable caching
config.consider_all_requests_local = true |
I think I found the issue, I managed to reproduce it by pausing the |
I tried the Redis version on CI and it still failed. So your finding it much better. |
CI is running many containers with groups of specs but each container runs only on rspec process, not in parallel, as far as I know. Is Redis shared on Github Actions? Is cache clearing executed async? We had the flaky spec before Filipe introduced the explicit cache clearing on the two caching specs. Is the cache automatically cleared by Rspec? In that case it would be difficult to be more granular. Can we isolate our cache with a Thread id or something similar? Others must have had this problem before... |
Interestingly, I just found this bit of code in config.before(:each) do
reset_spree_preferences do |spree_config|
# These are all settings that differ from Spree's defaults
spree_config.default_country_id = default_country_id
spree_config.checkout_zone = checkout_zone
spree_config.currency = currency
spree_config.shipping_instructions = true
end
end
|
Indeed : rails/rails#48341 , but by the look of it rails doesn't offer any solution. |
I am out of ideas, the only thing I can say is @filipefurtad0 's fix isn't changing anything because a |
Thanks for that investigation @rioug. I guess that explains why it still keeps failing... Locally, it went from constant failing to passing; hence, I was confident that the change would bring improvement. Let's revert it, once we have another approach for this 👍 |
No worries, this one seems to consistently fails on my fork. But it's a weird one because we put something in cache and try to check the cache straight after and the entry is missing 😕 |
@mkllnk Sorry that this is biting you folks as well. I'm dropping in because I noticed the link from rails/rails#48341 back to this. I was able to isolate my cache by thread identifier using this sort of code as part of the parallelization spin-up in test/test_helper.rb: # Protect from collisions during parallelized testing by namespacing our caching prefix by our worker identifier.
# Without this protection, all test executors share the same cache, which can lead to transient failures due to cache collisions.
# This protection is partial -- even with it, cache state still travels across tests executing in the same worker, which the author must be cautious of.
parallelize_setup do |worker|
Rails.cache.options[:namespace] += "#{Time.now.to_i}.#{worker}:"
end I used Not sure if this will help you, but I thought maybe it would so I'd drop it in here in case it was useful -- hope that you find a relatively easy way past this contention! |
Another new occurrence (just FYI). |
As discussed here (openfoodfoundation#11010 (comment)), reset_spree_preferences already does Rails.cache.clear
I've been thinking about this spec more. The cache entry expires after 15 seconds which seems plenty but there may be a reason why the system is waiting somewhere. It is the first spec in the spec run which can trigger more boot time like compiling assets. So here are ways it can fail: visit shops_path
# wait until cache expires
sleep 16
key, options = CacheService::FragmentCaching.ams_shops
expect_cached "views/#{key}", options visit shops_path
sleep 10
visit shops_path
sleep 6
key, options = CacheService::FragmentCaching.ams_shops
expect_cached "views/#{key}", options I tried the second example to know if another spec could influence this but having the visits in two different Locally, I'm using Spring to run tests quicker but to simulate CI conditions I stopped spring and then tried again. It usually passes on my machine. But if I add Then I had the idea of clearing the cache with
After that, I ran the test again. Some JS has been compiled and the rest isn't taking as long. The test ran for 36 seconds and failed. I tried to surround the example with |
Nice find ! |
This came back: The previous PR fixed one spec example but if the execution order of examples changed then this could happen again. I'm looking into a more sustainable solution. |
Some failures were also tracked in this issue. |
@filipefurtad0 @mkllnk any of you planning to work on this issue? |
Thank you for pinging @sigmundpetersen. From my side, we can move it to All the things. Would be great to fix it, but I don't think this is a priority. |
I might continue this again this week. Otherwise I'll move it back. |
What we should change and why (this is tech debt)
rspec ./spec/system/consumer/caching/shops_caching_spec.rb:17
Context
https://github.com/openfoodfoundation/openfoodnetwork/actions/runs/5263043188/jobs/9512817149
https://openfoodnetwork.slack.com/archives/C012LE8LLDS/p1686718490274599?thread_ts=1686601349.929099&cid=C012LE8LLDS
Impact and timeline
The text was updated successfully, but these errors were encountered: