Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AO3-6328 Try to fix large tags not updating in uses sort #4637

Merged
merged 5 commits into from
Oct 16, 2023

Conversation

weeklies
Copy link
Contributor

@weeklies weeklies commented Oct 7, 2023

Pull Request Checklist

Issue

https://otwarchive.atlassian.net/browse/AO3-6328 (Second PR)

Purpose

Tries to ensure that when the number of uses of a large tag increases (e.g. when a work is posted), a reindex is triggered. After some investigation, it looks like the cached tag count prevents write_taggings_to_redis from being called when the tag has at least 1000 uses (TAGGINGS_COUNT_MIN_CACHE_COUNT).

I am not 100% sure of this approach, which is to remove the old cached count in after_update. Another way I believe to be possible is setting the taggings_count_cache range in ScheduledTagJob to be a lower value like TAGGINGS_COUNT_MIN_CACHE_COUNT. Not sure of the broader implications of these methods.

Testing Instructions

See Jira issue.

References

#4632

Credit

weeklies (she/her)

@sarken sarken added the Priority: High - Broken on Test Merge immediately after approval label Oct 8, 2023
@ceithir
Copy link
Contributor

ceithir commented Oct 9, 2023

def self.taggings_count_expiry(count)
# What we are trying to do here is work out a resonable amount of time for a work to be cached for
# This should take the number of taggings and divide it by TAGGINGS_COUNT_CACHE_DIVISOR ( defaults to 1500 )
# such that for example 1500, would be naturally be tagged for one minute while 105,000 would be cached for
# 70 minutes. However we then apply a filter such that the minimum amount of time we will cache something for
# would be TAGGINGS_COUNT_MIN_TIME ( defaults to 3 minutes ) and the maximum amount of time would be
# TAGGINGS_COUNT_MAX_TIME ( defaulting to an hour ).
expiry_time = count / (ArchiveConfig.TAGGINGS_COUNT_CACHE_DIVISOR || 1500)
[[expiry_time, (ArchiveConfig.TAGGINGS_COUNT_MIN_TIME || 3)].max, (ArchiveConfig.TAGGINGS_COUNT_MAX_TIME || 50) + count % 20 ].min
end

Do we know what are the values of TAGGINGS_COUNT_CACHE_DIVISOR, TAGGINGS_COUNT_MIN_TIME and TAGGINGS_COUNT_MAX_TIME on staging? Just to check they were not set to extremes quite different from the defaults to test things, ending up with an abnormally high taggings_count_expiry.

@sarken
Copy link
Collaborator

sarken commented Oct 9, 2023

@ceithir I just checked and all three are nil on staging.

@ceithir
Copy link
Contributor

ceithir commented Oct 10, 2023

I don't think it's the sole source of the current issue, as it only affects tags with a count < 1000, but dropping it here to not forget:
update_tag_cache tries to update the cache if the cached value is less than TAGGINGS_COUNT_MIN_CACHE_COUNT... But it seems to be doing nothing in that case, as there's an early return for any non nil value of the cache, be it 1, 999 or 1001.

@ceithir
Copy link
Contributor

ceithir commented Oct 11, 2023

Out of the blue, I'm wondering if the call to write_taggings_to_redis, and the whole job that goes with it, couldn't be replaced with something straightforward like:

if value != taggings_count_cache
  # Skipping callback to avoid changing the current in memory object, and cascading side effects in general, as we're in a getter
  Tag.where(id: self.id).update_all(taggings_count_cache: value)
  self.enqueue_to_index
end

What lock/performance issue am I missing here?

@weeklies
Copy link
Contributor Author

@ceithir I didn't want to try to redesign the tags update code, because I worry about unintended consequences considering the complexity of the various async jobs. If AD&T decides to do so, I think it should be in a separate issue.

@brianjaustin brianjaustin added this to the 0.9.351 milestone Oct 16, 2023
@sarken sarken merged commit 4cacf7b into otwcode:master Oct 16, 2023
24 checks passed
sarken added a commit that referenced this pull request Oct 17, 2023
sarken added a commit that referenced this pull request Oct 17, 2023
)

Revert "AO3-6328 Try to fix large tags not updating in uses sort (#4637)"

This reverts commit 4cacf7b.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants