-
Notifications
You must be signed in to change notification settings - Fork 800
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sync: Full Sync: Send immediately. #13963
Conversation
This is an automated check which relies on |
bbd0630
to
2d3811f
Compare
2b8d913
to
49ee683
Compare
bbcd97f
to
7f537d5
Compare
Miguel / Dan, I'll run through the code tomorrow AM. Conceptionally we are bypassing the queue so on shutdown or a request to pull content we are using the current state to build the chunks and send that data package. Do we maintain a checkout/checkin approach to verify the data sent is processed as well or are there any checks built in to make sure that data is being processed. |
I have a concern that this patch adds a lot of complexity. For example, we have a lot of duplication between send_ids... and enqueue_ids... I think this is really valuable, but maybe could use some refactoring before we merge. What do you think? Or maybe we say that if this works, we're going to switch completely to the new technique and so refactoring would be a waste of time and we'll just end up deleting the old code later. |
☝️ |
@gravityrail what do you think about merging to master while making the new way the default? and ask friends for testing? |
Correct. 😁
In this new mode we are sending one chunk/item at a time, if it fails we don't update the status, so we retry in the next attempt. It should give the same results as the "queue mode", but I think it gives us room to go more granular and maybe retry per object instead of per chunk. |
TODO: update the checkout endpoint. |
I would like to see a couple of really big sites sync using the new setting before I'm willing to make it the default in master. We are not completely certain of the approach until that happens. |
@lezama can you please add a parameter to the sync API request so that we know when we receive it whether this is a queued sync request or an immediate sync request? We may end up changing the behaviour on the WPCOM side when it's an immediate sync request, e.g. putting the request into a queue to be processed in the background. In this way, we can provide the benefits of both async and queue-less sending. An unrelated concern: What do you think about 100 posts as the upper limit? Might we face some memory issues doing that, with sites that have lots of large posts? How can we detect that we have hit a memory limit and perhaps back off? Should the number of rows we're sending be a setting the DB? |
I think it should be a setting per module 👍 |
they already have |
3dd4150
to
bcaac91
Compare
@lezama how do you feel about stability of the above? Is it at a place where I can apply this PR to the latest JP release on VIP and do some Full Sync tests? If so i'll work to get testing going so we can get some data. |
69933ce
to
d3ce760
Compare
Testing right now by updating my site, sandboxing it to a wpcom sandbox that's up to date with current SVN, installing the snippet and scheduling a Full Sync using the JP debugger. I'm getting these errors on the sandbox side:
What should I try? Is it my site, or is it the patch? Edit: thanks to @david-binda 's help we figured out that the problem was caused by several jobs running simultaneously. |
@zinigor seems unrelated with this PR, I have also seen those randomly, can you reproduce? |
Cherry-picked to |
* Changelog: 8.1 additions * Changelog: add #13858 * Changelog: add #13963 * Changelog: add #14174 * Changelog: add #14178 * Changelog: add #14175 * Changelog: add #14192 * Changelog: add #14196 * Changelog: add #14182 * Changelog: add #14218 * Changelog: add #14214 * Changelog: add #13757 * Changelog: add #14190 * Changelog: add #14131 * Changelog: add #14101 * Changelog: add #14203 * Changelog: add #14211 * Changelog: add #14224 * Changelog: add #14230 * Changelog: add #14241 * Changelog: add #14249 * Changelog: add #14264 * Changelog: add #14263 * Changelog: add #14256 * Changelog: add #10189 * Changelog: add #14240 * Changelog: add #14239 Also added some new entries to the testing file. Co-authored-by: Igor Zinovyev <[email protected]>
* Changelog: 8.1 additions * Changelog: add #13858 * Changelog: add #13963 * Changelog: add #14174 * Changelog: add #14178 * Changelog: add #14175 * Changelog: add #14192 * Changelog: add #14196 * Changelog: add #14182 * Changelog: add #14218 * Changelog: add #14214 * Changelog: add #13757 * Changelog: add #14190 * Changelog: add #14131 * Changelog: add #14101 * Changelog: add #14203 * Changelog: add #14211 * Changelog: add #14224 * Changelog: add #14230 * Changelog: add #14241 * Changelog: add #14249 * Changelog: add #14264 * Changelog: add #14263 * Changelog: add #14256 * Changelog: add #10189 * Changelog: add #14240 * Changelog: add #14239 Also added some new entries to the testing file. Co-authored-by: Igor Zinovyev <[email protected]>
The new full sync pattern introduced in #13963 should now be the default way to perform a full sync. We will still support the original full sync by allow sites to use a filter.
* Sync Package: use Full_Sync_Immediately by default The new full sync pattern introduced in #13963 should now be the default way to perform a full sync. We will still support the original full sync by allow sites to use a filter. * [not verified] Sync Unit Tests Use the legacy full sync module for default tests.
Full Sync has until this PR relied on the "full_sync" "queue" to organise and keep track of which objects have been sent. The queue mechanism has its benefits but it is also complex and prone for errors.
This PR intends to introduce a simpler way of "full syncing" the content of a site.
Changes proposed in this Pull Request:
New Full Sync immediately mode.
Testing instructions: