-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The ingest tracker drops updates if they come in too close together #4781
The ingest tracker drops updates if they come in too close together #4781
Comments
I don't think we'll know the best way to fix this until we can see what's going on. I've put up an initial patch which means we'll log the proper error (wellcomecollection/storage-service#738) – once we've got that merged, we can deploy it in staging and fire a stream of ingest updates into the tracker to see what happens. |
Looking in the logs, I saw the following sequence of events:
The ingest tracker API returns a Conflict if the underlying IngestTracker class returns an instance of I wrote a small test to see what happens in the Dynamo tracker if you fire in lots of updates at the same time. Mostly I got So here's my new theory for what happens: If multiple ingest updates arrive simultaneously, depending on consistency, DynamoDB returns either:
The latter only occurs rarely, subject to DynamoDB consistency, which is why we haven't seen this more often – but if we're putting a lot of load on DynamoDB (say, sending an update for every ingest in the storage service), you'd expect to see this more often. I'm going to fix this by adding a new status code to the ingest tracker: if you try to update an ingest that doesn't exist, you get an HTTP 404 Not Found error, rather than a StateConflict. |
When all the services are running and processing messages immediately, we've seen the ingest tracker drop updates, e.g.
If those two messages arrive in quick succession, one of them gets dropped.
e.g. 49737f31-35f7-4d09-b63b-514f2a46b29c is missing its message from the bag versioner.
The error in the logs is extremely unhelpful and unenlightening in this regard:
because we swallow the entire error.
The text was updated successfully, but these errors were encountered: