-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clear CFF Upload
s as they are overwritten
#745
Conversation
Codecov ReportAttention: Patch coverage is
✅ All tests successful. No failed tests found.
Additional details and impacted files@@ Coverage Diff @@
## main #745 +/- ##
=======================================
Coverage 98.02% 98.02%
=======================================
Files 438 438
Lines 36283 36305 +22
=======================================
+ Hits 35566 35588 +22
Misses 717 717
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Codecov ReportAttention: Patch coverage is ✅ All tests successful. No failed tests found.
@@ Coverage Diff @@
## main #745 +/- ##
=======================================
Coverage 98.02% 98.02%
=======================================
Files 438 438
Lines 36283 36305 +22
=======================================
+ Hits 35566 35588 +22
Misses 717 717
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Codecov ReportAttention: Patch coverage is
✅ All tests successful. No failed tests found.
@@ Coverage Diff @@
## main #745 +/- ##
=======================================
Coverage 98.02% 98.02%
=======================================
Files 438 438
Lines 36283 36305 +22
=======================================
+ Hits 35566 35588 +22
Misses 717 717
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Codecov ReportAttention: Patch coverage is
✅ All tests successful. No failed tests found. @@ Coverage Diff @@
## main #745 +/- ##
=======================================
Coverage 98.02% 98.02%
=======================================
Files 438 438
Lines 36283 36305 +22
=======================================
+ Hits 35566 35588 +22
Misses 717 717
Flags with carried forward coverage won't be shown. Click here to find out more.
|
bd39c1f
to
aadcb43
Compare
|
||
|
||
@sentry_sdk.trace | ||
def delete_uploads_by_sessionid(upload: Upload, session_ids: list[int]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3 things here:
- I could see this taking some time because some customers have a gargantuan number of CFFs in their commits, so this might benefit from making a standalone task - unless we're wanting to do this synchronously. In which case, we'd take a bit extra time processing, so something to contemplate/measure in some way if we can
- How could this react with other sessions being processed in parallel? Not sure if while deleting we could run into errors if two or more processes are trying to dleete the same entry
- How could we test this in practice? I just want to make sure we aren't silently corrupting data by not acknowledging our reports were using data from cff's we deleted or something like that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- not quite sure the overhead of a separate task (both complexity and runtime) is reasonable for that. but you are right that this might be quite slow. However it is limited to only those uploads/sessions which have been overridden by an upload, so that should be rather limited.
- we are behind the "modify report" (upload processing) lock, so there is no parallelism here. also concurrent deletes shouldn’t be a problem either way, as we filter things by
IN
, which sql just ignores when no matching rows exist. - good point, I don’t have a good idea for this yet. Though this change aims to align data in SQL with data inside of the
report_json
. So this is actually fixing a pre-existing "data corruption" if you think about it that way.
with these questions, you have brought up another good point: do we need to care about SQL-level locking here? I don’t think we need to, as the deletes are only locking the effected rows for the duration of the transaction (which I guess spans the whole task?), and those particular rows are only touched behind the upload processing lock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming the deletions take place a short amount of time, I don't think we'd need to worry
aadcb43
to
2b7fdda
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When merged, lets keep a close look at how the DB and customers react to the data (e.g. be aware if there's any huge complains from our customers, if we hear anything from support, etc). I'd post a brief message in the eng/support channel to explain the change and have everyone ready
2b7fdda
to
ea42b7f
Compare
The `Upload`s in the database should mirror the `Session`s within a stored `Report`. However, so far this was not the case for carryforwarded sessions/uploads. The `Session`s were removed from the `Report` as they were being overridden by new uploads, but that was not the case for the `Upload` rows in the database which were created from the carry-forwarded `Session`s on initial `Report` carry-forwarding. This change will now make sure that the `Upload`s in the database match the `Session`s in the `Report`.
ea42b7f
to
de2c496
Compare
The
Upload
s in the database should mirror theSession
s within a storedReport
.However, so far this was not the case for carryforwarded sessions/uploads. The
Session
s were removed from theReport
as they were being overridden by new uploads, but that was not the case for theUpload
rows in the database which were created from the carry-forwardedSession
s on initialReport
carry-forwarding.This change will now make sure that the
Upload
s in the database match theSession
s in theReport
.