Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clear CFF Uploads as they are overwritten #745

Merged
merged 1 commit into from
Oct 2, 2024

Conversation

Swatinem
Copy link
Contributor

The Uploads in the database should mirror the Sessions within a stored Report.

However, so far this was not the case for carryforwarded sessions/uploads. The Sessions were removed from the Report as they were being overridden by new uploads, but that was not the case for the Upload rows in the database which were created from the carry-forwarded Sessions on initial Report carry-forwarding.

This change will now make sure that the Uploads in the database match the Sessions in the Report.

@Swatinem Swatinem requested review from adrian-codecov and a team September 27, 2024 09:13
@Swatinem Swatinem self-assigned this Sep 27, 2024
Copy link

codecov bot commented Sep 27, 2024

Codecov Report

Attention: Patch coverage is 97.43590% with 1 line in your changes missing coverage. Please review.

Project coverage is 98.02%. Comparing base (1fd3bd6) to head (de2c496).
Report is 1 commits behind head on main.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
services/report/__init__.py 96.66% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main     #745   +/-   ##
=======================================
  Coverage   98.02%   98.02%           
=======================================
  Files         438      438           
  Lines       36283    36305   +22     
=======================================
+ Hits        35566    35588   +22     
  Misses        717      717           
Flag Coverage Δ
integration 98.02% <97.43%> (+<0.01%) ⬆️
unit 98.02% <97.43%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
NonTestCode 95.87% <96.77%> (+<0.01%) ⬆️
OutsideTasks 98.03% <96.77%> (+<0.01%) ⬆️
Files with missing lines Coverage Δ
services/report/raw_upload_processor.py 95.75% <100.00%> (ø)
services/tests/test_report.py 100.00% <ø> (ø)
tasks/tests/integration/test_upload_e2e.py 100.00% <100.00%> (ø)
services/report/__init__.py 96.96% <96.66%> (+0.11%) ⬆️

@codecov-notifications
Copy link

codecov-notifications bot commented Sep 27, 2024

Codecov Report

Attention: Patch coverage is 97.43590% with 1 line in your changes missing coverage. Please review.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
services/report/__init__.py 96.66% 1 Missing ⚠️

Impacted file tree graph

@@           Coverage Diff           @@
##             main     #745   +/-   ##
=======================================
  Coverage   98.02%   98.02%           
=======================================
  Files         438      438           
  Lines       36283    36305   +22     
=======================================
+ Hits        35566    35588   +22     
  Misses        717      717           
Flag Coverage Δ
integration 98.02% <97.43%> (+<0.01%) ⬆️
unit 98.02% <97.43%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
NonTestCode 95.87% <96.77%> (+<0.01%) ⬆️
OutsideTasks 98.03% <96.77%> (+<0.01%) ⬆️
Files with missing lines Coverage Δ
services/report/raw_upload_processor.py 95.75% <100.00%> (ø)
services/tests/test_report.py 100.00% <ø> (ø)
tasks/tests/integration/test_upload_e2e.py 100.00% <100.00%> (ø)
services/report/__init__.py 96.96% <96.66%> (+0.11%) ⬆️

@codecov-qa
Copy link

codecov-qa bot commented Sep 27, 2024

Codecov Report

Attention: Patch coverage is 97.43590% with 1 line in your changes missing coverage. Please review.

Project coverage is 98.02%. Comparing base (1fd3bd6) to head (de2c496).
Report is 1 commits behind head on main.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
services/report/__init__.py 96.66% 1 Missing ⚠️

Impacted file tree graph

@@           Coverage Diff           @@
##             main     #745   +/-   ##
=======================================
  Coverage   98.02%   98.02%           
=======================================
  Files         438      438           
  Lines       36283    36305   +22     
=======================================
+ Hits        35566    35588   +22     
  Misses        717      717           
Flag Coverage Δ
integration 98.02% <97.43%> (+<0.01%) ⬆️
unit 98.02% <97.43%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
NonTestCode 95.87% <96.77%> (+<0.01%) ⬆️
OutsideTasks 98.03% <96.77%> (+<0.01%) ⬆️
Files with missing lines Coverage Δ
services/report/raw_upload_processor.py 95.75% <100.00%> (ø)
services/tests/test_report.py 100.00% <ø> (ø)
tasks/tests/integration/test_upload_e2e.py 100.00% <100.00%> (ø)
services/report/__init__.py 96.96% <96.66%> (+0.11%) ⬆️

Copy link

codecov-public-qa bot commented Sep 27, 2024

Codecov Report

Attention: Patch coverage is 97.43590% with 1 line in your changes missing coverage. Please review.

Project coverage is 98.02%. Comparing base (1fd3bd6) to head (de2c496).

✅ All tests successful. No failed tests found.

Impacted file tree graph

@@           Coverage Diff           @@
##             main     #745   +/-   ##
=======================================
  Coverage   98.02%   98.02%           
=======================================
  Files         438      438           
  Lines       36283    36305   +22     
=======================================
+ Hits        35566    35588   +22     
  Misses        717      717           
Flag Coverage Δ
integration 98.02% <97.43%> (+<0.01%) ⬆️
unit 98.02% <97.43%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
NonTestCode 95.87% <96.77%> (+<0.01%) ⬆️
OutsideTasks 98.03% <96.77%> (+<0.01%) ⬆️
Files Coverage Δ
services/report/raw_upload_processor.py 95.75% <100.00%> (ø)
services/tests/test_report.py 100.00% <ø> (ø)
tasks/tests/integration/test_upload_e2e.py 100.00% <100.00%> (ø)
services/report/__init__.py 96.96% <96.66%> (+0.11%) ⬆️



@sentry_sdk.trace
def delete_uploads_by_sessionid(upload: Upload, session_ids: list[int]):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 things here:

  1. I could see this taking some time because some customers have a gargantuan number of CFFs in their commits, so this might benefit from making a standalone task - unless we're wanting to do this synchronously. In which case, we'd take a bit extra time processing, so something to contemplate/measure in some way if we can
  2. How could this react with other sessions being processed in parallel? Not sure if while deleting we could run into errors if two or more processes are trying to dleete the same entry
  3. How could we test this in practice? I just want to make sure we aren't silently corrupting data by not acknowledging our reports were using data from cff's we deleted or something like that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. not quite sure the overhead of a separate task (both complexity and runtime) is reasonable for that. but you are right that this might be quite slow. However it is limited to only those uploads/sessions which have been overridden by an upload, so that should be rather limited.
  2. we are behind the "modify report" (upload processing) lock, so there is no parallelism here. also concurrent deletes shouldn’t be a problem either way, as we filter things by IN, which sql just ignores when no matching rows exist.
  3. good point, I don’t have a good idea for this yet. Though this change aims to align data in SQL with data inside of the report_json. So this is actually fixing a pre-existing "data corruption" if you think about it that way.

with these questions, you have brought up another good point: do we need to care about SQL-level locking here? I don’t think we need to, as the deletes are only locking the effected rows for the duration of the transaction (which I guess spans the whole task?), and those particular rows are only touched behind the upload processing lock.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming the deletions take place a short amount of time, I don't think we'd need to worry

Copy link
Contributor

@adrian-codecov adrian-codecov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When merged, lets keep a close look at how the DB and customers react to the data (e.g. be aware if there's any huge complains from our customers, if we hear anything from support, etc). I'd post a brief message in the eng/support channel to explain the change and have everyone ready

The `Upload`s in the database should mirror the `Session`s within a stored `Report`.

However, so far this was not the case for carryforwarded sessions/uploads.
The `Session`s were removed from the `Report` as they were being overridden by new uploads,
but that was not the case for the `Upload` rows in the database which were created from the carry-forwarded `Session`s on initial `Report` carry-forwarding.

This change will now make sure that the `Upload`s in the database match the `Session`s in the `Report`.
@Swatinem Swatinem added this pull request to the merge queue Oct 2, 2024
Merged via the queue into main with commit c435a68 Oct 2, 2024
36 of 40 checks passed
@Swatinem Swatinem deleted the swatinem/clear-cff-uploads branch October 2, 2024 13:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants