Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make iter_diff() robust to lstat-only changes #650

Merged
merged 2 commits into from
Apr 7, 2024
Merged

Conversation

mih
Copy link
Member

@mih mih commented Apr 7, 2024

With this change, a dedicate update-index refresh run is performed before running diff-index. This fixes undesired modification reports that are triggered by mtime-change only.

This comes at a performance cost. For a repo with 36k files (many-ish), this is about 5% of the total runtime:

%timeit next_status('.', result_renderer='disabled', untracked='all', recursive='datasets')
229 ms ± 537 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit next_status('.', result_renderer='disabled', untracked='all', recursive='datasets')
240 ms ± 2.59 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Closes #639

With this change, a dedicate `update-index` refresh run is performed
before running `diff-index`. This fixes undesired modification reports
that are triggered by mtime-change only.

This comes at a performance cost. For a repo with 36k files (many-ish),
this is about 5% of the total runtime:

```
%timeit next_status('.', result_renderer='disabled', untracked='all', recursive='datasets')
229 ms ± 537 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit next_status('.', result_renderer='disabled', untracked='all', recursive='datasets')
240 ms ± 2.59 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
```

Closes datalad#639
Copy link

codecov bot commented Apr 7, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 93.07%. Comparing base (12c6d53) to head (c6df601).

❗ Current head c6df601 differs from pull request most recent head 168ee4b. Consider uploading reports for the commit 168ee4b to get more accurate results

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #650   +/-   ##
=======================================
  Coverage   93.07%   93.07%           
=======================================
  Files         171      171           
  Lines       11971    11973    +2     
  Branches     1805     1806    +1     
=======================================
+ Hits        11142    11144    +2     
  Misses        642      642           
  Partials      187      187           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mih
Copy link
Member Author

mih commented Apr 7, 2024

Failures are unrelated to the changes.

@mih mih merged commit c8ad167 into datalad:main Apr 7, 2024
4 of 6 checks passed
@mih mih deleted the bf-639 branch April 7, 2024 15:37
@mih mih added this to the 1.4 milestone May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Spurious modification reports from next-status
1 participant