Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additional limits on hole reporting #14641

Merged
merged 1 commit into from
Mar 28, 2023
Merged

Conversation

behlendorf
Copy link
Contributor

@behlendorf behlendorf commented Mar 16, 2023

Motivation and Context

Issue #14512

Description

Holding the zp->z_rangelock as a RL_READER over the range 0-UINT64_MAX is sufficient to prevent the dnode from being re-dirtied by concurrent writers. To avoid potentially looping multiple times for external caller which do not take the rangelock holes are not reported after the first sync. While not optimal this is always functionally correct.

This change adds the missing rangelock calls on FreeBSD to zvol_cdev_ioctl().

How Has This Been Tested?

Verified the fix behaves as expected using the test case in #14512.

The check in dnode_is_dirty() could probably be further refined. However, we need to be careful about making it too expensive. It's reasonable to allow a safe early abort in the uncommon case where a constantly modified file is being searched for holes.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@rincebrain
Copy link
Contributor

I'm not sure, thinking about it, why we can't just return holes that were there after at most one sync, rather than returning EBUSY?

We're not allowed to return holes that didn't exist, and returning no holes is definitely always safe, but there's not some atomic operation for "seek to hole/data and write/read thing there", so I don't think you get any promises about it still being there after the call returns, just that it was when you did it.

I could be missing something, of course, it just seems a shame to have to say "no holes at all" for remotely busy files, when I don't think you get any promises about how long the call is accurate for...

@behlendorf
Copy link
Contributor Author

I think you could make a reasonable case for that behavior. Once the system call has returned all bets are off anyway regarding if that hole still exists. My feeling however is that in this case we want to play it as safe as possible. Just to make sure we avoid any other unexpected issues.

Ideally, I think what we'd want to do is not force the sync at all. But instead walk the dirty records for the file and apply them to the current on-disk blocks to determine what the file will look like when it finally hits disk. Though that of course comes with it's own set of challenges since we need to be able to make some assertions about how the I/O pipeline will handle those records.

Holding the zp->z_rangelock as a RL_READER over the range
0-UINT64_MAX is sufficient to prevent the dnode from being
re-dirtied by concurrent writers.  To avoid potentially
looping multiple times for external caller which do not
take the rangelock holes are not reported after the first
sync.  While not optimal this is always functionally correct.

This change adds the missing rangelock calls on FreeBSD to
zvol_cdev_ioctl().

Signed-off-by: Brian Behlendorf <[email protected]>
Issue openzfs#14512
@behlendorf
Copy link
Contributor Author

I'm not sure, thinking about it, why we can't just return holes that were there after at most one sync, rather than returning EBUSY?

I've updated the patch to handle this a little bit better. By overextending the rangelock to the end of file we can properly lock out any process appending to the file and at worst require a single sync. However I refrained from adding an ASSERT for this because Lustre uses this interface without the rangelock and we don't want to break that.

@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Mar 28, 2023
@behlendorf behlendorf merged commit 64bfa6b into openzfs:master Mar 28, 2023
behlendorf added a commit to behlendorf/zfs that referenced this pull request Mar 28, 2023
Holding the zp->z_rangelock as a RL_READER over the range
0-UINT64_MAX is sufficient to prevent the dnode from being
re-dirtied by concurrent writers.  To avoid potentially
looping multiple times for external caller which do not
take the rangelock holes are not reported after the first
sync.  While not optimal this is always functionally correct.

This change adds the missing rangelock calls on FreeBSD to
zvol_cdev_ioctl().

Reviewed-by: Brian Atkinson <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#14512
Closes openzfs#14641
andrewc12 pushed a commit to andrewc12/openzfs that referenced this pull request Mar 28, 2023
Holding the zp->z_rangelock as a RL_READER over the range
0-UINT64_MAX is sufficient to prevent the dnode from being
re-dirtied by concurrent writers.  To avoid potentially
looping multiple times for external caller which do not
take the rangelock holes are not reported after the first
sync.  While not optimal this is always functionally correct.

This change adds the missing rangelock calls on FreeBSD to
zvol_cdev_ioctl().

Reviewed-by: Brian Atkinson <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#14512
Closes openzfs#14641
behlendorf added a commit that referenced this pull request Mar 29, 2023
Holding the zp->z_rangelock as a RL_READER over the range
0-UINT64_MAX is sufficient to prevent the dnode from being
re-dirtied by concurrent writers.  To avoid potentially
looping multiple times for external caller which do not
take the rangelock holes are not reported after the first
sync.  While not optimal this is always functionally correct.

This change adds the missing rangelock calls on FreeBSD to
zvol_cdev_ioctl().

Reviewed-by: Brian Atkinson <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #14512
Closes #14641
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Mar 29, 2023
Holding the zp->z_rangelock as a RL_READER over the range
0-UINT64_MAX is sufficient to prevent the dnode from being
re-dirtied by concurrent writers.  To avoid potentially
looping multiple times for external caller which do not
take the rangelock holes are not reported after the first
sync.  While not optimal this is always functionally correct.

This change adds the missing rangelock calls on FreeBSD to
zvol_cdev_ioctl().

Reviewed-by: Brian Atkinson <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#14512
Closes openzfs#14641
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Apr 5, 2023
Holding the zp->z_rangelock as a RL_READER over the range
0-UINT64_MAX is sufficient to prevent the dnode from being
re-dirtied by concurrent writers.  To avoid potentially
looping multiple times for external caller which do not
take the rangelock holes are not reported after the first
sync.  While not optimal this is always functionally correct.

This change adds the missing rangelock calls on FreeBSD to
zvol_cdev_ioctl().

Reviewed-by: Brian Atkinson <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#14512
Closes openzfs#14641
pcd1193182 pushed a commit to pcd1193182/zfs that referenced this pull request Sep 26, 2023
Holding the zp->z_rangelock as a RL_READER over the range
0-UINT64_MAX is sufficient to prevent the dnode from being
re-dirtied by concurrent writers.  To avoid potentially
looping multiple times for external caller which do not
take the rangelock holes are not reported after the first
sync.  While not optimal this is always functionally correct.

This change adds the missing rangelock calls on FreeBSD to
zvol_cdev_ioctl().

Reviewed-by: Brian Atkinson <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#14512
Closes openzfs#14641
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants