Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Windows native method to retrieve the number of allocated bytes on disk for file #79698

Merged
merged 24 commits into from
Nov 5, 2021
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -322,8 +322,11 @@ protected List<String> getFieldOrder() {
native int GetCompressedFileSizeW(WString lpFileName, IntByReference lpFileSizeHigh);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we define this in test/framework rather than server since it's not used in prod code today? Or are we planning to use it in prod later on (e.g. expose sparse sizes in stats)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I'd like this to be returned as part of searchable snapshot's Cache Stats API, so that we have information about the shared cache and cold cache that sit altogether. What do you think?

I can try to make it work in the test framework; it should be OK as long as the static instance is created before the security manager is installed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also: I'd like to have assertion in CacheFile itself that would retrieve the allocated size on disk and compares it with the completed ranges of a cache file, and verify if size vs ranges is not completly out of bounds.

That would also require the method to be in server. I tried to make it work in the plugin itself but for JNA kernel32 library it's very difficult given all the permissions it requires (createClassLoader, setSecurityManager etc). Linux is OK though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 having these stats in future seems like a good plan, so I'm ok to leave the method where it is.


/**
* Returns the number of allocated bytes on disk for a given file. This method uses Kernel32 DDL native method
* {@link #GetCompressedFileSizeW(WString, IntByReference)} to retrieve the allocated size of the file.
* Retrieves the actual number of bytes of disk storage used to store a specified file. If the file is located on a volume that supports
* compression and the file is compressed, the value obtained is the compressed size of the specified file. If the file is located on a
* volume that supports sparse files and the file is a sparse file, the value obtained is the sparse size of the specified file.
*
* This method uses Win32 DLL native method {@link #GetCompressedFileSizeW(WString, IntByReference)}.
*
* @param path the path to the file
* @return the number of allocated bytes on disk for the file or {@code null} if the allocated size is invalid
Expand All @@ -342,7 +345,7 @@ Long allocatedSizeInBytes(Path path) {

final long allocatedSize = (((long) lpFileSizeHigh.getValue()) << 32) | (lpFileSizeLow & 0xffffffffL);
if (logger.isTraceEnabled()) {
logger.trace("native method GetCompressedFileSizeW returned [{}] for file [{}]", allocatedSize, path);
logger.trace("executing native method GetCompressedFileSizeW returned [{}] for file [{}]", allocatedSize, path);
}
return allocatedSize;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -405,10 +405,13 @@ public void testCacheFileCreatedAsSparseFile() throws Exception {
Long sizeOnDisk = Natives.allocatedSizeInBytes(file);
assertThat(sizeOnDisk, equalTo(0L));

// write 1 byte at the last position in the cache file.
// For non sparse files, Windows would allocate the full file on disk in order to write a single byte at the end,
// making the next assertion fails.
fill(cacheFile.getChannel(), Math.toIntExact(cacheFile.getLength() - 1L), Math.toIntExact(cacheFile.getLength()));

sizeOnDisk = Natives.allocatedSizeInBytes(file);
assertThat(sizeOnDisk, not(equalTo(1048576L)));
assertThat("Cache file should be sparse and not fully allocated on disk", sizeOnDisk, not(equalTo(1048576L)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be asserting its size is <1MiB rather than ≠1MiB? I worry that some funny rounding (e.g. counting the size of the directory entry) could pass this assertion even for non-sparse files.

After we make this assertion could we fill the file with genuine data and assert that its size on disk is now ≥1MiB?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be asserting its size is <1MiB rather than ≠1MiB?

Yes it is more appropriate (I think I did it and then reverted it somehow 🤔 )

After we make this assertion could we fill the file with genuine data and assert that its size on disk is now ≥1MiB?

Yes too.

I pushed f109191

} finally {
cacheFile.release(listener);
}
Expand Down