Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Windows native method to retrieve the number of allocated bytes on disk for file #79698

Merged
merged 24 commits into from
Nov 5, 2021
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -14,12 +14,15 @@
import com.sun.jna.Pointer;
import com.sun.jna.Structure;
import com.sun.jna.WString;
import com.sun.jna.ptr.IntByReference;
import com.sun.jna.win32.StdCallLibrary;

import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;
import org.apache.lucene.util.Constants;

import java.nio.file.Files;
import java.nio.file.Path;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
Expand Down Expand Up @@ -306,4 +309,50 @@ protected List<String> getFieldOrder() {
* @return true if the function succeeds
*/
native boolean SetInformationJobObject(Pointer job, int infoClass, Pointer info, int infoLength);

/**
* Retrieves the actual number of bytes of disk storage used to store a specified file.
*
* https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-getcompressedfilesizew
*
* @param lpFileName the path string
* @param lpFileSizeHigh pointer to high-order DWORD for compressed file size (or null if not needed)
* @return the low-order DWORD for compressed file siz
*/
native int GetCompressedFileSizeW(WString lpFileName, IntByReference lpFileSizeHigh);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we define this in test/framework rather than server since it's not used in prod code today? Or are we planning to use it in prod later on (e.g. expose sparse sizes in stats)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I'd like this to be returned as part of searchable snapshot's Cache Stats API, so that we have information about the shared cache and cold cache that sit altogether. What do you think?

I can try to make it work in the test framework; it should be OK as long as the static instance is created before the security manager is installed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also: I'd like to have assertion in CacheFile itself that would retrieve the allocated size on disk and compares it with the completed ranges of a cache file, and verify if size vs ranges is not completly out of bounds.

That would also require the method to be in server. I tried to make it work in the plugin itself but for JNA kernel32 library it's very difficult given all the permissions it requires (createClassLoader, setSecurityManager etc). Linux is OK though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 having these stats in future seems like a good plan, so I'm ok to leave the method where it is.


/**
* Retrieves the actual number of bytes of disk storage used to store a specified file. If the file is located on a volume that supports
* compression and the file is compressed, the value obtained is the compressed size of the specified file. If the file is located on a
* volume that supports sparse files and the file is a sparse file, the value obtained is the sparse size of the specified file.
*
* This method uses Win32 DLL native method {@link #GetCompressedFileSizeW(WString, IntByReference)}.
*
* @param path the path to the file
* @return the number of allocated bytes on disk for the file or {@code null} if the allocated size is invalid
*/
Long allocatedSizeInBytes(Path path) {
assert Files.isRegularFile(path) : path;
final WString fileName = new WString("\\\\?\\" + path);
final IntByReference lpFileSizeHigh = new IntByReference();

final int lpFileSizeLow = GetCompressedFileSizeW(fileName, lpFileSizeHigh);
if (lpFileSizeLow == 0xffffffff) {
final int err = Native.getLastError();
logger.warn("error [{}] when executing native method GetCompressedFileSizeW for file [{}]", err, path);
return null;
}

final long allocatedSize = (((long) lpFileSizeHigh.getValue()) << 32) | (lpFileSizeLow & 0xffffffffL);
if (logger.isTraceEnabled()) {
logger.trace(
"executing native method GetCompressedFileSizeW returned [high={}, low={}, allocated={}] for file [{}]",
lpFileSizeHigh,
lpFileSizeLow,
allocatedSize,
path
);
}
return allocatedSize;
}
}
15 changes: 15 additions & 0 deletions server/src/main/java/org/elasticsearch/bootstrap/JNANatives.java
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,11 @@
import com.sun.jna.Native;
import com.sun.jna.Pointer;
import com.sun.jna.WString;

import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;
import org.apache.lucene.util.Constants;
import org.elasticsearch.core.Nullable;
import org.elasticsearch.monitor.jvm.JvmInfo;

import java.nio.file.Path;
Expand Down Expand Up @@ -260,4 +262,17 @@ static void tryInstallSystemCallFilter(Path tmpFile) {
}
}

/**
* Returns the number of allocated bytes on disk for a given file.
*
* @param path the path o the file
* @return the number of allocated bytes on disk for the file or {@code null} if the allocated size could not be returned
*/
@Nullable
static Long allocatedSizeInBytes(Path path) {
if (Constants.WINDOWS) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the intent is to have this for stats, shouldn't we have a nix implementation as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I plan to add it in a follow up pull request

return JNAKernel32Library.getInstance().allocatedSizeInBytes(path);
}
return null;
}
}
16 changes: 15 additions & 1 deletion server/src/main/java/org/elasticsearch/bootstrap/Natives.java
Original file line number Diff line number Diff line change
Expand Up @@ -10,14 +10,15 @@

import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;
import org.elasticsearch.core.Nullable;

import java.nio.file.Path;

/**
* The Natives class is a wrapper class that checks if the classes necessary for calling native methods are available on
* startup. If they are not available, this class will avoid calling code that loads these classes.
*/
final class Natives {
public final class Natives {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about making this public. We've kept this package private because all this native stuff, thus far, is for Elasticsearch startup, hence the bootstrap package. If we want to start utilizing other native functionality for utilities like this, perhaps we should have eg in this case a filesystem utils class (outside of bootstrap, we could still force init early before SM is set). But I would like to know @ChrisHegarty 's thoughts.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's the most problematic point of this change so I wanted to have your opinion.

I initially wanted it to be in the searchable snapshot plugin but I failed to make it work (it is doable I think for Linux but definitely hard to do right for windows/kernel32).

I like the idea of having outside of boostrap: I'm going to update the PR in this direction and I'll ask for Chris opinion when he comes back.

/** no instantiation */
private Natives() {}

Expand Down Expand Up @@ -133,4 +134,17 @@ static boolean isSystemCallFilterInstalled() {
return JNANatives.LOCAL_SYSTEM_CALL_FILTER;
}

/**
* Returns the number of allocated bytes on disk for a given file.
*
* @param path the path to the file
* @return the number of allocated bytes on disk for the file or {@code null} if the allocated size could not be returned
*/
@Nullable
public static Long allocatedSizeInBytes(Path path) {
if (JNA_AVAILABLE == false) {
return null;
}
return JNANatives.allocatedSizeInBytes(path);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,10 @@
package org.elasticsearch.xpack.searchablesnapshots.cache.common;

import org.apache.lucene.store.AlreadyClosedException;
import org.apache.lucene.util.Constants;
import org.apache.lucene.util.LuceneTestCase;
import org.apache.lucene.util.SetOnce;
import org.elasticsearch.bootstrap.Natives;
import org.elasticsearch.common.UUIDs;
import org.elasticsearch.common.util.concurrent.DeterministicTaskQueue;
import org.elasticsearch.core.PathUtils;
Expand All @@ -20,11 +23,13 @@
import org.hamcrest.Matcher;

import java.io.IOException;
import java.nio.ByteBuffer;
import java.nio.channels.FileChannel;
import java.nio.file.FileSystem;
import java.nio.file.Files;
import java.nio.file.Path;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.HashSet;
import java.util.Iterator;
import java.util.List;
Expand All @@ -38,13 +43,16 @@
import static org.elasticsearch.xpack.searchablesnapshots.cache.common.TestUtils.randomPopulateAndReads;
import static org.hamcrest.Matchers.containsString;
import static org.hamcrest.Matchers.equalTo;
import static org.hamcrest.Matchers.greaterThanOrEqualTo;
import static org.hamcrest.Matchers.hasSize;
import static org.hamcrest.Matchers.instanceOf;
import static org.hamcrest.Matchers.is;
import static org.hamcrest.Matchers.lessThan;
import static org.hamcrest.Matchers.notNullValue;
import static org.hamcrest.Matchers.nullValue;
import static org.hamcrest.Matchers.sameInstance;

@LuceneTestCase.SuppressFileSystems("DisableFsyncFS") // required by {@link testCacheFileCreatedAsSparseFile()}
public class CacheFileTests extends ESTestCase {

private static final CacheFile.ModificationListener NOOP = new CacheFile.ModificationListener() {
Expand All @@ -57,7 +65,7 @@ public void onCacheFileDelete(CacheFile cacheFile) {}

private static final CacheKey CACHE_KEY = new CacheKey("_snap_uuid", "_snap_index", new ShardId("_name", "_uuid", 0), "_filename");

public void testGetCacheKey() throws Exception {
public void testGetCacheKey() {
final Path file = createTempDir().resolve("file.new");
final CacheKey cacheKey = new CacheKey(
UUIDs.randomBase64UUID(random()),
Expand Down Expand Up @@ -380,6 +388,51 @@ public void testFSyncFailure() throws Exception {
}
}

public void testCacheFileCreatedAsSparseFile() throws Exception {
assumeTrue("This test uses a native method implemented only for Windows", Constants.WINDOWS);
final long ONE_MB = 1 << 20;
arteam marked this conversation as resolved.
Show resolved Hide resolved

final Path file = createTempDir().resolve(UUIDs.randomBase64UUID(random()));
final CacheFile cacheFile = new CacheFile(
new CacheKey("_snap_uuid", "_snap_name", new ShardId("_name", "_uid", 0), "_filename"),
ONE_MB,
file,
NOOP
);
assertFalse(Files.exists(file));

final TestEvictionListener listener = new TestEvictionListener();
cacheFile.acquire(listener);
try {
final FileChannel fileChannel = cacheFile.getChannel();
assertTrue(Files.exists(file));

Long sizeOnDisk = Natives.allocatedSizeInBytes(file);
assertThat(sizeOnDisk, equalTo(0L));

// write 1 byte at the last position in the cache file.
// For non sparse files, Windows would allocate the full file on disk in order to write a single byte at the end,
// making the next assertion fails.
fill(fileChannel, Math.toIntExact(cacheFile.getLength() - 1L), Math.toIntExact(cacheFile.getLength()));
fileChannel.force(false);

sizeOnDisk = Natives.allocatedSizeInBytes(file);
assertThat("Cache file should be sparse and not fully allocated on disk", sizeOnDisk, lessThan(ONE_MB));

fill(fileChannel, 0, Math.toIntExact(cacheFile.getLength()));
fileChannel.force(false);

sizeOnDisk = Natives.allocatedSizeInBytes(file);
assertThat(
"Cache file should be fully allocated on disk (maybe more given cluster/block size)",
sizeOnDisk,
greaterThanOrEqualTo(ONE_MB)
);
} finally {
cacheFile.release(listener);
}
}

static class TestEvictionListener implements EvictionListener {

private final SetOnce<CacheFile> evicted = new SetOnce<>();
Expand Down Expand Up @@ -440,4 +493,24 @@ private static FSyncTrackingFileSystemProvider setupFSyncCountingFileSystem() {
PathUtilsForTesting.installMock(provider.getFileSystem(null));
return provider;
}

private static void fill(FileChannel fileChannel, int from, int to) {
final byte[] buffer = new byte[Math.min(Math.max(0, to - from), 1024)];
Arrays.fill(buffer, (byte) 0xff);
assert fileChannel.isOpen();

try {
int written = 0;
int remaining = to - from;
while (remaining > 0) {
final int len = Math.min(remaining, buffer.length);
fileChannel.write(ByteBuffer.wrap(buffer, 0, len), from + written);
remaining -= len;
written += len;
}
assert written == to - from;
} catch (IOException e) {
throw new AssertionError(e);
}
}
}