-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create cache files with CREATE_NEW & SPARSE options #79371
Conversation
Pinging @elastic/es-distributed (Team:Distributed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, though maybe adjust the naming a little :)
* Indicates if the file should be created when it is open for the first time. | ||
* This is required to pass the right options for sparse file support. | ||
*/ | ||
private volatile boolean created; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe name this fileExists
and just describe it as "true
if the physical cache file exists on disk" or so? I find the description and name very confusing tbh.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I pushed 3bbef1e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* Indicates if the file should be created when it is open for the first time. | ||
* This is required to pass the right options for sparse file support. | ||
*/ | ||
private volatile boolean created; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Thanks Yannick and Armin! |
* Create cache files with CREATE_NEW & SPARSE options * assert + doc * rename
💔 Backport failed
You can use sqren/backport to manually backport by running |
This commit adds the bug elastic#79371 as a known issue in documentation from 7.12.0 to 7.15.1.
* upstream/master: Validate tsdb's routing_path (elastic#79384) Adjust the BWC version for the return200ForClusterHealthTimeout field (elastic#79436) API for adding and removing indices from a data stream (elastic#79279) Exposing the ability to log deprecated settings at non-critical level (elastic#79107) Convert operator privilege license object to LicensedFeature (elastic#79407) Mute SnapshotBasedIndexRecoveryIT testSeqNoBasedRecoveryIsUsedAfterPrimaryFailOver (elastic#79456) Create cache files with CREATE_NEW & SPARSE options (elastic#79371) Revert "[ML] Use a new annotations index for future annotations (elastic#79151)" [ML] Use a new annotations index for future annotations (elastic#79151) [ML] Removing legacy code from ML/transform auditor (elastic#79434) Fix rate agg with custom `_doc_count` (elastic#79346) Optimize SLM Policy Queries (elastic#79341) Fix execution of exists query within nested queries on field with doc_values disabled (elastic#78841) Stricter UpdateSettingsRequest parsing on the REST layer (elastic#79227) Do not release snapshot file download permit during recovery retries (elastic#79409) Preserve request headers in a mixed version cluster (elastic#79412) Adjust versions after elastic#79044 backport to 7.x (elastic#79424) Mute BulkByScrollUsesAllScrollDocumentsAfterConflictsIntegTests (elastic#79429) Fail on SSPL licensed x-pack sources (elastic#79348) # Conflicts: # server/src/test/java/org/elasticsearch/index/TimeSeriesModeTests.java
This commit adds the bug elastic#79371 as a known issue in documentation from 7.12.0 to 7.15.1. Backport of elastic#79473
This commit adds the bug elastic#79371 as a known issue in documentation from 7.12.0 to 7.15.1. Backport of elastic#79473
This commit adds the bug elastic#79371 as a known issue in documentation from 7.12.0 to 7.15.1. Backport of elastic#79473
This commit adds the bug elastic#79371 as a known issue in documentation from 7.12.0 to 7.15.1. Backport of elastic#79473
…n disk for file (#79698) In #79371 we fixed a bug where cache files were not created as sparse files on Windows platforms because the wrong options were used when creating the files for the first time. This bug got unnoticed as we were lacking a way to retrieve the exact number of bytes allocated for a given file on disk. This commit adds a FileSystemNatives.allocatedSizeInBytes(Path) method for that exact purpose (only implemented for Windows for now) and a test in CacheFileTests that would fail on Windows if the cache file is not sparse. Relates #79371
…n disk for file (elastic#79698) In elastic#79371 we fixed a bug where cache files were not created as sparse files on Windows platforms because the wrong options were used when creating the files for the first time. This bug got unnoticed as we were lacking a way to retrieve the exact number of bytes allocated for a given file on disk. This commit adds a FileSystemNatives.allocatedSizeInBytes(Path) method for that exact purpose (only implemented for Windows for now) and a test in CacheFileTests that would fail on Windows if the cache file is not sparse. Relates elastic#79371
…n disk for file (elastic#79698) In elastic#79371 we fixed a bug where cache files were not created as sparse files on Windows platforms because the wrong options were used when creating the files for the first time. This bug got unnoticed as we were lacking a way to retrieve the exact number of bytes allocated for a given file on disk. This commit adds a FileSystemNatives.allocatedSizeInBytes(Path) method for that exact purpose (only implemented for Windows for now) and a test in CacheFileTests that would fail on Windows if the cache file is not sparse. Relates elastic#79371
…n disk for file (#79698) (#80426) In #79371 we fixed a bug where cache files were not created as sparse files on Windows platforms because the wrong options were used when creating the files for the first time. This bug got unnoticed as we were lacking a way to retrieve the exact number of bytes allocated for a given file on disk. This commit adds a FileSystemNatives.allocatedSizeInBytes(Path) method for that exact purpose (only implemented for Windows for now) and a test in CacheFileTests that would fail on Windows if the cache file is not sparse. Relates #79371 Co-authored-by: Elastic Machine <[email protected]>
…n disk for file (#79698) (#80427) In #79371 we fixed a bug where cache files were not created as sparse files on Windows platforms because the wrong options were used when creating the files for the first time. This bug got unnoticed as we were lacking a way to retrieve the exact number of bytes allocated for a given file on disk. This commit adds a FileSystemNatives.allocatedSizeInBytes(Path) method for that exact purpose (only implemented for Windows for now) and a test in CacheFileTests that would fail on Windows if the cache file is not sparse. Relates #79371 Co-authored-by: Elastic Machine <[email protected]>
Cold and Frozen tiers use a local cache for some Lucene files. This cache is based on files stored on disk, and to limit the required disk space it relies on the operating system's support for sparse files.
When Elasticsearch creates the cache files it uses the following standard options:
Sadly those options are not enough to enable sparse file for the cache files. Instead it should use the
CREATE_NEW
option, as it is documented in the Java NIO API:The consequence today is that sparse file support if not enable for operating systems that do not enable it by default, like NTFS.
Since partially mounted indices ignores disk watermarks and report a size on disk of 0 bytes it means that the cache files can grow and use much more disk space than needed, potentially filling up the disk.
Operating systems that enables sparse file by default, like most Linux distributions, should not be affected.
This pull request adds a
created
flag to the CacheFile class that indicates if the file is expected to already exist on disk or not. This flag is used to create or open the file with the correct options.This bug will be documented in a follow up. I also plan to spend some time looking if we can use JNA to add assertions that the shared cache file is effectively sparse on Windows/Linux etc.