Create cache files with CREATE_NEW & SPARSE options #79371

tlrx · 2021-10-18T16:01:23Z

Cold and Frozen tiers use a local cache for some Lucene files. This cache is based on files stored on disk, and to limit the required disk space it relies on the operating system's support for sparse files.

When Elasticsearch creates the cache files it uses the following standard options:

StandardOpenOption.READ,
StandardOpenOption.WRITE,
StandardOpenOption.CREATE,
StandardOpenOption.SPARSE

Sadly those options are not enough to enable sparse file for the cache files. Instead it should use the CREATE_NEW option, as it is documented in the Java NIO API:

/**
     * Sparse file. When used with the {@link #CREATE_NEW} option then this
     * option provides a <em>hint</em> that the new file will be sparse. The
     * option is ignored when the file system does not support the creation of
     * sparse files.
     */
    SPARSE,

The consequence today is that sparse file support if not enable for operating systems that do not enable it by default, like NTFS.

Since partially mounted indices ignores disk watermarks and report a size on disk of 0 bytes it means that the cache files can grow and use much more disk space than needed, potentially filling up the disk.

Operating systems that enables sparse file by default, like most Linux distributions, should not be affected.

This pull request adds a created flag to the CacheFile class that indicates if the file is expected to already exist on disk or not. This flag is used to create or open the file with the correct options.

This bug will be documented in a follow up. I also plan to spend some time looking if we can use JNA to add assertions that the shared cache file is effectively sparse on Windows/Linux etc.

elasticmachine · 2021-10-18T16:01:26Z

Pinging @elastic/es-distributed (Team:Distributed)

original-brownbear

LGTM, though maybe adjust the naming a little :)

original-brownbear · 2021-10-19T10:11:26Z

...pshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/common/CacheFile.java

+     * Indicates if the file should be created when it is open for the first time.
+     * This is required to pass the right options for sparse file support.
+     */
+    private volatile boolean created;


Maybe name this fileExists and just describe it as "true if the physical cache file exists on disk" or so? I find the description and name very confusing tbh.

Sure, I pushed 3bbef1e

ywelsch

LGTM

ywelsch · 2021-10-19T10:31:43Z

...pshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/common/CacheFile.java

+     * Indicates if the file should be created when it is open for the first time.
+     * This is required to pass the right options for sparse file support.
+     */
+    private volatile boolean created;


tlrx · 2021-10-19T12:18:51Z

Thanks Yannick and Armin!

* Create cache files with CREATE_NEW & SPARSE options * assert + doc * rename

elasticsearchmachine · 2021-10-19T12:19:44Z

💔 Backport failed

Status	Branch	Result
✅	7.x
❌	7.15	Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 79371

Backport of elastic#79371

* Create cache files with CREATE_NEW & SPARSE options * assert + doc * rename

This commit adds the bug elastic#79371 as a known issue in documentation from 7.12.0 to 7.15.1.

* upstream/master: Validate tsdb's routing_path (elastic#79384) Adjust the BWC version for the return200ForClusterHealthTimeout field (elastic#79436) API for adding and removing indices from a data stream (elastic#79279) Exposing the ability to log deprecated settings at non-critical level (elastic#79107) Convert operator privilege license object to LicensedFeature (elastic#79407) Mute SnapshotBasedIndexRecoveryIT testSeqNoBasedRecoveryIsUsedAfterPrimaryFailOver (elastic#79456) Create cache files with CREATE_NEW & SPARSE options (elastic#79371) Revert "[ML] Use a new annotations index for future annotations (elastic#79151)" [ML] Use a new annotations index for future annotations (elastic#79151) [ML] Removing legacy code from ML/transform auditor (elastic#79434) Fix rate agg with custom `_doc_count` (elastic#79346) Optimize SLM Policy Queries (elastic#79341) Fix execution of exists query within nested queries on field with doc_values disabled (elastic#78841) Stricter UpdateSettingsRequest parsing on the REST layer (elastic#79227) Do not release snapshot file download permit during recovery retries (elastic#79409) Preserve request headers in a mixed version cluster (elastic#79412) Adjust versions after elastic#79044 backport to 7.x (elastic#79424) Mute BulkByScrollUsesAllScrollDocumentsAfterConflictsIntegTests (elastic#79429) Fail on SSPL licensed x-pack sources (elastic#79348) # Conflicts: # server/src/test/java/org/elasticsearch/index/TimeSeriesModeTests.java

Backport of #79371

This commit adds the bug #79371 as a known issue in documentation from 7.12.0 to 7.15.1.

This commit adds the bug elastic#79371 as a known issue in documentation from 7.12.0 to 7.15.1. Backport of elastic#79473

This commit adds the bug #79371 as a known issue in documentation from 7.12.0 to 7.15.1. Backport of #79473

This commit adds the bug elastic#79371 as a known issue in documentation from 7.12.0 to 7.15.1. Backport of elastic#79473

This commit adds the bug #79371 as a known issue in documentation from 7.12.0 to 7.15.1. Backport of #79473

…n disk for file (#79698) In #79371 we fixed a bug where cache files were not created as sparse files on Windows platforms because the wrong options were used when creating the files for the first time. This bug got unnoticed as we were lacking a way to retrieve the exact number of bytes allocated for a given file on disk. This commit adds a FileSystemNatives.allocatedSizeInBytes(Path) method for that exact purpose (only implemented for Windows for now) and a test in CacheFileTests that would fail on Windows if the cache file is not sparse. Relates #79371

…n disk for file (elastic#79698) In elastic#79371 we fixed a bug where cache files were not created as sparse files on Windows platforms because the wrong options were used when creating the files for the first time. This bug got unnoticed as we were lacking a way to retrieve the exact number of bytes allocated for a given file on disk. This commit adds a FileSystemNatives.allocatedSizeInBytes(Path) method for that exact purpose (only implemented for Windows for now) and a test in CacheFileTests that would fail on Windows if the cache file is not sparse. Relates elastic#79371

…n disk for file (#79698) (#80426) In #79371 we fixed a bug where cache files were not created as sparse files on Windows platforms because the wrong options were used when creating the files for the first time. This bug got unnoticed as we were lacking a way to retrieve the exact number of bytes allocated for a given file on disk. This commit adds a FileSystemNatives.allocatedSizeInBytes(Path) method for that exact purpose (only implemented for Windows for now) and a test in CacheFileTests that would fail on Windows if the cache file is not sparse. Relates #79371 Co-authored-by: Elastic Machine <[email protected]>

…n disk for file (#79698) (#80427) In #79371 we fixed a bug where cache files were not created as sparse files on Windows platforms because the wrong options were used when creating the files for the first time. This bug got unnoticed as we were lacking a way to retrieve the exact number of bytes allocated for a given file on disk. This commit adds a FileSystemNatives.allocatedSizeInBytes(Path) method for that exact purpose (only implemented for Windows for now) and a test in CacheFileTests that would fail on Windows if the cache file is not sparse. Relates #79371 Co-authored-by: Elastic Machine <[email protected]>

Create cache files with CREATE_NEW & SPARSE options

8b78ed2

tlrx added >bug :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v8.0.0 v7.16.0 labels Oct 18, 2021

elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Oct 18, 2021

ywelsch added the v7.15.2 label Oct 18, 2021

tlrx added 2 commits October 19, 2021 09:40

Merge branch 'master' into create-sparse_new

66b1e07

assert + doc

174954f

tlrx requested review from DaveCTurner, ywelsch and original-brownbear October 19, 2021 09:23

original-brownbear approved these changes Oct 19, 2021

View reviewed changes

ywelsch approved these changes Oct 19, 2021

View reviewed changes

rename

3bbef1e

tlrx added auto-backport-and-merge auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) labels Oct 19, 2021

elasticsearchmachine merged commit a3f057c into elastic:master Oct 19, 2021

tlrx mentioned this pull request Oct 19, 2021

[7.x] Create cache files with CREATE_NEW & SPARSE options (#79371) #79458

Merged

tlrx added a commit to tlrx/elasticsearch that referenced this pull request Oct 19, 2021

Create cache files with CREATE_NEW & SPARSE options (elastic#79371)

c772d36

* Create cache files with CREATE_NEW & SPARSE options * assert + doc * rename

tlrx added a commit to tlrx/elasticsearch that referenced this pull request Oct 19, 2021

Create cache files with CREATE_NEW & SPARSE options (elastic#79371)

8b5ad36

Backport of elastic#79371

tlrx mentioned this pull request Oct 19, 2021

Create cache files with CREATE_NEW & SPARSE options (#79371) #79465

Merged

elasticsearchmachine pushed a commit that referenced this pull request Oct 19, 2021

Create cache files with CREATE_NEW & SPARSE options (#79371) (#79458)

68a6ce0

* Create cache files with CREATE_NEW & SPARSE options * assert + doc * rename

tlrx added a commit to tlrx/elasticsearch that referenced this pull request Oct 19, 2021

Add known issue docs for elastic#79371

3e6a44d

This commit adds the bug elastic#79371 as a known issue in documentation from 7.12.0 to 7.15.1.

tlrx mentioned this pull request Oct 19, 2021

Add known issue docs for #79371 #79473

Merged

elasticsearchmachine pushed a commit that referenced this pull request Oct 19, 2021

Create cache files with CREATE_NEW & SPARSE options (#79371) (#79465)

49f7318

Backport of #79371

tlrx added a commit that referenced this pull request Oct 19, 2021

Add known issue docs for #79371 (#79473)

44b8450

This commit adds the bug #79371 as a known issue in documentation from 7.12.0 to 7.15.1.

tlrx added a commit to tlrx/elasticsearch that referenced this pull request Oct 19, 2021

Add known issue docs for elastic#79371

53343b6

This commit adds the bug elastic#79371 as a known issue in documentation from 7.12.0 to 7.15.1. Backport of elastic#79473

tlrx mentioned this pull request Oct 19, 2021

[7.15] Add known issue docs for #79371 #79484

Merged

tlrx added a commit to tlrx/elasticsearch that referenced this pull request Oct 19, 2021

Add known issue docs for elastic#79371

b46aea6

This commit adds the bug elastic#79371 as a known issue in documentation from 7.12.0 to 7.15.1. Backport of elastic#79473

tlrx mentioned this pull request Oct 19, 2021

[7.14] Add known issue docs for #79371 #79485

Merged

elasticsearchmachine pushed a commit that referenced this pull request Oct 19, 2021

Add known issue docs for #79371 (#79484)

c0e1dda

This commit adds the bug #79371 as a known issue in documentation from 7.12.0 to 7.15.1. Backport of #79473

tlrx added a commit to tlrx/elasticsearch that referenced this pull request Oct 19, 2021

Add known issue docs for elastic#79371

28a8877

This commit adds the bug elastic#79371 as a known issue in documentation from 7.12.0 to 7.15.1. Backport of elastic#79473

tlrx mentioned this pull request Oct 19, 2021

[7.13] Add known issue docs for #79371 #79487

Merged

tlrx added a commit to tlrx/elasticsearch that referenced this pull request Oct 19, 2021

Add known issue docs for elastic#79371

7bb0809

This commit adds the bug elastic#79371 as a known issue in documentation from 7.12.0 to 7.15.1. Backport of elastic#79473

tlrx mentioned this pull request Oct 19, 2021

[7.12] Add known issue docs for #79371 #79488

Merged

elasticsearchmachine pushed a commit that referenced this pull request Oct 19, 2021

Add known issue docs for #79371 (#79485)

e954c1d

This commit adds the bug #79371 as a known issue in documentation from 7.12.0 to 7.15.1. Backport of #79473

elasticsearchmachine pushed a commit that referenced this pull request Oct 19, 2021

Add known issue docs for #79371 (#79487)

c779a1b

This commit adds the bug #79371 as a known issue in documentation from 7.12.0 to 7.15.1. Backport of #79473

elasticsearchmachine pushed a commit that referenced this pull request Oct 19, 2021

Add known issue docs for #79371 (#79488)

5e2ce98

This commit adds the bug #79371 as a known issue in documentation from 7.12.0 to 7.15.1. Backport of #79473

tlrx mentioned this pull request Oct 25, 2021

Add Windows native method to retrieve the number of allocated bytes on disk for file #79698

Merged

jakelandis added v8.0.0-beta1 and removed v8.0.0 labels Oct 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create cache files with CREATE_NEW & SPARSE options #79371

Create cache files with CREATE_NEW & SPARSE options #79371

tlrx commented Oct 18, 2021 •

edited

Loading

elasticmachine commented Oct 18, 2021

original-brownbear left a comment

original-brownbear Oct 19, 2021

ywelsch Oct 19, 2021

tlrx Oct 19, 2021

ywelsch left a comment

ywelsch Oct 19, 2021

tlrx commented Oct 19, 2021

elasticsearchmachine commented Oct 19, 2021

Create cache files with CREATE_NEW & SPARSE options #79371

Create cache files with CREATE_NEW & SPARSE options #79371

Conversation

tlrx commented Oct 18, 2021 • edited Loading

elasticmachine commented Oct 18, 2021

original-brownbear left a comment

Choose a reason for hiding this comment

original-brownbear Oct 19, 2021

Choose a reason for hiding this comment

ywelsch Oct 19, 2021

Choose a reason for hiding this comment

tlrx Oct 19, 2021

Choose a reason for hiding this comment

ywelsch left a comment

Choose a reason for hiding this comment

ywelsch Oct 19, 2021

Choose a reason for hiding this comment

tlrx commented Oct 19, 2021

elasticsearchmachine commented Oct 19, 2021

💔 Backport failed

tlrx commented Oct 18, 2021 •

edited

Loading