Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix Flaky-test: testBookieContinueWritingIfMultipleLedgersPresent #3421

Merged

Conversation

wenbingshen
Copy link
Member

@wenbingshen wenbingshen commented Jul 24, 2022

Motivation

Fixes issue #2665

When the disk is full, InterleavedLedgerStorage creates a new entryLog and index file, copies and moves the old index (newFile) FileInfo to the new directory, and finally deletes the old index file.

The old file of FileInfo was deleted, but it was copied and moved a newFile to the new directory. Therefore, the delete status flag should be restored. If it is not set to false,checkOpen will throw FileInfoDeletedException when judging the fence status of ledgerHandle when adding an entry, causing the writing to fail. When the client changes the ensemble, it throws an exception because the test has only two bookie nodes and cannot satisfy the two writeSet: Not enough non-faulty bookies available.

    public synchronized boolean isFenced() throws IOException {
        checkOpen(false);
        return (stateBits & STATE_FENCED_BIT) == STATE_FENCED_BIT;
    }

    private synchronized void checkOpen(boolean create, boolean openBeforeClose)
            throws IOException {
        if (deleted) {
            throw new FileInfoDeletedException();
        }
      .....
    }

Changes

When FileInfo moves to a new directory by moveToNewLocation,delete the old index file and restore the delete flag, because the new index file is ready.

Set the disk check interval to the Integer.MAX_VALUE, which is stable
assertEquals("writable dirs should have one dir", 1, ledgerDirsManager .getWritableLedgerDirs().size());
Because once the test thrashes, the disk check thread will restore the full disk to a writable state.

newConf.setDiskCheckInterval(Integer.MAX_VALUE);

@wenbingshen
Copy link
Member Author

ping @hangc0276 @dlg99 @shoothzj @eolivelli PTAL.

@hangc0276 hangc0276 merged commit 1908b7e into apache:master Jul 26, 2022
@wenbingshen wenbingshen deleted the fixFlakyTestWritingMultipleLedgers branch July 29, 2022 10:14
zymap pushed a commit that referenced this pull request Aug 2, 2022
### Motivation

Fixes issue #2665

When the disk is full, `InterleavedLedgerStorage` creates a new entryLog and index file, copies and moves the old index (newFile) `FileInfo` to the new directory, and finally deletes the old index file.

The old file of `FileInfo` was `deleted`, but it was copied and moved a newFile to the new directory. Therefore, the delete status flag should be restored. If it is not set to false,`checkOpen` will throw `FileInfoDeletedException` when judging the fence status of `ledgerHandle` when adding an entry, causing the writing to fail. When the client changes the `ensemble`, it throws an exception because the test has only two bookie nodes and cannot satisfy the two writeSet: `Not enough non-faulty bookies available`.
```java
    public synchronized boolean isFenced() throws IOException {
        checkOpen(false);
        return (stateBits & STATE_FENCED_BIT) == STATE_FENCED_BIT;
    }

    private synchronized void checkOpen(boolean create, boolean openBeforeClose)
            throws IOException {
        if (deleted) {
            throw new FileInfoDeletedException();
        }
      .....
    }
```

### Changes
When `FileInfo` moves to a new directory by `moveToNewLocation`,delete the old index file and restore the `delete` flag, because the new index file is ready.

Set the disk check interval to the `Integer.MAX_VALUE`, which is stable
`assertEquals("writable dirs should have one dir", 1, ledgerDirsManager
                .getWritableLedgerDirs().size());`
Because once the test thrashes, the disk check thread will restore the full disk to a writable state.
```java
newConf.setDiskCheckInterval(Integer.MAX_VALUE);
```

(cherry picked from commit 1908b7e)
hangc0276 pushed a commit to hangc0276/bookkeeper that referenced this pull request Nov 5, 2022
…#3421)

Fixes issue apache#2665

When the disk is full, `InterleavedLedgerStorage` creates a new entryLog and index file, copies and moves the old index (newFile) `FileInfo` to the new directory, and finally deletes the old index file.

The old file of `FileInfo` was `deleted`, but it was copied and moved a newFile to the new directory. Therefore, the delete status flag should be restored. If it is not set to false,`checkOpen` will throw `FileInfoDeletedException` when judging the fence status of `ledgerHandle` when adding an entry, causing the writing to fail. When the client changes the `ensemble`, it throws an exception because the test has only two bookie nodes and cannot satisfy the two writeSet: `Not enough non-faulty bookies available`.
```java
    public synchronized boolean isFenced() throws IOException {
        checkOpen(false);
        return (stateBits & STATE_FENCED_BIT) == STATE_FENCED_BIT;
    }

    private synchronized void checkOpen(boolean create, boolean openBeforeClose)
            throws IOException {
        if (deleted) {
            throw new FileInfoDeletedException();
        }
      .....
    }
```

When `FileInfo` moves to a new directory by `moveToNewLocation`,delete the old index file and restore the `delete` flag, because the new index file is ready.

Set the disk check interval to the `Integer.MAX_VALUE`, which is stable
`assertEquals("writable dirs should have one dir", 1, ledgerDirsManager
                .getWritableLedgerDirs().size());`
Because once the test thrashes, the disk check thread will restore the full disk to a writable state.
```java
newConf.setDiskCheckInterval(Integer.MAX_VALUE);
```

(cherry picked from commit 1908b7e)
hangc0276 pushed a commit to hangc0276/bookkeeper that referenced this pull request Nov 7, 2022
…#3421)

Fixes issue apache#2665

When the disk is full, `InterleavedLedgerStorage` creates a new entryLog and index file, copies and moves the old index (newFile) `FileInfo` to the new directory, and finally deletes the old index file.

The old file of `FileInfo` was `deleted`, but it was copied and moved a newFile to the new directory. Therefore, the delete status flag should be restored. If it is not set to false,`checkOpen` will throw `FileInfoDeletedException` when judging the fence status of `ledgerHandle` when adding an entry, causing the writing to fail. When the client changes the `ensemble`, it throws an exception because the test has only two bookie nodes and cannot satisfy the two writeSet: `Not enough non-faulty bookies available`.
```java
    public synchronized boolean isFenced() throws IOException {
        checkOpen(false);
        return (stateBits & STATE_FENCED_BIT) == STATE_FENCED_BIT;
    }

    private synchronized void checkOpen(boolean create, boolean openBeforeClose)
            throws IOException {
        if (deleted) {
            throw new FileInfoDeletedException();
        }
      .....
    }
```

When `FileInfo` moves to a new directory by `moveToNewLocation`,delete the old index file and restore the `delete` flag, because the new index file is ready.

Set the disk check interval to the `Integer.MAX_VALUE`, which is stable
`assertEquals("writable dirs should have one dir", 1, ledgerDirsManager
                .getWritableLedgerDirs().size());`
Because once the test thrashes, the disk check thread will restore the full disk to a writable state.
```java
newConf.setDiskCheckInterval(Integer.MAX_VALUE);
```

(cherry picked from commit 1908b7e)
nicoloboschi pushed a commit to datastax/bookkeeper that referenced this pull request Jan 11, 2023
…#3421)

Fixes issue apache#2665

When the disk is full, `InterleavedLedgerStorage` creates a new entryLog and index file, copies and moves the old index (newFile) `FileInfo` to the new directory, and finally deletes the old index file.

The old file of `FileInfo` was `deleted`, but it was copied and moved a newFile to the new directory. Therefore, the delete status flag should be restored. If it is not set to false,`checkOpen` will throw `FileInfoDeletedException` when judging the fence status of `ledgerHandle` when adding an entry, causing the writing to fail. When the client changes the `ensemble`, it throws an exception because the test has only two bookie nodes and cannot satisfy the two writeSet: `Not enough non-faulty bookies available`.
```java
    public synchronized boolean isFenced() throws IOException {
        checkOpen(false);
        return (stateBits & STATE_FENCED_BIT) == STATE_FENCED_BIT;
    }

    private synchronized void checkOpen(boolean create, boolean openBeforeClose)
            throws IOException {
        if (deleted) {
            throw new FileInfoDeletedException();
        }
      .....
    }
```

When `FileInfo` moves to a new directory by `moveToNewLocation`,delete the old index file and restore the `delete` flag, because the new index file is ready.

Set the disk check interval to the `Integer.MAX_VALUE`, which is stable
`assertEquals("writable dirs should have one dir", 1, ledgerDirsManager
                .getWritableLedgerDirs().size());`
Because once the test thrashes, the disk check thread will restore the full disk to a writable state.
```java
newConf.setDiskCheckInterval(Integer.MAX_VALUE);
```

(cherry picked from commit 1908b7e)
(cherry picked from commit da532fc)
Ghatage pushed a commit to sijie/bookkeeper that referenced this pull request Jul 12, 2024
…#3421)

### Motivation

Fixes issue apache#2665 

When the disk is full, `InterleavedLedgerStorage` creates a new entryLog and index file, copies and moves the old index (newFile) `FileInfo` to the new directory, and finally deletes the old index file.

The old file of `FileInfo` was `deleted`, but it was copied and moved a newFile to the new directory. Therefore, the delete status flag should be restored. If it is not set to false,`checkOpen` will throw `FileInfoDeletedException` when judging the fence status of `ledgerHandle` when adding an entry, causing the writing to fail. When the client changes the `ensemble`, it throws an exception because the test has only two bookie nodes and cannot satisfy the two writeSet: `Not enough non-faulty bookies available`.
```java
    public synchronized boolean isFenced() throws IOException {
        checkOpen(false);
        return (stateBits & STATE_FENCED_BIT) == STATE_FENCED_BIT;
    }

    private synchronized void checkOpen(boolean create, boolean openBeforeClose)
            throws IOException {
        if (deleted) {
            throw new FileInfoDeletedException();
        }
      .....
    }
``` 

### Changes
When `FileInfo` moves to a new directory by `moveToNewLocation`,delete the old index file and restore the `delete` flag, because the new index file is ready.


Set the disk check interval to the `Integer.MAX_VALUE`, which is stable 
`assertEquals("writable dirs should have one dir", 1, ledgerDirsManager
                .getWritableLedgerDirs().size());`
Because once the test thrashes, the disk check thread will restore the full disk to a writable state.
```java
newConf.setDiskCheckInterval(Integer.MAX_VALUE);
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants