Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: khonsulabs/okaywal
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: main
Choose a base ref
...
head repository: spaceandtimelabs/okaywal
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: main
Choose a head ref
Can’t automatically merge. Don’t worry, you can still create the pull request.
  • 15 commits
  • 7 files changed
  • 3 contributors

Commits on Aug 1, 2023

  1. Add the ability to checkpoint-on-commit and to wait for a given entry…

    … to be checkpointed
    
    This adds two abilities to okaywal:
    - The ability to tell it to perform a checkpoint right after commiting
      an entry. This allows to control when the checkpointing code should run.
    - The ability to wait for a certain entry_id to be checkpointed, with
      a timeout.
    David Alves committed Aug 1, 2023
    Copy the full SHA
    2729070 View commit details
  2. Merge pull request #1 from spaceandtimelabs/dralves-checkpointing

    Add the ability to checkpoint-on-commit and to wait for a given entry to be checkpointed
    dralves authored Aug 1, 2023
    Copy the full SHA
    00a1339 View commit details

Commits on Aug 11, 2023

  1. Cleanup: Added some logging and prevented checkpointing threads from …

    …quitting
    David Alves committed Aug 11, 2023
    Copy the full SHA
    0ac9343 View commit details
  2. Merge pull request #2 from spaceandtimelabs/dralves-no-quitting

    Cleanup: Added some logging and prevented checkpointing threads from quitting
    dralves authored Aug 11, 2023
    Copy the full SHA
    909783f View commit details

Commits on Aug 16, 2023

  1. Change the checkpointing logic to reuse reclaim

    Checkpointing depends on the file being switched before
    the checkpointing is started but that was never happening
    because reclaim would only occur when the file was full
    (and not when the file was forced to roll over).
    
    This fixes that by embedding the checkpointing logic in
    the pre-existing roll-over logic.
    David Alves committed Aug 16, 2023
    Copy the full SHA
    f4b9ccd View commit details
  2. Merge pull request #3 from spaceandtimelabs/dralves-fix-cp

    Change the checkpointing logic to reuse reclaim
    yjshen authored Aug 16, 2023
    Copy the full SHA
    82f2da2 View commit details

Commits on Aug 22, 2023

  1. Allow to obtain the number of pending checkpoints

    This is a measure of backpressure: if the number of checkpoints
    pendings increases too much it means we're falling behind.
    
    This also make it so that we panic!() when a checkpoint fails.
    It's important to fail here because most of the time we don't
    know how to recover from a checkpoint having failed.
    David Alves committed Aug 22, 2023
    Copy the full SHA
    578f797 View commit details
  2. Merge pull request #4 from spaceandtimelabs/dralves-len

    Allow to obtain the number of pending checkpoints
    yjshen authored Aug 22, 2023
    Copy the full SHA
    f526ba3 View commit details

Commits on Dec 21, 2023

  1. Add a hard quota for disk usage in percent

    This adds a hard disk quota for disk usage in percent, after which:
    - New entries will be rejected
    - Activating new files will fail
    
    The default quota is 95%.
    David Alves committed Dec 21, 2023
    Copy the full SHA
    b4570ec View commit details

Commits on Jan 2, 2024

  1. Merge pull request #6 from spaceandtimelabs/max_quota

    Add a hard quota for disk usage in percent
    yjshen authored Jan 2, 2024
    Copy the full SHA
    f132251 View commit details

Commits on Jan 11, 2024

  1. fix space check logic

    yjshen committed Jan 11, 2024
    Copy the full SHA
    f241ae3 View commit details
  2. Merge pull request #7 from spaceandtimelabs/fix_disk_quota_check

    fix disk space check logic
    dralves authored Jan 11, 2024
    Copy the full SHA
    0c42d33 View commit details

Commits on Feb 5, 2024

  1. No more panics on errors & allow to inspect whether the checkpointing…

    … thread is running (#8)
    
    This adds a couple of things, one is to stop the checkpoiting thread from
    panicking and instead allow it to just return the error. This will help
    in debugging cases where there are many checkpointing threads.
    
    The second one is to add a method that allows to inspect whether the
    checkpointing thread is running.
    dralves authored Feb 5, 2024
    Copy the full SHA
    f0ab33a View commit details

Commits on Feb 22, 2024

  1. Copy the full SHA
    0bc962b View commit details

Commits on Mar 12, 2024

  1. Truncate files on startup (#5)

    When the checkpoint thread fails, we accumulate a lot of files
    but the startup process doesn't trim them. This adds a max_inactive_files
    configuration and enforces it on startup.
    
    TODO: Finish unit test
    
    Co-authored-by: David Alves <david.alves@dfinity.org>
    dralves and David Alves authored Mar 12, 2024
    Copy the full SHA
    09db7a9 View commit details
Showing with 387 additions and 54 deletions.
  1. +2 −1 .gitignore
  2. +2 −1 Cargo.toml
  3. +15 −0 src/config.rs
  4. +36 −17 src/entry.rs
  5. +172 −32 src/lib.rs
  6. +10 −2 src/log_file.rs
  7. +150 −1 src/tests.rs
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
/target
/Cargo.lock
perf.data*
.tmp*
.tmp*
.idea/
3 changes: 2 additions & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
@@ -15,7 +15,8 @@ parking_lot = "0.12.1"
crc32c = "0.6.3"
flume = "0.10.14"
tracing = { version = "0.1.36", optional = true }
file-manager = { git = "https://github.com/khonsulabs/file-manager", branch = "main" }
file-manager = { git = "https://github.com/spaceandtimelabs/file-manager", branch = "main" }
log = "0.4.19"

[dev-dependencies]
tempfile = "3.3.0"
15 changes: 15 additions & 0 deletions src/config.rs
Original file line number Diff line number Diff line change
@@ -32,6 +32,12 @@ pub struct Configuration<M> {
/// was to allow detection of what format or version of a format the data
/// was inside of the log without needing to parse the entries.
pub version_info: Arc<Vec<u8>>,
/// The max number of inactive files to keep around.
/// This is a soft limit.
pub max_inactive_files: u32,
/// The maximum disk usage, in percent, before writes start to be rejected.
/// Must be a value between 0 and 100.
pub max_disk_usage_percent: u16,
}

impl Default for Configuration<StdFileManager> {
@@ -72,6 +78,8 @@ where
checkpoint_after_bytes: kilobytes(768),
buffer_bytes: kilobytes(16),
version_info: Arc::default(),
max_inactive_files: 10,
max_disk_usage_percent: 95,
}
}
/// Sets the number of bytes to preallocate for each segment file. Returns `self`.
@@ -105,6 +113,13 @@ where
self
}

/// Sets the maximum number of inactive files to keep around.
/// Returns 'self'.
pub fn max_inactive_files(mut self, max_inactive_files: u32) -> Self {
self.max_inactive_files = max_inactive_files;
self
}

/// Opens the log using the provided log manager with this configuration.
pub fn open<Manager: LogManager<M>>(self, manager: Manager) -> io::Result<WriteAheadLog<M>> {
WriteAheadLog::open(self, manager)
53 changes: 36 additions & 17 deletions src/entry.rs
Original file line number Diff line number Diff line change
@@ -1,8 +1,7 @@
use std::io::{self, Read, Write};

use crc32c::crc32c_append;
use file_manager::FileManager;
use parking_lot::MutexGuard;
use std::io::{self, Read, Write};

use crate::{
log_file::{LogFile, LogFileWriter},
@@ -59,6 +58,32 @@ where
self.id
}

fn commit_internal<F: FnOnce(&mut LogFileWriter<M::File>) -> io::Result<()>>(
&mut self,
callback: F,
) -> io::Result<u64> {
let file = self.file.as_ref().expect("Already committed");
let mut writer = file.lock();
writer.write_all(&[END_OF_ENTRY])?;
let new_length = writer.position();
writer.set_last_entry_id(Some(self.id));
callback(&mut writer)?;
drop(writer);
Ok(new_length)
}

/// Commits this entry to the log and forces a checkpoint to happen.
///
/// See `commit`.
pub fn commit_and_checkpoint(mut self) -> io::Result<EntryId> {
let new_length = self.commit_internal(|_file| Ok(()))?;
let id = self.id;
let file = self.file.take().expect("Already committed");
self.log
.reclaim(file, WriteResult::Entry { new_length }, true)?;
Ok(id)
}

/// Commits this entry to the log. Once this call returns, all data is
/// atomically updated and synchronized to disk.
///
@@ -73,19 +98,12 @@ where
mut self,
callback: F,
) -> io::Result<EntryId> {
let file = self.file.take().expect("already committed");

let mut writer = file.lock();

writer.write_all(&[END_OF_ENTRY])?;
let new_length = writer.position();
callback(&mut writer)?;
writer.set_last_entry_id(Some(self.id));
drop(writer);

self.log.reclaim(file, WriteResult::Entry { new_length })?;

Ok(self.id)
let new_length = self.commit_internal(callback)?;
let id = self.id;
let file = self.file.take().expect("file already dropped");
self.log
.reclaim(file, WriteResult::Entry { new_length }, false)?;
Ok(id)
}

/// Abandons this entry, preventing the entry from being recovered in the
@@ -101,8 +119,9 @@ where
let mut writer = file.lock();
writer.revert_to(self.original_length)?;
drop(writer);

self.log.reclaim(file, WriteResult::RolledBack).unwrap();
self.log
.reclaim(file, WriteResult::RolledBack, false)
.unwrap();

Ok(())
}
Loading