Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add bindings to git_indexer #911

Merged
merged 5 commits into from
Feb 27, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 32 additions & 0 deletions libgit2-sys/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ pub const GIT_REFDB_BACKEND_VERSION: c_uint = 1;
pub const GIT_CHERRYPICK_OPTIONS_VERSION: c_uint = 1;
pub const GIT_APPLY_OPTIONS_VERSION: c_uint = 1;
pub const GIT_REVERT_OPTIONS_VERSION: c_uint = 1;
pub const GIT_INDEXER_OPTIONS_VERSION: c_uint = 1;

macro_rules! git_enum {
(pub enum $name:ident { $($variants:tt)* }) => {
Expand Down Expand Up @@ -91,6 +92,7 @@ pub enum git_odb_object {}
pub enum git_worktree {}
pub enum git_transaction {}
pub enum git_mailmap {}
pub enum git_indexer {}

#[repr(C)]
pub struct git_revspec {
Expand Down Expand Up @@ -354,6 +356,14 @@ pub type git_indexer_progress_cb =
)]
pub type git_transfer_progress = git_indexer_progress;

#[repr(C)]
pub struct git_indexer_options {
pub version: c_uint,
pub progress_cb: git_indexer_progress_cb,
pub progress_cb_payload: *mut c_void,
pub verify: c_uchar,
}

pub type git_remote_ready_cb = Option<extern "C" fn(*mut git_remote, c_int, *mut c_void) -> c_int>;

#[repr(C)]
Expand Down Expand Up @@ -3801,6 +3811,28 @@ extern "C" {
) -> c_int;
pub fn git_packbuilder_free(pb: *mut git_packbuilder);

// indexer
pub fn git_indexer_new(
out: *mut *mut git_indexer,
path: *const c_char,
mode: c_uint,
odb: *mut git_odb,
opts: *mut git_indexer_options,
) -> c_int;
pub fn git_indexer_append(
idx: *mut git_indexer,
data: *const c_void,
size: size_t,
stats: *mut git_indexer_progress,
) -> c_int;
pub fn git_indexer_commit(idx: *mut git_indexer, stats: *mut git_indexer_progress) -> c_int;
#[deprecated = "use `git_indexer_name` to retrieve the filename"]
pub fn git_indexer_hash(idx: *const git_indexer) -> *const git_oid;
pub fn git_indexer_name(idx: *const git_indexer) -> *const c_char;
pub fn git_indexer_free(idx: *mut git_indexer);

pub fn git_indexer_options_init(opts: *mut git_indexer_options, version: c_uint) -> c_int;

// odb
pub fn git_repository_odb(out: *mut *mut git_odb, repo: *mut git_repository) -> c_int;
pub fn git_odb_new(db: *mut *mut git_odb) -> c_int;
Expand Down
166 changes: 164 additions & 2 deletions src/indexer.rs
Original file line number Diff line number Diff line change
@@ -1,7 +1,13 @@
use std::marker;
use std::ffi::CStr;
use std::mem::MaybeUninit;
use std::path::Path;
use std::{io, marker, mem, ptr};

use crate::raw;
use libc::c_void;

use crate::odb::{write_pack_progress_cb, OdbPackwriterCb};
use crate::util::Binding;
use crate::{raw, Error, IntoCString, Odb};

/// Struct representing the progress by an in-flight transfer.
pub struct Progress<'a> {
Expand Down Expand Up @@ -94,3 +100,159 @@ impl<'a> Binding for Progress<'a> {
)]
#[allow(dead_code)]
pub type TransportProgress<'a> = IndexerProgress<'a>;

/// A stream to write and index a packfile
///
/// This is equivalent to [`crate::OdbPackwriter`], but allows to store the pack
/// and index at an arbitrary path. It also does not require access to an object
/// database if, and only if, the pack file is self-contained (i.e. not "thin").
pub struct Indexer<'odb> {
raw: *mut raw::git_indexer,
progress: MaybeUninit<raw::git_indexer_progress>,
progress_payload_ptr: *mut OdbPackwriterCb<'odb>,
}

impl<'a> Indexer<'a> {
/// Create a new indexer
///
/// The [`Odb`] is used to resolve base objects when fixing thin packs. It
/// can be `None` if no thin pack is expected, in which case missing bases
/// will result in an error.
///
/// `mode` is the permissions to use for the output files, use `0` for defaults.
///
/// If `verify` is `false`, the indexer will bypass object connectivity checks.
pub fn new(odb: Option<&Odb<'a>>, path: &Path, mode: u32, verify: bool) -> Result<Self, Error> {
let path = path.into_c_string()?;

let odb = odb.map(Binding::raw).unwrap_or_else(ptr::null_mut);

let mut out = ptr::null_mut();
let progress = MaybeUninit::uninit();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having trouble seeing where progress is ever initialized. I see it created here, and then a few calls to as_mut_ptr, but I don't see where it gets initialized before as_mut_ptr is called. Can you help me understand how this works?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, actually, OdbPackwriter does the same. I assumed it works because MaybeUninit<T> has the same size and alignment as T, so the C side can just write to it (i.e. it is initialized after the first time we pass as_mut_ptr over to libgit2).

I don't know why this would be preferable over deriving Default for git_indexer_progress, and casting to a raw pointer where needed. If I made this change, would you want me to do it for OdbPackwriter as well?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, it looks like OdbPackwriter should also be updated. From what I can tell, the C side is writing to uninitialized memory which is UB to my understanding.

let progress_cb: raw::git_indexer_progress_cb = Some(write_pack_progress_cb);
let progress_payload = Box::new(OdbPackwriterCb { cb: None });
let progress_payload_ptr = Box::into_raw(progress_payload);

unsafe {
let mut opts = mem::zeroed();
try_call!(raw::git_indexer_options_init(
&mut opts,
raw::GIT_INDEXER_OPTIONS_VERSION
));
opts.progress_cb = progress_cb;
opts.progress_cb_payload = progress_payload_ptr as *mut c_void;
opts.verify = verify.into();

try_call!(raw::git_indexer_new(&mut out, path, mode, odb, &mut opts));
}

Ok(Self {
raw: out,
progress,
progress_payload_ptr,
})
}

/// Finalize the pack and index
///
/// Resolves any pending deltas and writes out the index file. The returned
/// string is the hexadecimal checksum of the packfile, which is also used
/// to name the pack and index files (`pack-<checksum>.pack` and
/// `pack-<checksum>.idx` respectively).
pub fn commit(mut self) -> Result<String, Error> {
unsafe {
try_call!(raw::git_indexer_commit(
self.raw,
self.progress.as_mut_ptr()
));

let name = CStr::from_ptr(raw::git_indexer_name(self.raw));
Ok(name.to_str().expect("pack name not utf8").to_owned())
}
}

/// The callback through which progress is monitored. Be aware that this is
/// called inline, so performance may be affected.
pub fn progress<F>(&mut self, cb: F) -> &mut Self
where
F: FnMut(Progress<'_>) -> bool + 'a,
{
let progress_payload =
unsafe { &mut *(self.progress_payload_ptr as *mut OdbPackwriterCb<'_>) };
progress_payload.cb = Some(Box::new(cb) as Box<IndexerProgress<'a>>);

self
}
}

impl io::Write for Indexer<'_> {
fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
unsafe {
let ptr = buf.as_ptr() as *mut c_void;
let len = buf.len();

let res = raw::git_indexer_append(self.raw, ptr, len, self.progress.as_mut_ptr());

if res < 0 {
Err(io::Error::new(io::ErrorKind::Other, "Write error"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unfortunate that it erases the libgit2 error. I assume you are just copying this from OdbWriter. I'm curious why this was implemented as io::Write instead of an inherent write method that would expose a native error message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think io::Write makes sense because we can just io::copy the pack bytes from e.g. a network socket.

That being said, Error::last_error would indeed give the native error here (while it wouldn't for OdbPackwriter), so I'll change that.

} else {
Ok(buf.len())
}
}
}

fn flush(&mut self) -> io::Result<()> {
Ok(())
}
}

impl Drop for Indexer<'_> {
fn drop(&mut self) {
unsafe {
raw::git_indexer_free(self.raw);
drop(Box::from_raw(self.progress_payload_ptr))
}
}
}

#[cfg(test)]
mod tests {
use crate::{Buf, Indexer};
use std::io::prelude::*;

#[test]
fn indexer() {
let (_td, repo_source) = crate::test::repo_init();
let (_td, repo_target) = crate::test::repo_init();

let mut progress_called = false;

// Create an in-memory packfile
let mut builder = t!(repo_source.packbuilder());
let mut buf = Buf::new();
let (commit_source_id, _tree) = crate::test::commit(&repo_source);
t!(builder.insert_object(commit_source_id, None));
t!(builder.write_buf(&mut buf));

// Write it to the standard location in the target repo, but via indexer
let odb = repo_source.odb().unwrap();
let mut indexer = Indexer::new(
Some(&odb),
repo_target.path().join("objects").join("pack").as_path(),
0o644,
true,
)
.unwrap();
indexer.progress(|_| {
progress_called = true;
true
});
indexer.write(&buf).unwrap();
indexer.commit().unwrap();

// Assert that target repo picks it up as valid
let commit_target = repo_target.find_commit(commit_source_id).unwrap();
assert_eq!(commit_target.id(), commit_source_id);
assert!(progress_called);
}
}
2 changes: 1 addition & 1 deletion src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ pub use crate::error::Error;
pub use crate::index::{
Index, IndexConflict, IndexConflicts, IndexEntries, IndexEntry, IndexMatchedPath,
};
pub use crate::indexer::{IndexerProgress, Progress};
pub use crate::indexer::{Indexer, IndexerProgress, Progress};
pub use crate::mailmap::Mailmap;
pub use crate::mempack::Mempack;
pub use crate::merge::{AnnotatedCommit, MergeOptions};
Expand Down
8 changes: 5 additions & 3 deletions src/odb.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
use std::io;
use std::marker;
use std::mem::MaybeUninit;

kim marked this conversation as resolved.
Show resolved Hide resolved
use std::ptr;
use std::slice;

Expand All @@ -10,6 +11,7 @@ use libc::{c_char, c_int, c_uint, c_void, size_t};

use crate::panic;
use crate::util::Binding;

kim marked this conversation as resolved.
Show resolved Hide resolved
use crate::{
raw, Error, IndexerProgress, Mempack, Object, ObjectType, OdbLookupFlags, Oid, Progress,
};
Expand Down Expand Up @@ -438,8 +440,8 @@ impl<'repo> io::Write for OdbWriter<'repo> {
}
}

struct OdbPackwriterCb<'repo> {
cb: Option<Box<IndexerProgress<'repo>>>,
pub(crate) struct OdbPackwriterCb<'repo> {
pub(crate) cb: Option<Box<IndexerProgress<'repo>>>,
}

/// A stream to write a packfile to the ODB
Expand Down Expand Up @@ -542,7 +544,7 @@ extern "C" fn foreach_cb(id: *const raw::git_oid, payload: *mut c_void) -> c_int
.unwrap_or(1)
}

extern "C" fn write_pack_progress_cb(
pub(crate) extern "C" fn write_pack_progress_cb(
stats: *const raw::git_indexer_progress,
payload: *mut c_void,
) -> c_int {
Expand Down