FEAT: Iterators API #809

igchor · 2020-10-06T11:42:17Z

FEAT: Iterators

Rationale

Iterators are more user-friendly (especially in languages like Python or Java). The iterators API is also more 'composable'. Since finding an element and modifying/reading from it would be separate actions we would not have to expose functions like (get_lower, get_higher, update_lower, update_higher, etc.).

Description

The capabilities of iterators will be at least the same as of existing get/get_all/get_above functions. With iterators we can also implement update feature without callbacks (#387).

API Changes

C++

class db {
    template <bool Const>
    iterator new_iterator(); // TODO: should be pass some configuration structure to this func to allow for future extensions?
};

namespace internal {
template <bool IsConst>
class iterator {
public:
    iterator(); // in ctor, we might take global locks if necessary (e.g. for csmap)
    ~iterator(); // drops any locks

    iterator(const iterator&) = delete;
    iterator(iterator&&);

    iterator& operator=(iterator&&); // TODO: do we want move assignment op?
    iterator& operator=(const iterator&) = delete;

    status seek(string_view key);
    status seek_lower(string_view key);
    status seek_lower_eq(string_view key);
    status seek_higher(string_view key);
    status seek_higher_eq(string_view key);

    status seek_to_first();
    status seek_to_last();

    status next();
    status prev(); // it can return NOT_SUPPORTED (e.g, for csmap)

    result<string_view, status> key();

    result<string_view, status> value(); // only for IsConst

    result<accessor, status> value();    // only for !IsConst, starts a transaction
                                       // TODO: should this function consume iterator?  
};

/* For some engines (e.g. cmap) accessor takes locks internally and starts a transacion
 * which makes it impossible to create other iterators while accessor is live.  For now, we should just
 * prohibit using multiple accessors for pmemobj engines. We can relax this requirement if there is
 * an active pmemkv transaction (see description below for batch update). */
class accessor {
    result<slice<InputIt<char>>, status> read_range(size_t pos, size_t n); // only for reading
    result<slice<OutputIt<char>>, status> write_range(size_t pos, size_t n); // only for writing (can point to the same memory as read_range or to WAL).
    // It is unspecified whether writes made through write_range are visible though read_range

    // TODO: should we also have read_for_update_range here (for MVCC based engines)?
    

     // TODO: if commit and abort consumes iterator (like proposed in C API) maybe it would be better
     // to make those functions as standalone functions taking rvalue ref, e.g.: status commit(accessor&&);
    status commit();
    void abort();
};
}

// Based on https://doc.rust-lang.org/std/result/enum.Result.html
template <typename Ok, typename Err>
class result {
    bool is_ok();
    bool is_error();
    
    // std::optional is available only since C++17 so we might need to think about some other approach
    std::optional<Ok> ok(); // discards error, if any
    std::optional<Err> err(); // discards value, if any
};

using read_iterator = internal::iterator<true>;
using write_iterator = internal::iterator<false>;

C

typedef struct pmemkv_read_iterator pmemkv_read_iterator;
typedef struct pmemkv_write_iterator pmemkv_write_iterator;
typedef struct pmemkv_accessor pmemkv_accessor;

pmemkv_read_iterator* pmemkv_read_iterator_new();
pmemkv_write_iterator* pmemkv_write_iterator_new();

void pmemkv_read_iterator_delete(pmemkv_read_iterator*);
void pmemkv_write_iterator_delete(pmemkv_write_iterator*);

// TODO: since we have two iterator types we either have to implement all those functions twice or take some generic
// paramater like void* or pmemkv_iterator*. In case of pmemkv_iterator* user would have to explicitly cast the pointer.     
int pmemkv_iterator_seek(void* it, const char* k, size_t kb);
int pmemkv_iterator_seek_lower(void* it, const char* k, size_t kb); 
int pmemkv_iterator_seek_lower_eq(void* it, const char* k, size_t kb);
int pmemkv_iterator_seek_higher(void* it, const char* k, size_t kb);
int pmemkv_iterator_seek_higher_eq(void* it, const char* k, size_t kb);

int pmemkv_iterator_seek_to_first(void* it);
int pmemkv_iterator_seek_to_last(void* it);

int pmemkv_iterator_next(void *it);
int pmemkv_iterator_prev(void *it);

int pmemkv_iterator_key(void* it, const char** k, size_t* kb);
int pmemkv_iterator_key(void *it, const char** k, size_t* kb);

int pmemkv_read_iterator_value(pmemkv_read_iterator* it, const char** val, size_t* kb);
int pmemkv_write_iterator_value(pmemkv_write_iterator* it, pmemkv_accessor** acc);

int pmemkv_accessor_read_range(pmemkv_accessor* acc, size_t pos, size_t n, const char** data, size_t *kb);
int pmemkv_accessor_write_range(pmemkv_accessor* acc, size_t pos, size_t n, char** data, size_t *kb);

// both commit and abort will consume the accessor (destroy it)
int pmemkv_accessor_commit(pmemkv_accessor*); 
void pmemkv_accessor_abort(pmemkv_accessor*);

// void pmemkv_accessor_delete(pmemkv_accessor*); // probably not needed if commit/abort consumes the iterator

Batch updates (Just an initial idea)

For current engines (at least cmap) we cannot use iterators inside of a transaction.
If we would like to add support for batch updates the only way would be to first obtain iterators
and only then start transaction/batch operation like this (this assumes we can hold several iterators per thread):

auto it1 = db.new_iterator<false>().seek("k1");
auto it2 = db.new_iterator<false>().seek("k2");

// auto acc0 = it1.get().accessor;
// acc0.write_range(...);
// TODO: if commit was not called before starting a tx, this is an error (or do we want the db.tx() to just create an inner tx?)
auto tx = db.tx();

auto acc1 = it1.get().accessor;
auto r1 = acc1.write_range(0, 10);
std::memset(r1.begin(), 2, r1.size());

// acc1.commit() // TODO: this is an error since this would commit the outer tx OR this does nothing

auto acc2 = it2.get().accessor;
auto r2 = acc2.write_range(0, 10);
std::memset(r2.begin(), 2, r2.size());

tx.update(std::move(acc1)); // consumes accessor and transfers any locks, any modifications to acc1 are visible only after tx.commit()
tx.update(std::move(acc2));

tx.commit();

Holding mutliple iterators

The biggest problem with holding multiple iterators (for concurrent engines) is that we can call lock on an already locked element. A solution to that might be keeping a list of all active iterators and if a user requests second iterator to the same element we just return him the same one (wrapped in some shared_ptr?). If then one of the iterators is modified (via this shared_ptr) we can use CoW approach to just modify one of them.

Other methods

The problem is similar to multiple iterators. If we call get(key) while already holding an iterator for key we might get undefined behavior. For put/get we could detect if we already hold an iterator and return appropriate status (calling get/put on different keys can suceed). For methods like defrag() we should probably always return an error in case any iterator is held.

Ref:
#387
#713

The text was updated successfully, but these errors were encountered:

karczex · 2020-10-07T11:42:27Z

It may be useful to add

* begin()
* end()

and optionally

* advance() 
* distance()

igchor · 2020-10-07T11:50:32Z

@karczex are you suggesting begin() and end() for db class? I think it's better to have just seek_to_first/last in iterator class to avoid duplication. We wouldn'y be able to just replace new_iterator() with begin or end because that would decrease performance (we would have to lock the first or last element even if you just want to seek to some other one).

karczex · 2020-10-07T12:27:29Z

why name it seek_to_first/last instead of begin/end ?

igchor · 2020-10-07T13:07:05Z

Well, traditionally begin()/end() returns you the iterator. Here, I already have an iterator and want to proceed.

Update from the meeting: any checks (for single accessor/single iterator per thread) should probably be done just in debug (asserts). Unless we want to actually track all iterator, then returning a status in case we call get() on key which is already hold by iterator would not introduce much additional overhead.

igchor added the Type: Feature label Oct 6, 2020

igchor mentioned this issue Oct 12, 2020

FEAT: transactions #810

Open

JanDorniak99 mentioned this issue Oct 21, 2020

Iterators API #831

Merged

szyrom closed this as completed in #831 Dec 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FEAT: Iterators API #809

FEAT: Iterators API #809

igchor commented Oct 6, 2020 •

edited

Loading

karczex commented Oct 7, 2020

igchor commented Oct 7, 2020

karczex commented Oct 7, 2020

igchor commented Oct 7, 2020

FEAT: Iterators API #809

FEAT: Iterators API #809

Comments

igchor commented Oct 6, 2020 • edited Loading

FEAT: Iterators

Rationale

Description

API Changes

Batch updates (Just an initial idea)

Holding mutliple iterators

Other methods

karczex commented Oct 7, 2020

igchor commented Oct 7, 2020

karczex commented Oct 7, 2020

igchor commented Oct 7, 2020

igchor commented Oct 6, 2020 •

edited

Loading