Skip to content

Commit

Permalink
crud: add readview support
Browse files Browse the repository at this point in the history
Added readview support for select and pairs.

Closes #343
  • Loading branch information
better0fdead committed Sep 26, 2023
1 parent 2d3d479 commit 6c8952a
Show file tree
Hide file tree
Showing 9 changed files with 3,815 additions and 36 deletions.
17 changes: 11 additions & 6 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,17 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html).

## Unreleased

### Added
* Read view support for select and pairs(#343).

## [1.2.0] - 07-06-23

### Added
* Add `noreturn` option for operations:
`insert`, `insert_object`, `insert_many`, `insert_object_many`,
`replace`, `replace_object`, `replace_many`, `insert_object_many`,
* Add `noreturn` option for operations:
`insert`, `insert_object`, `insert_many`, `insert_object_many`,
`replace`, `replace_object`, `replace_many`, `insert_object_many`,
`upsert`, `upsert_object`, `upsert_many`, `upsert_object_many`,
`update`, `delete` (#267).

Expand Down Expand Up @@ -39,16 +44,16 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
## [1.0.0] - 02-02-23

### Added
* Add timeout condition for the validation of master presence in
* Add timeout condition for the validation of master presence in
replicaset and for the master connection (#95).
* Support Cartridge clusterwide configuration for `crud.cfg` (#332).

### Changed
* **Breaking**: forbid using space id in `crud.len` (#255).

### Fixed
* Add validation of the master presence in replicaset and the
master connection to the `utils.get_space` method before
* Add validation of the master presence in replicaset and the
master connection to the `utils.get_space` method before
receiving the space from the connection (#331).
* Fix fiber cancel on schema reload timeout in `call_reload_schema` (PR #336).

Expand Down
238 changes: 218 additions & 20 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,12 @@ It also provides the `crud-storage` and `crud-router` roles for
- [Count](#count)
- [Call options for crud methods](#call-options-for-crud-methods)
- [Statistics](#statistics)
- [Read view](#read-view)
- [Creating a read view](#creating-a-read-view)
- [Closing a read view](#closing-a-read-view)
- [Read view select](#read-view-select)
- [Read view select conditions](#read-view-select-conditions)
- [Read view pairs](#read-view-pairs)
- [Cartridge roles](#cartridge-roles)
- [Usage](#usage)
- [License](#license)
Expand Down Expand Up @@ -237,8 +243,8 @@ where:
* `noreturn` (`?boolean`) - suppress successfully processed tuple
(first return value is `nil`). `false` by default
* `fetch_latest_metadata` (`?boolean`) - guarantees the
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
Performance overhead is up to 15%. `false` by default

Returns metadata and array contains one inserted row, error.
Expand Down Expand Up @@ -308,8 +314,8 @@ where:
* `noreturn` (`?boolean`) - suppress successfully processed tuples
(first return value is `nil`). `false` by default
* `fetch_latest_metadata` (`?boolean`) - guarantees the
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
Performance overhead is up to 15%. `false` by default

Returns metadata and array with inserted rows, array of errors.
Expand Down Expand Up @@ -450,8 +456,8 @@ where:
vshard router instance. Set this parameter if your space is not
a part of the default vshard cluster
* `fetch_latest_metadata` (`?boolean`) - guarantees the
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
Performance overhead is up to 15%. `false` by default

Returns metadata and array contains one row, error.
Expand Down Expand Up @@ -493,8 +499,8 @@ where:
* `noreturn` (`?boolean`) - suppress successfully processed tuple
(first return value is `nil`). `false` by default
* `fetch_latest_metadata` (`?boolean`) - guarantees the
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
Performance overhead is up to 15%. `false` by default

Returns metadata and array contains one updated row, error.
Expand Down Expand Up @@ -535,8 +541,8 @@ where:
* `noreturn` (`?boolean`) - suppress successfully processed tuple
(first return value is `nil`). `false` by default
* `fetch_latest_metadata` (`?boolean`) - guarantees the
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
Performance overhead is up to 15%. `false` by default

Returns metadata and array contains one deleted row (empty for vinyl), error.
Expand Down Expand Up @@ -588,8 +594,8 @@ where:
* `noreturn` (`?boolean`) - suppress successfully processed tuple
(first return value is `nil`). `false` by default
* `fetch_latest_metadata` (`?boolean`) - guarantees the
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
Performance overhead is up to 15%. `false` by default

Returns inserted or replaced rows and metadata or nil with error.
Expand Down Expand Up @@ -659,8 +665,8 @@ where:
* `noreturn` (`?boolean`) - suppress successfully processed tuples
(first return value is `nil`). `false` by default
* `fetch_latest_metadata` (`?boolean`) - guarantees the
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
Performance overhead is up to 15%. `false` by default

Returns metadata and array with inserted/replaced rows, array of errors.
Expand Down Expand Up @@ -801,8 +807,8 @@ where:
* `noreturn` (`?boolean`) - suppress successfully processed tuple
(first return value is `nil`). `false` by default
* `fetch_latest_metadata` (`?boolean`) - guarantees the
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
Performance overhead is up to 15%. `false` by default

Returns metadata and empty array of rows or nil, error.
Expand Down Expand Up @@ -868,8 +874,8 @@ where:
* `noreturn` (`?boolean`) - suppress successfully processed tuples
(first return value is `nil`). `false` by default
* `fetch_latest_metadata` (`?boolean`) - guarantees the
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
Performance overhead is up to 15%. `false` by default

Returns metadata and array of errors.
Expand Down Expand Up @@ -1014,8 +1020,8 @@ where:
* `yield_every` (`?number`) - number of tuples processed on storage to yield after,
`yield_every` should be > 0, default value is 1000
* `fetch_latest_metadata` (`?boolean`) - guarantees the
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
Performance overhead is up to 15%. `false` by default


Expand Down Expand Up @@ -1541,6 +1547,198 @@ support preserving stats between role reload
(see [tarantool/metrics#334](https://github.com/tarantool/metrics/issues/334)),
thus this feature will be unsupported for `metrics` driver.

### Read view

A read view is an in-memory snapshot of the entire database that isn’t affected by future data modifications. Read views allow you to retrieve data using the `read_view_object:select()` and `read_view_object:pairs()` operations.

Read views can be used to make complex analytical queries. This reduces the load on the main database and improves RPS for a single Tarantool instance.

To improve memory consumption and performance, Tarantool creates read views using the copy-on-write technique. In this case, duplication of the entire data set is not required: Tarantool duplicates only blocks modified after a read view is created

Read views have the following limitations:

* Only the memtx engine is supported.
* Read view can be used starting from Tarantool Enterprise v2.11.0.

#### Creating a read view

To create a read view, call the `crud.readview()` function.

```lua
local foo = crud.readview(opts)
```

where:

* `opts`:
* `name` (`?string`) - name of the read view
* `timeout` (`?number`) - `vshard.call` timeout (in seconds)

**Example:**

```lua
local foo = crud.readview({name: 'foo', timeout: 3})
```

#### Closing a read view

When a read view is no longer needed, close it using the `read_view_object:close()` method because a read view may consume a substantial amount of memory.

```lua
local foo = foo.readview()
foo:close(opts)
```

where:

* `opts`:
* `timeout` (`?number`) - `vshard.call` timeout (in seconds)

Otherwise, a read view is closed implicitly when the read view object is collected by the Lua garbage collector.

**Example:**

```lua
local foo = crud.readview()
foo:close({timeout = 3})
```

#### Read view select

`read_view_object:select()` supports multi-conditional selects, treating a cluster as a single space, same as `crud.select`.

```lua
local foo = crud.readview()
local objects, err = foo:select(space_name, conditions, opts)
foo:close()
```

where:

* `space_name` (`string`) - name of the space
* `conditions` (`?table`) - array of [select conditions](#select-conditions)
* `opts`:
* `first` (`?number`) - the maximum count of the objects to return.
If negative value is specified, the objects behind `after` are returned
(`after` option is required in this case). [See pagination examples](doc/select.md#pagination).
* `after` (`?table`) - tuple after which objects should be selected
* `batch_size` (`?number`) - number of tuples to process per one request to storage
* `bucket_id` (`?number|cdata`) - bucket ID
* `force_map_call` (`?boolean`) - if `true`
then the map call is performed without any optimizations even
if full primary key equal condition is specified
* `timeout` (`?number`) - `vshard.call` timeout (in seconds)
* `fields` (`?table`) - field names for getting only a subset of fields
* `fullscan` (`?boolean`) - if `true` then a critical log entry will be skipped
on potentially long `select`, see [avoiding full scan](doc/select.md#avoiding-full-scan).
* `vshard_router` (`?string|table`) - Cartridge vshard group name or
vshard router instance. Set this parameter if your space is not
a part of the default vshard cluster
* `yield_every` (`?number`) - number of tuples processed on storage to yield after,
`yield_every` should be > 0, default value is 1000
* `fetch_latest_metadata` (`?boolean`) - guarantees the
up-to-date metadata (space format) in first return value, otherwise
it may not take into account the latest migration of the data format.
Performance overhead is up to 15%. `false` by default


Returns metadata and array of rows, error.

**Example:**

```lua
local foo = crud.readview()
foo:select('customers', nil, {batch_size=1, fullscan=true})
---
- metadata: [{'name': 'id', 'type': 'unsigned'}, {'name': 'bucket_id', 'type': 'unsigned'},
{'name': 'name', 'type': 'string'}, {'name': 'age', 'type': 'number'}]
rows:
- [1, 477, 'Elizabeth', 12]
- [2, 401, 'Mary', 46]
- [3, 2804, 'David', 33]
- [4, 1161, 'William', 81]
- [5, 1172, 'Jack', 35]
- [6, 1064, 'William', 25]
- [7, 693, 'Elizabeth', 18]
- null
...
crud.insert('customers', {8, box.NULL, 'Elizabeth', 23})
---
- rows:
- [8, 185, 'Elizabeth', 23]
metadata: [{'name': 'id', 'type': 'unsigned'}, {'name': 'bucket_id', 'type': 'unsigned'},
{'name': 'name', 'type': 'string'}, {'name': 'age', 'type': 'number'}]
- null
...
foo:select('customers', nil, {batch_size=1, fullscan=true})
---
- metadata: [{'name': 'id', 'type': 'unsigned'}, {'name': 'bucket_id', 'type': 'unsigned'},
{'name': 'name', 'type': 'string'}, {'name': 'age', 'type': 'number'}]
rows:
- [1, 477, 'Elizabeth', 12]
- [2, 401, 'Mary', 46]
- [3, 2804, 'David', 33]
- [4, 1161, 'William', 81]
- [5, 1172, 'Jack', 35]
- [6, 1064, 'William', 25]
- [7, 693, 'Elizabeth', 18]
- null
...
foo.close()
```

##### Read view select conditions

Select conditions for `read_view_object:select()` are the same as [select conditions](#select-conditions) for `crud.select`.

**Example:**

```lua
foo = crud.readview()
foo:select('customers', {{'<=', 'age', 35}}, {first = 10})
---
- metadata:
- {'name': 'id', 'type': 'unsigned'}
- {'name': 'bucket_id', 'type': 'unsigned'}
- {'name': 'name', 'type': 'string'}
- {'name': 'age', 'type': 'number'}
rows:
- [5, 1172, 'Jack', 35]
- [3, 2804, 'David', 33]
- [6, 1064, 'William', 25]
- [7, 693, 'Elizabeth', 18]
- [1, 477, 'Elizabeth', 12]
...
foo.close()
```

#### Read view pairs

You can iterate across a distributed space using the `read_view_object:pairs()` method.
Its arguments are the same as [`crud.readview.select`](#read-view-select) arguments except
`fullscan` (it does not exist because `crud.pairs` does not generate a critical
log entry on potentially long requests) and negative `first` values aren't
allowed.
User could pass use_tomap flag (false by default) to iterate over flat tuples or objects.

**Example:**

```lua
foo = crud.readview()
local tuples = {}
for _, tuple in foo:pairs('customers', {{'<=', 'age', 35}}, {use_tomap = false}) do
-- {5, 1172, 'Jack', 35}
table.insert(tuples, tuple)
end

local objects = {}
for _, object in foo:pairs('customers', {{'<=', 'age', 35}}, {use_tomap = true}) do
-- {id = 5, name = 'Jack', bucket_id = 1172, age = 35}
table.insert(objects, object)
end
foo:close()
```

## Cartridge roles

`cartridge.roles.crud-storage` is a Tarantool Cartridge role that depends on the
Expand Down
6 changes: 6 additions & 0 deletions crud.lua
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ local borders = require('crud.borders')
local sharding_metadata = require('crud.common.sharding.sharding_metadata')
local utils = require('crud.common.utils')
local stats = require('crud.stats')
local readview = require('crud.readview')

local crud = {}

Expand Down Expand Up @@ -147,6 +148,10 @@ crud.reset_stats = stats.reset
-- @function storage_info
crud.storage_info = utils.storage_info

-- @refer readview.new
-- @function readview
crud.readview = readview.new

--- Initializes crud on node
--
-- Exports all functions that are used for calls
Expand Down Expand Up @@ -174,6 +179,7 @@ function crud.init_storage()
count.init()
borders.init()
sharding_metadata.init()
readview.init()

_G._crud.storage_info_on_storage = utils.storage_info_on_storage
end
Expand Down
Loading

0 comments on commit 6c8952a

Please sign in to comment.