Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support registering Lua API functions on storages and routers in runtime #1799

Closed
akudiyar opened this issue Oct 30, 2020 · 4 comments
Closed
Labels
feature A new functionality in design

Comments

@akudiyar
Copy link

akudiyar commented Oct 30, 2020

/* Moved from crud to cartridge. */

Problem statement

For using Tarantool Cartridge from connectors customers have to define custom roles with custom methods that perform basic manipulation with the data on storages and then call these methods by name. This makes customers writing a lot of boilerplate and duplicate code and frustrates them.

Examples of typical use-cases:

  1. Customer wants to perform local (same bucket) aggregation of data from different spaces (aka "joining", see Implement select API for simple LEFT JOIN cases crud#77) with filtering and reducing (see Implement count() method crud#74)
  2. Customer wants to perform aggregation of data on the router with filtering (map-reduce) for basic scenarios (see Implement select API for simple LEFT JOIN cases crud#77)
  3. Customer wants to perform batch updates/deletes of data with filtering, with asynchronous background tasks working with microbatches of data (see Support update() with conditions  crud#76)

Registering custom Lua functions for typical tasks generated in connectors will help customers to spend less time on the typical tasks.

Proposed API

I think it may be implemented in a separate module or in the Cartridge itself since it is not a "CRUD" functionality.

crud.register

    crud.register(function_signature, check_params, body, opts)

Registers the function on all cluster nodes.

  • function_signature (string) -- function name and parameters
  • check_params(table) -- checks() call parameters for checking the passed parameters
  • body (string) -- function body
  • opts (table) -- function metadata:
    • roles (list) -- list of Cartridge role names. The function will be available only on instances that have all of the passed roles enabled
    • acl (table) -- access metadata:
      • allow (table) -- which users/roles will be able to invoke the function
        • users (list) -- lsit of user logins
        • roles (list) -- list of user role names
      • deny (table) -- which users/roles will be not able to invoke the function
        • users (list) -- lsit of user logins
        • roles (list) -- list of user role names

Example:

    crud.register('filter_by_name(space, status)', {'string', 'string'}, "return crud.select('test_space', {{'=', 'status', 'NEW'}})", {roles = {'crud-storage'}, acl = {allow = {users = {'admin'}}}})

crud.unregister

    crud.unregister(function_signature)

Unregisters the function on all cluster nodes.

  • function_signature (string) -- function name and parameters

Example:

    crud.unregister('filter_by_name(space, status)')

Requirements

  1. If a function with the same signature was already registered, it will not be registered again and no error will be returned.
  2. If the function is not registered, the client receives the normal error about calling a non-existing function.
  3. Functions should not be persistent, the clients will have to register them again after restarting the instance.
  4. Functions will not be sandboxed.
@Totktonada Totktonada added the feature A new functionality label Jun 18, 2021
@Totktonada
Copy link
Member

Totktonada commented Oct 28, 2021

vshard is about calling functions, crud is about accessing spaces. It looks a bit strange to duplicate support of calling functions in crud. But some enhancements for the 'calling a function' flow are proposed above, so we can discuss them out of the crud module context. (Maybe it is closer to https://github.com/tarantool/cartridge-extensions / https://github.com/tarantool/easy-function, but I'm not in context of those projects.)

The example shows calling of a user-defined function on a storage ('crud-storage' role), but the function calls crud.select(), which is the router function (defined by the 'crud-router' role). So it is not clear, what is your usage scenario.

Calling of a function only on instances that have given cartridge role (one from a provided list) is a bit unclear too. Whether the function should be called on all instances with this role? Or on one according to mode and prefer_replica parameters? Where I can pass those parameters? It seems, the shown API proposal is incomplete: it does not show how to call a registered function.

ACL idea is not clear for me too. Tarantool has built-in ACL, so if we'll allows to add, say, crud.my_func, we can control who will able to call the function over iproto. If integration with some external list of users / roles is assumed, I think, we should design a general auth provider API (and make a canonical implementation) and use it in the tarantool ecosystem.

I'll reassign the task to the Product team to elaborate a problem statement(s) and determine importance.

@akudiyar
Copy link
Author

@Totktonada thanks for triaging this. Let me address some of the questions you put up.

It looks a bit strange to duplicate support of calling functions in crud

Of course, and I stated this, however, there is still no appropriate module to put this functionality into. vshard is just a module which functions may be used for implementing this API, but there is no such requirement and in the common case the call mechanism uses just IPROTO and net.box facilities. I believe that Cartridge is best suitable for this, see the arguments below.

The example shows calling of a user-defined function on a storage

What example? If you speak about "JOIN" cases, they are extensions of the select function, and I thought that we may use the function registration mechanism for implementing such extensions.

The main idea is to allow users and connector authors to not bother with registering the functions manually in Tarantool and roles via writing the code in roles and initialization scripts, but use some API instead for passing the function signature and ACLs and having all this set up automatically. A close example of this is the stored procedure registration commands in different SQL flavors.

Here's an example of how I saw this API usage:

    -- In the initialization code of some Cartridge role
    function init(self)
          if (self.is_master)
                 cartridge.register('my_function(foo, bar, baz)',`{'integer', 'string', 'boolean'}`, "require('logger').error(hello)", {})
          end
    end

    -- In the client code
    tuples = cartridge.rpc_call('my_function', {1, 'abc', true}, {option = value}) -- looks a lot like the current API when
                                                                                                                                -- functions are registered manually in Tarantool
                                                                                                                                -- and called via vshard

Calling of a function only on instances that have given cartridge role (one from a provided list) is a bit unclear too.

The intention was to guarantee some "dependencies" for a function, which may be expressed in a list of enabled roles. It is for supporting the situation where only part of the nodes has some initialization code that provides local tables and necessary auxiliary functions which will be used by the registered function.

Speaking about the mode parameters -- they definitely need to be taken into attention when designing this future, maybe as an additional parameter in the opts.

ACL idea is not clear for me too.

The proposed ACL options are for providing the users a way of setting up all the necessary access control in one place. All the necessary calls, including public API registration, grants, and possible usage of some third-party security manager are done for the user automatically, the user only specifies what needs to be done in a declarative form.

However, if there are some movements towards improving the security API in Tarantool, I support that, but it has nothing with this task.

@Totktonada
Copy link
Member

Moved from crud to cartridge.

@filonenko-mikhail
Copy link
Contributor

Sorry no capacity for such feature right now. Gently closing. Feel free to reopen any time from everywhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature A new functionality in design
Projects
None yet
Development

No branches or pull requests

3 participants