Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Straightforward distinguishing between arrays and maps #89

Open
zaglex opened this issue Jun 16, 2016 · 14 comments
Open

Straightforward distinguishing between arrays and maps #89

zaglex opened this issue Jun 16, 2016 · 14 comments
Assignees
Labels
feature A new functionality
Milestone

Comments

@zaglex
Copy link

zaglex commented Jun 16, 2016

Conversion from php-array to lua-table may be little bit tricky.

Consider the case:

$data[0] = 'a';
$data[1] = 'b';
$data[2] = 'c';
$tarantool->call('myfunc', [$data]);

On lua side you will se such a table:

data[0] == nil
data[1] == 'a'
data[2] == 'b'
data[3] == 'c'

But then you do this on php-side:

unset($data[1]);
$tarantool->call('myfunc', [$data]);

And on lua side you will se this:

data[0] == 'a'
data[1] == 'nil'
data[2] == 'c'

I understand why it happens, but anyway this may be confusing and may cause unexpected errors.

Is there any way to force tarantool client to trait array either as "real array" or as "map"? Would be great to have special classes like MessagePackArray and MessagePackMap, or even LuaArray and LuaTable for this purpose.

@rybakit
Copy link
Collaborator

rybakit commented Jun 16, 2016

From what i see, [0 => 'a', 1 => 'b', 2 => 'c'] is packed to MP array and [0 => 'a', 2 => 'c'] to MP map and on Lua side these structures are converted to Lua array and Lua table accordingly. Why it's confusing and what result you expect to see? Could you please elaborate on this a bit further?

@zaglex
Copy link
Author

zaglex commented Jun 17, 2016

Imagine space where tuples consist of 2 fields: NUM primary key and TABLE "payload".
Payload is a variable-length map: { fieldId => fieldValue}, where fieldId is integer >= 0.
When you pass from php to lua something like this:

$fields = [];
$fields[0] = 'field0Val';
$fields[5] = 'field5Val';
$field[7] = 'field7Val';
  • it is treated as map and everything is ok. But if in some case you have sequential fieldIds in some set:
$fields = [];
$fields[0] = 'field0Val';
$fields[1] = 'field1Val';
$field[2] = 'field2Val';
  • then it will be unexpectedly treated as array.

Of course you can disallow to use 0 as field identifier on the app level, so it doesn't look like a critical issue, or may be it is not an issue at all - just bad choice of field identifier in someone's app)

@rybakit
Copy link
Collaborator

rybakit commented Jun 17, 2016

Ah, I see. I have the similar issue with the msgpack php extension ;) In my own msgpack implementation I use MP ext type to pack arbitrary types (see https://github.com/rybakit/msgpack.php#custom-types) but, afaik, Tarantool's msgpack doesn't support it yet.

@akalend
Copy link

akalend commented Jun 17, 2016

I make the issue rtsisyk/msgpuck#10 at 3 month ago. The part code of the implenentation MP ext type in the hhvm_msgpack: https://github.com/akalend/hhvm-msgpack/blob/hhvm-v-3.12/msgpuck.h

@bigbes
Copy link
Contributor

bigbes commented Jun 17, 2016

@rybakit As I get it - there's no connection between MP_EXT and this issue. @zaglex want a way to enforce something to be Map or Array. That's why some problems with indexing are happened.

@bigbes
Copy link
Contributor

bigbes commented Jun 17, 2016

@zaglex what'll happened if you'll do something like this?

$data[1] = NULL;
$tarantool->call('myfunc', [$data]);

@zaglex
Copy link
Author

zaglex commented Jun 17, 2016

Yes, @bigbes, you are right, I'm speaking about enforcing Map or Array.
As for your example, when I do this:

$data[1] = NULL;
$tarantool->call('myfunc', [$data]);

I get this:

data[0]  - nil
data[1]  - cdata<void *>: NULL

@zaglex
Copy link
Author

zaglex commented Jun 17, 2016

Ah, you mean that this solves problem with unset from my first example. Yes, it looks like a workaround.

@rybakit
Copy link
Collaborator

rybakit commented Jun 17, 2016

@bigbes I don't see any other sane way to force php array to be packed as MP map. And with ext it's quite easy to implement. It's already possible with pure client, so I can do something like:

// register your own map transformer 
$coll = new Collection([new MapTransformer(5)]);
$packer->setTransformers($coll);

...

// send a map object directly, MapTransformer will take care about the rest
$client->call('myfunc', [new Map(['a', 'b', 'c'])]); // <- will be packed as MP map

The only problem atm is that on Lua side you will not be able to unpack this extension.

@bigbes
Copy link
Contributor

bigbes commented Jun 17, 2016

@rybakit it was said - you're creating MessagePackMap/MessagePackArray instance (that inherits ArrayObject) and use it like array. Maybe you can create object, that will store Array.
Every other thing will happen in connector (type checking and applying right encoding procedure)

@rybakit
Copy link
Collaborator

rybakit commented Jun 17, 2016

@bigbes Of course, you can create the classes and do some checks it in the connector, but:

  1. The connector should know about those classes to apply different packing/unpacking rules based on the class name (or other marker, e.g. interface).

  2. Note, that the problem is a bit wider than only map/array distinguishing. The same issue is with MP str/bin format family. Sometimes you need to pack/unpack php string as MP binary and vise versa.

I'm thinking out loud now, but you may consider to add the Tarantool\Ext class:

$tnt->call('myfunct', new Ext(1, ['a', 'b', 'c'])); // we know that 1 means "array"
$tnt->call('myfunct', new Ext(2, ['a', 'b', 'c'])); // and 2 means "map" in our app

And Tarantool can even have predefined ext types for common cases:

Tarantool::EXT_ARRAY = 1
Tarantool::EXT_MAP = 2
Tarantool::EXT_UTF8 = 3
Tarantool::EXT_BIN = 4
...
(we can reserve 0-16 values for future types)

So instead of creating custom MessagePackMap / MessagePackArray / MessagePackUtf8 / MessagePackBin classes you will have one generic Ext class which will do the same. And you can decide if you want to pack/unpack those structures directly in the connector or forward them to Tarantool as MP ext and let him do packing/unpacking (but then, Tarantool should support MP ext out of the box).

@bigbes
Copy link
Contributor

bigbes commented Jun 18, 2016

  1. It's what we discuss here - we need those classes or not
  2. Yes we can, but that needs changing of server - and that's more fundamental task, than just rewrite part of connector. It's overingeneering.

@bigbes bigbes added the feature A new functionality label Jun 22, 2016
@bigbes bigbes self-assigned this Jun 23, 2016
@rybakit
Copy link
Collaborator

rybakit commented Jun 30, 2016

There is a related issue regarding bin/utf8 distinguishing: msgpack/msgpack-php#89.

@rybakit
Copy link
Collaborator

rybakit commented Apr 14, 2018

For the record, here is an example of what I was talking above regarding the custom Map objects: https://github.com/rybakit/msgpack.php/blob/64adceae711932b7c55f658884c110f5c2e18e50/examples/map.php#L13-L14. The only difference is that the new implementation doesn't use the MessagePack extension, but directly packs the wrapped array into MP map. Though I'm not sure how this functionality can be ported to the pecl connector.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature A new functionality
Projects
None yet
Development

No branches or pull requests

5 participants