Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add load_mangled_object_vars(), the inverse of get_mangled_object_vars() #9408

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

nicolas-grekas
Copy link
Contributor

@nicolas-grekas nicolas-grekas commented Aug 23, 2022

Being able to hydrate an object from its "mangled vars" is a quite common operation in eg ORMs.

Doing so right now requires writing complex and slow logic that involve many calls to reflection and/or rebound closures.
I propose to remove this overhead by adding load_mangled_object_vars(), which can be understood as the inverse of get_mangled_object_vars().

Internally, the implementation just calls object_properties_load() for now, which means it's missing checks for property types. I'm wondering if these checks should be added to object_properties_load() or to load_mangled_object_vars()?

load_mangled_object_vars($ro, []);
var_dump($ro);

load_mangled_object_vars($ro, ['ro' => 'abc']);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just here to prove that the implementation is incomplete for now, see PR description.

@TimWolla
Copy link
Member

#9354 is related here.

@cmb69
Copy link
Member

cmb69 commented Aug 24, 2022

I think that needs to be discussed on the internals mailing list.

@TysonAndre
Copy link
Contributor

  1. What's the intended behavior for read-only properties?

  2. For internal classes, the fact that handlers.write_property is bypassed might be a problem - it'd allow creating properties that shouldn't be overriden or created. Maybe throw if that isn't the standard handler

  3. Maybe get_serialized_properties(array $object): array and unserialize_object_from_members(string $className, array $members): object is a possible alternative (accounting for __sleep()/__serialize()/__wakeup()/__unserialize(), throwing for the deprecated Serializable interface).
    (This would be a shallow clone of the values of the member properties that could be serialized and unserialized)

    This would hopefully be able to reuse checks in both internal and user-defined classes that the resulting object is

    1. Valid
    2. Has expected fields/combinations of fields
    3. Isn't malicious data, e.g. not going to have unexpected side effects such as deleting files or executing commands or calling methods on unexpected classes in __destruct due to passing values that wouldn't be set to a given type/value in a real serialized object. Or in the cases of internal classes, not going to crash due to being an unexpected type for the property that write_property would forbid

    This may also help in avoiding unexpected mixes of loaded properties and properties set in __construct in some edge cases, by not requiring calling __construct

    • Although I guess new ReflectionClass(...)->newInstanceWithoutConstructor() existing makes this less of a concern
// ext/reflection/php_reflection.c
/* {{{ _reflection_write_property */
static zval *_reflection_write_property(zend_object *object, zend_string *name, zval *value, void **cache_slot)
{
	if (zend_hash_exists(&object->ce->properties_info, name)
		&& (zend_string_equals_literal(name, "name") || zend_string_equals_literal(name, "class")))
	{
		zend_throw_exception_ex(reflection_exception_ptr, 0,
			"Cannot set read-only property %s::$%s", ZSTR_VAL(object->ce->name), ZSTR_VAL(name));
		return &EG(uninitialized_zval);
	}
	else
	{
		return zend_std_write_property(object, name, value, cache_slot);
	}
}
/* }}} */

I think that needs to be discussed on the internals mailing list.

Agreed

@nicolas-grekas
Copy link
Contributor Author

The function I'm looking for would do the following:

function load_mangled_object_vars(object $object, array $vars): void
{
    foreach ($vars as $name => $value) {
        try {
            if ('' === $name || "\0" !== $name[0]) {
                $reflector = new ReflectionProperty($object, $name);
            } else {
                $i = strrpos($name, "\0");
                $reflector = new ReflectionProperty(substr($name, 1, $i - 1), substr($name, $i + 1));

            }
        } catch (ReflectionException) {
            $object->$name = $value;
            continue;
        }

        $reflector->setValue($object, $value);
    }
}

Of course, it would do so much more efficiently, and it would also preserve references found in $vars (exactly like the array-cast operator preserves references found in the original object.)

I realize that object_properties_load misses many checks and that the current implementation is not enough.
I'm not sure I have the C-skills to work on that but I'd be interesting in gathering feedback about the proposal, and help also about the implementation 🙏

Here first maybe, then on internals?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants