Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redo edgedb basic types to inherit from builtin types #366

Merged
merged 14 commits into from
Sep 28, 2022
Merged

Conversation

fantix
Copy link
Member

@fantix fantix commented Sep 14, 2022

  • edgedb.Tuple is tuple
    • weakref.ref() will stop working on edgedb.Tuple objects.
  • edgedb.NamedTuple is a subclass of tuple
    • Actual named tuple objects will be an instance of a transient heap type DerivedNamedTuple (subclass of edgedb.NamedTuple)
    • weakref.ref() will stop working on edgedb.NamedTuple objects.
  • edgedb.Set is list
  • edgedb.Array is list
  • edgedb.EnumValue is a subclass of enum.Enum
  • link and linkprop? We'll keep them as they are now, and they are not included in the dataclasses.asdict() result.
  • docs
  • tests

Benchmark:

2.3 GHz Intel 4 Core, Turbo Boost Off, concurrency=10, time=100s

image

image

datatypes branch (left):

queries:	184141
qps:		1841 q/s
min latency:	1.88ms
avg latency:	5.42ms
max latency:	28.36ms

0.24 (right):

queries:	188075
qps:		1880 q/s
min latency:	1.37ms
avg latency:	5.30ms
max latency:	88.56ms

Tested query:

    SELECT User {
        id,
        name,
        image,
        enumval := <Color>'Red',
        namedtuple := (a := 1, b := 2.2),
        arr := [1, 2, 3],
        tup := (1, 2, 3),
        latest_reviews := (
            WITH UserReviews := User.<author[IS Review]
            SELECT UserReviews {
                id,
                body,
                rating,
                movie: {
                    id,
                    image,
                    title,
                    avg_rating
                }
            }
            ORDER BY .creation_time DESC
            LIMIT 10
        )
    }
    FILTER .id = <uuid>$id

weakref.ref() will stop working on edgedb.Tuple objects.
@fantix fantix requested a review from 1st1 September 14, 2022 18:29
@i0bs
Copy link

i0bs commented Sep 14, 2022

EdgeArrayObject *obj = NULL;

Will other types from stdlib such as arrays inherit from builtin?

@fantix
Copy link
Member Author

fantix commented Sep 14, 2022

Will other types from stdlib such as arrays inherit from builtin?

Yes, but array may just be a subclass of tuple because it's immutable.

@i0bs
Copy link

i0bs commented Sep 14, 2022

Yes, but array may just be a subclass of tuple because it's immutable.

Has it always been immutable? I thought EdgeQL supported mutation to the shape and size of arrays.

@fantix
Copy link
Member Author

fantix commented Sep 14, 2022

Has it always been immutable?

Oh this PR is the edgedb-python construct of arrays only in query results, it was and will be immutable.

@1st1
Copy link
Member

1st1 commented Sep 15, 2022

Why can't we use the builtin tuple type for our tuples? The __repr__ is basically compatible, so maybe just drop our custom tuple implementation and use the builtin one? It has a freelist optimization.

edgedb/datatypes/tuple.c Outdated Show resolved Hide resolved
edgedb/datatypes/tuple.c Outdated Show resolved Hide resolved
edgedb.NamedTuple will now create a heap type for the names.
The heap type "DerivedNamedTuple" has a different __new__().
weakref.ref() will stop working on edgedb.NamedTuple objects.
edgedb/datatypes/namedtuple.c Outdated Show resolved Hide resolved
EDGE_NEW_WITH_FREELIST(EDGE_NAMED_TUPLE, EdgeNamedTupleObject,
&EdgeNamedTuple_Type, nt, size);
PyTupleObject *nt = NULL;
EDGE_NEW_WITH_FREELIST(EDGE_NAMED_TUPLE, PyTupleObject, type, nt, size);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot reuse the builtin freelist for tuple because the ob_type is not &PyTuple_Type.

edgedb/datatypes/namedtuple.c Outdated Show resolved Hide resolved
* bpo-35810: incref type in new for Python 3.7
* bpo-40217: don't traverse type for Python 3.9+
* Don't resize a type object - over-allocate with invisible members
* Moved slots to EdgeNamedTuple_Type for performance
* Use Py_TRASHCAN_BEGIN / END in Python 3.8+
@fantix
Copy link
Member Author

fantix commented Sep 23, 2022

Has it always been immutable?

Oh this PR is the edgedb-python construct of arrays only in query results, it was and will be immutable.

UPDATE: we decided to use plain Python list for edgedb.Set and edgedb.Array, so the query results will be partially mutable, but again that's just in user memory.

@fantix fantix marked this pull request as ready for review September 24, 2022 20:10
@fantix fantix force-pushed the datatypes branch 2 times, most recently from 45c1ba4 to b8aa067 Compare September 25, 2022 15:36
They way it is done is hacking the __mro__ to insert the enum.Enum
class, while duck-typing the enum.Enum class.
@i0bs
Copy link

i0bs commented Sep 26, 2022

UPDATE: we decided to use plain Python list for edgedb.Set and edgedb.Array, so the query results will be partially mutable, but again that's just in user memory.

Does the builtin list object support the same optimisation as before? Does this improve querying perf?

@fantix
Copy link
Member Author

fantix commented Sep 26, 2022

UPDATE: we decided to use plain Python list for edgedb.Set and edgedb.Array, so the query results will be partially mutable, but again that's just in user memory.

Does the builtin list object support the same optimisation as before? Does this improve querying perf?

The main optimization is still here, which is using the C API PyList_SET_ITEM for fast initialization, but we lost the (cached) hashing function, which is not a critical use. Also we lost the freelist of array, which may have slight performance impact when arrays are frequently (re-)allocated. Other than that, query performance should not be affected, I’ll run benchmarks to test it.

@i0bs
Copy link

i0bs commented Sep 26, 2022

I'd love to hear more about the benchmarks once they're available. This sounds like a great change if memory allocation doesn't prove to be too impactful for builtin inheritance. Great stuff @fantix ❤️

docs/api/types.rst Outdated Show resolved Hide resolved
edgedb/datatypes/hash.c Outdated Show resolved Hide resolved
docs/api/types.rst Outdated Show resolved Hide resolved
edgedb/datatypes/datatypes.pxd Outdated Show resolved Hide resolved
edgedb/datatypes/datatypes.pxd Show resolved Hide resolved
edgedb/datatypes/namedtuple.c Show resolved Hide resolved
edgedb/datatypes/namedtuple.c Show resolved Hide resolved
edgedb/datatypes/namedtuple.c Show resolved Hide resolved
fantix and others added 5 commits September 27, 2022 07:52
Co-authored-by: Yury Selivanov <[email protected]>
Co-authored-by: Yury Selivanov <[email protected]>
Now edgedb.Object instances will use default Python object.__hash__,
which is calculated baased on memory address. ``==`` compares memory
address too. ``<``, ``<=``, ``>``, ``>=`` are disabled.
Copy link
Member

@1st1 1st1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! (Aside from one little nit, see the new comment)

@fantix
Copy link
Member Author

fantix commented Sep 28, 2022

@i0bs benchmark updated

@fantix fantix merged commit b11b991 into master Sep 28, 2022
@fantix fantix deleted the datatypes branch September 28, 2022 01:10
fantix added a commit that referenced this pull request Oct 21, 2022
Changes
=======

* Implement dataclass for EdgeObject (#359)
  (by @fantix in dfb8c8b for #359)

* Redo edgedb basic types to inherit from builtin types (#366)
  (by @fantix in b11b991 for #366)

* Officially drop 3.6 support (#373)
  (by @msullivan in 7b76bc7 for #373)

* Support Python 3.11 (#375)
  (by @msullivan in 04b0da2 for #375)

* Codegen with the describe_query() API (#363)
  (by @fantix in 361221d for #363)

* Add codegen docs (#380)
  (by @colinhacks in 23dd42e for #380)

* Use typing_extension.Literal in codegen for Python 3.7
  (by @fantix in 6d0d6ab)

* Add __all__ to edgedb/__init__.py
  (by @fmoor in d3ef6d9)

Fixes
=====

* Improve duration parsing
  (by @fmoor in 241c80d)

* Tweak wording in query_single() error messages (#369)
  (by @msullivan in e24bb53 for #369)

* Fix flake tests on python3.7 (#371)
  (by @msullivan in 583e1cb for #371)

* Don't reject tuple arguments on the client side (#370)
  (by @msullivan in 09c950f for #370)

* Correct edgedb.Client.close() timeout behavior
  (by @fantix in 33a912c)

* Ping first if conn is idle for too long (#365)
  (by @fantix in 99cf78a for #365)

Deprecations
============

* Deprecate object links and simplify link property access (#379)
  (by @fantix in 2c5dcc7 for #379)
@fantix fantix mentioned this pull request Oct 21, 2022
fantix added a commit that referenced this pull request Oct 21, 2022
Changes
=======

* Implement dataclass for EdgeObject (#359)
  (by @fantix in dfb8c8b for #359)

* Redo edgedb basic types to inherit from builtin types (#366)
  (by @fantix in b11b991 for #366)

* Officially drop 3.6 support (#373)
  (by @msullivan in 7b76bc7 for #373)

* Support Python 3.11 (#375)
  (by @msullivan in 04b0da2 for #375)

* Codegen with the describe_query() API (#363)
  (by @fantix in 361221d for #363)

* Add codegen docs (#380)
  (by @colinhacks in 23dd42e for #380)

* Use typing_extension.Literal in codegen for Python 3.7
  (by @fantix in 6d0d6ab)

* Add __all__ to edgedb/__init__.py
  (by @fmoor in d3ef6d9)

Fixes
=====

* Improve duration parsing
  (by @fmoor in 241c80d)

* Tweak wording in query_single() error messages (#369)
  (by @msullivan in e24bb53 for #369)

* Fix flake tests on python3.7 (#371)
  (by @msullivan in 583e1cb for #371)

* Don't reject tuple arguments on the client side (#370)
  (by @msullivan in 09c950f for #370)

* Correct edgedb.Client.close() timeout behavior
  (by @fantix in 33a912c)

* Ping first if conn is idle for too long (#365)
  (by @fantix in 99cf78a for #365)

* Use @-prefixed keys in object codec for link properties (#384)
  (by @fantix in 68480f9 for #377)

Deprecations
============

* Deprecate object links and simplify link property access (#379)
  (by @fantix in 2c5dcc7 for #379)
@fantix fantix mentioned this pull request Oct 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants