Allow referencing previously defined named types #51

xico42 · 2020-11-11T21:53:03Z

Once a named type is defined, in order to reuse that same type it is
needed to reference that type by the name (as in the primitive types).

Add support for NamedTypes in both the schema builder and the schema
generator, which is a way of referencing previously defined named types.

As an example, the following snippet wouldn't work, because the schema cannot be parsed:

<?php

declare(strict_types=1);

namespace Acme\Types;

use FlixTech\AvroSerializer\Objects\Schema\Generation\Annotations as SerDe;

/**
 * @SerDe\AvroName("SomeDocument")
 * @SerDe\AvroType("record")
 */
class SomeDocument
{
    /**
     * @SerDe\AvroName("Status")
     * @SerDe\AvroType("enum", attributes={
     *     @SerDe\AvroSymbols({"Undefined", "NotAuthorized", "Canceled"}),
     *     @SerDe\AvroName("Status")
     * })
     */
    private $status;
}

<?php

declare(strict_types=1);

namespace Acme\Types;

use FlixTech\AvroSerializer\Objects\Schema\Generation\Annotations as SerDe;

/**
 * @SerDe\AvroType("record")
 * @SerDe\AvroName("SomeDocumentCdc")
 * @SerDe\AvroNamespace("org.acme.types")
 */
class SomeDocumentCdc
{
    /**
     * @SerDe\AvroName("NewDocument")
     * @SerDe\AvroType("record", attributes={
     *     @SerDe\AvroTargetClass("\Acme\Types\SomeDocument")
     * })
     */
    private $newDocument;

    /**
     * @SerDe\AvroName("OldDocument")
     * @SerDe\AvroType("null")
     * @SerDe\AvroType("record", attributes={
     *     @SerDe\AvroTargetClass("\Acme\Types\Some\Document")
     * })
     */
    private $oldDocument;
}

With this PR we'll be able to do the following:

<?php

declare(strict_types=1);

namespace Acme\Types;

use FlixTech\AvroSerializer\Objects\Schema\Generation\Annotations as SerDe;

/**
 * @SerDe\AvroType("record")
 * @SerDe\AvroName("SomeDocumentCdc")
 * @SerDe\AvroNamespace("org.acme.types")
 */
class SomeDocumentCdc
{
    /**
     * @SerDe\AvroName("NewDocument")
     * @SerDe\AvroType("record", attributes={
     *     @SerDe\AvroTargetClass("\Acme\Types\SomeDocument")
     * })
     */
    private $newDocument;

    /**
     * @SerDe\AvroName("OldDocument")
     * @SerDe\AvroType("null")
     * @SerDe\AvroType("SomeDocument")
     */
    private $oldDocument;
}

xico42 · 2020-11-11T22:10:56Z

@tPl0ch the CI error seems to be something related to uploading code coverage report, would you have any clues on how to solve that?

tPl0ch

Hey @fcoedno,

when seeing your contributions I feel that this library is indeed very valuable to you and that sparks great joy in my heart! ❤️

Would you mind looking at some of the comments and questions I have left so that we can get this merged?

Regarding the CI failures, I believe that the failures are either a glitch of GitHub actions themselves, or something is wrong with the action provider. I will take care of this issue.

tPl0ch · 2020-11-12T08:40:35Z

src/Objects/Schema/Generation/SchemaGenerator.php

@@ -123,7 +123,9 @@ private function schemaFromTypes(Type ...$types): Schema

                return $this->applyAttributes($schema, $attributes);
            default:
-                throw new \InvalidArgumentException('$type is not a valid avro type');
+                $schema = Schema::named($type->getTypeName());


Do you also have the feeling that this method is doing multiple things - it processes the collection of types and a single type. I believe it makes sense to split these into schemaFromTypeCollection and schemaFromSingleType.

Additionally I am generally not a huge fan of large switch/case blocks. One possilbe other way that I have used is using a map of closures instead:

<?php $typeProcessorMap = [ TypeName::ENUM => function (SchemaAttributes $attributes) { return $this->applyAttributes(Schema::enum(), $attributes); }, // ... more types ];

This map can be defined in the constructor of the SchemaGenerator class.

Idd. I've refactored it and splitted in two methods. Also, I've extracted the mapping logic into a separate class and also followed your suggestion of creating an associative array with closures.

tPl0ch · 2020-11-12T08:41:33Z

test/Objects/Schema/Generation/SchemaGeneratorTest.php

@@ -45,6 +45,7 @@ public function it_should_generate_an_empty_record()
            ->namespace('org.acme');

        $this->assertEquals($expected, $schema);
+        $schema->parse();


Why do we need to parse the schema after the expectation? Is there maybe another test assumption/condition missing?

I was not so sure also if it was the best place to do this here. My intention is just to make sure the generated schema object is parseable. There are some checks and validations that will be applied only after we actual do the parsing.

So if anything is wrong, an exception would be raised at this point.

Another idea I had was to create a data provider that would provide the class with the annotations and the expected schema. Then we would have two tests: One to actually see if the generated schema matches the expected one and the other to make sure the generated schema parses without errors.

What do you think?

I believe that testing for the failure conditions should be a separate test. That test can use a data provider to provide invalid edge cases and make sure that an exception is thrown. Right now I believe the tests are mixing up different responsibilities - testing that no exception is thrown and the valid parsing of annotations.

Idd. Also in this case testing for failures and expect exceptions wouldn't make much sense. Because the parsing and validation of schema is responsibility of the avro-php package. I'm going to remove these "parse" calls

tPl0ch · 2020-11-12T08:42:44Z

test/Objects/Schema/Generation/SchemaGeneratorTest.php

@@ -156,9 +159,17 @@ public function it_should_generate_records_containing_records()
                    ->name('SimpleRecord')
                    ->namespace('org.acme')
                    ->field('intType', Schema::int(), Schema\Record\FieldOption::default(42))
+            )
+            ->field(
+                'union',


Nit: Should use the name constant.

I don't think so. Here the union name does not represent the union type itself, it is just the name of the field. Maybe the best would be to give the field another name to avoid confusion?

The name did indeed confuse me. I like the idea of renaming it to something that indicates an example fieldname rather than the type.

tPl0ch · 2020-11-12T15:07:15Z

@tPl0ch the CI error seems to be something related to uploading code coverage report, would you have any clues on how to solve that?

It seems that this is a limitation of GitHub Actions that repository secrets are not shared with pipelines running in the forks. I will refactor the actions at a later time and would accept this PR even with this step failing.

tPl0ch · 2020-11-15T07:03:23Z

src/Objects/Schema/Generation/TypeMapper.php

+use FlixTech\AvroSerializer\Objects\Schema\AttributeName;
+use FlixTech\AvroSerializer\Objects\Schema\TypeName;
+
+class TypeMapper


Nit: This could be a final class

tPl0ch · 2020-11-15T07:06:24Z

src/Objects/Schema/Generation/SchemaGenerator.php

+    /**
+     * @param array<Type> $types
+     */
+    private function schemaFromTypes(array $types): Schema


Could you explain to me why you changed the signature to a simple array over using the variadic type argument?

private function schemaFromTypes(Type ...$types): Schema

Simply to avoid unpacking the arrays when passing arguments to this function. But thinking better about this, we loose a more strict type checking by doing so. Let me revert this change.

xico42 · 2020-11-15T11:35:09Z

Besides fixing the problem of named types itself, I've also found another bug. It was not possible to define extra custom attributes to a record type from the field definition:

<?php

    /**
     * @SerDe\AvroName("simpleField")
     * @SerDe\AvroType("record", attributes={
     *     @SerDe\AvroTargetClass("\FlixTech\AvroSerializer\Test\Objects\Schema\Generation\Fixture\SimpleRecord"),
     *     @SerDe\AvroDoc("This a simple record for testing purposes")
     * })
     */
    private $simpleRecord;

Once a named type is defined, in order to reuse that same type it is needed to reference that type by the name (as in the primitive types). Add support for NamedTypes in both the schema builder and the schema generator, which is a way of referencing previously defined named types. Add support for adding extra attributes for records in field's definition.

tPl0ch

LGTM

tPl0ch · 2020-11-15T19:22:47Z

@fcoedno Your work is awesome and much appreciated! ❤️

tPl0ch requested changes Nov 12, 2020

View reviewed changes

xico42 force-pushed the named-types branch from 563a74c to c37999c Compare November 13, 2020 18:11

tPl0ch requested changes Nov 15, 2020

View reviewed changes

xico42 force-pushed the named-types branch 2 times, most recently from 1c67fe5 to b0118da Compare November 15, 2020 11:31

xico42 force-pushed the named-types branch from b0118da to eff6efb Compare November 15, 2020 11:36

tPl0ch approved these changes Nov 15, 2020

View reviewed changes

tPl0ch merged commit 322fcd6 into flix-tech:master Nov 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow referencing previously defined named types #51

Allow referencing previously defined named types #51

xico42 commented Nov 11, 2020

xico42 commented Nov 11, 2020

tPl0ch left a comment

tPl0ch Nov 12, 2020

xico42 Nov 13, 2020

tPl0ch Nov 12, 2020

xico42 Nov 12, 2020

tPl0ch Nov 12, 2020

xico42 Nov 13, 2020

tPl0ch Nov 12, 2020

xico42 Nov 12, 2020

tPl0ch Nov 12, 2020

tPl0ch commented Nov 12, 2020

tPl0ch Nov 15, 2020

tPl0ch Nov 15, 2020

xico42 Nov 15, 2020

xico42 commented Nov 15, 2020

tPl0ch left a comment

tPl0ch commented Nov 15, 2020

Allow referencing previously defined named types #51

Allow referencing previously defined named types #51

Conversation

xico42 commented Nov 11, 2020

xico42 commented Nov 11, 2020

tPl0ch left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tPl0ch commented Nov 12, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xico42 commented Nov 15, 2020

tPl0ch left a comment

Choose a reason for hiding this comment

tPl0ch commented Nov 15, 2020