Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow referencing previously defined named types #51

Merged
merged 1 commit into from
Nov 15, 2020

Conversation

xico42
Copy link
Contributor

@xico42 xico42 commented Nov 11, 2020

Once a named type is defined, in order to reuse that same type it is
needed to reference that type by the name (as in the primitive types).

Add support for NamedTypes in both the schema builder and the schema
generator, which is a way of referencing previously defined named types.

As an example, the following snippet wouldn't work, because the schema cannot be parsed:

<?php

declare(strict_types=1);

namespace Acme\Types;

use FlixTech\AvroSerializer\Objects\Schema\Generation\Annotations as SerDe;

/**
 * @SerDe\AvroName("SomeDocument")
 * @SerDe\AvroType("record")
 */
class SomeDocument
{
    /**
     * @SerDe\AvroName("Status")
     * @SerDe\AvroType("enum", attributes={
     *     @SerDe\AvroSymbols({"Undefined", "NotAuthorized", "Canceled"}),
     *     @SerDe\AvroName("Status")
     * })
     */
    private $status;
}
<?php

declare(strict_types=1);

namespace Acme\Types;

use FlixTech\AvroSerializer\Objects\Schema\Generation\Annotations as SerDe;

/**
 * @SerDe\AvroType("record")
 * @SerDe\AvroName("SomeDocumentCdc")
 * @SerDe\AvroNamespace("org.acme.types")
 */
class SomeDocumentCdc
{
    /**
     * @SerDe\AvroName("NewDocument")
     * @SerDe\AvroType("record", attributes={
     *     @SerDe\AvroTargetClass("\Acme\Types\SomeDocument")
     * })
     */
    private $newDocument;

    /**
     * @SerDe\AvroName("OldDocument")
     * @SerDe\AvroType("null")
     * @SerDe\AvroType("record", attributes={
     *     @SerDe\AvroTargetClass("\Acme\Types\Some\Document")
     * })
     */
    private $oldDocument;
}

With this PR we'll be able to do the following:

<?php

declare(strict_types=1);

namespace Acme\Types;

use FlixTech\AvroSerializer\Objects\Schema\Generation\Annotations as SerDe;

/**
 * @SerDe\AvroType("record")
 * @SerDe\AvroName("SomeDocumentCdc")
 * @SerDe\AvroNamespace("org.acme.types")
 */
class SomeDocumentCdc
{
    /**
     * @SerDe\AvroName("NewDocument")
     * @SerDe\AvroType("record", attributes={
     *     @SerDe\AvroTargetClass("\Acme\Types\SomeDocument")
     * })
     */
    private $newDocument;

    /**
     * @SerDe\AvroName("OldDocument")
     * @SerDe\AvroType("null")
     * @SerDe\AvroType("SomeDocument")
     */
    private $oldDocument;
}

@xico42
Copy link
Contributor Author

xico42 commented Nov 11, 2020

@tPl0ch the CI error seems to be something related to uploading code coverage report, would you have any clues on how to solve that?

Copy link
Collaborator

@tPl0ch tPl0ch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @fcoedno,

when seeing your contributions I feel that this library is indeed very valuable to you and that sparks great joy in my heart! ❤️

Would you mind looking at some of the comments and questions I have left so that we can get this merged?

Regarding the CI failures, I believe that the failures are either a glitch of GitHub actions themselves, or something is wrong with the action provider. I will take care of this issue.

@@ -123,7 +123,9 @@ private function schemaFromTypes(Type ...$types): Schema

return $this->applyAttributes($schema, $attributes);
default:
throw new \InvalidArgumentException('$type is not a valid avro type');
$schema = Schema::named($type->getTypeName());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you also have the feeling that this method is doing multiple things - it processes the collection of types and a single type. I believe it makes sense to split these into schemaFromTypeCollection and schemaFromSingleType.

Additionally I am generally not a huge fan of large switch/case blocks. One possilbe other way that I have used is using a map of closures instead:

<?php

$typeProcessorMap = [
    TypeName::ENUM => function (SchemaAttributes $attributes) {
        return $this->applyAttributes(Schema::enum(), $attributes);
    },
   // ... more types
];

This map can be defined in the constructor of the SchemaGenerator class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idd. I've refactored it and splitted in two methods. Also, I've extracted the mapping logic into a separate class and also followed your suggestion of creating an associative array with closures.

@@ -45,6 +45,7 @@ public function it_should_generate_an_empty_record()
->namespace('org.acme');

$this->assertEquals($expected, $schema);
$schema->parse();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to parse the schema after the expectation? Is there maybe another test assumption/condition missing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not so sure also if it was the best place to do this here. My intention is just to make sure the generated schema object is parseable. There are some checks and validations that will be applied only after we actual do the parsing.

So if anything is wrong, an exception would be raised at this point.

Another idea I had was to create a data provider that would provide the class with the annotations and the expected schema. Then we would have two tests: One to actually see if the generated schema matches the expected one and the other to make sure the generated schema parses without errors.

What do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that testing for the failure conditions should be a separate test. That test can use a data provider to provide invalid edge cases and make sure that an exception is thrown. Right now I believe the tests are mixing up different responsibilities - testing that no exception is thrown and the valid parsing of annotations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idd. Also in this case testing for failures and expect exceptions wouldn't make much sense. Because the parsing and validation of schema is responsibility of the avro-php package. I'm going to remove these "parse" calls

@@ -156,9 +159,17 @@ public function it_should_generate_records_containing_records()
->name('SimpleRecord')
->namespace('org.acme')
->field('intType', Schema::int(), Schema\Record\FieldOption::default(42))
)
->field(
'union',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Should use the name constant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. Here the union name does not represent the union type itself, it is just the name of the field. Maybe the best would be to give the field another name to avoid confusion?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name did indeed confuse me. I like the idea of renaming it to something that indicates an example fieldname rather than the type.

@tPl0ch
Copy link
Collaborator

tPl0ch commented Nov 12, 2020

@tPl0ch the CI error seems to be something related to uploading code coverage report, would you have any clues on how to solve that?

It seems that this is a limitation of GitHub Actions that repository secrets are not shared with pipelines running in the forks. I will refactor the actions at a later time and would accept this PR even with this step failing.

use FlixTech\AvroSerializer\Objects\Schema\AttributeName;
use FlixTech\AvroSerializer\Objects\Schema\TypeName;

class TypeMapper
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This could be a final class

/**
* @param array<Type> $types
*/
private function schemaFromTypes(array $types): Schema
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain to me why you changed the signature to a simple array over using the variadic type argument?

private function schemaFromTypes(Type ...$types): Schema

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simply to avoid unpacking the arrays when passing arguments to this function. But thinking better about this, we loose a more strict type checking by doing so. Let me revert this change.

@xico42 xico42 force-pushed the named-types branch 2 times, most recently from 1c67fe5 to b0118da Compare November 15, 2020 11:31
@xico42
Copy link
Contributor Author

xico42 commented Nov 15, 2020

Besides fixing the problem of named types itself, I've also found another bug. It was not possible to define extra custom attributes to a record type from the field definition:

<?php

    /**
     * @SerDe\AvroName("simpleField")
     * @SerDe\AvroType("record", attributes={
     *     @SerDe\AvroTargetClass("\FlixTech\AvroSerializer\Test\Objects\Schema\Generation\Fixture\SimpleRecord"),
     *     @SerDe\AvroDoc("This a simple record for testing purposes")
     * })
     */
    private $simpleRecord;

Once a named type is defined, in order to reuse that same type it is
needed to reference that type by the name (as in the primitive types).

Add support for NamedTypes in both the schema builder and the schema
generator, which is a way of referencing previously defined named types.

Add support for adding extra attributes for records in field's
definition.
Copy link
Collaborator

@tPl0ch tPl0ch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tPl0ch tPl0ch merged commit 322fcd6 into flix-tech:master Nov 15, 2020
@tPl0ch
Copy link
Collaborator

tPl0ch commented Nov 15, 2020

@fcoedno Your work is awesome and much appreciated! ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants