Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider checking the type of a bad row before parsing directly with the right decoder #35

Closed
benjben opened this issue Apr 6, 2020 · 3 comments
Labels
duplicate This issue or pull request already exists enhancement New feature or request question Further information is requested

Comments

@benjben
Copy link
Contributor

benjben commented Apr 6, 2020

At the moment we use .as[BadRow] to parse a Json to its bad row case class.
This is based on ADTs encoding and decoding.
Given the number of classes and fields in the bad rows it might be more performant to first check the schema of the bad row (in "schema" field) and then use directly the right Decoder, e.g.

schema match {
  case Schemas.EnrichmentFailures.toSchemaUri =>
    json.as[EnrichmentFailures]
  ...
}
@benjben benjben added enhancement New feature or request question Further information is requested labels Apr 6, 2020
@chuwy
Copy link
Contributor

chuwy commented Apr 6, 2020

Yup, that's a good idea. But I believe it is duplicated ticket: #28

@chuwy
Copy link
Contributor

chuwy commented Apr 6, 2020

At the same time, I think it would be better to leave Decoder[BadRow] like it is now, because .as[A] decoding implies that structure is compatible with A, while what we really can decode it SelfDescribingData[BadRow]

@benjben
Copy link
Contributor Author

benjben commented Apr 6, 2020

But I believe it is duplicated ticket

Arf it is indeed, sorry.

because .as[A] decoding implies that structure is compatible with A, while what we really can decode it SelfDescribingData[BadRow]

Good point. So we would need a parse function as you suggested in #28

def parse(json: Json): Either[String, BadRow] =
  for {
    sdj <- SelfDescribingData
      .parse(json)
      .leftMap(e => s"Cannot parse ${json.noSpaces} as self-describing JSON, ${e.code}")
    br <-  sdj.schema match {
      case Schemas.EnrichmentFailures.toSchemaUri ->
        sdj.data.as[EnrichmentFailures].leftMap(_.getMessage)
     case Schemas.SchemaViolations.toSchemaUri ->
        sdj.data.as[SchemaViolations].leftMap(_.getMessage)
     ...
  } yield br

We should also add one to parse directly from a String.

@benjben benjben closed this as completed Apr 6, 2020
@chuwy chuwy added the duplicate This issue or pull request already exists label Apr 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists enhancement New feature or request question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants