Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: type variant information isn't propagated past scalar expressions #48613

Open
jordanlewis opened this issue May 8, 2020 · 4 comments
Open
Labels
A-tools-efcore C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-queries SQL Queries Team

Comments

@jordanlewis
Copy link
Member

jordanlewis commented May 8, 2020

Any time a query does a scalar expression that produces a variant of a particular type (for example, see #48563 in which we cast a string to a "char"), it loses the type variant information including its OID or width.

This is because many things that need the type of an expression use the ResolvedType method on Expr, which when given a Datum, returns the "base type" of the type family that the Datum belongs to. For example, a DString always returns types.String, even if it had been created with a cast to a string type with a width limit. However, we really would prefer that the resultant Datum keep its width and other information (like a custom OID) or we end up with problems like in #48563.

This is kind of tricky, though, because we like calling ResolvedType everywhere - and a DString doesn't have any 'information' in its struct, besides the string that it holds - no room to hold type information at all.

I think one way to fix this could be passing type information in a "side channel" everywhere we currently pass rows of data, and banning calls to ResolvedType, but this is kind of a big project. Alternatively, we use things like DOidWrapper more aggressively, or make fresh Datum types much more frequently. For the latter approach, though, you still can't represent polymorphic types that don't have a fixed number of variants - since you can't generate them up front. An example of this is strings with a fixed length.

Epic CRDB-2474
Jira issue: CRDB-4309

@jordanlewis jordanlewis added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label May 8, 2020
@rohany
Copy link
Contributor

rohany commented May 9, 2020

I've already made the enum types take in a T upon creation and hold on to it. Perhaps all the datums should too? In these cases we definitely have the type information before making the datum itself. Or is this your second suggestion already?

@jordanlewis
Copy link
Member Author

Not all the Datums should - it's just expensive for no reason IMO. ResultColumns should always be around with the correct types. I think we have the right infrastructure to solve this, just some missing pieces. I opened #48619 to try to solve a big missing piece here.

@jordanlewis
Copy link
Member Author

@RaduBerinde I think we can mark this as fixed with your PR

@RaduBerinde
Copy link
Member

I think we still need the rest of #48619, and also more work inside the optimizer (we are relying on ResolvedType() in too many places).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tools-efcore C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-queries SQL Queries Team
Projects
Status: Backlog
Development

No branches or pull requests

5 participants