Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: collation support, phase one #10605

Merged
merged 1 commit into from
Nov 14, 2016
Merged

sql: collation support, phase one #10605

merged 1 commit into from
Nov 14, 2016

Conversation

eisenstatdavid
Copy link

@eisenstatdavid eisenstatdavid commented Nov 10, 2016

Introduces minimal support for collation: a collated string type, the
COLLATE operator, and overloads for casting and comparing collated
strings.

Significantly, other overloads for collated strings are not provided,
since adding them without upgrading the type checker will confuse
overload resolution (c.f. the discussion of timestamp vs. timestamptz in
#9627). As a temporary workaround, cast to a string, call the string
overload, and collate back.

Tracking issue for collation: #2473


This change is Reviewable

@knz
Copy link
Contributor

knz commented Nov 11, 2016

:lgtm: This is very good. Just minor nits left.


Reviewed 10 of 11 files at r1.
Review status: 10 of 11 files reviewed at latest revision, 3 unresolved discussions.


pkg/sql/parser/datum.go, line 581 at r1 (raw file):

// environment.
func NewDCollatedString(contents string, locale string, env *CollationEnvironment) *DCollatedString {
  entry, ok := env.cache[locale]

I'd move this and the condition below onto a method (*CollactionEnvironment).GetCachedEntry(string).


pkg/sql/parser/eval.go, line 1938 at r1 (raw file):

      }
  }

No cast to a collated string? Is COLLATE mandatory to achieve this?


pkg/sql/parser/eval.go, line 1959 at r1 (raw file):

      return NewDCollatedString(d.Contents, expr.Locale, &ctx.collationEnv), nil
  default:
      panic(fmt.Sprintf("invalid argument to COLLATE: %s", d))

I didn't know you could re-type the collation this way. In which use cases is this useful?


Comments from Reviewable

@knz
Copy link
Contributor

knz commented Nov 11, 2016

Reviewed 1 of 11 files at r1.
Review status: all files reviewed at latest revision, 3 unresolved discussions.


Comments from Reviewable

@nvanbenschoten
Copy link
Member

:lgtm: with some nits.

This change clearly demonstrates that we need to start thinking about overload resolution again, now that we're adding more "non-orthogonal" types. We may want to consider replacing preferred overloads with preferred types within "type categories". This would solve the same problem that preferred overloads solves, but would also promote consistency between these preferences in a way that is easier to understand and is less fragile to future changes. Postgres has a similar notion (see typispreferred in https://www.postgresql.org/docs/9.6/static/catalog-pg-type.html), and I think it comes with a lot of benefits.


Reviewed 4 of 11 files at r1.
Review status: all files reviewed at latest revision, 8 unresolved discussions.


pkg/sql/parser/datum.go, line 580 at r1 (raw file):

// if locale is invalid. Do not call this function concurrently with the same
// environment.
func NewDCollatedString(contents string, locale string, env *CollationEnvironment) *DCollatedString {

long line. Also, it's more common to see types say "not safe for concurrent use" than "Do not call this function concurrently with the same"


pkg/sql/parser/datum.go, line 651 at r1 (raw file):

// Size implements the Datum interface.
func (d *DCollatedString) Size() uintptr {
  return unsafe.Sizeof(*d) + uintptr(len(d.Contents)) + uintptr(len(d.Locale)) + uintptr(len(d.key))

long line again


pkg/sql/parser/eval.go, line 891 at r1 (raw file):

          RightType: TypeCollatedString,
          fn: func(_ *EvalContext, left Datum, right Datum) (DBool, error) {
              return DBool(bytes.Equal(left.(*DCollatedString).key, right.(*DCollatedString).key)), nil

Long lines throughout.


pkg/sql/parser/expr.go, line 1060 at r1 (raw file):

}

var _ TypedExpr = &CollateExpr{}

No need for this, the return from TypeCheck will statically assert it.


pkg/sql/pgwire/types.go, line 86 at r1 (raw file):

  switch {
  case istype(parser.TypeCollatedString):
      return pgType{oid.T_text, -1}

Is this what Postgres returns? I'm seeing oid.T_unknown


Comments from Reviewable

Introduces minimal support for collation: a collated string type, the
COLLATE operator, and overloads for casting and comparing collated
strings.

Significantly, other overloads for collated strings are not provided,
since adding them without upgrading the type checker will confuse
overload resolution (c.f. the discussion of timestamp vs. timestamptz in
#9627). As a temporary workaround, cast to a string, call the string
overload, and collate back.

Tracking issue for collation: #2473
@eisenstatdavid
Copy link
Author

Review status: all files reviewed at latest revision, 8 unresolved discussions.


pkg/sql/parser/datum.go, line 580 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

long line. Also, it's more common to see types say "not safe for concurrent use" than "Do not call this function concurrently with the same"

Changed the wording. I noticed the long line (101 characters), but crlfmt wants it as is. _shrug_

pkg/sql/parser/datum.go, line 581 at r1 (raw file):

Previously, knz (kena) wrote…

I'd move this and the condition below onto a method (*CollactionEnvironment).GetCachedEntry(string).

Done.

pkg/sql/parser/datum.go, line 651 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

long line again

100 characters even is fine.

pkg/sql/parser/eval.go, line 891 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Long lines throughout.

They're at or near 100 characters, and breaking them would harm legibility.

pkg/sql/parser/eval.go, line 1938 at r1 (raw file):

Previously, knz (kena) wrote…

No cast to a collated string? Is COLLATE mandatory to achieve this?

For now, yes. We'd need a new ColumnType, which I didn't want to bite off with this patch, and the syntax for casting to a collated string would be specific to CRDB, since pg doesn't really think of collated strings as a distinct type from string.

pkg/sql/parser/eval.go, line 1959 at r1 (raw file):

Previously, knz (kena) wrote…

I didn't know you could re-type the collation this way. In which use cases is this useful?

I have no idea, but pg allows it. _shrug_

pkg/sql/parser/expr.go, line 1060 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

No need for this, the return from TypeCheck will statically assert it.

Done.

pkg/sql/pgwire/types.go, line 86 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Is this what Postgres returns? I'm seeing oid.T_unknown

Thanks for checking, done.

Comments from Reviewable

@eisenstatdavid eisenstatdavid merged commit 172ae88 into cockroachdb:master Nov 14, 2016
@knz
Copy link
Contributor

knz commented Nov 14, 2016

You haven't addressed the comment at the start of Nathan's review. If you do not plan to do this here at least either explain why or file a follow-up issue.

@eisenstatdavid
Copy link
Author

I'll file a follow-up issue.


Comments from Reviewable

@jseldess
Copy link
Contributor

jseldess commented Dec 1, 2016

@eisenstatdavid, we still need to doc the new type, but I just regenerated the operators docs, and the collate operator isn't listed. Any ideas why?

@eisenstatdavid
Copy link
Author

It has its own syntax: 'x' COLLATE en.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants