Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: unclear behavior if using a collation that's not present in pg_collation #54817

Open
timgraham opened this issue Sep 25, 2020 · 4 comments
Labels
A-sql-pgcatalog C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. O-community Originated from the community T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) X-blathers-triaged blathers was able to find an owner

Comments

@timgraham
Copy link
Contributor

timgraham commented Sep 25, 2020

Describe the problem

The behavior of using a collation (sv_x_icu in this example) that doesn't appear in pg_collation is unclear.

To Reproduce

CREATE TABLE schema_author (
    id INT8 NOT NULL DEFAULT unique_rowid(),
    name VARCHAR(255) NOT NULL,
    height INT8 NULL,
    weight INT8 NULL,
    uuid UUID NULL,
    alias VARCHAR(255) COLLATE sv_x_icu NOT NULL,
    CONSTRAINT "primary" PRIMARY KEY (id ASC),
    FAMILY "primary" (id, name, height, weight, uuid, alias),
    CONSTRAINT check_height CHECK (height >= 0)
)

The collation of the alias column is reported as NULL:

SELECT
    a.attname AS column_name,
    NOT (a.attnotnull OR (t.typtype = 'd' AND t.typnotnull)) AS is_nullable,
    pg_get_expr(ad.adbin, ad.adrelid) AS column_default,
    CASE WHEN collname = 'default' THEN NULL ELSE collname END AS collation
FROM pg_attribute a
LEFT JOIN pg_attrdef ad ON a.attrelid = ad.adrelid AND a.attnum = ad.adnum
LEFT JOIN pg_collation co ON a.attcollation = co.oid
JOIN pg_type t ON a.atttypid = t.oid
JOIN pg_class c ON a.attrelid = c.oid
JOIN pg_namespace n ON c.relnamespace = n.oid
WHERE c.relkind IN ('f', 'm', 'p', 'r', 'v')
    AND c.relname = 'schema_author'
    AND n.nspname NOT IN ('pg_catalog', 'pg_toast')
    AND pg_catalog.pg_table_is_visible(c.oid)

  column_name | is_nullable | column_default | collation
--------------+-------------+----------------+------------
  uuid        |    true     | NULL           | NULL
  id          |    false    | unique_rowid() | NULL
  height      |    true     | NULL           | NULL
  weight      |    true     | NULL           | NULL
  name        |    false    | NULL           | en-US
  alias       |    false    | NULL           | NULL

Expected behavior

The collation is reported correctly when instead using a collation that appears in pg_collation.collname such as sv.

If sv_x_icu is supported, should it be present in pg_collation? If it's not supported, perhaps an error should be reported when trying to use it in CREATE TABLE etc.

Incidentally, the CockroachDB collation docs aren't clear to me. It mentions "the BCP 47 language tag at the end of each line, immediately preceded by //" but the reference document seems to have changed since that was written as I don't see such lines. While researching this, I found a documentation issue to "Document special colations".

Environment:

  • CockroachDB version 20.2.0-alpha.1-3636-g3c81612122 @ 2020/09/24 20:37:23

Additional context

Django 3.2 adds support for column collations.

Jira issue: CRDB-3711

@blathers-crl
Copy link

blathers-crl bot commented Sep 25, 2020

Hello, I am Blathers. I am here to help you get the issue triaged.

Hoot - a bug! Though bugs are the bane of my existence, rest assured the wretched thing will get the best of care here.

I have CC'd a few people who may be able to assist you:

If we have not gotten back to your issue within a few business days, you can try the following:

  • Join our community slack channel and ask on #cockroachdb.
  • Try find someone from here if you know they worked closely on the area and CC them.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@blathers-crl blathers-crl bot added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. O-community Originated from the community X-blathers-triaged blathers was able to find an owner labels Sep 25, 2020
@otan
Copy link
Contributor

otan commented Nov 12, 2020

This collation compatibility needs a larger re-think than a quick fix I can apply here.

We do not normalise collation names in any way, making them impossible to join against the pg_collation table (#56352 makes this go further from making this work).

Furthermore, allowing any BCP47 extension (https://www.w3.org/International/multilingualweb/dublin/slides/23b-davis.pdf) means there's an infinite amount of possibilities which would make this impossible to join unless we start joining against a builtin which normalises these names and joining on that. Postgres seems to very deliberately only join a set list, which makes this hard to imitate.

@otan otan removed their assignment Nov 12, 2020
@otan
Copy link
Contributor

otan commented Nov 12, 2020

retriaging this as it's a bigger task that requires more thought.

@rafiss
Copy link
Collaborator

rafiss commented Nov 16, 2020

Maybe we should introduce a builtin that removes the extension from the collation name, and use that to join with.

@rafiss rafiss added the T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) label May 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-pgcatalog C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. O-community Originated from the community T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) X-blathers-triaged blathers was able to find an owner
Projects
None yet
Development

No branches or pull requests

3 participants