Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: mysql order doesn't respec ASCII values #9439

Closed
1 task done
ncclementi opened this issue Jun 25, 2024 · 5 comments
Closed
1 task done

bug: mysql order doesn't respec ASCII values #9439

ncclementi opened this issue Jun 25, 2024 · 5 comments
Labels
mssql The Microsoft SQL Server backend mysql The MySQL backend not caught by tests hubris

Comments

@ncclementi
Copy link
Contributor

What happened?

In mysql when ordering a column that has uppercase and lowercase strings, the order doesn't follow ASCII orders.

Minimal reproducer:

  • just up mysql
import ibis
from ibis import _
import os

MYSQL_USER = os.environ.get("IBIS_TEST_MYSQL_USER", "ibis")
MYSQL_PASS = os.environ.get("IBIS_TEST_MYSQL_PASSWORD", "ibis")
MYSQL_HOST = os.environ.get("IBIS_TEST_MYSQL_HOST", "localhost")
MYSQL_PORT = int(os.environ.get("IBIS_TEST_MYSQL_PORT", 3306))
IBIS_TEST_MYSQL_DB = os.environ.get("IBIS_TEST_MYSQL_DATABASE", "ibis_testing")

con = ibis.mysql.connect(
user=MYSQL_USER,
password=MYSQL_PASS,
host=MYSQL_HOST,
port=MYSQL_PORT,
database=IBIS_TEST_MYSQL_DB,
)

 t = ibis.memtable(
 {
 "col1": [1, 3, 2, 1, 3, 1],
 "col2": ["a","a", "B", "c", "D", "a"],
 }
 )
 
expr = t.order_by([_.col2.desc()])

con.execute(expr)

   col1 col2
0     3    D
1     1    c
2     2    B
3     1    a
4     3    a
5     1    a
  

What version of ibis are you using?

main

What backend(s) are you using, if any?

mysql

Relevant log output

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@ncclementi ncclementi added the bug Incorrect behavior inside of ibis label Jun 25, 2024
@cpcloud
Copy link
Member

cpcloud commented Jun 25, 2024

This is almost certainly a MySQL-ism and not something we can control. I vaguely recall some nonsense around collation being case-INsensitive.

@cpcloud
Copy link
Member

cpcloud commented Jun 25, 2024

Ok, well, it IS as MySQL-ism, but perhaps we can control it :)

https://dev.mysql.com/doc/refman/8.4/en/sorting-rows.html

image

Adding in the text of the above for searchability:

On character type columns, sorting—like all other comparison operations—is normally performed in a case-insensitive fashion. This means that the order is undefined for columns that are identical except for their case. You can force a case-sensitive sort for a column by using BINARY like so: ORDER BY BINARY col_name.

@ncclementi
Copy link
Contributor Author

Based on this #9385 (comment). Should we close this one? or Does mysql still deserves the chance to try to sort this out? (pun intended)

@gforsyth gforsyth added mysql The MySQL backend requires upstream support Feature or bug requires support from the upstream project mssql The Microsoft SQL Server backend not caught by tests hubris and removed bug Incorrect behavior inside of ibis requires upstream support Feature or bug requires support from the upstream project labels Jun 26, 2024
@gforsyth
Copy link
Member

Our test suite wasn't catching this because we don't have mixed-case strings in our test data.

@gforsyth
Copy link
Member

Closing this out as it is the defined behavior in MySQL -- to get ASCII sorting, first cast to binary.

@github-project-automation github-project-automation bot moved this from backlog to done in Ibis planning and roadmap Jun 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mssql The Microsoft SQL Server backend mysql The MySQL backend not caught by tests hubris
Projects
Archived in project
Development

No branches or pull requests

3 participants