Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support 'vector' datatype of pgvector #9025

Closed
1 task done
jieguangzhou opened this issue Apr 19, 2024 · 11 comments · Fixed by #9037
Closed
1 task done

feat: Support 'vector' datatype of pgvector #9025

jieguangzhou opened this issue Apr 19, 2024 · 11 comments · Fixed by #9037
Assignees
Labels
datatypes Issues relating to ibis's datatypes (under `ibis.expr.datatypes`) feature Features or general enhancements postgres The PostgreSQL backend

Comments

@jieguangzhou
Copy link

jieguangzhou commented Apr 19, 2024

Is your feature request related to a problem?

Is there any way to create vector type data on PostgreSQL using ibis?

I have created a table using pgvector, but I cannot access it through ibis, although I can access it using raw SQL. However, I cannot access the table.

Raw SQL

In [18]: query = "SELECT * FROM items"

In [19]: list(conn.raw_sql(query))
Out[19]: [(1, '[1,2,3]')]

Accecc the table

In [20]: conn.table('items')
---------------------------------------------------------------------------
ParseError                                Traceback (most recent call last)
File ~/workspace/SuperDuperDB/superduperdb/env/lib/python3.10/site-packages/sqlglot/parser.py:1073, in Parser.parse_into(self, expression_types, raw_tokens, sql)
   1072 try:
-> 1073     return self._parse(parser, raw_tokens, sql)
   1074 except ParseError as e:

File ~/workspace/SuperDuperDB/superduperdb/env/lib/python3.10/site-packages/sqlglot/parser.py:1112, in Parser._parse(self, parse_method, raw_tokens, sql)
   1111 if self._index < len(self._tokens):
-> 1112     self.raise_error("Invalid expression / Unexpected token")
   1114 self.check_errors()

File ~/workspace/SuperDuperDB/superduperdb/env/lib/python3.10/site-packages/sqlglot/parser.py:1153, in Parser.raise_error(self, message, token)
   1152 if self.error_level == ErrorLevel.IMMEDIATE:
-> 1153     raise error
   1155 self.errors.append(error)

ParseError: Invalid expression / Unexpected token. Line 1, Col: 6.
  vector(3)

The above exception was the direct cause of the following exception:

ParseError                                Traceback (most recent call last)
Cell In[20], line 1
----> 1 conn.table('items')

File ~/workspace/SuperDuperDB/superduperdb/env/lib/python3.10/site-packages/ibis/backends/base/sql/alchemy/__init__.py:688, in BaseAlchemyBackend.table(self, name, database, schema)
    663 """Create a table expression from a table in the database.
    664
    665 Parameters
   (...)
    684     Table expression
    685 """
    686 namespace = ops.Namespace(schema=schema, database=database)
--> 688 sqla_table = self._get_sqla_table(name, namespace=namespace)
    690 schema = self._schema_from_sqla_table(
    691     sqla_table, schema=self._schemas.get(name)
    692 )
    693 node = ops.DatabaseTable(
    694     name=name, schema=schema, source=self, namespace=namespace
    695 )

File ~/workspace/SuperDuperDB/superduperdb/env/lib/python3.10/site-packages/ibis/backends/base/sql/alchemy/__init__.py:533, in BaseAlchemyBackend._get_sqla_table(self, name, namespace, autoload, **_)
    531 if not nulltype_cols:
    532     return table
--> 533 return self._handle_failed_column_type_inference(table, nulltype_cols)

File ~/workspace/SuperDuperDB/superduperdb/env/lib/python3.10/site-packages/ibis/backends/base/sql/alchemy/__init__.py:586, in BaseAlchemyBackend._handle_failed_column_type_inference(self, table, nulltype_cols)
    577 dialect = self.con.dialect
    579 quoted_name = ".".join(
    580     map(
    581         dialect.identifier_preparer.quote,
    582         filter(None, [table.schema, table.name]),
    583     )
    584 )
--> 586 for colname, dtype in self._metadata(quoted_name):
    587     if colname in nulltype_cols:
    588         # replace null types discovered by sqlalchemy with non null
    589         # types
    590         table.append_column(
    591             sa.Column(
    592                 colname,
   (...)
    597             replace_existing=True,
    598         )

File ~/workspace/SuperDuperDB/superduperdb/env/lib/python3.10/site-packages/ibis/backends/postgres/__init__.py:278, in Backend._metadata(self, query)
    276 con.exec_driver_sql(f"CREATE TEMPORARY VIEW {name} AS {query}")
    277 try:
--> 278     yield from (
    279         (col, PostgresType.from_string(typestr))
    280         for col, typestr in con.execute(text)
    281     )
    282 finally:
    283     con.exec_driver_sql(f"DROP VIEW IF EXISTS {name}")

File ~/workspace/SuperDuperDB/superduperdb/env/lib/python3.10/site-packages/ibis/backends/postgres/__init__.py:279, in <genexpr>(.0)
    276 con.exec_driver_sql(f"CREATE TEMPORARY VIEW {name} AS {query}")
    277 try:
    278     yield from (
--> 279         (col, PostgresType.from_string(typestr))
    280         for col, typestr in con.execute(text)
    281     )
    282 finally:
    283     con.exec_driver_sql(f"DROP VIEW IF EXISTS {name}")

File ~/workspace/SuperDuperDB/superduperdb/env/lib/python3.10/site-packages/ibis/backends/postgres/datatypes.py:88, in PostgresType.from_string(cls, type_string)
     86 @classmethod
     87 def from_string(cls, type_string: str) -> PostgresType:
---> 88     return SqlglotPostgresType.from_string(type_string)

File ~/workspace/SuperDuperDB/superduperdb/env/lib/python3.10/site-packages/ibis/backends/base/sqlglot/datatypes.py:175, in SqlglotType.from_string(cls, text, nullable)
    172 if dtype := cls.unknown_type_strings.get(text.lower()):
    173     return dtype
--> 175 sgtype = sg.parse_one(text, into=sge.DataType, read=cls.dialect)
    176 return cls.to_ibis(sgtype, nullable=nullable)

File ~/workspace/SuperDuperDB/superduperdb/env/lib/python3.10/site-packages/sqlglot/__init__.py:123, in parse_one(sql, read, dialect, into, **opts)
    120 dialect = Dialect.get_or_raise(read or dialect)
    122 if into:
--> 123     result = dialect.parse_into(into, sql, **opts)
    124 else:
    125     result = dialect.parse(sql, **opts)

File ~/workspace/SuperDuperDB/superduperdb/env/lib/python3.10/site-packages/sqlglot/dialects/dialect.py:447, in Dialect.parse_into(self, expression_type, sql, **opts)
    444 def parse_into(
    445     self, expression_type: exp.IntoType, sql: str, **opts
    446 ) -> t.List[t.Optional[exp.Expression]]:
--> 447     return self.parser(**opts).parse_into(expression_type, self.tokenize(sql), sql)

File ~/workspace/SuperDuperDB/superduperdb/env/lib/python3.10/site-packages/sqlglot/parser.py:1078, in Parser.parse_into(self, expression_types, raw_tokens, sql)
   1075         e.errors[0]["into_expression"] = expression_type
   1076         errors.append(e)
-> 1078 raise ParseError(
   1079     f"Failed to parse '{sql or raw_tokens}' into {expression_types}",
   1080     errors=merge_errors(errors),
   1081 ) from errors[-1]

ParseError: Failed to parse 'vector(3)' into <class 'sqlglot.expressions.DataType'>

What is the motivation behind your request?

  1. I want to create a table with vector datatype in Postgresql
  2. I want to access the table with vector datatype in Postgresql

Describe the solution you'd like

I want to create a vector type using dtype("pgvector") or dtype('vector')

What version of ibis are you running?

❯ pip list |grep ibis
ibis                          3.3.0
ibis-framework                8.0.0

What backend(s) are you using, if any?

Postgresql

Code of Conduct

  • I agree to follow this project's Code of Conduct
@cpcloud
Copy link
Member

cpcloud commented Apr 19, 2024

What version of ibis are you running?

3.3.0

This doesn't seem correct. How did you compute that version number?

@jieguangzhou jieguangzhou changed the title feat: Support datatype 'vector' of pgvector feat: Support 'vector' datatype of pgvector Apr 19, 2024
@jieguangzhou
Copy link
Author

jieguangzhou commented Apr 19, 2024

What version of ibis are you running?

3.3.0

This doesn't seem correct. How did you compute that version number?

Hey @cpcloud Thank you for the quick reply, I have resent the version number.

❯ pip list |grep ibis
ibis                          3.3.0
ibis-framework                8.0.0

@cpcloud
Copy link
Member

cpcloud commented Apr 19, 2024

I'm surprised anything works at all!

Those two libraries cannot coexist in the same environment. You need to remove ibis if you want to use ibis-framework.

@jieguangzhou
Copy link
Author

I'm surprised anything works at all!

Those two libraries cannot coexist in the same environment. You need to remove ibis if you want to use ibis-framework.

Hmm, Thank you for your reminder

@gforsyth
Copy link
Member

Hey @jieguangzhou this should be fixed on main -- you can try out a pre-release install by running python -m pip install -U --pre 'ibis-framework[postgres]' (or just hang tight for another week or so and 9.0 will be released.

@makkarss929
Copy link

Hey @jieguangzhou this should be fixed on main -- you can try out a pre-release install by running python -m pip install -U --pre 'ibis-framework[postgres]' (or just hang tight for another week or so and 9.0 will be released.

Hi @gforsyth , I have tried, this is not working

@blythed
Copy link

blythed commented Apr 22, 2024

@makkarss929 @jieguangzhou you are saying you get the exact same error? Maybe also information about your systems?

@makkarss929
Copy link

Yes @blythed

@gforsyth
Copy link
Member

Thanks for the reports, all! We added support for parsing a bare vector dtype, but not for one with specified precision, like vector(3). I'll add this to the backlog.

@gforsyth
Copy link
Member

Hey all -- just a heads up that we now have support for variable precision vector types from pgvector on main. It will be available in the upcoming 9.0 release.

@gforsyth gforsyth added postgres The PostgreSQL backend datatypes Issues relating to ibis's datatypes (under `ibis.expr.datatypes`) labels Apr 25, 2024
@jieguangzhou
Copy link
Author

@gforsyth Great, thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datatypes Issues relating to ibis's datatypes (under `ibis.expr.datatypes`) feature Features or general enhancements postgres The PostgreSQL backend
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

5 participants