Optionally represent monetdb str column as fixed with numpy string array #165

gijzelaerr · 2021-12-29T14:08:47Z

what we can also think of having an option to format monetdb string columns as numpy fixed-width columns in the result. I know database people find this ugly, but this is just how numpy works and what makes it fast (in the context of python). A common use case in data science panda land is that you have a short string column, containing a dataset label for example. Fixed width strings then suddenly start to make sense again. The issue now is that if you have a big table with int and floats and you call a .fetchdf() monetdbe is fast, but as soon as one of the columns is a (short) string, performance plummets, since it falls back to the python processing mode (with a warning). We could make this a configurable option for .fetchdf() and .fetchnumpy(), where you indicate if you need speed or are memory limited.

The text was updated successfully, but these errors were encountered:

gijzelaerr added this to the 0.12 milestone Dec 29, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optionally represent monetdb str column as fixed with numpy string array #165

Optionally represent monetdb str column as fixed with numpy string array #165

gijzelaerr commented Dec 29, 2021

Optionally represent monetdb str column as fixed with numpy string array #165

Optionally represent monetdb str column as fixed with numpy string array #165

Comments

gijzelaerr commented Dec 29, 2021