Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C#] Column(string) method in RecordBatch is linear to the number of columns #44501

Open
vthemelis opened this issue Oct 22, 2024 · 2 comments · May be fixed by #44633
Open

[C#] Column(string) method in RecordBatch is linear to the number of columns #44501

vthemelis opened this issue Oct 22, 2024 · 2 comments · May be fixed by #44633

Comments

@vthemelis
Copy link

Describe the enhancement requested

It looks like a column lookup by name is linear to the number of columns. This is not intuitive and can easily lead to performance regressions. Would it be possible to add a lookup to convert this into an O(1) operation?

Component(s)

C#

@CurtHagenlocher
Copy link
Contributor

Column names are not actually required to be unique, which is why Schema.Fields is marked deprecated. If you know that the column names in your data are unique, you could still use it. Otherwise, we'd need something like a mapping of a string onto what's possibly a list of field positions.

@vthemelis
Copy link
Author

Hi @CurtHagenlocher and thanks for you reply! Very interesting that the column names don't need to be unique. I don't mind so much about that. I personally just want the existing retrieval functions to be faster than they are for their most common use-cases.

I added #44633 to do exactly that. Note that I would like to also replace the existing Lookups with signature string -> Field but unfortunately those use StringComparer.Default instead of StringComparer.CurrentCulture. Not sure if this is intentional.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants