Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add strings_to_ctypes_array to convert a sequence of strings into a ctypes array #3137

Merged
merged 4 commits into from
Mar 26, 2024

Conversation

seisman
Copy link
Member

@seisman seisman commented Mar 23, 2024

Description of proposed changes

To pass a list of strings into a ctypes function, currently, we have codes like the below, which is not readable:

strings_pointer = (ctp.c_char_p * len(strings))()
strings_pointer[:] = np.char.encode(strings)

So better to have a strings_to_ctypes_array function, which can hide the technical details.

The above two lines codes can be shortened into a single line of code:

strings_pointer = (ctp.c_char_p * len(strings))(*np.char.encode(strings))

Actually np.char.encode calls the str.encode element-wise, so it can be further written as:

strings_pointer = (ctp.c_char_p * len(strings))(*[s.encode() for s in strings])

The list comprehension version is faster than the np.char.encode version:

>>> import numpy as np
>>> import ctypes as ctp
>>> strings = ["ABC", "DEFGHI", "ABC123", "ABC1234566"]
>>> %timeit (ctp.c_char_p * 4)(*[s.encode() for s in strings])
1.37 µs ± 3.79 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

>>> %timeit (ctp.c_char_p * 4)(*np.char.encode(strings))
6.17 µs ± 22.5 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

This PR adds the strings_to_ctypes_array function to do the conversion work. It just contains one line of code and doesn't check if the strings is an empty list to avoid extra overheads.

PR #3136 is a similar work but for converting a sequence of numbers to a ctypes array. These two functions share similar codes but I prefer not to combine them into a single function to avoid too many if-else clauses in the low-level function. Better to review these two PRs back-to-back.

@seisman seisman force-pushed the strings_to_ctypes_array branch from 32c97d5 to 2fe50a9 Compare March 23, 2024 08:42
@seisman seisman added the maintenance Boring but important stuff for the core devs label Mar 23, 2024
@seisman seisman added this to the 0.12.0 milestone Mar 23, 2024
@seisman seisman added the needs review This PR has higher priority and needs review. label Mar 23, 2024
@seisman seisman added run/benchmark Trigger the benchmark workflow in PRs and removed run/benchmark Trigger the benchmark workflow in PRs labels Mar 23, 2024
@seisman seisman changed the title Add a function strings_to_ctypes_array to convert a sequence of strings into a ctypes array Add strings_to_ctypes_array to convert a sequence of strings into a ctypes array Mar 23, 2024
@michaelgrund michaelgrund added final review call This PR requires final review and approval from a second reviewer and removed needs review This PR has higher priority and needs review. labels Mar 25, 2024
@seisman seisman removed the final review call This PR requires final review and approval from a second reviewer label Mar 26, 2024
@seisman seisman merged commit 62eb5d6 into main Mar 26, 2024
6 of 7 checks passed
@seisman seisman deleted the strings_to_ctypes_array branch March 26, 2024 01:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Boring but important stuff for the core devs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants