Skip to content

scipy 2022 numpy porting discussion

Stepan Sindelar edited this page Aug 4, 2022 · 2 revisions

SciPy 2022 NumPy Porting Discussion

Present

Sebastian Berg (NumPy), Antonio Cuni (HPy), Simon Cross (HPy)

Minutes

We printed out Stepan Sindler's notes on the blockers and concerns encountered during the draft port of NumPy to HPy and discussed each of the issues with Sebastian.

The summary is that it appears that there are good ways to resolve all of the issues.

We also discussed some broader questions regarding the port of NumPy to HPy which are recorded at the end of these minutes.

Discussion of blockers and concerns

The following list goes point by point through the concerns outlined in Stepan's notes, with the headng of the concern in bold:

  1. NumPy uses the METH_FASTCALL | METH_KEYWORDS convention and has its own argument parser for that:

    Sebastian noted that numpy has this parer in order to avoid having to use Argument Clinic and suggested that we borrow numpy's parser and use it in HPy.

  2. metaclass support for heap types is missing in CPython:

    This is being implemented for HPy in https://github.com/hpyproject/hpy/pull/335.

  3. tp_vectorcall is not supposed to be used for heap types:

    After some discussion, we realised that we probably can use tp_vectorcall for heap types specifically in the case of numpy since only the ufuncs need it and they cannot be subclassed or otherwise tampered with.

  4. NumPy accesses tp_ slots directly:

    The function pointer comparisons are mostly used for "is-forward" (i.e. forwarding a call directly to the right function if the pointers are the same). NumPy are looking to clean this up themselves. A solution to this needs to be found, but Sebastian wasn't worried about it in principle.

  5. NumPy API: expose second capsule and header(s) with HPy based APIs?

    Everyone agreed that, yes, we should expose a second capsule with the HPy based APIs.

  6. global (as in C level global variables) caches:

    Sebastian said that he thinks that just using an HPyGlobal would be fine. Note (ss): HPyGlobal cannot store anything else than HPy. The caches I had in mind hold also primitive C types. We can solve that with the module state/context. However, my fear was mainly that the overhead of calling into HPy to retrieve the cache (either via module state or whatever else HPy provides for this) may diminish the advantage of the caching itself

  7. PyArrayObject -> HPy removes type information and type checking:*

    The concensus was that we need to make this nicer somehow for HPy users, but that it doesn't look like an insurmountable problem.

    We also discussed the need to decide whether modifying a struct retrieved from HPy_AsStruct immediately updates the value for everyone or not and document this.

Discussion of broader questions

At the start of the meeting we also discussed some high-level questions about what is needed to officially port NumPy to HPy:

  • It would be good to have a PEP for supporting HPy in CPython. The idea is not to add HPy to CPython, but rather for HPy to be an officially supported option.

  • We need to version the HPy API.

  • We need some way to guarantee HPy's long term survival. This could take various forms -- support from the CPython core developers, adoption as part of some other well-supported effort, financial support from somewhere, etc.

    Sebastian mentioned that it's not really the end of the world for numpy if HPy didn't see a lot of new development after a few years. One could still build ordinary C extensions from HPy numpy and the old Python C API itself cannot evolve paricularly fast anyway because it exposes so much.

  • NumPy are currently doing some clean-up and refactoring of the dtype infrastructure, so it could be a good time to ask for changes that would make porting to HPy easier.

Clone this wiki locally