Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scoped_interpreter. overloaded constructor: PyConfig param #4330

Merged
merged 19 commits into from
Dec 1, 2022

Conversation

arman-novikov
Copy link
Contributor

@arman-novikov arman-novikov commented Nov 13, 2022

Description

In some cases it is convenient to have an opportunity to construct a scoped_interpreter with a custom PyConfig instance supplied.

PyConfig my_config{};
//...
pybind11::scoped_interpreter{my_config}

Suggested changelog entry:

* feat: scoped_interpreter overloaded ctor: PyConfig param
* tests: Add program dir to path, Custom PyConfig, Custom PyConfig with argv

@arman-novikov arman-novikov marked this pull request as draft November 13, 2022 15:23
@arman-novikov arman-novikov marked this pull request as ready for review November 13, 2022 15:44
Copy link
Collaborator

@Skylion007 Skylion007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, it would be great to have a test added to our test suite as well.

Comment on lines 269 to 273
PyStatus status = Py_InitializeFromConfig(&config);
if (PyStatus_Exception(status)) {
throw std::runtime_error(PyStatus_IsError(status) ? status.err_msg : "Failed to init CPython");
throw std::runtime_error(PyStatus_IsError(status) ? status.err_msg
: "Failed to init CPython");
}
Copy link
Collaborator

@Skylion007 Skylion007 Nov 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code appears to be entirely duplicated from the end on initialize. Furthermore, for consistency py::initialize should also be able to take a config on recent python versions. Therefore, I would suggest abstracting this into an overload for initialize_interpreter(const PyConfig & config) that is only built when the python is recent enough. That would reduce code duplicate and make this a bit cleaner without changing the API that much.

The scoped_interepreter is really just an RAII for these calls anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your feedback. I find your suggestions reasonable.

However, I would rather get a sign of approval from the maintainers that they [want/do not mind] to have the feature introduced into the project before I rush into refactoring and extending tests. Is it at all possible to find out their opinion?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seem reasonable to me, @rwgk @henryiii thoughts?

@rwgk
Copy link
Collaborator

rwgk commented Nov 13, 2022

I think it'll be great to support passing PyConfig, and ideally to clean up:

The current implementation of initialize_interpreter() was merged in a rush despite me being "really concerned": #3923 (comment)

EDIT OOPS this is NOT TRUE, I just made a mistake:

I stopped looking when I saw the merge commit back then while working on my feedback, but looking again now, I see that the argc and argv arguments are completely and silently ignored in the Python 3.11 branch, which is obviously error prone, especially because the function documentation was not updated when the Python 3.11 branch was added.

I think a good start for this PR would be to cleanly split the two implementations, which will make the argc / argv situation immediately obvious, then to figure out how to best design passing PyConfig

  • without duplicating code, and
  • without silently (runtime or documentation) ignoring arguments, depending on the Python version.

It's not immediately obvious to me how to provide an API that works intuitively for Python < 3.11 and >= 3.11. Can it be done in a reasonable way without expecting

#if PY_VERSION_HEX < 0x030B0000
...
#else
...
#endif

constructs in user code? @arman-novikov what do you think?

@rwgk
Copy link
Collaborator

rwgk commented Nov 13, 2022

Sorry our codesearch hover-over tooling failed me: it doesn't highlight the argc, argv in the 3.11 branch for some reason (I just double-checked). I only realized when searching in a normal editor.

That changes my thinking a lot:

Now I'm thinking we need all arguments for >= 3.11, rough start:

    explicit scoped_interpreter(bool init_signal_handlers = true,
                                int argc = 0,
                                const char *const *argv = nullptr,
                                bool add_program_dir_to_path = true
                                PyConfig *config = nullptr) {

Although init_signal_handlers is redundant if config is passed. Not sure how to best deal with that.

Also not sure what to do about PyConfig for < 3.11, maybe like this? (Note: that code was removed already when we dropped Python 2 support.)

#if PY_MAJOR_VERSION >= 3
using module_def = PyModuleDef;
#else
struct module_def {};
#endif

Copy link
Collaborator

@rwgk rwgk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the direction this is going.

config_guard(PyConfig& config): m_config{config}{}
~config_guard(){PyConfig_Clear(&m_config);}
private:
PyConfig& m_config;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if that's behind the Python 3.11 segfaults in the CI, but it's very error prone. The performance hit for making this safe is probably not measurable in any real world use, or even unit tests.

If you really want to pass in an existing config:

  • Just copy.

  • Or use std::unique_ptr, which would also make this RAII pattern safe.

  • If not using std::unique_ptr, the copy ctor and operator need to be deleted.

BUT

  • scoped_config would seem more fitting as a name.

How about this?

struct scoped_config {
    scoped_config() = default;
    ~scoped_config() { PyConfig_Clear(&config); };

    scoped_config(const scoped_config &) = delete;
    scoped_config &operator=(const scoped_config &) = delete;

    PyConfig config;
};

...

    scoped_config scoped_cfg;
    PyConfig_InitIsolatedConfig(&scoped_cfg.config);
    scoped_cfg.config.isolated = 0;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is an excellent move. I will introduce it asap.


/*!
* \warning should not be called by user
* \todo put it into the scoped_interpreter's private section?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could simply put this in the detail namespace, like a lot of existing code not intended for public use.

"if sys.argv and os.path.exists(sys.argv[0]) else '')");
}
precheck_interpreter();
_initialize_interpreter(config, add_program_dir_to_path);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Symbols with a leading underscore are reserved for the compiler + library.
In this codebase we don't have them anywhere AFAIK.
detail::initialize_interpreter() could work.

#if PY_VERSION_HEX >= 0x030B0000
class config_guard {
public:
config_guard(PyConfig &config) : m_config{config} {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete copy ctor, and copy assignment.

@@ -88,6 +88,42 @@ inline wchar_t *widen_chars(const char *safe_arg) {

PYBIND11_NAMESPACE_END(detail)

inline void precheck_interpreter() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should DEFINITELY be in detail namespace as well.

@rwgk
Copy link
Collaborator

rwgk commented Nov 16, 2022

BTW

While looking at the documentation for PyConfig I noticed "New in version 3.8."

I'm wondering, would it make sense / should we try to change the condition to #if PY_VERSION_HEX >= 0x03080000?

But what I'd probably do:

  • get it to pass with 3.11 first,
  • only then try how far back we can go.

@arman-novikov arman-novikov requested review from Skylion007 and removed request for henryiii November 18, 2022 22:41
@arman-novikov
Copy link
Contributor Author

@Skylion007, the test has been introduced in a commit. I was forced to set up a newer version of python (3.8) in CI. The CI change has been introduced in commit. Would you please review the change?

@arman-novikov
Copy link
Contributor Author

BTW

While looking at the documentation for PyConfig I noticed "New in version 3.8."

I'm wondering, would it make sense / should we try to change the condition to #if PY_VERSION_HEX >= 0x03080000?

But what I'd probably do:

* get it to pass with 3.11 first,

* only then try how far back we can go.

It seems python 3.8 was the first version with PyConfig support.

.appveyor.yml Outdated
@@ -9,7 +9,7 @@ platform:
- x86
environment:
matrix:
- PYTHON: 36
- PYTHON: 38
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was forced to set up a newer version of python (3.8) in CI.

This sounds like a misunderstanding. Whatever is done in this PR should not break the existing test (because it would also break users). Could you please undo this change?

(I think dropping Python 3.6 support is past due, but that's a separate step.)

@@ -55,6 +55,7 @@
PYBIND11_TOSTRING(name), PYBIND11_CONCAT(pybind11_init_impl_, name)); \
void PYBIND11_CONCAT(pybind11_init_, name)(::pybind11::module_ \
& variable) // NOLINT(bugprone-macro-parentheses)
#define PYBIND11_PYCONFIG_SUPPORT_PY_VERSION (0x03080000)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PYBIND11_PYCONFIG_SUPPORT_PY_VERSION_HEX (for clarity)

@@ -261,6 +283,13 @@ class scoped_interpreter {
initialize_interpreter(init_signal_handlers, argc, argv, add_program_dir_to_path);
}

#if PY_VERSION_HEX >= PYBIND11_PYCONFIG_SUPPORT_PY_VERSION
explicit scoped_interpreter(const PyConfig &config, bool add_program_dir_to_path = true) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs to become:

    explicit scoped_interpreter(PyConfig *config,
                                int argc = 0,
                                const char *const *argv = nullptr,
                                bool add_program_dir_to_path = true) {

I'm not sure how this will shake out best.

I'm thinking the scoped_config will not actually be needed.

I'd definitely split out initialize_interpreter_pre_pyconfig() for clarity and safety (unused variables in one branch or another), in the detail namespace.

Then there would be two initialize_interpreter() overloads:

#if PY_VERSION_HEX >= PYBIND11_PYCONFIG_SUPPORT_PY_VERSION_HEX
inline void initialize_interpreter(PyConfig *config,
                                   int argc = 0,
                                   const char *const *argv = nullptr,
                                   bool add_program_dir_to_path = true) {
    ... implementation here ...
}
#endif

inline void initialize_interpreter(bool init_signal_handlers = true,
                                   int argc = 0,
                                   const char *const *argv = nullptr,
                                   bool add_program_dir_to_path = true) {
#if PY_VERSION_HEX < PYBIND11_PYCONFIG_SUPPORT_PY_VERSION_HEX
    detail::initialize_interpreter_pre_pyconfig(
        init_signal_handlers, argc, argv, add_program_dir_to_path);
#else
    PyConfig config;
    PyConfig_InitIsolatedConfig(&config);
    config.isolated = 0;
    config.use_environment = 1;
    config.install_signal_handlers = init_signal_handlers ? 1 : 0;
    initialize_interpreter(&config, argc, argv, add_program_dir_to_path);
#endif
}

Then the old scoped_interpreter constructor can stay as-is, and the new one will simply be:

    explicit scoped_interpreter(PyConfig *config,
                                int argc = 0,
                                const char *const *argv = nullptr,
                                bool add_program_dir_to_path = true) {
        initialize_interpreter(config, argc, argv, add_program_dir_to_path);
    }

Note that I'd choose to make this PyConfig *config instead of PyConfig &config, so that it is more obvious that the config is changed in place (PyConfig_Clear()). It seems very unlikely that anyone would want config to be const, although that's only a best guess. When you're done with this PR (CI passes), we should run it by Victor Stinner (not tagging him just now intentionally).

Comment on lines 289 to 290
detail::precheck_interpreter();
detail::initialize_interpreter(config, argc, argv, add_program_dir_to_path);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will be most useful

  • to move this initialize_interpreter() overload out of the detail namespace,
  • and the detail::precheck_interpreter(); call to the very top of that overload, just like for the other overload.

The end result will be that the pair of constructors and the pair of initialize_interpreter() overloads each are almost identical, except for the first argument (bool vs PyCconfig).

It's very little extra work, it looks great already.

@rwgk rwgk force-pushed the scoped_interp_from_pyconfig branch from cc89470 to b94d6ba Compare November 23, 2022 16:35
@rwgk
Copy link
Collaborator

rwgk commented Nov 23, 2022

The force push was for git rebase master.

I added one simple commit, which was much easier than explaining here: b94d6ba

I just looked, our existing test coverage isn't complete, unless I'm overlooking something, add_program_dir_to_path is not exercised at all. Covering this systematically for both pre-PyConfig and PyConfig would be great, but I see that as optional, covering that just for the new PyConfig overload would be sufficient for this PR I think.

This existing test case looks like a good starting point:

TEST_CASE("sys.argv gets initialized properly") {
py::finalize_interpreter();
{
py::scoped_interpreter default_scope;
auto module = py::module::import("test_interpreter");
auto py_widget = module.attr("DerivedWidget")("The question");
const auto &cpp_widget = py_widget.cast<const Widget &>();
REQUIRE(cpp_widget.argv0().empty());
}
{
char *argv[] = {strdup("a.out")};
py::scoped_interpreter argv_scope(true, 1, argv);
std::free(argv[0]);
auto module = py::module::import("test_interpreter");
auto py_widget = module.attr("DerivedWidget")("The question");
const auto &cpp_widget = py_widget.cast<const Widget &>();
REQUIRE(cpp_widget.argv0() == "a.out");
}
py::initialize_interpreter();
}

Could you please try to morph that into an additional test case, with this at the core?

        py::scoped_interpreter argv_scope(&config, 1, argv, true);

I don't know how to tricky it is to assert that add_program_dir_to_path is handled properly. I hope it's within reason, I'd definitely try.

arman-novikov and others added 4 commits November 26, 2022 11:22
…ace.

Move the `PYBIND11_PYCONFIG_SUPPORT_PY_VERSION_HEX` define down to where it is used for the first time, and check if it is defined already, so that it is possible to customize from the compilation command line, just in case there is some unforeseen issue for Python 3.8, 3.9, 3.10.
@arman-novikov arman-novikov force-pushed the scoped_interp_from_pyconfig branch from 4794892 to e5e8c28 Compare November 26, 2022 09:22
@rwgk
Copy link
Collaborator

rwgk commented Nov 27, 2022

LGTM but

  • 🐍 3 • ICC latest • x64 (using Python 3.8)
  • 🐍 3 • almalinux:9 • x64 (using Python 3.9)

now produce similar errors. Do you have ideas already?

I think the segfaults could be side-effects of the failed REQUIRE( get_sys_path_size() == sys_path_default_size + 1 ) and are possibly resolved when the REQUIRE is fixed.

@rwgk
Copy link
Collaborator

rwgk commented Nov 27, 2022

Interesting, I see you changed the expecations from == default + 1 to > default.

Did you look why it is not default + 1 in the two failure cases?

@arman-novikov
Copy link
Contributor Author

it seems a value remains in sys path left by the previous call of scoped_interpreter ctor

@rwgk
Copy link
Collaborator

rwgk commented Nov 27, 2022

it seems a value remains in sys path left by the previous call of scoped_interpreter ctor

Not good ... but that sounds like a problem in CPython.

From the previous run it seems to be a problem only for Python 3.8 and 3.9. Maybe only some platforms?

  • But — I don't want to turn this into a science project.
  • But also — I want to keep the test as strict as possible.

What do you think about a simple "future-strict" approach? Something like:

#if PY_VERSION_HEX < 0x030A0000
    // It seems a value remains in sys.path left by the previous call of scoped_interpreter ctor.
    REQUIRE(get_sys_path_size() > sys_path_default_size);
#else
    REQUIRE(get_sys_path_size() == sys_path_default_size + 1);
#endif

IIRC the 1st REQUIRE(get_sys_path_size() == sys_path_default_size + 1); succeeded everywhere, only the 2nd failed and would need that #if.

Copy link
Collaborator

@rwgk rwgk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the last commit!

@Skylion007 could you please take another look?

@rwgk
Copy link
Collaborator

rwgk commented Nov 28, 2022

I forgot to mention: it would be great if you could update the PR description and changelog entry.

@arman-novikov
Copy link
Contributor Author

@Skylion007 would you please share your review? I've already covered new (and some current) features with tests

Copy link
Collaborator

@Skylion007 Skylion007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@rwgk
Copy link
Collaborator

rwgk commented Dec 1, 2022

Thanks @arman-novikov for all the work, thanks @Skylion007 for the review!

@rwgk rwgk merged commit 8869984 into pybind:master Dec 1, 2022
@github-actions github-actions bot added the needs changelog Possibly needs a changelog entry label Dec 1, 2022
rwgk pushed a commit to rwgk/pybind11 that referenced this pull request Dec 1, 2022
rwgk pushed a commit that referenced this pull request Dec 1, 2022
@henryiii henryiii removed the needs changelog Possibly needs a changelog entry label Dec 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants