Skip to content

[BUG]: __new__ does not initialize STL containers and resulting in undefined behavior #4549

Open
@XuehaiPan

Description

@XuehaiPan

Required prerequisites

What version (or hash if on master) of pybind11 are you using?

2.10.3

Problem description

I'm writing a C++ class with some STL-like containers as its members. Then I bind it with pybind11 and expose it to Python. However, the bind C++ class cannot behave like a normal Python class as expected.

In [1]: !pip3 install optree 
   ...: from optree import PyTreeSpec

In [2]: PyTreeSpec
Out[2]: <class 'optree.PyTreeSpec'>

In [3]: PyTreeSpec.mro()
Out[3]: [<class 'optree.PyTreeSpec'>, <class 'pybind11_builtins.pybind11_object'>, <class 'object'>]

In [4]: PyTreeSpec()  # an error raised as expected
TypeError: optree.PyTreeSpec: No constructor defined!

In [5]: spec = PyTreeSpec.__new__(PyTreeSpec)  # expect to raise an error

In [6]: repr(spec)  # segfault due to invalid memory access
[1]    31095 segmentation fault  ipython3

All bind types created by py::class_<CppClass> are inherit from pybind11_builtins.pybind11_object. As the comment says:

/// Instance creation function for all pybind11 types. It allocates the internal instance layout
/// for holding C++ objects and holders. Allocation is done lazily (the first time the instance is
/// cast to a reference or pointer), and initialization is done by an `__init__` function.
inline PyObject *make_new_instance(PyTypeObject *type) {
#if defined(PYPY_VERSION)
// PyPy gets tp_basicsize wrong (issue 2482) under multiple inheritance when the first
// inherited object is a plain Python type (i.e. not derived from an extension type). Fix it.
ssize_t instance_size = static_cast<ssize_t>(sizeof(instance));
if (type->tp_basicsize < instance_size) {
type->tp_basicsize = instance_size;
}
#endif
PyObject *self = type->tp_alloc(type, 0);
auto *inst = reinterpret_cast<instance *>(self);
// Allocate the value/holder internals:
inst->allocate_layout();
return self;
}
/// Instance creation function for all pybind11 types. It only allocates space for the
/// C++ object, but doesn't call the constructor -- an `__init__` function must do that.
extern "C" inline PyObject *pybind11_object_new(PyTypeObject *type, PyObject *, PyObject *) {
return make_new_instance(type);
}

pybind11_object_new only allocates space for the C++ object, but doesn't call the constructor. That means if someone calls BoundCppClass.__new__(BoundCppClass), they will get undefined results since the C++ object is not initialized at all. Also, default values in the C++ class definition are not used even if there is a default constructor.

Reproducible example code

  1. Clone https://github.com/pybind/cmake_example:
git clone https://github.com/pybind/cmake_example.git
cd cmake_example
  1. Paste the following content to src/main.cpp:
#include <string>
#include <sstream>
#include <vector>
#include <pybind11/pybind11.h>

namespace py = pybind11;

using ssize_t = py::ssize_t;

class MyList {
private:
    std::vector<int> data = {0, 1, 2, 3};

public:
    MyList() = default;
    ssize_t size() const { return data.size(); }
    std::string repr() const {
        std::ostringstream os;
        os << "MyList([";
        for (int i = 0; i < data.size(); ++i) {
            if (i != 0) {
                os << ", ";
            }
            os << data[i];
        }
        os << "], size=" << size() << ")";
        return os.str();
    }
};

PYBIND11_MODULE(cmake_example, m) {
    auto cls = py::class_<MyList>(m, "MyList");
    cls.def(py::init<>());
    cls.def("size", &MyList::size);
    cls.def("__repr__", &MyList::repr);
}
  1. Create a new virtual environment and install:
python3 -m venv venv
source venv/bin/activate
pip3 install -U pip setuptools ipython
pip3 install -e .
  1. Run the following code in ipython:
In [1]: from cmake_example import MyList

In [2]: l = MyList()

In [3]: l.size()
Out[3]: 4

In [4]: l  # calls repr()
Out[4]: MyList([0, 1, 2, 3], size=4)

In [5]: l = MyList.__new__(MyList)

In [6]: l.size()
Out[6]: -23599346664417

In [7]: l  # calls repr()
[1]    31601 segmentation fault  ipython3
In [1]: from cmake_example import MyList

In [2]: class MyAnotherList(MyList):
   ...:     def __new__(cls):
   ...:         inst = super().__new__(cls)
   ...:         inst.default_size = inst.size()  # default size from the C++ default constructor
   ...:         return inst
   ...:         

In [3]: l = MyAnotherList()

In [4]: l  # calls repr()
Out[4]: MyList([0, 1, 2, 3], size=4)

In [5]: l.size()
Out[5]: 4

In [6]: l.default_size  # expect 4
Out[6]: -23607549586738

In [7]: MyAnotherList().default_size  # undefined
Out[7]: -5769946945

In [8]: MyAnotherList().default_size  # undefined
Out[8]: -5769947144

In [9]: MyAnotherList().default_size  # undefined
Out[9]: -23639255465217

Is this a regression? Put the last known working version here if it is.

Not a regression

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions