pybind11 With Sklearn Pipelines - Core Requirements

Posted on Jun 23, 2025

Core requirements for building a Python binding:

  • Copyable
Original:                    Copy:
┌──────────────┐            ┌──────────────┐
│ Copyable     │            │ Copyable     │
│ data: 0x1000 │            │ data: 0x2000 │ Different!
│ size: 5      │            │ size: 5      │
└──────┬───────┘            └──────┬───────┘
       │                           │
       ▼                           ▼
   [1,2,3,4,5]                [1,2,3,4,5]
   at 0x1000                  at 0x2000
   
- Since they have their own share of data, we are safe.

A demonstration class inside CLion:

class Copyable {
    int* value;
    
public:
    Copyable(int v) : value(new int(v)) {
        std::cout << "Constructor: Copyable(" << v << ") at " << this 
                  << ", value at " << value << std::endl;
    }
    
    // Copy constructor - MUST do deep copy
    Copyable(const Copyable& other) : value(new int(*other.value)) {
        std::cout << "Copy constructor: at " << this << " from " << &other
                  << ", new value at " << value << " = " << *value << std::endl;
    }
    
    // Copy assignment - MUST do deep copy
    Copyable& operator=(const Copyable& other) {
        std::cout << "Copy assignment: to " << this << " from " << &other << std::endl;
        if (this != &other) {
            *value = *other.value;  // Or: delete value; value = new int(*other.value);
        }
        return *this;
    }
    
    // Destructor - MUST clean up
    ~Copyable() {
        std::cout << "Destructor: at " << this << ", deleting value at " << value << std::endl;
        delete value;
    }
    
    int get() const { return *value; }
    void set(int v) { *value = v; }
    
    void print() const {
        std::cout << "Copyable at " << this << ": value = " << *value 
                  << " (stored at " << value << ")" << std::endl;
    }
};

int main() {
    std::cout << "1. Create object:" << std::endl;
    Copyable s1(42);
    s1.print();
    
    std::cout << "\n2. Copy construction:" << std::endl;
    Copyable s2(s1);
    s1.print();
    s2.print();
    
    std::cout << "\n3. Modify copy (should not affect original):" << std::endl;
    s2.set(100);
    s1.print();
    s2.print();
    
    std::cout << "\n4. Copy assignment:" << std::endl;
    Copyable s3(0);
    s3 = s1;
    s3.print();

If you execute this code block:

1. Create object:
Constructor: Copyable(42) at 0x7ffc9f231a08, value at 0x62c889ccb6c0
Copyable at 0x7ffc9f231a08: value = 42 (stored at 0x62c889ccb6c0)

2. Copy construction:
Copy constructor: at 0x7ffc9f231a10 from 0x7ffc9f231a08, new value at 0x62c889ccb6e0 = 42
Copyable at 0x7ffc9f231a08: value = 42 (stored at 0x62c889ccb6c0)
Copyable at 0x7ffc9f231a10: value = 42 (stored at 0x62c889ccb6e0)

3. Modify copy (should not affect original):
Copyable at 0x7ffc9f231a08: value = 42 (stored at 0x62c889ccb6c0)
Copyable at 0x7ffc9f231a10: value = 100 (stored at 0x62c889ccb6e0)

4. Copy assignment:
Constructor: Copyable(0) at 0x7ffc9f231a18, value at 0x62c889ccb700
Copy assignment: to 0x7ffc9f231a18 from 0x7ffc9f231a08
Copyable at 0x7ffc9f231a18: value = 42 (stored at 0x62c889ccb700)

The object s1 has a value of of 42 stored in 0x62c889ccb6c0 address, when we copy, it’s stored at another address: 0x62c889ccb6c0.

When we modify the value on s2, this should not modify the value stored in s1.

  • Movable

  • Neither