C++ deleting destructors
Recently during a code review, Benoit pointed out a strange linker error, that neither of us expected. Here’s what the situation roughly looked like:
$ cat test.cpp #include <new> class Base { public: virtual ~Base() {} }; class Derived : public Base { public: ~Derived() {} private: void* operator new(size_t); void* operator new[](size_t); void operator delete(void*); void operator delete[](void*); }; int main() { Derived d; return 0; } $ clang++ test.cpp Undefined symbols for architecture x86_64: "Derived::operator delete(void*)", referenced from: Derived::~Derived() in test-LLdS56.o ld: symbol(s) not found for architecture x86_64 clang: error: linker command failed with exit code 1 (use -v to see invocation)
The goal here is to make sure that objects of type Derived cannot be allocated on the heap, by defining the new and delete operators private, so that users of Derived cannot use them. But the linker error is quite puzzling, since we were not expecting the compiler generated Derived destructor to call Derived::operator delete(). This looked rather suspicious, so we looked at the generated assembly code, and indeed there was a call to operator delete in the generated code. The compiler generates two ~Derived functions. The first one does what you would expect:
Derived::~Derived(): ## @_ZN7DerivedD2Ev .cfi_startproc ## BB#0: pushq %rbp Ltmp17: .cfi_def_cfa_offset 16 Ltmp18: .cfi_offset %rbp, -16 movq %rsp, %rbp Ltmp19: .cfi_def_cfa_register %rbp subq $16, %rsp movq %rdi, -8(%rbp) movq -8(%rbp), %rdi callq Base::~Base() addq $16, %rsp popq %rbp ret .cfi_endproc
That is, call the base class version of the virtual function, Base::~Base(). The second destructor, however, was more interesting (note that this is the code generated with -fno-exceptions, which is a bit simpler but that doesn’t matter for our purposes here):
Derived::~Derived(): ## @_ZN7DerivedD0Ev .cfi_startproc ## BB#0: pushq %rbp Ltmp37: .cfi_def_cfa_offset 16 Ltmp38: .cfi_offset %rbp, -16 movq %rsp, %rbp Ltmp39: .cfi_def_cfa_register %rbp subq $16, %rsp movq %rdi, -8(%rbp) movq -8(%rbp), %rdi movq %rdi, -16(%rbp) ## 8-byte Spill callq Derived::~Derived() movq -16(%rbp), %rdi ## 8-byte Reload callq Derived::operator delete(void*) addq $16, %rsp popq %rbp ret .cfi_endproc
This function is basically first calling the other ~Derived() function, and then calls Derived::operator delete(). I could not explain why the compiler is generating two destructors here, and what the second one is actually trying to do. Also interesting was that the second destructor (__ZN7DerivedD0Ev) was not even called in main():
_main: ## @main .cfi_startproc ## BB#0: pushq %rbp Ltmp2: .cfi_def_cfa_offset 16 Ltmp3: .cfi_offset %rbp, -16 movq %rsp, %rbp Ltmp4: .cfi_def_cfa_register %rbp subq $32, %rsp leaq -16(%rbp), %rdi movl $0, -4(%rbp) callq __ZN7DerivedC1Ev leaq -16(%rbp), %rdi movl $0, -4(%rbp) movl $1, -20(%rbp) callq __ZN7DerivedD1Ev movl -4(%rbp), %eax addq $32, %rsp popq %rbp ret .cfi_endproc
But both destructors did occur in the vtable for Derived:
__ZTV7Derived: .quad 0 .quad __ZTI7Derived .quad __ZN7DerivedD1Ev .quad __ZN7DerivedD0Ev
In fact, both the Base and Derived vtables had two destructor entries, which means that both functions can be called through the vtable.
After a bit searching and thinking, we came to the conclusion on why this happens. The answers lies in C++ deleting destructors.
C++ mandates that you must pass the exact same address to operator delete as what operator new returns. When you’re allocating an object using new, the compiler implicitly knows the concrete type of the object (which is what the compiler uses to pass in the correct memory size to operator new, for example.) However, if your class has a base class with a virtual destructor, and your object is deleted through a pointer to the base class, the compiler doesn’t know the concrete type at the call site, and therefore cannot compute the correct address to pass to operator delete(). Why, you may ask? Because in presence of multiple inheritance, the base class pointer’s address may be different to the object’s address in memory. So, what happens in that case is that when you delete an object which has a virtual destructor, the compiler calls what is called a deleting destructor (which has a D0Ev suffix in the GNU toolchain, as opposed to D1Ev which is the suffix for regular destructors) instead of the usual sequence of a call to the normal destructor followed by operator delete() to reclaim the memory. Since the deleting destructor is a virtual function, at runtime the implementation of the concrete type will be called, and that implementation is capable of computing the correct address for the object in memory. What that implementation does is call the regular destructor, compute the correct address of the object, and then call operator delete() on that address. The correct address is computed by looking at an offset value stored in the vtable for the object. Looking at the generated code for that case is left as an exercise for the curious reader.
There is also the concept of vector deleting destructors which are generated if you use operator delete[] in your program. They’re probably used for the same purpose but I didn’t explore them. The whole concept of deleting destructors in C++ seems to be very poorly documented, so I’d appreciate if someone can point me to a place where this is properly documented.
You make it sound as if this was a C++ feature. It isn’t. It’s just an implementation technique GCC and compilers following the same C++ ABI.
There are actually two requirements that they need to satisfy here. One is obvious: operator delete needs to be given the same address that operator new returned. In the face of multiple and virtual inheritance, this can be a problem as you described. Deleting destructors are just one way to solve it, though; the other (and I think Microsoft’s compiler uses this) would be to simply have the destructor return the this pointer, then pass that to operator delete.
The other requirement is that the operator delete corresponding to the dynamic type of the object is used. That is, if you have a class Derived with an operator delete, and you delete it through a Base pointer, the compiler cannot statically determine that it should call Derived’s operator delete, but the spec says it has to. The deleting destructor is a simple way to achieve this.
Anyway, if you want more info, look at the C++ ABI: http://mentorembedded.github.com/cxx-abi/
Offtopic here, but do you have plans to update your “Form Control Context Menu” extension? Since Firefox 20.0 (or 20.0.1) it litters context menu with lots of elements.
No sorry, I don’t have any plans to update that extension.
Ow, too bad. Well, anyway, thanks for the quick response.
I’m not sure this is such a great strategy for preventing heap allocation. First, you are still vulnerable to someone doing a placement new/delete, but more than that, you should be removing just the new operators. Removing the deletes isn’t necessary and seems like it creates a lot of room for mistakes. I’d do something like this:
#ifdef __cplusplus > 199711L
#define DELETE_METHOD = delete
#else
#define DELETE_METHOD
#endif
class no_heap_alloc {
protected:
static void* operator new(size_t) DELETE_METHOD;
static void* operator new[] (size_t) DELETE_METHOD;
//maybe think about adding something magic to operator new(size_t, void*) to check if the address is on the heap, but really….
};
class Derived : public Base, public no_heap_alloc {
//stuff
};
I’m not sure this is such a great strategy for preventing heap allocation. First, you are still vulnerable to someone doing a placement new/delete, but more than that, you should be removing just the new operators. Removing the deletes isn’t necessary and seems like it creates a lot of room for mistakes. I’d do something like this:
#ifdef __cplusplus > 199711L
#define DELETE_METHOD = delete
#else
#define DELETE_METHOD
#endif
class no_heap_alloc {
protected:
static void* operator new(size_t) DELETE_METHOD;
static void* operator new[] (size_t) DELETE_METHOD;
//maybe think about adding something magic to the constructor to check if the address is on the heap, but really….
};
class Derived : public Base, public no_heap_alloc {
//stuff
};