Thread Rating:
  • 0 Votes - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Pure virtual function error in CpDeviceCUdn function
30-04-2013, 11:20 AM
Post: #1
Pure virtual function error in CpDeviceCUdn function
I'm getting occasional "pure virtual function" errors from the CpDeviceCUdn function in OpenHome/Net/Bindings/C/ControlPoint/CpDeviceC.cpp.

Here's the disassembly of the relevant code:
Code:
61097E30 56                   push        esi  
61097E31 8B 74 24 08          mov         esi,dword ptr [esp+8]  
61097E35 85 F6                test        esi,esi  
61097E37 75 0F                jne         _CpDeviceCUdn@4+18h (61097E48h)  
61097E39 6A 0D                push        0Dh  
61097E3B 68 EC 04 12 61       push        offset OpenHome::Net::CpDeviceList::`vftable'+4 (611204ECh)  
61097E40 E8 7B DB 02 00       call        OpenHome::CallAssertHandler (610C59C0h)  
61097E45 83 C4 08             add         esp,8  
61097E48 8B CE                mov         ecx,esi  
61097E4A E8 B1 26 00 00       call        OpenHome::Net::CpiDevice::Udn (6109A500h)  
61097E4F 8B 10                mov         edx,dword ptr [eax]  
61097E51 8B C8                mov         ecx,eax  
61097E53 8B 02                mov         eax,dword ptr [edx]  
61097E55 FF D0                call        eax  
61097E57 5E                   pop         esi  
61097E58 C2 04 00             ret         4

Visual Studio shows the instruction pointer as 61097E57 and the EAX value as 0x0, so I think the failure is on this line of code:

return (const char*)device->Udn().Ptr();

It looks like the device->Udn() call has completed successfully and the Ptr() call causes the error message because it's trying to call address 0x0.

The only calls from my code to CpDeviceCUdn() are after a deviceAdded() or deviceRemoved() event. Unfortunately, the "pure virtual function" error doesn't result in ohNet's fatal error handler being called, so I don't have a Java thread dump that would show me exactly which line of MinimServer code was making the call to CpDeviceCUdn(). For me to get this next time it happens, would it be suitable for me to replace the above line by the following code?

Brn udn(device->Udn());
ASSERT(udn != NULL);
return (const char*)udn.Ptr();

Any other suggestions would be much appreciated.
Find all posts by this user
30-04-2013, 06:36 PM
Post: #2
RE: Pure virtual function error in CpDeviceCUdn function
(30-04-2013 11:20 AM)simoncn Wrote:  I'm getting occasional "pure virtual function" errors from the CpDeviceCUdn function in OpenHome/Net/Bindings/C/ControlPoint/CpDeviceC.cpp.

The only calls from my code to CpDeviceCUdn() are after a deviceAdded() or deviceRemoved() event. Unfortunately, the "pure virtual function" error doesn't result in ohNet's fatal error handler being called, so I don't have a Java thread dump that would show me exactly which line of MinimServer code was making the call to CpDeviceCUdn(). For me to get this next time it happens, would it be suitable for me to replace the above line by the following code?

Brn udn(device->Udn());
ASSERT(udn != NULL);
return (const char*)udn.Ptr();

Any other suggestions would be much appreciated.

That looks like the best place to add logging. Off the top of my head, this could be the result of code (either ohNet or your app) removing references they hadn't previously added.

I'll look through the code and will post here if I spot anything useful.
Find all posts by this user
30-04-2013, 10:16 PM
Post: #3
RE: Pure virtual function error in CpDeviceCUdn function
(30-04-2013 06:36 PM)simonc Wrote:  That looks like the best place to add logging. Off the top of my head, this could be the result of code (either ohNet or your app) removing references they hadn't previously added.

I'll look through the code and will post here if I spot anything useful.

My suggested code change doesn't compile. I'm getting the following messages:

OpenHome/Net/Bindings/C/ControlPoint/CpDeviceC.cpp(15) : error C2679: binary '!=' : no operator found which takes a right-hand operand of type 'int' (or there is no acceptable conversion)
Build\Include\OpenHome/Buffer.h(22): could be 'OpenHome::TBool OpenHome::Brx::operator !=(const OpenHome::Brx &) const'
while trying to match the argument list '(OpenHome::Brn, int)'
NMAKE : fatal error U1077: '"D:\Program Files\Microsoft Visual Studio 10.0\VC\BI

Can you suggest a way to fix this? It seems I don't understand what the Ptr() method of Brn does.
Find all posts by this user
01-05-2013, 07:44 AM
Post: #4
RE: Pure virtual function error in CpDeviceCUdn function
I don't think you can catch this with a null-check:

* device is clearly not null.
* device->Udn() returns a const ref to a Brhz embedded in the device object. This can't be null either.
* It doesn't matter if the pointer inside the Brhz is null, because we haven't even gotten as far as dereferencing it.

The only thing that does appear to be null is a slot in a block of memory that we thought was a Brx vtable. As Simon suggests, it seems most likely this is because it's *not* actually a Brx vtable and this is a use-after-free bug.

Code:
const TByte *udn = device->Udn();
ASSERT(udn != NULL);
return (const char*)udn;
Visit this user's website Find all posts by this user
01-05-2013, 11:34 AM
Post: #5
RE: Pure virtual function error in CpDeviceCUdn function
(01-05-2013 07:44 AM)andreww Wrote:  I don't think you can catch this with a null-check:

* device is clearly not null.
* device->Udn() returns a const ref to a Brhz embedded in the device object. This can't be null either.
* It doesn't matter if the pointer inside the Brhz is null, because we haven't even gotten as far as dereferencing it.

The only thing that does appear to be null is a slot in a block of memory that we thought was a Brx vtable. As Simon suggests, it seems most likely this is because it's *not* actually a Brx vtable and this is a use-after-free bug.

Thanks for clarifying this. Does the incorrect vtable mean the CpiDevice has been freed, or the Brhz for the Udn has been freed, or both?
Find all posts by this user
01-05-2013, 03:03 PM
Post: #6
RE: Pure virtual function error in CpDeviceCUdn function
The Brhz is a member of the CpiDevice. The get freed at the same time - the storage of the Brhz is a subset of the storage for the CpiDevice. (Well, technically the CpiDevice's destructor will be executed first, but they are both destroyed in quick succession.) If the Brhz has a screwed up vtable, I think it means either both have been destroyed, something has scribbled over memory, or the device pointer points to random garbage. The likely reason that device->Udn() didn't explode is that Udn() isn't a virtual method - it can still run even if the device pointer isn't to a valid device, it just has undefined results.
Visit this user's website Find all posts by this user
02-05-2013, 09:17 AM
Post: #7
RE: Pure virtual function error in CpDeviceCUdn function
(01-05-2013 03:03 PM)andreww Wrote:  The Brhz is a member of the CpiDevice. The get freed at the same time - the storage of the Brhz is a subset of the storage for the CpiDevice. (Well, technically the CpiDevice's destructor will be executed first, but they are both destroyed in quick succession.) If the Brhz has a screwed up vtable, I think it means either both have been destroyed, something has scribbled over memory, or the device pointer points to random garbage. The likely reason that device->Udn() didn't explode is that Udn() isn't a virtual method - it can still run even if the device pointer isn't to a valid device, it just has undefined results.

Thanks again! I now understand exactly why this fails in the way it does.

By adding extra tracing, I've been able to confirm that the problem is caused by a bug in MinimServer, and I'm working on a fix for this.

This error also raises a slight issue about the design of the Java bindings for this. The Java object CpDevice contains a field iHandle which points to the C++ CpiDevice object. If Java application code calls addRef() followed by removeRef() on a Java CpDevice object, the iHandle might point to freed memory and any subsequent call to this CpDevice object from Java code is likely to cause a "pure virtual function" abort instead of throwing a Java exception. The Java CpDevice object should ideally be able to detect that its iHandle isn't valid any more and throw a Java exception, but there's currently no mechanism for the C++ code that deletes the CpiDevice object to know that the CpiDevice object is currently in use by a Java CpDevice object and inform the Java bindings that the CpiDevice object has been deleted.
Find all posts by this user
02-05-2013, 10:02 AM
Post: #8
RE: Pure virtual function error in CpDeviceCUdn function
(02-05-2013 09:17 AM)simoncn Wrote:  Thanks again! I now understand exactly why this fails in the way it does.

By adding extra tracing, I've been able to confirm that the problem is caused by a bug in MinimServer, and I'm working on a fix for this.

This error also raises a slight issue about the design of the Java bindings for this. The Java object CpDevice contains a field iHandle which points to the C++ CpiDevice object. If Java application code calls addRef() followed by removeRef() on a Java CpDevice object, the iHandle might point to freed memory and any subsequent call to this CpDevice object from Java code is likely to cause a "pure virtual function" abort instead of throwing a Java exception. The Java CpDevice object should ideally be able to detect that its iHandle isn't valid any more and throw a Java exception, but there's currently no mechanism for the C++ code that deletes the CpiDevice object to know that the CpiDevice object is currently in use by a Java CpDevice object and inform the Java bindings that the CpiDevice object has been deleted.

It isn't safe for Java code to act on a CpDevice it doesn't hold a reference to (the one exception to this is inside device added/removed callbacks where a device handle is guaranteed valid). We could tighten up behaviour slightly here by nulling the Java handle inside removeRef(). What do you think?
Find all posts by this user
02-05-2013, 12:37 PM (This post was last modified: 02-05-2013 12:43 PM by simoncn.)
Post: #9
RE: Pure virtual function error in CpDeviceCUdn function
(02-05-2013 10:02 AM)simonc Wrote:  It isn't safe for Java code to act on a CpDevice it doesn't hold a reference to (the one exception to this is inside device added/removed callbacks where a device handle is guaranteed valid). We could tighten up behaviour slightly here by nulling the Java handle inside removeRef(). What do you think?

Thanks for this suggestion. Unfortunately, it wouldn't help with the scenario I'm fixing at the moment.

I've got multiple device lists listening for new devices (CpDeviceListUpnpDeviceType and CpDeviceListUpnpServiceType). It's possible for a new device to match more than one of these device lists. In this case, the Java bindings will create multiple Java CpDevice objects to pass to the deviceAdded() or deviceRemoved() callback of each list. All these Java CpDevice objects will have the same native handle. As soon as the last active reference is removed by calling removeRef() on any of these Java CpDevice objects, all other Java CpDevice objects with the same handle will become unsafe for method calls even though no removeRef() call has been made on these other Java CpDevice objects.

It would also be possible for this nulling to cause a problem if multiple addRef() and removeRef() calls are made via the same Java CpDevice object. (I presume this is valid.) In this case, the handle would be nulled by the first removeRef() call, and subsequent method calls on the Java CpDevice object would fail.

For these reasons, I don't think it would be useful to attempt to "help" the Java programmer by doing this nulling.
Find all posts by this user
02-05-2013, 12:38 PM
Post: #10
RE: Pure virtual function error in CpDeviceCUdn function
(02-05-2013 12:37 PM)simoncn Wrote:  It would also be possible for this nulling to cause a problem if multiple addRef() and removeRef() calls are made via the same Java CpDevice object. (I presume this is valid.) In this case, the handle would be nulled by the first removeRef() call, and subsequent method calls on the Java CpDevice object would fail.

Very good point. In hindsight, that was a terrible suggestion from me, consider it withdrawn.
Find all posts by this user


Forum Jump: