Thread Rating:
  • 0 Votes - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Race conditions in EventServerUpnp
19-02-2013, 03:30 PM (This post was last modified: 19-02-2013 04:17 PM by simonc.)
Post: #21
RE: Race conditions in EventServerUpnp
(19-02-2013 03:16 PM)simoncn Wrote:  I have a custom service that sends regular updates to a custom control point. When the control point receives an update, it refreshes its status. I'm using an evented property to do this, as discussed in this thread, which gives more details about the application use case.

The updates and refreshes need to be fairly frequent to ensure acceptable responsiveness for the control point. In the referenced discussion, you suggested that ohNet could handle sending an event message every 5ms. You also said that no level of traffic would overload either the device or control point stacks.

After experimenting with various values for the interval between event messages, I have arrived at a figure of 50 ms as an acceptable compromise between network overload and application responsiveness. This seemed a very conservative figure given the comment about ohNet being able to handle sending messages every 5ms.

My application requirement is for these events to be delivered on an established subscription every 50ms, without events being lost because of race conditions or resubscribes. If an update is missing or delayed, the control point status won't be refreshed in a timely manner. If the ohNet control point stack is unable to do this, I'm surprised and disappointed, and I'll need to implement another way to transmit these events that doesn't use ohNet.

Thanks, this is very helpful.

I don't believe you'd see many out of order events when updates only happen every 50ms. Are you seeing this happen frequently or are you still at the planning stage and worried that it might go wrong?

Can you tell me more about your UPnP service please? I'm struggling to think how it'd behave with multiple clients. If I'm not mis-understanding (which is pretty likely I admit) then it might be more appropriate to use UPnP to detect this network service and request a connection address then use a regular, persistent, TCP connection thereafter.

In summary, I believe ohNet is capable of doing what you want. I have some questions about whether a UPnP stack is the most appropriate technology to enable your features however.
Find all posts by this user
19-02-2013, 07:54 PM
Post: #22
RE: Race conditions in EventServerUpnp
(19-02-2013 03:30 PM)simonc Wrote:  Thanks, this is very helpful.

I don't believe you'd see many out of order events when updates only happen every 50ms. Are you seeing this happen frequently or are you still at the planning stage and worried that it might go wrong?

This is based on real trace logs which show that events for a subscription are sometimes reordered by different event server threads between arriving at the control point and being allocated to a subscription. A trace log showing this is attached to my original post. I don't know whether this is an isolated or frequent occurrence.

Quote:Can you tell me more about your UPnP service please? I'm struggling to think how it'd behave with multiple clients.

The log service maintains an ever-growing log, and the clients (custom UPnP control points) continuously monitor this log and display any updates to the viewer in near-real-time.

The current length of the log is a UPnP service property. When new data is added to the log, the service updates this property (but not less than 50 ms after its previous update to the property), and each client (a UPnP custom control point) receives the new length of the log as an event message. When the client receives this event message, it compares the new length property value with the previously received length property value. If the new value is larger (the normal case), the client sends a UPnP action request to the service to get the new portion of log data, which the control point displays to the user and appends to its shadow copy of the log. The oldest portion of log data is automatically discarded when the shadow copy reaches a certain size.

Quote:If I'm not mis-understanding (which is pretty likely I admit) then it might be more appropriate to use UPnP to detect this network service and request a connection address then use a regular, persistent, TCP connection thereafter.

I will need to do something like this if the issue we're discussing can't be resolved. However, this introduces extra complexity and some difficult issues. Apart from the effort to reimplement my current code, this approach would require the server to maintain a list of multiple client connections and use a "push" approach to write each log update to every connection in its list. This is problematical if some clients aren't responsive for any reason, because the server would need to buffer the data destined for these clients and take some action to terminate the client connection if the buffered data grows beyond a certain limit. The current approach uses "pull" requests from the clients, so the server simply responds to client requests for log data and isn't affected if any client gets into difficulties.

Quote:In summary, I believe ohNet is capable of doing what you want. I have some questions about whether a UPnP stack is the most appropriate technology to enable your features however.

UPnP seems ideally suited to doing what I want. It has built-in facilities for defining server-side properties, an event mechanism for sending server-side property updates to clients, and a mechanism for making client requests that are triggered by these event updates. I'd much prefer to use these standard features of UPnP than roll my own custom protocol to provide a similar set of capabilities.
Find all posts by this user
19-02-2013, 10:34 PM
Post: #23
RE: Race conditions in EventServerUpnp
That sounds exactly the way to implement such a service in UPnP. I think that's roughly what we did in the home control app. May I ask what symptoms you see when the aforementioned reordering occurs? I'd expect ohNet to automatically resubscribe, and at that point receive an update on all the evented state variables. So it shouldn't be without events for more than a second or so (and I'd have thought less), and your architecture sounds like it should cope with that fine, unless it's getting constant resubscribes. Is that what you see? On the other hand, it would be a much bigger problem if the resubscribe wasn't working, if no event was triggered after the resubscribe, or if it happens in quick succession.
Visit this user's website Find all posts by this user
20-02-2013, 07:21 AM (This post was last modified: 20-02-2013 07:40 AM by simoncn.)
Post: #24
RE: Race conditions in EventServerUpnp
(19-02-2013 10:34 PM)andreww Wrote:  That sounds exactly the way to implement such a service in UPnP. I think that's roughly what we did in the home control app. May I ask what symptoms you see when the aforementioned reordering occurs? I'd expect ohNet to automatically resubscribe, and at that point receive an update on all the evented state variables. So it shouldn't be without events for more than a second or so (and I'd have thought less), and your architecture sounds like it should cope with that fine, unless it's getting constant resubscribes. Is that what you see? On the other hand, it would be a much bigger problem if the resubscribe wasn't working, if no event was triggered after the resubscribe, or if it happens in quick succession.

You're exactly right. ohNet would resubscribe and my client would miss events for a short time, and then my client would recover. I don't know how often this would occur. From my perspective, this situation doesn't seem acceptable, bearing in mind that a small change to the ohNet control point stack would prevent this problem from occurring.
Find all posts by this user
20-02-2013, 09:24 AM
Post: #25
RE: Race conditions in EventServerUpnp
(19-02-2013 10:34 PM)andreww Wrote:  I'd expect ohNet to automatically resubscribe, and at that point receive an update on all the evented state variables. So it shouldn't be without events for more than a second or so (and I'd have thought less)

That turns out to be a very pessimistic estimate. I measured resubscribe times for 10 Linn Songbox instances on a very congested network. Averages varied from 4ms to 25ms with the median value under 10ms.
Find all posts by this user
20-02-2013, 10:11 AM
Post: #26
RE: Race conditions in EventServerUpnp
(19-02-2013 03:16 PM)simoncn Wrote:  I have a custom service that sends regular updates to a custom control point. When the control point receives an update, it refreshes its status. I'm using an evented property to do this, as discussed in this thread, which gives more details about the application use case.

The updates and refreshes need to be fairly frequent to ensure acceptable responsiveness for the control point. In the referenced discussion, you suggested that ohNet could handle sending an event message every 5ms. You also said that no level of traffic would overload either the device or control point stacks.

I've committed a change to the device stack to serialise updates to a given subscriber. I believe this removes the possibility of out-of-order events for your particular use case.

I don't plan to address the more general possibility of out-of-order events from other device stacks. There is no longer an immediate client for the changes. More importantly, while the control point patch takes an interesting approach I hadn't previously considered, it adds complexity to an area that has been a bit fragile and feels like a premature optimisation.
Find all posts by this user
20-02-2013, 11:20 AM
Post: #27
RE: Race conditions in EventServerUpnp
(20-02-2013 10:11 AM)simonc Wrote:  I've committed a change to the device stack to serialise updates to a given subscriber. I believe this removes the possibility of out-of-order events for your particular use case.

I don't plan to address the more general possibility of out-of-order events from other device stacks. There is no longer an immediate client for the changes. More importantly, while the control point patch takes an interesting approach I hadn't previously considered, it adds complexity to an area that has been a bit fragile and feels like a premature optimisation.

I presume the device stack will do this serialization by waiting for the HTTP 200 response before sending another NOTIFY request for the same subscription. As you say, this should solve the problem for my use case. Thanks for doing this.
Find all posts by this user
13-03-2013, 10:50 PM (This post was last modified: 14-03-2013 08:00 AM by simoncn.)
Post: #28
RE: Race conditions in EventServerUpnp
(20-02-2013 10:11 AM)simonc Wrote:  I've committed a change to the device stack to serialise updates to a given subscriber. I believe this removes the possibility of out-of-order events for your particular use case.

I've been looking at a problem with the device stack complaining about leaked objects when it's destroyed. Eventually I tracked it down to the changes in this commit.

In DviSubscription::Remove, there's an added line:
iService = NULL;

This is the cause of the leakage problem. When DviSubscription::Remove is called from PropertyWriterFactory::Disable, the service reference count isn't being decremented as it should be. This is because:

1) DviSubscription::Remove calls DviService::RemoveSubscription

2) DviService::RemoveSubscription calls DviSubscription::Stop

3) DviSubscription::Stop checks that iService is non-null before calling iService->RemoveRef() and setting iService to NULL. If iService has previously been set to NULL, the RemoveRef() call won't happen.

4) When the device stack is eventually destroyed, the DviService object has a refCount of 1, so the device stack aborts.

I've tried to work out how to patch this, but I don't understand why moving the AutoMutex creation from CreateWriter() to WriteChanges() required these changes to be made in Remove() and Expired().
Find all posts by this user
14-03-2013, 08:59 AM
Post: #29
RE: Race conditions in EventServerUpnp
(13-03-2013 10:50 PM)simoncn Wrote:  I've been looking at a problem with the device stack complaining about leaked objects when it's destroyed. Eventually I tracked it down to the changes in this commit.

Thanks for letting me know and for all the work diagnosing the problem.

I'll have a look at it.
Find all posts by this user
14-03-2013, 05:35 PM
Post: #30
RE: Race conditions in EventServerUpnp
(13-03-2013 10:50 PM)simoncn Wrote:  I've been looking at a problem with the device stack complaining about leaked objects when it's destroyed. Eventually I tracked it down to the changes in this commit.

In DviSubscription::Remove, there's an added line:
iService = NULL;

...

I've tried to work out how to patch this, but I don't understand why moving the AutoMutex creation from CreateWriter() to WriteChanges() required these changes to be made in Remove() and Expired().

I think I slipped an unrelated "improvement" in with the main change. I helpfully didn't document the reason for nulling iService. I can't think why this was necessary so we might as well try removing that line. This'll hopefully either fix all problems or throw up a new bug that'll demonstrate what I was trying to fix Smile

I've committed this locally so it'll hopefully be available to you this evening.
Find all posts by this user


Forum Jump: