Thread Rating:
  • 0 Votes - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Interval between ssdp:alive notifications
10-01-2012, 07:06 PM
Post: #1
Interval between ssdp:alive notifications
I'm concerned that the interval between ssdp:alive notifications sent by the ohNet device stack isn't sufficient to cope with the possibility of occasional non-delivery of these notifications (which can happen, as UDP multicast is an unreliable protocol).

The default max-age value is 1800 seconds, and ohNet sends ssdp:alive messages at a random interval between 600 and 1200 seconds (1/3 and 2/3 of max-age). This means that a single missed notification could cause a control point to assume that the device has gone away. For example, two consecutive notifications might randomly happen at intervals of 1000 seconds each, and the first of these might not be delivered.

As a comparison, Twonky and Asset send these notifications at intervals of approximately one minute, so it would require over 25 consecutive missed notifications before a control point removes the device.

The interval used by ohNet doesn't seem to be configurable. It's always a random number between 1/3 and 2/3 of the max-age value. I'd like to use a shorter interval to guard against the possibility of UDP multicast unreliabilty, though I wouldn't go as far in this direction as Twonky and Asset do. Would it be possible to either decrease the default interval or add a configuration parameter so that a device could decrease the interval that it uses?

Simon
Find all posts by this user
11-01-2012, 09:48 AM (This post was last modified: 11-01-2012 10:32 AM by simonc.)
Post: #2
RE: Interval between ssdp:alive notifications
(10-01-2012 07:06 PM)simoncn Wrote:  I'm concerned that the interval between ssdp:alive notifications sent by the ohNet device stack isn't sufficient to cope with the possibility of occasional non-delivery of these notifications (which can happen, as UDP multicast is an unreliable protocol).

Are you worried about this because you're aware of local networks where UDP packets frequently go missing? Or is your concern prompted by having read that UDP is unreliable? I'm not aware of any evidence that UDP is inherently unreliable but would be interested to hear if you're finding real world problems.

When I started working on ohNet I was concerned by the UPnP arch docs repeated mentions of UDP's unreliability. I wrote some simple tests to measure how often messages failed to be delivered. I got bored and gave up several million successful deliveries later.
Later: Songcast gives a better indication of UDP reliability on a local network.

The only way I can ever reproduce a problem is to prompt so much traffic that a control point can't read the messages as quickly as they appear. The cp's socket recv buffer overflows and some messages are lost. Even then, each device sends several alive messages (see below) so the chances of losing all of them are very slim. (I've included a note of this for completeness but wouldn't worry about it. I could only prompt problems by doing a msearch for all UPnP devices on our very heavilly populated network. I've never seen problems using a CpDeviceListServiceType which issues a more targetted msearch and only generates one response per device.)

(10-01-2012 07:06 PM)simoncn Wrote:  The default max-age value is 1800 seconds, and ohNet sends ssdp:alive messages at a random interval between 600 and 1200 seconds (1/3 and 2/3 of max-age). This means that a single missed notification could cause a control point to assume that the device has gone away. For example, two consecutive notifications might randomly happen at intervals of 1000 seconds each, and the first of these might not be delivered.

Not quite. There are (3 + numServices) alive messages generated over a client configured time (1-5s). All of these would have to be missed before a device list would get out of synch.

(10-01-2012 07:06 PM)simoncn Wrote:  As a comparison, Twonky and Asset send these notifications at intervals of approximately one minute, so it would require over 25 consecutive missed notifications before a control point removes the device.

This seems like a really bad idea. I've seen some devices that send out very frequent alives to mitigate incomplete msearch support. Its also possible both servers use a device stack which misunderstood the UPnP arch docs.

If you have the time, I'd consider reporting this as a bug against each product.

(10-01-2012 07:06 PM)simoncn Wrote:  The interval used by ohNet doesn't seem to be configurable. It's always a random number between 1/3 and 2/3 of the max-age value. I'd like to use a shorter interval to guard against the possibility of UDP multicast unreliabilty, though I wouldn't go as far in this direction as Twonky and Asset do. Would it be possible to either decrease the default interval or add a configuration parameter so that a device could decrease the interval that it uses?

Simon

I don't think a config value to determine frequency of alive notifications would be useful. You could however set the MsearchTimeSecs config value to 5 to spread alive messages over their maximum extend. This will reduce (the already very small) chance of cp recv buffer overflowing.

We'll also discuss whether a slight increase in alive notification frequency (to 1/3 - 1/2) would be sensible. I'll let you know what happens here.
Find all posts by this user
11-01-2012, 02:41 PM
Post: #3
RE: Interval between ssdp:alive notifications
(11-01-2012 09:48 AM)simonc Wrote:  Are you worried about this because you're aware of local networks where UDP packets frequently go missing? Or is your concern prompted by having read that UDP is unreliable? I'm not aware of any evidence that UDP is inherently unreliable but would be interested to hear if you're finding real world problems.

Yes, I have seen this on my home wireless network. As part of a recent debugging exercise to find out why my ohNet control point was losing contact with my ohNet server, I ran a Wireshark trace on my laptop, which is attached via wireless. Asset was running at the same time, and it normally sends out 12 alive messages every 60 seconds. Here's what happened:

time=60: received 11
time=120: received 12
time=180: received 4
time=240: received 12
time=300: received 12
time=360: received 0
time=420: received 12
time=480: received 12
time=540: received 9
time=600: received 12
time=660: received 0
time=720: received 1
time=780: received 4
time=840: received 11
time=900: received 11
time=960: received 0
time=1020: received 7
time=1080: received 11
time=1140: received 12
time=1200: received 12
time=1260: received 12
etc.

The pattern continues to repeat in the same way, with 0/12 being received every few transmissions, and 12/12 being received fairly infrequently.

Quote:When I started working on ohNet I was concerned by the UPnP arch docs repeated mentions of UDP's unreliability. I wrote some simple tests to measure how often messages failed to be delivered. I got bored and gave up several million successful deliveries later.
Later: Songcast gives a better indication of UDP reliability on a local network.

Perhaps this is network dependent. It may be better on a wired network than a wireless network, because (as I understand it) multicast transmissions over wireless use a different protocol involving beacons. It may also be better on a corporate network than a home network with equipment of questionable quality (although equipment quality shouldn't be an issue for my home network where I ran the above test).

Quote:The only way I can ever reproduce a problem is to prompt so much traffic that a control point can't read the messages as quickly as they appear. The cp's socket recv buffer overflows and some messages are lost. Even then, each device sends several alive messages (see below) so the chances of losing all of them are very slim. (I've included a note of this for completeness but wouldn't worry about it. I could only prompt problems by doing a msearch for all UPnP devices on our very heavilly populated network. I've never seen problems using a CpDeviceListServiceType which issues a more targetted msearch and only generates one response per device.)

The evidence from my test is that it's quite common for 12 out of 12 messages to be lost if they happen to be sent at a bad time.

Quote:Not quite. There are (3 + numServices) alive messages generated over a client configured time (1-5s). All of these would have to be missed before a device list would get out of synch.

OK, that's a fair point. However, my evidence is that even when multiple messages are sent, it's quite common for all of them to fail to arrive.

Quote:This seems like a really bad idea. I've seen some devices that send out very frequent alives to mitigate incomplete msearch support. Its also possible both servers use a device stack which misunderstood the UPnP arch docs.

If you have the time, I'd consider reporting this as a bug against each product.

As I said, I wouldn't want to go as far as Asset and Twonky do in terms of frequency of transmissions. However, it's possible that the developers of these products were aware of real-world multicast reception problems and chose to increase the alive frequency to (over)compensate for this.

Quote:I don't think a config value to determine frequency of alive notifications would be useful.

I'm sorry to hear that. Many other aspects of the ohNet runtime are configurable, and I don't see why it would be a bad idea to let the device developer have a configuration option for this setting.

Quote:You could however set the MsearchTimeSecs config value to 5 to spread alive messages over their maximum extend. This will reduce (the already very small) chance of cp recv buffer overflowing.

Does this also increase the time lag for getting responses back to M-SEARCH requests? If so, that would be an unwelcome side effect.

Quote:We'll also discuss whether a slight increase in alive notification frequency (to 1/3 - 1/2) would be sensible. I'll let you know what happens here.

That seems very sensible to me, as it would mean that at least two consecutive sets of alive notifications would need to go missing in order to cause a problem. This didn't happen in my test, though on one occasion there was a sequence of 0 and 1 messages received.

Simon
Find all posts by this user
11-01-2012, 03:05 PM (This post was last modified: 11-01-2012 03:07 PM by simonc.)
Post: #4
RE: Interval between ssdp:alive notifications
Thanks for all the detail, its very helpful!

As a start, I've changed the frequency of alive notifications to 1/4 - 1/2 of advertised duration (arch doc turns out to recommend a max of 1/2). I'm still not keen on exposing this as a config option but am open to making further changes - if a client can think of better rules for frequency of advertisement we should just adopt these in ohnet.

Rather than sending out more frequent alive notifications I wonder whether we should spread the notifications over a much longer period. We currently send a message on average every 40ms; its possible sending a message every 10s instead would prevent all messages being sent in a window where wireless propogation of multicast messages appears to be iffy.

Ignore my suggestion about changing MsearchTimeSecs - I was confused about which announcements it influenced; it won't have any effect here.
Find all posts by this user
11-01-2012, 03:18 PM
Post: #5
RE: Interval between ssdp:alive notifications
(11-01-2012 03:05 PM)simonc Wrote:  Thanks for all the detail, its very helpful!

As a start, I've changed the frequency of alive notifications to 1/4 - 1/2 of advertised duration (arch doc turns out to recommend a max of 1/2). I'm still not keen on exposing this as a config option but am open to making further changes - if a client can think of better rules for frequency of advertisement we should just adopt these in ohnet.

OK, that's fair enough. The 1/4 to 1/2 change should definitely help.

Quote:Rather than sending out more frequent alive notifications I wonder whether we should spread the notifications over a much longer period. We currently send a message on average every 40ms; its possible sending a message every 10s instead would prevent all messages being sent in a window where wireless propogation of multicast messages appears to be iffy.

A 40ms interval doesn't seem long enough. I've observed dropouts of about 5 seconds on my wireless network, so it would be good to spread the series of announcements over at least that interval. Sending them every 10s should be very safe.

Quote:Ignore my suggestion about changing MsearchTimeSecs - I was confused about which announcements it influenced; it won't have any effect here.

OK, no problem!
Find all posts by this user


Forum Jump: