Race condition in Timer::FireAt
|
09-01-2015, 07:52 AM
(This post was last modified: 09-01-2015 07:52 AM by simoncn.)
Post: #1
|
|||
|
|||
Race condition in Timer::FireAt
A user has reported an ohNet crash that turns out to be caused by a race condition in Timer::FireAt.
This method can be called by multiple device threads in rapid succession when a subscription is being renewed. If a thread switch occurs after iMgr.Remove and before iMgr.Add, the following can happen: 1) Thread A removes the Timer object from the queue 2) Thread B removes the same Timer object from the queue. This is allowed by the code in QueueSortedBase::DoRemove (line 204). 3) Thread B modifies this Timer object and adds it back to the queue 4) Thread A modifies this Timer object and adds it back to the queue. This is not allowed by the code in QueueSortedBase::DoAdd (lines 177-179). The result is an assertion failure in line 179 of Queue.cpp. I think there are two possible fixes for the problem: a) Protect the code in Timer::FireAt with a new mutex b) Modify QueueSortedBase::DoAdd to not throw an assertion error if the added item is already in the queue I think a) is more robust and I presume the overhead of adding a mutex is acceptable. If you agree, I would be happy to submit a patch for this change. |
|||
12-01-2015, 03:41 PM
Post: #2
|
|||
|
|||
RE: Race condition in Timer::FireAt
(09-01-2015 07:52 AM)simoncn Wrote: I think there are two possible fixes for the problem: Thanks for the report/diagnosis. I've gone with a variant of your approach (a) above - Timer::FireAt() now calls onto TimerManager::FireAt() which is protected by a mutex. This is committed locally so should be on github this evening. |
|||
15-01-2015, 09:06 PM
Post: #3
|
|||
|
|||
RE: Race condition in Timer::FireAt
(12-01-2015 03:41 PM)simonc Wrote: Thanks for the report/diagnosis. Thanks for this. I've released a MinimServer update with this fix and it seems to be working well so far. |
|||
« Next Oldest | Next Newest »
|