Re: Concurrent subscription.destroy() ? Crash when using tick42rmds transport.

Konstantin Baydarov
 

Classification: Public

Hi, guys.

I'm working on the issue with Yury. I spotted the deadlock possibility during debugging tick42rmds bridge crash on unsubscribe, will be interested knowing the solution as well.

BR,
Konstantin Baydarov

-----Original Message-----
From: Yury Batrakov
Sent: Tuesday, June 20, 2017 12:38 PM
To: Frank Quinn <fquinn@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...>
Cc: Konstantin Baydarov <konstantin.baydarov@...>
Subject: RE: Concurrent subscription.destroy() ? Crash when using tick42rmds transport.

Classification: Public

Hi Frank,

Let me answer your question in random order :)
2. It looks like a designed behavior of RMDS bridge - callbacks are invoked in a thread servicing transport events from a server. One thread per mamaTransport is created.
1. Therefore race conditions are possible. Our case is: mute is called for a subscription, then mamaSubscription_cleanup frees self->mInitialRequest but concurrent mamaSubscription_processMsg call tries to access self->mInitialRequest because the message it processes is initial message from a server.
3. Taking in account p.2 we cannot process mute request asynchronously as MAMA starts freeing subscription resources immediately after mute is called. Do you think it is possible for MAMA not to invoke bridgeMamaSubscriptionMute() under mamaSubscription locks? Thus the contract for this function would be:
- This call should be synchronous and no events should be processed after it returned (like before)
- It should be reentrant and synchronize it's operations by itself (quite sensible requirement)

-----Original Message-----
From: Frank Quinn [mailto:fquinn@...]
Sent: Tuesday, June 20, 2017 10:05 AM
To: Yury Batrakov <yury.batrakov@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...>
Subject: RE: Concurrent subscription.destroy() ? Crash when using tick42rmds transport.

Hi Yury,

Thanks for the detailed query, I have a few outstanding questions and suggestions on this one:

1. I would question whether or not mamaSubscription_processMsg should crash for a muted subscription. Muting is a state which exists to attempt to stop new events coming in. However if you're in the dispatcher thread and you just received an object, it's too late this time - mute should only be invoked prior to prevent the *next* read. So perhaps (knowing nothing about the RMDS bridge) the straightforward solution would be simply to do the checks which may cause muting after the callback?
2. Should the thread which processes events from RMDS server invoke the callback method directly (inline)? Is it on the same thread as is assigned to the MAMA Subscription object? It should be to match the application's expected concurrency behaviour.
3. Rather than muting immediately, you could consider creating a muting callback event which gets enqueued onto your subscription thread. That way the mute event will always be synchronous with the subscription thread and you don't need to worry about locking any associated resources. Locking with subscription objects (particularly MAMA core objects) is hairy stuff and you should avoid it where possible to avoid race conditions.

Cheers,
Frank



-----Original Message-----
From: openmama-dev-bounces@... [mailto:openmama-dev-bounces@...] On Behalf Of Yury Batrakov
Sent: 19 June 2017 17:42
To: openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...>
Subject: [Openmama-dev] Concurrent subscription.destroy() ? Crash when using tick42rmds transport.

Classification: Public

Hi guys,

I've faced the following issue when using OpenMAMA and tick42rmds bridge.
The bridge internally creates a thread to process events from RMDS server, once a message is received that thread invokes mamaSubscription_processMsg. While the message is processed user may want to destroy the subscription (obviously in other thread). To avoid corruption of mamaSubscription object, mamaSubscription_destroy() function calls bridge->mute for the bridge to stop calling mamaSubscription_processMsg() and only then deallocates mamaSubscription. The problem with this approach is the following: here's the pseudo code for RMDS dispatching thread


if(muted) {
// Do not dispatch
return;
}

// Do some other checks <-- mute() may be invoked here
mamaSubscription_processMsg() // processMsg for muted subscription, may crash


The solution for that is to change RMDS bridge to block in bridge->mute() call until mamaSubscription_processMsg() returns but there's another problem: mamaSubscription_processMsg and mamaSubscription_deactivate may deadlock on wombatThrottle. Consider the following scenario:
1. RMDS bridge thread invokes mamaSubscription_processMsg() for message of type initial
2. User thread invokes mamaSubscription_destroy() which acquires wombat throttle lock:
if (impl->mTransport)
throttle = mamaTransportImpl_getThrottle(impl->mTransport,
MAMA_THROTTLE_DEFAULT);

if(NULL != throttle)
{
wombatThrottle_lock(throttle);
}
3. Then mamaSubscription_destroy calls mamaSubscription_deactivate_internal which calls our new version of bridge->mute() which waits for RMDS bridge thread to finish message processing
if (impl->mSubscBridge)
{
impl->mBridgeImpl->bridgeMamaSubscriptionMute (impl->mSubscBridge);
}
4. RMDS bridge handles initial message and tries to acquire the same throttle:
#5 0x00007ffff78d4f32 in wombatThrottle_lock (throttle=0x6298e0) at mama/c_cpp/src/c/throttle.c:441
#6 0x00007ffff78a34e2 in imageRequest_stopWaitForResponse (request=0x14d1a20) at mama/c_cpp/src/c/imagerequest.c:774
#7 0x00007ffff78cbe06 in mamaSubscription_stopWaitForResponse (subscription=0xe36280, ctx=0xe364a0) at mama/c_cpp/src/c/subscription.c:1262
#8 0x00007ffff78a38fe in processPointToPointMessage (callback=0x1527e50, msg=0x642460, msgType=6, ctx=0xe364a0) at mama/c_cpp/src/c/listenermsgcallback.c:169
#9 0x00007ffff78a3f9c in listenerMsgCallback_processMsg (callback=0x1527e50, msg=0x642460, ctx=0xe364a0) at mama/c_cpp/src/c/listenermsgcallback.c:480
#10 0x00007ffff78cd825 in mamaSubscription_processMsg (subscription=0xe36280, msg=0x642460) at mama/c_cpp/src/c/subscription.c:2259

What do you think is the best way to avoid this?


---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.


The information contained in this message may be privileged and confidential and protected from disclosure. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by replying to this message and deleting it from your computer. Thank you. Vela Trading Technologies LLC


---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.

Join Openmama-users@lists.openmama.org to automatically receive all group messages.