Date
1 - 2 of 2
Concurrent subscription.destroy() ? Crash when using tick42rmds transport.
Konstantin Baydarov
Classification: Public
toggle quoted message
Show quoted text
Hi, guys. I'm working on the issue with Yury. I spotted the deadlock possibility during debugging tick42rmds bridge crash on unsubscribe, will be interested knowing the solution as well. BR, Konstantin Baydarov -----Original Message-----
From: Yury Batrakov Sent: Tuesday, June 20, 2017 12:38 PM To: Frank Quinn <fquinn@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Cc: Konstantin Baydarov <konstantin.baydarov@...> Subject: RE: Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Classification: Public Hi Frank, Let me answer your question in random order :) 2. It looks like a designed behavior of RMDS bridge - callbacks are invoked in a thread servicing transport events from a server. One thread per mamaTransport is created. 1. Therefore race conditions are possible. Our case is: mute is called for a subscription, then mamaSubscription_cleanup frees self->mInitialRequest but concurrent mamaSubscription_processMsg call tries to access self->mInitialRequest because the message it processes is initial message from a server. 3. Taking in account p.2 we cannot process mute request asynchronously as MAMA starts freeing subscription resources immediately after mute is called. Do you think it is possible for MAMA not to invoke bridgeMamaSubscriptionMute() under mamaSubscription locks? Thus the contract for this function would be: - This call should be synchronous and no events should be processed after it returned (like before) - It should be reentrant and synchronize it's operations by itself (quite sensible requirement) -----Original Message----- From: Frank Quinn [mailto:fquinn@...] Sent: Tuesday, June 20, 2017 10:05 AM To: Yury Batrakov <yury.batrakov@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Subject: RE: Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Hi Yury, Thanks for the detailed query, I have a few outstanding questions and suggestions on this one: 1. I would question whether or not mamaSubscription_processMsg should crash for a muted subscription. Muting is a state which exists to attempt to stop new events coming in. However if you're in the dispatcher thread and you just received an object, it's too late this time - mute should only be invoked prior to prevent the *next* read. So perhaps (knowing nothing about the RMDS bridge) the straightforward solution would be simply to do the checks which may cause muting after the callback? 2. Should the thread which processes events from RMDS server invoke the callback method directly (inline)? Is it on the same thread as is assigned to the MAMA Subscription object? It should be to match the application's expected concurrency behaviour. 3. Rather than muting immediately, you could consider creating a muting callback event which gets enqueued onto your subscription thread. That way the mute event will always be synchronous with the subscription thread and you don't need to worry about locking any associated resources. Locking with subscription objects (particularly MAMA core objects) is hairy stuff and you should avoid it where possible to avoid race conditions. Cheers, Frank -----Original Message----- From: openmama-dev-bounces@... [mailto:openmama-dev-bounces@...] On Behalf Of Yury Batrakov Sent: 19 June 2017 17:42 To: openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Subject: [Openmama-dev] Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Classification: Public Hi guys, I've faced the following issue when using OpenMAMA and tick42rmds bridge. The bridge internally creates a thread to process events from RMDS server, once a message is received that thread invokes mamaSubscription_processMsg. While the message is processed user may want to destroy the subscription (obviously in other thread). To avoid corruption of mamaSubscription object, mamaSubscription_destroy() function calls bridge->mute for the bridge to stop calling mamaSubscription_processMsg() and only then deallocates mamaSubscription. The problem with this approach is the following: here's the pseudo code for RMDS dispatching thread if(muted) { // Do not dispatch return; } // Do some other checks <-- mute() may be invoked here mamaSubscription_processMsg() // processMsg for muted subscription, may crash The solution for that is to change RMDS bridge to block in bridge->mute() call until mamaSubscription_processMsg() returns but there's another problem: mamaSubscription_processMsg and mamaSubscription_deactivate may deadlock on wombatThrottle. Consider the following scenario: 1. RMDS bridge thread invokes mamaSubscription_processMsg() for message of type initial 2. User thread invokes mamaSubscription_destroy() which acquires wombat throttle lock: if (impl->mTransport) throttle = mamaTransportImpl_getThrottle(impl->mTransport, MAMA_THROTTLE_DEFAULT); if(NULL != throttle) { wombatThrottle_lock(throttle); } 3. Then mamaSubscription_destroy calls mamaSubscription_deactivate_internal which calls our new version of bridge->mute() which waits for RMDS bridge thread to finish message processing if (impl->mSubscBridge) { impl->mBridgeImpl->bridgeMamaSubscriptionMute (impl->mSubscBridge); } 4. RMDS bridge handles initial message and tries to acquire the same throttle: #5 0x00007ffff78d4f32 in wombatThrottle_lock (throttle=0x6298e0) at mama/c_cpp/src/c/throttle.c:441 #6 0x00007ffff78a34e2 in imageRequest_stopWaitForResponse (request=0x14d1a20) at mama/c_cpp/src/c/imagerequest.c:774 #7 0x00007ffff78cbe06 in mamaSubscription_stopWaitForResponse (subscription=0xe36280, ctx=0xe364a0) at mama/c_cpp/src/c/subscription.c:1262 #8 0x00007ffff78a38fe in processPointToPointMessage (callback=0x1527e50, msg=0x642460, msgType=6, ctx=0xe364a0) at mama/c_cpp/src/c/listenermsgcallback.c:169 #9 0x00007ffff78a3f9c in listenerMsgCallback_processMsg (callback=0x1527e50, msg=0x642460, ctx=0xe364a0) at mama/c_cpp/src/c/listenermsgcallback.c:480 #10 0x00007ffff78cd825 in mamaSubscription_processMsg (subscription=0xe36280, msg=0x642460) at mama/c_cpp/src/c/subscription.c:2259 What do you think is the best way to avoid this? --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy. The information contained in this message may be privileged and confidential and protected from disclosure. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by replying to this message and deleting it from your computer. Thank you. Vela Trading Technologies LLC --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy. |
|
Tom Doust
Hi Yury, Konstantin
toggle quoted message
Show quoted text
Are you using the current github version of the bridge code? We looked at and fixed some of the issues around locking the subscription destroy some time back. It would be good to know if we have missed something. Best regards Tom -----Original Message-----
From: openmama-users-bounces@... [mailto:openmama-users-bounces@...] On Behalf Of Konstantin Baydarov Sent: 20 June 2017 11:32 To: openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Subject: Re: [Openmama-users] Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Classification: Public Hi, guys. I'm working on the issue with Yury. I spotted the deadlock possibility during debugging tick42rmds bridge crash on unsubscribe, will be interested knowing the solution as well. BR, Konstantin Baydarov -----Original Message----- From: Yury Batrakov Sent: Tuesday, June 20, 2017 12:38 PM To: Frank Quinn <fquinn@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Cc: Konstantin Baydarov <konstantin.baydarov@...> Subject: RE: Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Classification: Public Hi Frank, Let me answer your question in random order :) 2. It looks like a designed behavior of RMDS bridge - callbacks are invoked in a thread servicing transport events from a server. One thread per mamaTransport is created. 1. Therefore race conditions are possible. Our case is: mute is called for a subscription, then mamaSubscription_cleanup frees self->mInitialRequest but concurrent mamaSubscription_processMsg call tries to access self->mInitialRequest because the message it processes is initial message from a server. 3. Taking in account p.2 we cannot process mute request asynchronously as MAMA starts freeing subscription resources immediately after mute is called. Do you think it is possible for MAMA not to invoke bridgeMamaSubscriptionMute() under mamaSubscription locks? Thus the contract for this function would be: - This call should be synchronous and no events should be processed after it returned (like before) - It should be reentrant and synchronize it's operations by itself (quite sensible requirement) -----Original Message----- From: Frank Quinn [mailto:fquinn@...] Sent: Tuesday, June 20, 2017 10:05 AM To: Yury Batrakov <yury.batrakov@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Subject: RE: Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Hi Yury, Thanks for the detailed query, I have a few outstanding questions and suggestions on this one: 1. I would question whether or not mamaSubscription_processMsg should crash for a muted subscription. Muting is a state which exists to attempt to stop new events coming in. However if you're in the dispatcher thread and you just received an object, it's too late this time - mute should only be invoked prior to prevent the *next* read. So perhaps (knowing nothing about the RMDS bridge) the straightforward solution would be simply to do the checks which may cause muting after the callback? 2. Should the thread which processes events from RMDS server invoke the callback method directly (inline)? Is it on the same thread as is assigned to the MAMA Subscription object? It should be to match the application's expected concurrency behaviour. 3. Rather than muting immediately, you could consider creating a muting callback event which gets enqueued onto your subscription thread. That way the mute event will always be synchronous with the subscription thread and you don't need to worry about locking any associated resources. Locking with subscription objects (particularly MAMA core objects) is hairy stuff and you should avoid it where possible to avoid race conditions. Cheers, Frank -----Original Message----- From: openmama-dev-bounces@... [mailto:openmama-dev-bounces@...] On Behalf Of Yury Batrakov Sent: 19 June 2017 17:42 To: openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Subject: [Openmama-dev] Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Classification: Public Hi guys, I've faced the following issue when using OpenMAMA and tick42rmds bridge. The bridge internally creates a thread to process events from RMDS server, once a message is received that thread invokes mamaSubscription_processMsg. While the message is processed user may want to destroy the subscription (obviously in other thread). To avoid corruption of mamaSubscription object, mamaSubscription_destroy() function calls bridge->mute for the bridge to stop calling mamaSubscription_processMsg() and only then deallocates mamaSubscription. The problem with this approach is the following: here's the pseudo code for RMDS dispatching thread if(muted) { // Do not dispatch return; } // Do some other checks <-- mute() may be invoked here mamaSubscription_processMsg() // processMsg for muted subscription, may crash The solution for that is to change RMDS bridge to block in bridge->mute() call until mamaSubscription_processMsg() returns but there's another problem: mamaSubscription_processMsg and mamaSubscription_deactivate may deadlock on wombatThrottle. Consider the following scenario: 1. RMDS bridge thread invokes mamaSubscription_processMsg() for message of type initial 2. User thread invokes mamaSubscription_destroy() which acquires wombat throttle lock: if (impl->mTransport) throttle = mamaTransportImpl_getThrottle(impl->mTransport, MAMA_THROTTLE_DEFAULT); if(NULL != throttle) { wombatThrottle_lock(throttle); } 3. Then mamaSubscription_destroy calls mamaSubscription_deactivate_internal which calls our new version of bridge->mute() which waits for RMDS bridge thread to finish message processing if (impl->mSubscBridge) { impl->mBridgeImpl->bridgeMamaSubscriptionMute (impl->mSubscBridge); } 4. RMDS bridge handles initial message and tries to acquire the same throttle: #5 0x00007ffff78d4f32 in wombatThrottle_lock (throttle=0x6298e0) at mama/c_cpp/src/c/throttle.c:441 #6 0x00007ffff78a34e2 in imageRequest_stopWaitForResponse (request=0x14d1a20) at mama/c_cpp/src/c/imagerequest.c:774 #7 0x00007ffff78cbe06 in mamaSubscription_stopWaitForResponse (subscription=0xe36280, ctx=0xe364a0) at mama/c_cpp/src/c/subscription.c:1262 #8 0x00007ffff78a38fe in processPointToPointMessage (callback=0x1527e50, msg=0x642460, msgType=6, ctx=0xe364a0) at mama/c_cpp/src/c/listenermsgcallback.c:169 #9 0x00007ffff78a3f9c in listenerMsgCallback_processMsg (callback=0x1527e50, msg=0x642460, ctx=0xe364a0) at mama/c_cpp/src/c/listenermsgcallback.c:480 #10 0x00007ffff78cd825 in mamaSubscription_processMsg (subscription=0xe36280, msg=0x642460) at mama/c_cpp/src/c/subscription.c:2259 What do you think is the best way to avoid this? --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy. The information contained in this message may be privileged and confidential and protected from disclosure. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by replying to this message and deleting it from your computer. Thank you. Vela Trading Technologies LLC --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy. _______________________________________________ Openmama-users mailing list Openmama-users@... https://lists.openmama.org/mailman/listinfo/openmama-users |
|