[Openmama-dev] Concurrent subscription.destroy() ? Crash when using tick42rmds transport.


Konstantin Baydarov
 

Classification: Public

Hi, Tom.

I noticed, that comparing to qpid bridge(that comes with openmama sources), tick42rmds calls mamaSubscription_processMsg() method from separate thread and not from mamaQueue_dispatch(), wondering if it's correct. Probably it's one of the reasons of the issue that we facing?
Qpid bridge call stack:
#0 mamaSubscription_processMsg (subscription=0x76e150, msg=0x7aebb0) at mama/c_cpp/src/c/subscription.c:2226
#1 0x00007ffff7b4c580 in imageRequestImpl_onInitialMessage (msg=0x7aebb0, closure=0x772710) at mama/c_cpp/src/c/imagerequest.c:225
#2 0x00007ffff648bede in qpidBridgeMamaInboxImpl_onMsg (subscription=0x772900, msg=0x7aebb0, closure=0x7727c0, itemClosure=0x0) at mama/c_cpp/src/c/bridge/qpid/inbox.c:298
#3 0x00007ffff7b76ab4 in mamaSubscription_forwardMsg (subscription=0x772900, msg=0x7aebb0) at mama/c_cpp/src/c/subscription.c:1422
#4 0x00007ffff7b781ef in mamaSubscription_processMsg (subscription=0x772900, msg=0x7aebb0) at mama/c_cpp/src/c/subscription.c:2315
#5 0x00007ffff6490818 in qpidBridgeMamaTransportImpl_queueCallback (queue=0x60eb50, closure=0x61df80) at mama/c_cpp/src/c/bridge/qpid/transport.c:1083
#6 0x00007ffff7b90a1f in wombatQueue_dispatchInt (queue=0x60ecb0, data=0x0, closure=0x0, isTimed=1 '\001', timout=500) at common/c_cpp/src/c/queue.c:319
#7 0x00007ffff7b90aa2 in wombatQueue_timedDispatch (queue=0x60ecb0, data=0x0, closure=0x0, timeout=500) at common/c_cpp/src/c/queue.c:335
#8 0x00007ffff648e720 in qpidBridgeMamaQueue_dispatch (queue=0x60ec40) at mama/c_cpp/src/c/bridge/qpid/queue.c:265
#9 0x00007ffff7b6e1de in mamaQueue_dispatch (queue=0x60eb50) at mama/c_cpp/src/c/queue.c:824
#10 0x00007ffff648a8c3 in qpidBridge_start (defaultEventQueue=0x60eb50) at mama/c_cpp/src/c/bridge/qpid/bridge.c:196
#11 0x00007ffff7b52976 in mama_start (bridgeImpl=0x60e750) at mama/c_cpp/src/c/mama.c:1659
#12 0x0000000000403e61 in buildDataDictionary () at mama/c_cpp/src/examples/c/mamalistenc.c:647
#13 0x000000000040366f in main (argc=9, argv=0x7fffffffd728) at mama/c_cpp/src/examples/c/mamalistenc.c:335

tick42rmds bridge call stack:
#0 0x0000000000000000 in ?? ()
#1 0x00007ffff78cc259 in mamaSubscription_forwardMsg (subscription=0x7fffe9757c50, msg=0x641d60) at mama/c_cpp/src/c/subscription.c:1426
#2 0x00007ffff78a38ec in processPointToPointMessage (callback=0x7fffe9768fb0, msg=0x641d60, msgType=6, ctx=0x7fffe939b500) at mama/c_cpp/src/c/listenermsgcallback.c:174
#3 0x00007ffff78a3f37 in listenerMsgCallback_processMsg (callback=0x7fffe9768fb0, msg=0x641d60, ctx=0x7fffe939b500) at mama/c_cpp/src/c/listenermsgcallback.c:471
#4 0x00007ffff78cd7e5 in mamaSubscription_processMsg (subscription=0x7fffe939b2e0, msg=0x641d60) at mama/c_cpp/src/c/subscription.c:2260
#5 0x00007ffff51fdaf3 in RMDSBridgeSubscription::OnMessage(mamaMsgImpl_*, mamaMsgType) ()
from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so
#6 0x00007ffff5258492 in UPASubscription::NotifyListenersRefreshMessage(mamaMsgImpl_*, boost::shared_ptr<RMDSBridgeSubscription>, bool) ()
from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so
#7 0x00007ffff5259032 in UPASubscription::InternalProcessMarketPriceResponse(RsslMsg*, RsslDecIterator*) ()
from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so
#8 0x00007ffff525e785 in UPASubscription::ProcessMarketPriceResponse(RsslMsg*, RsslDecIterator*) ()
from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so
#9 0x00007ffff522ccc8 in UPAConsumer::ProcessResponse(RsslChannel*, RwfBuffer*) ()
from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so
#10 0x00007ffff522d2b0 in UPAConsumer::ReadFromChannel(RsslChannel*) () from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so
#11 0x00007ffff522e62d in UPAConsumer::Run() () from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so
#12 0x00007ffff52146cc in RMDSSubscriber::threadFunc(void*) () from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so
#13 0x00007ffff6733806 in start_thread () from /lib64/libpthread.so.0
#14 0x00007ffff5af59bd in clone () from /lib64/libc.so.6
#15 0x0000000000000000 in ?? ()

BR,
Konstantin Baydarov

-----Original Message-----
From: Yury Batrakov
Sent: Tuesday, June 20, 2017 8:12 PM
To: Tom Doust <tom.doust@...>; Konstantin Baydarov <konstantin.baydarov@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...>
Subject: RE: [Openmama-dev] [Openmama-users] Concurrent subscription.destroy() ? Crash when using tick42rmds transport.

Classification: Public

Hi Tom,

We are using version 1.3.
As I see from latest github code the problem still exists. See RMDSBridgeSubscription::OnMessage method:

if (isShutdown_ || ((0 != source_) && source_->IsPausedUpdates()))
{
return;
}

// ... Shutdown() may be called here
// And then MAMA can start destroying subscription_'s fields try
{
status = mamaSubscription_processMsg(subscription_, msg); // This function will be examining subscription_'s fields being destroyed
}


-----Original Message-----
From: openmama-dev-bounces@... [mailto:openmama-dev-bounces@...] On Behalf Of Tom Doust
Sent: Tuesday, June 20, 2017 4:09 PM
To: Konstantin Baydarov <konstantin.baydarov@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...>
Subject: Re: [Openmama-dev] [Openmama-users] Concurrent subscription.destroy() ? Crash when using tick42rmds transport.

Hi Yury, Konstantin

Are you using the current github version of the bridge code? We looked at and fixed some of the issues around locking the subscription destroy some time back.

It would be good to know if we have missed something.

Best regards

Tom

-----Original Message-----
From: openmama-users-bounces@... [mailto:openmama-users-bounces@...] On Behalf Of Konstantin Baydarov
Sent: 20 June 2017 11:32
To: openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...>
Subject: Re: [Openmama-users] Concurrent subscription.destroy() ? Crash when using tick42rmds transport.

Classification: Public

Hi, guys.

I'm working on the issue with Yury. I spotted the deadlock possibility during debugging tick42rmds bridge crash on unsubscribe, will be interested knowing the solution as well.

BR,
Konstantin Baydarov

-----Original Message-----
From: Yury Batrakov
Sent: Tuesday, June 20, 2017 12:38 PM
To: Frank Quinn <fquinn@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...>
Cc: Konstantin Baydarov <konstantin.baydarov@...>
Subject: RE: Concurrent subscription.destroy() ? Crash when using tick42rmds transport.

Classification: Public

Hi Frank,

Let me answer your question in random order :) 2. It looks like a designed behavior of RMDS bridge - callbacks are invoked in a thread servicing transport events from a server. One thread per mamaTransport is created.
1. Therefore race conditions are possible. Our case is: mute is called for a subscription, then mamaSubscription_cleanup frees self->mInitialRequest but concurrent mamaSubscription_processMsg call tries to access self->mInitialRequest because the message it processes is initial message from a server.
3. Taking in account p.2 we cannot process mute request asynchronously as MAMA starts freeing subscription resources immediately after mute is called. Do you think it is possible for MAMA not to invoke bridgeMamaSubscriptionMute() under mamaSubscription locks? Thus the contract for this function would be:
- This call should be synchronous and no events should be processed after it returned (like before)
- It should be reentrant and synchronize it's operations by itself (quite sensible requirement)

-----Original Message-----
From: Frank Quinn [mailto:fquinn@...]
Sent: Tuesday, June 20, 2017 10:05 AM
To: Yury Batrakov <yury.batrakov@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...>
Subject: RE: Concurrent subscription.destroy() ? Crash when using tick42rmds transport.

Hi Yury,

Thanks for the detailed query, I have a few outstanding questions and suggestions on this one:

1. I would question whether or not mamaSubscription_processMsg should crash for a muted subscription. Muting is a state which exists to attempt to stop new events coming in. However if you're in the dispatcher thread and you just received an object, it's too late this time - mute should only be invoked prior to prevent the *next* read. So perhaps (knowing nothing about the RMDS bridge) the straightforward solution would be simply to do the checks which may cause muting after the callback?
2. Should the thread which processes events from RMDS server invoke the callback method directly (inline)? Is it on the same thread as is assigned to the MAMA Subscription object? It should be to match the application's expected concurrency behaviour.
3. Rather than muting immediately, you could consider creating a muting callback event which gets enqueued onto your subscription thread. That way the mute event will always be synchronous with the subscription thread and you don't need to worry about locking any associated resources. Locking with subscription objects (particularly MAMA core objects) is hairy stuff and you should avoid it where possible to avoid race conditions.

Cheers,
Frank



-----Original Message-----
From: openmama-dev-bounces@... [mailto:openmama-dev-bounces@...] On Behalf Of Yury Batrakov
Sent: 19 June 2017 17:42
To: openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...>
Subject: [Openmama-dev] Concurrent subscription.destroy() ? Crash when using tick42rmds transport.

Classification: Public

Hi guys,

I've faced the following issue when using OpenMAMA and tick42rmds bridge.
The bridge internally creates a thread to process events from RMDS server, once a message is received that thread invokes mamaSubscription_processMsg. While the message is processed user may want to destroy the subscription (obviously in other thread). To avoid corruption of mamaSubscription object, mamaSubscription_destroy() function calls bridge->mute for the bridge to stop calling mamaSubscription_processMsg() and only then deallocates mamaSubscription. The problem with this approach is the following: here's the pseudo code for RMDS dispatching thread


if(muted) {
// Do not dispatch
return;
}

// Do some other checks <-- mute() may be invoked here
mamaSubscription_processMsg() // processMsg for muted subscription, may crash


The solution for that is to change RMDS bridge to block in bridge->mute() call until mamaSubscription_processMsg() returns but there's another problem: mamaSubscription_processMsg and mamaSubscription_deactivate may deadlock on wombatThrottle. Consider the following scenario:
1. RMDS bridge thread invokes mamaSubscription_processMsg() for message of type initial
2. User thread invokes mamaSubscription_destroy() which acquires wombat throttle lock:
if (impl->mTransport)
throttle = mamaTransportImpl_getThrottle(impl->mTransport,
MAMA_THROTTLE_DEFAULT);

if(NULL != throttle)
{
wombatThrottle_lock(throttle);
}
3. Then mamaSubscription_destroy calls mamaSubscription_deactivate_internal which calls our new version of bridge->mute() which waits for RMDS bridge thread to finish message processing
if (impl->mSubscBridge)
{
impl->mBridgeImpl->bridgeMamaSubscriptionMute (impl->mSubscBridge);
}
4. RMDS bridge handles initial message and tries to acquire the same throttle:
#5 0x00007ffff78d4f32 in wombatThrottle_lock (throttle=0x6298e0) at mama/c_cpp/src/c/throttle.c:441
#6 0x00007ffff78a34e2 in imageRequest_stopWaitForResponse (request=0x14d1a20) at mama/c_cpp/src/c/imagerequest.c:774
#7 0x00007ffff78cbe06 in mamaSubscription_stopWaitForResponse (subscription=0xe36280, ctx=0xe364a0) at mama/c_cpp/src/c/subscription.c:1262
#8 0x00007ffff78a38fe in processPointToPointMessage (callback=0x1527e50, msg=0x642460, msgType=6, ctx=0xe364a0) at mama/c_cpp/src/c/listenermsgcallback.c:169
#9 0x00007ffff78a3f9c in listenerMsgCallback_processMsg (callback=0x1527e50, msg=0x642460, ctx=0xe364a0) at mama/c_cpp/src/c/listenermsgcallback.c:480
#10 0x00007ffff78cd825 in mamaSubscription_processMsg (subscription=0xe36280, msg=0x642460) at mama/c_cpp/src/c/subscription.c:2259

What do you think is the best way to avoid this?


---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.


The information contained in this message may be privileged and confidential and protected from disclosure. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by replying to this message and deleting it from your computer. Thank you. Vela Trading Technologies LLC


---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.
_______________________________________________
Openmama-users mailing list
Openmama-users@...
https://lists.openmama.org/mailman/listinfo/openmama-users
_______________________________________________
Openmama-dev mailing list
Openmama-dev@...
https://lists.openmama.org/mailman/listinfo/openmama-dev


---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.


Tom Doust
 

Yes we call back on the bridge’s thread. We don’t queue the message, we leave the client code to do that if it wants to. This is by design.

On 20/06/2017, 18:18, "Konstantin Baydarov" <konstantin.baydarov@...> wrote:

Classification: Public

Hi, Tom.

I noticed, that comparing to qpid bridge(that comes with openmama sources), tick42rmds calls mamaSubscription_processMsg() method from separate thread and not from mamaQueue_dispatch(), wondering if it's correct. Probably it's one of the reasons of the issue that we facing?
Qpid bridge call stack:
#0 mamaSubscription_processMsg (subscription=0x76e150, msg=0x7aebb0) at mama/c_cpp/src/c/subscription.c:2226
#1 0x00007ffff7b4c580 in imageRequestImpl_onInitialMessage (msg=0x7aebb0, closure=0x772710) at mama/c_cpp/src/c/imagerequest.c:225
#2 0x00007ffff648bede in qpidBridgeMamaInboxImpl_onMsg (subscription=0x772900, msg=0x7aebb0, closure=0x7727c0, itemClosure=0x0) at mama/c_cpp/src/c/bridge/qpid/inbox.c:298
#3 0x00007ffff7b76ab4 in mamaSubscription_forwardMsg (subscription=0x772900, msg=0x7aebb0) at mama/c_cpp/src/c/subscription.c:1422
#4 0x00007ffff7b781ef in mamaSubscription_processMsg (subscription=0x772900, msg=0x7aebb0) at mama/c_cpp/src/c/subscription.c:2315
#5 0x00007ffff6490818 in qpidBridgeMamaTransportImpl_queueCallback (queue=0x60eb50, closure=0x61df80) at mama/c_cpp/src/c/bridge/qpid/transport.c:1083
#6 0x00007ffff7b90a1f in wombatQueue_dispatchInt (queue=0x60ecb0, data=0x0, closure=0x0, isTimed=1 '\001', timout=500) at common/c_cpp/src/c/queue.c:319
#7 0x00007ffff7b90aa2 in wombatQueue_timedDispatch (queue=0x60ecb0, data=0x0, closure=0x0, timeout=500) at common/c_cpp/src/c/queue.c:335
#8 0x00007ffff648e720 in qpidBridgeMamaQueue_dispatch (queue=0x60ec40) at mama/c_cpp/src/c/bridge/qpid/queue.c:265
#9 0x00007ffff7b6e1de in mamaQueue_dispatch (queue=0x60eb50) at mama/c_cpp/src/c/queue.c:824
#10 0x00007ffff648a8c3 in qpidBridge_start (defaultEventQueue=0x60eb50) at mama/c_cpp/src/c/bridge/qpid/bridge.c:196
#11 0x00007ffff7b52976 in mama_start (bridgeImpl=0x60e750) at mama/c_cpp/src/c/mama.c:1659
#12 0x0000000000403e61 in buildDataDictionary () at mama/c_cpp/src/examples/c/mamalistenc.c:647
#13 0x000000000040366f in main (argc=9, argv=0x7fffffffd728) at mama/c_cpp/src/examples/c/mamalistenc.c:335

tick42rmds bridge call stack:
#0 0x0000000000000000 in ?? ()
#1 0x00007ffff78cc259 in mamaSubscription_forwardMsg (subscription=0x7fffe9757c50, msg=0x641d60) at mama/c_cpp/src/c/subscription.c:1426
#2 0x00007ffff78a38ec in processPointToPointMessage (callback=0x7fffe9768fb0, msg=0x641d60, msgType=6, ctx=0x7fffe939b500) at mama/c_cpp/src/c/listenermsgcallback.c:174
#3 0x00007ffff78a3f37 in listenerMsgCallback_processMsg (callback=0x7fffe9768fb0, msg=0x641d60, ctx=0x7fffe939b500) at mama/c_cpp/src/c/listenermsgcallback.c:471
#4 0x00007ffff78cd7e5 in mamaSubscription_processMsg (subscription=0x7fffe939b2e0, msg=0x641d60) at mama/c_cpp/src/c/subscription.c:2260
#5 0x00007ffff51fdaf3 in RMDSBridgeSubscription::OnMessage(mamaMsgImpl_*, mamaMsgType) ()
from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so
#6 0x00007ffff5258492 in UPASubscription::NotifyListenersRefreshMessage(mamaMsgImpl_*, boost::shared_ptr<RMDSBridgeSubscription>, bool) ()
from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so
#7 0x00007ffff5259032 in UPASubscription::InternalProcessMarketPriceResponse(RsslMsg*, RsslDecIterator*) ()
from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so
#8 0x00007ffff525e785 in UPASubscription::ProcessMarketPriceResponse(RsslMsg*, RsslDecIterator*) ()
from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so
#9 0x00007ffff522ccc8 in UPAConsumer::ProcessResponse(RsslChannel*, RwfBuffer*) ()
from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so
#10 0x00007ffff522d2b0 in UPAConsumer::ReadFromChannel(RsslChannel*) () from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so
#11 0x00007ffff522e62d in UPAConsumer::Run() () from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so
#12 0x00007ffff52146cc in RMDSSubscriber::threadFunc(void*) () from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so
#13 0x00007ffff6733806 in start_thread () from /lib64/libpthread.so.0
#14 0x00007ffff5af59bd in clone () from /lib64/libc.so.6
#15 0x0000000000000000 in ?? ()

BR,
Konstantin Baydarov

-----Original Message-----
From: Yury Batrakov
Sent: Tuesday, June 20, 2017 8:12 PM
To: Tom Doust <tom.doust@...>; Konstantin Baydarov <konstantin.baydarov@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...>
Subject: RE: [Openmama-dev] [Openmama-users] Concurrent subscription.destroy() ? Crash when using tick42rmds transport.

Classification: Public

Hi Tom,

We are using version 1.3.
As I see from latest github code the problem still exists. See RMDSBridgeSubscription::OnMessage method:

if (isShutdown_ || ((0 != source_) && source_->IsPausedUpdates()))
{
return;
}

// ... Shutdown() may be called here
// And then MAMA can start destroying subscription_'s fields try
{
status = mamaSubscription_processMsg(subscription_, msg); // This function will be examining subscription_'s fields being destroyed
}


-----Original Message-----
From: openmama-dev-bounces@... [mailto:openmama-dev-bounces@...] On Behalf Of Tom Doust
Sent: Tuesday, June 20, 2017 4:09 PM
To: Konstantin Baydarov <konstantin.baydarov@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...>
Subject: Re: [Openmama-dev] [Openmama-users] Concurrent subscription.destroy() ? Crash when using tick42rmds transport.

Hi Yury, Konstantin

Are you using the current github version of the bridge code? We looked at and fixed some of the issues around locking the subscription destroy some time back.

It would be good to know if we have missed something.

Best regards

Tom

-----Original Message-----
From: openmama-users-bounces@... [mailto:openmama-users-bounces@...] On Behalf Of Konstantin Baydarov
Sent: 20 June 2017 11:32
To: openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...>
Subject: Re: [Openmama-users] Concurrent subscription.destroy() ? Crash when using tick42rmds transport.

Classification: Public

Hi, guys.

I'm working on the issue with Yury. I spotted the deadlock possibility during debugging tick42rmds bridge crash on unsubscribe, will be interested knowing the solution as well.

BR,
Konstantin Baydarov

-----Original Message-----
From: Yury Batrakov
Sent: Tuesday, June 20, 2017 12:38 PM
To: Frank Quinn <fquinn@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...>
Cc: Konstantin Baydarov <konstantin.baydarov@...>
Subject: RE: Concurrent subscription.destroy() ? Crash when using tick42rmds transport.

Classification: Public

Hi Frank,

Let me answer your question in random order :) 2. It looks like a designed behavior of RMDS bridge - callbacks are invoked in a thread servicing transport events from a server. One thread per mamaTransport is created.
1. Therefore race conditions are possible. Our case is: mute is called for a subscription, then mamaSubscription_cleanup frees self->mInitialRequest but concurrent mamaSubscription_processMsg call tries to access self->mInitialRequest because the message it processes is initial message from a server.
3. Taking in account p.2 we cannot process mute request asynchronously as MAMA starts freeing subscription resources immediately after mute is called. Do you think it is possible for MAMA not to invoke bridgeMamaSubscriptionMute() under mamaSubscription locks? Thus the contract for this function would be:
- This call should be synchronous and no events should be processed after it returned (like before)
- It should be reentrant and synchronize it's operations by itself (quite sensible requirement)

-----Original Message-----
From: Frank Quinn [mailto:fquinn@...]
Sent: Tuesday, June 20, 2017 10:05 AM
To: Yury Batrakov <yury.batrakov@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...>
Subject: RE: Concurrent subscription.destroy() ? Crash when using tick42rmds transport.

Hi Yury,

Thanks for the detailed query, I have a few outstanding questions and suggestions on this one:

1. I would question whether or not mamaSubscription_processMsg should crash for a muted subscription. Muting is a state which exists to attempt to stop new events coming in. However if you're in the dispatcher thread and you just received an object, it's too late this time - mute should only be invoked prior to prevent the *next* read. So perhaps (knowing nothing about the RMDS bridge) the straightforward solution would be simply to do the checks which may cause muting after the callback?
2. Should the thread which processes events from RMDS server invoke the callback method directly (inline)? Is it on the same thread as is assigned to the MAMA Subscription object? It should be to match the application's expected concurrency behaviour.
3. Rather than muting immediately, you could consider creating a muting callback event which gets enqueued onto your subscription thread. That way the mute event will always be synchronous with the subscription thread and you don't need to worry about locking any associated resources. Locking with subscription objects (particularly MAMA core objects) is hairy stuff and you should avoid it where possible to avoid race conditions.

Cheers,
Frank



-----Original Message-----
From: openmama-dev-bounces@... [mailto:openmama-dev-bounces@...] On Behalf Of Yury Batrakov
Sent: 19 June 2017 17:42
To: openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...>
Subject: [Openmama-dev] Concurrent subscription.destroy() ? Crash when using tick42rmds transport.

Classification: Public

Hi guys,

I've faced the following issue when using OpenMAMA and tick42rmds bridge.
The bridge internally creates a thread to process events from RMDS server, once a message is received that thread invokes mamaSubscription_processMsg. While the message is processed user may want to destroy the subscription (obviously in other thread). To avoid corruption of mamaSubscription object, mamaSubscription_destroy() function calls bridge->mute for the bridge to stop calling mamaSubscription_processMsg() and only then deallocates mamaSubscription. The problem with this approach is the following: here's the pseudo code for RMDS dispatching thread


if(muted) {
// Do not dispatch
return;
}

// Do some other checks <-- mute() may be invoked here
mamaSubscription_processMsg() // processMsg for muted subscription, may crash


The solution for that is to change RMDS bridge to block in bridge->mute() call until mamaSubscription_processMsg() returns but there's another problem: mamaSubscription_processMsg and mamaSubscription_deactivate may deadlock on wombatThrottle. Consider the following scenario:
1. RMDS bridge thread invokes mamaSubscription_processMsg() for message of type initial
2. User thread invokes mamaSubscription_destroy() which acquires wombat throttle lock:
if (impl->mTransport)
throttle = mamaTransportImpl_getThrottle(impl->mTransport,
MAMA_THROTTLE_DEFAULT);

if(NULL != throttle)
{
wombatThrottle_lock(throttle);
}
3. Then mamaSubscription_destroy calls mamaSubscription_deactivate_internal which calls our new version of bridge->mute() which waits for RMDS bridge thread to finish message processing
if (impl->mSubscBridge)
{
impl->mBridgeImpl->bridgeMamaSubscriptionMute (impl->mSubscBridge);
}
4. RMDS bridge handles initial message and tries to acquire the same throttle:
#5 0x00007ffff78d4f32 in wombatThrottle_lock (throttle=0x6298e0) at mama/c_cpp/src/c/throttle.c:441
#6 0x00007ffff78a34e2 in imageRequest_stopWaitForResponse (request=0x14d1a20) at mama/c_cpp/src/c/imagerequest.c:774
#7 0x00007ffff78cbe06 in mamaSubscription_stopWaitForResponse (subscription=0xe36280, ctx=0xe364a0) at mama/c_cpp/src/c/subscription.c:1262
#8 0x00007ffff78a38fe in processPointToPointMessage (callback=0x1527e50, msg=0x642460, msgType=6, ctx=0xe364a0) at mama/c_cpp/src/c/listenermsgcallback.c:169
#9 0x00007ffff78a3f9c in listenerMsgCallback_processMsg (callback=0x1527e50, msg=0x642460, ctx=0xe364a0) at mama/c_cpp/src/c/listenermsgcallback.c:480
#10 0x00007ffff78cd825 in mamaSubscription_processMsg (subscription=0xe36280, msg=0x642460) at mama/c_cpp/src/c/subscription.c:2259

What do you think is the best way to avoid this?


---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.


The information contained in this message may be privileged and confidential and protected from disclosure. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by replying to this message and deleting it from your computer. Thank you. Vela Trading Technologies LLC


---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.
_______________________________________________
Openmama-users mailing list
Openmama-users@...
https://lists.openmama.org/mailman/listinfo/openmama-users
_______________________________________________
Openmama-dev mailing list
Openmama-dev@...
https://lists.openmama.org/mailman/listinfo/openmama-dev


---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.


Konstantin Baydarov
 

Classification: Public

Hi, Tom.

As Frank suggested, I switched our app from destroy () to destroyEx() (MamaSubscription), but I still get a segmentation fault. I think the reason of segfault is that while RMDSBridgeSubscription::OnMessage() calls mamaSubscription_processMsg() in the tick42rmds thread - the subscription is destroyed in the mamaQueue_dispatch() by the mama_start() thread. The problem is that tick42rmds processing thread, calling RMDSBridgeSubscription::OnMessage(), is not synchronized with mama_start() thread, so mama_start() thread can destroy mama subscription at any time including in the moment of running mamaSubscription_processMsg().
Here are stacks of both threads that operates on the same subscription concurrently without any synchronization:
1) mama_start() thread that destroys subscription:
#0 mamaSubscription_destroy (subscription=0x0) at mama/c_cpp/src/c/subscription.c:2929
#1 0x00007ffff51f259a in tick42rmdsBridgeMamaInbox_destroy () from /export/home/dnauat/komodo/UAT/UK/kb/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so
#2 0x00007ffff78b836b in mamaInbox_destroy (inbox=0x1065f10) at mama/c_cpp/src/c/inbox.c:249
#3 0x00007ffff78a2aec in imageRequest_destroy (request=0x104a300) at mama/c_cpp/src/c/imagerequest.c:425
#4 0x00007ffff78a1425 in dqContext_cleanup (ctx=0x7fffec3add10) at mama/c_cpp/src/c/dqstrategy.c:621
#5 0x00007ffff78cc4d3 in mamaSubscription_cleanup (subscription=0x7fffec3adae0) at mama/c_cpp/src/c/subscription.c:1510
#6 0x00007ffff78ce5df in mamaSubscription_destroy (subscription=0x7fffec3adae0) at mama/c_cpp/src/c/subscription.c:2948
#7 0x00007ffff78cc7b6 in mamaSubscription_DestroyThroughQueueCB (Queue=0x61e340, closure=0x7fffec3adae0) at mama/c_cpp/src/c/subscription.c:1604
#8 0x00007ffff78e634f in wombatQueue_dispatchInt (queue=0x61e430, data=0x0, closure=0x0, isTimed=1 '\001', timout=500) at common/c_cpp/src/c/queue.c:319
#9 0x00007ffff78e63d2 in wombatQueue_timedDispatch (queue=0x61e430, data=0x0, closure=0x0, timeout=500) at common/c_cpp/src/c/queue.c:335
#10 0x00007ffff51fa6d9 in tick42rmdsBridgeMamaQueue_dispatch () from /export/home/dnauat/komodo/UAT/UK/kb/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so
#11 0x00007ffff78c3a8b in mamaQueue_dispatch (queue=0x61e340) at mama/c_cpp/src/c/queue.c:819
#12 0x00007ffff51f20a8 in tick42rmdsBridge_start () from /export/home/dnauat/komodo/UAT/UK/kb/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so
#13 0x00007ffff78a8810 in mama_start (bridgeImpl=0x61d660) at mama/c_cpp/src/c/mama.c:1591
#14 0x00007ffff7b8ee53 in Wombat::Mama::start (bridgeImpl=0x61d660) at mama/c_cpp/src/cpp/mamacpp.cpp:198
#15 0x000000000040c53a in MamaListen::start (this=0x7fffffffd690) at mamalistencpp_mt.cpp:1072
#16 0x000000000040ebaf in main (argc=9, argv=0x7fffffffd948) at mamalistencpp_mt.cpp:1915

2) tick42rmds thread - moment of Segfault:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff38ff700 (LWP 25476)]
0x00007ffff78a3508 in imageRequest_stopWaitForResponse (request=0x0) at mama/c_cpp/src/c/imagerequest.c:772
772 mamaSubscription_getTransport ( impl->mSubscription, &tport );
(gdb) bt
#0 0x00007ffff78a3508 in imageRequest_stopWaitForResponse (request=0x0) at mama/c_cpp/src/c/imagerequest.c:772
#1 0x00007ffff78cbe53 in mamaSubscription_stopWaitForResponse (subscription=0x7fffe0587200, ctx=0x7fffe0587420) at mama/c_cpp/src/c/subscription.c:1264
#2 0x00007ffff78a393e in processPointToPointMessage (callback=0x7fffe0587a00, msg=0x63eda0, msgType=6, ctx=0x7fffe0587420) at mama/c_cpp/src/c/listenermsgcallback.c:171
#3 0x00007ffff78a3fdc in listenerMsgCallback_processMsg (callback=0x7fffe0587a00, msg=0x63eda0, ctx=0x7fffe0587420) at mama/c_cpp/src/c/listenermsgcallback.c:482
#4 0x00007ffff78cd9a7 in _mamaSubscription_processMsg (subscription=0x7fffe0587200, msg=0x63eda0) at mama/c_cpp/src/c/subscription.c:2322
#5 0x00007ffff78cd7ff in mamaSubscription_processMsg (subscription=0x7fffe0587200, msg=0x63eda0) at mama/c_cpp/src/c/subscription.c:2272
#6 0x00007ffff51fae33 in RMDSBridgeSubscription::OnMessage(mamaMsgImpl_*, mamaMsgType) () from /export/home/dnauat/komodo/UAT/UK/kb/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so
#7 0x00007ffff5257632 in UPASubscription::NotifyListenersRefreshMessage(mamaMsgImpl_*, boost::shared_ptr<RMDSBridgeSubscription>, bool) ()
from /export/home/dnauat/komodo/UAT/UK/kb/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so
#8 0x00007ffff525bad2 in UPASubscription::InternalProcessMarketPriceResponse(RsslMsg*, RsslDecIterator*) () from /export/home/dnauat/komodo/UAT/UK/kb/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so
#9 0x00007ffff52611d5 in UPASubscription::ProcessMarketPriceResponse(RsslMsg*, RsslDecIterator*) () from /export/home/dnauat/komodo/UAT/UK/kb/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so
#10 0x00007ffff522a268 in UPAConsumer::ProcessResponse(RsslChannel*, RwfBuffer*) () from /export/home/dnauat/komodo/UAT/UK/kb/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so
#11 0x00007ffff522a850 in UPAConsumer::ReadFromChannel(RsslChannel*) () from /export/home/dnauat/komodo/UAT/UK/kb/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so
#12 0x00007ffff522bb1d in UPAConsumer::Run() () from /export/home/dnauat/komodo/UAT/UK/kb/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so
#13 0x00007ffff5211b1c in RMDSSubscriber::threadFunc(void*) () from /export/home/dnauat/komodo/UAT/UK/kb/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so
#14 0x00007ffff6733806 in start_thread () from /lib64/libpthread.so.0
#15 0x00007ffff5af59bd in clone () from /lib64/libc.so.6
#16 0x0000000000000000 in ?? ()

BR,
Konstantin Baydarov

-----Original Message-----
From: Tom Doust [mailto:tom.doust@...]
Sent: Tuesday, June 20, 2017 8:40 PM
To: Konstantin Baydarov <konstantin.baydarov@...>; Yury Batrakov <yury.batrakov@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...>
Subject: Re: [Openmama-dev] [Openmama-users] Concurrent subscription.destroy() ? Crash when using tick42rmds transport.

Yes we call back on the bridge’s thread. We don’t queue the message, we leave the client code to do that if it wants to. This is by design.


On 20/06/2017, 18:18, "Konstantin Baydarov" <konstantin.baydarov@...> wrote:

Classification: Public

Hi, Tom.

I noticed, that comparing to qpid bridge(that comes with openmama sources), tick42rmds calls mamaSubscription_processMsg() method from separate thread and not from mamaQueue_dispatch(), wondering if it's correct. Probably it's one of the reasons of the issue that we facing?
Qpid bridge call stack:
#0 mamaSubscription_processMsg (subscription=0x76e150, msg=0x7aebb0) at mama/c_cpp/src/c/subscription.c:2226
#1 0x00007ffff7b4c580 in imageRequestImpl_onInitialMessage (msg=0x7aebb0, closure=0x772710) at mama/c_cpp/src/c/imagerequest.c:225
#2 0x00007ffff648bede in qpidBridgeMamaInboxImpl_onMsg (subscription=0x772900, msg=0x7aebb0, closure=0x7727c0, itemClosure=0x0) at mama/c_cpp/src/c/bridge/qpid/inbox.c:298
#3 0x00007ffff7b76ab4 in mamaSubscription_forwardMsg (subscription=0x772900, msg=0x7aebb0) at mama/c_cpp/src/c/subscription.c:1422
#4 0x00007ffff7b781ef in mamaSubscription_processMsg (subscription=0x772900, msg=0x7aebb0) at mama/c_cpp/src/c/subscription.c:2315
#5 0x00007ffff6490818 in qpidBridgeMamaTransportImpl_queueCallback (queue=0x60eb50, closure=0x61df80) at mama/c_cpp/src/c/bridge/qpid/transport.c:1083
#6 0x00007ffff7b90a1f in wombatQueue_dispatchInt (queue=0x60ecb0, data=0x0, closure=0x0, isTimed=1 '\001', timout=500) at common/c_cpp/src/c/queue.c:319
#7 0x00007ffff7b90aa2 in wombatQueue_timedDispatch (queue=0x60ecb0, data=0x0, closure=0x0, timeout=500) at common/c_cpp/src/c/queue.c:335
#8 0x00007ffff648e720 in qpidBridgeMamaQueue_dispatch (queue=0x60ec40) at mama/c_cpp/src/c/bridge/qpid/queue.c:265
#9 0x00007ffff7b6e1de in mamaQueue_dispatch (queue=0x60eb50) at mama/c_cpp/src/c/queue.c:824
#10 0x00007ffff648a8c3 in qpidBridge_start (defaultEventQueue=0x60eb50) at mama/c_cpp/src/c/bridge/qpid/bridge.c:196
#11 0x00007ffff7b52976 in mama_start (bridgeImpl=0x60e750) at mama/c_cpp/src/c/mama.c:1659
#12 0x0000000000403e61 in buildDataDictionary () at mama/c_cpp/src/examples/c/mamalistenc.c:647
#13 0x000000000040366f in main (argc=9, argv=0x7fffffffd728) at mama/c_cpp/src/examples/c/mamalistenc.c:335

tick42rmds bridge call stack:
#0 0x0000000000000000 in ?? ()
#1 0x00007ffff78cc259 in mamaSubscription_forwardMsg (subscription=0x7fffe9757c50, msg=0x641d60) at mama/c_cpp/src/c/subscription.c:1426
#2 0x00007ffff78a38ec in processPointToPointMessage (callback=0x7fffe9768fb0, msg=0x641d60, msgType=6, ctx=0x7fffe939b500) at mama/c_cpp/src/c/listenermsgcallback.c:174
#3 0x00007ffff78a3f37 in listenerMsgCallback_processMsg (callback=0x7fffe9768fb0, msg=0x641d60, ctx=0x7fffe939b500) at mama/c_cpp/src/c/listenermsgcallback.c:471
#4 0x00007ffff78cd7e5 in mamaSubscription_processMsg (subscription=0x7fffe939b2e0, msg=0x641d60) at mama/c_cpp/src/c/subscription.c:2260
#5 0x00007ffff51fdaf3 in RMDSBridgeSubscription::OnMessage(mamaMsgImpl_*, mamaMsgType) ()
from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so
#6 0x00007ffff5258492 in UPASubscription::NotifyListenersRefreshMessage(mamaMsgImpl_*, boost::shared_ptr<RMDSBridgeSubscription>, bool) ()
from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so
#7 0x00007ffff5259032 in UPASubscription::InternalProcessMarketPriceResponse(RsslMsg*, RsslDecIterator*) ()
from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so
#8 0x00007ffff525e785 in UPASubscription::ProcessMarketPriceResponse(RsslMsg*, RsslDecIterator*) ()
from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so
#9 0x00007ffff522ccc8 in UPAConsumer::ProcessResponse(RsslChannel*, RwfBuffer*) ()
from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so
#10 0x00007ffff522d2b0 in UPAConsumer::ReadFromChannel(RsslChannel*) () from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so
#11 0x00007ffff522e62d in UPAConsumer::Run() () from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so
#12 0x00007ffff52146cc in RMDSSubscriber::threadFunc(void*) () from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so
#13 0x00007ffff6733806 in start_thread () from /lib64/libpthread.so.0
#14 0x00007ffff5af59bd in clone () from /lib64/libc.so.6
#15 0x0000000000000000 in ?? ()

BR,
Konstantin Baydarov

-----Original Message-----
From: Yury Batrakov
Sent: Tuesday, June 20, 2017 8:12 PM
To: Tom Doust <tom.doust@...>; Konstantin Baydarov <konstantin.baydarov@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...>
Subject: RE: [Openmama-dev] [Openmama-users] Concurrent subscription.destroy() ? Crash when using tick42rmds transport.

Classification: Public

Hi Tom,

We are using version 1.3.
As I see from latest github code the problem still exists. See RMDSBridgeSubscription::OnMessage method:

if (isShutdown_ || ((0 != source_) && source_->IsPausedUpdates()))
{
return;
}

// ... Shutdown() may be called here
// And then MAMA can start destroying subscription_'s fields try
{
status = mamaSubscription_processMsg(subscription_, msg); // This function will be examining subscription_'s fields being destroyed
}


-----Original Message-----
From: openmama-dev-bounces@... [mailto:openmama-dev-bounces@...] On Behalf Of Tom Doust
Sent: Tuesday, June 20, 2017 4:09 PM
To: Konstantin Baydarov <konstantin.baydarov@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...>
Subject: Re: [Openmama-dev] [Openmama-users] Concurrent subscription.destroy() ? Crash when using tick42rmds transport.

Hi Yury, Konstantin

Are you using the current github version of the bridge code? We looked at and fixed some of the issues around locking the subscription destroy some time back.

It would be good to know if we have missed something.

Best regards

Tom

-----Original Message-----
From: openmama-users-bounces@... [mailto:openmama-users-bounces@...] On Behalf Of Konstantin Baydarov
Sent: 20 June 2017 11:32
To: openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...>
Subject: Re: [Openmama-users] Concurrent subscription.destroy() ? Crash when using tick42rmds transport.

Classification: Public

Hi, guys.

I'm working on the issue with Yury. I spotted the deadlock possibility during debugging tick42rmds bridge crash on unsubscribe, will be interested knowing the solution as well.

BR,
Konstantin Baydarov

-----Original Message-----
From: Yury Batrakov
Sent: Tuesday, June 20, 2017 12:38 PM
To: Frank Quinn <fquinn@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...>
Cc: Konstantin Baydarov <konstantin.baydarov@...>
Subject: RE: Concurrent subscription.destroy() ? Crash when using tick42rmds transport.

Classification: Public

Hi Frank,

Let me answer your question in random order :) 2. It looks like a designed behavior of RMDS bridge - callbacks are invoked in a thread servicing transport events from a server. One thread per mamaTransport is created.
1. Therefore race conditions are possible. Our case is: mute is called for a subscription, then mamaSubscription_cleanup frees self->mInitialRequest but concurrent mamaSubscription_processMsg call tries to access self->mInitialRequest because the message it processes is initial message from a server.
3. Taking in account p.2 we cannot process mute request asynchronously as MAMA starts freeing subscription resources immediately after mute is called. Do you think it is possible for MAMA not to invoke bridgeMamaSubscriptionMute() under mamaSubscription locks? Thus the contract for this function would be:
- This call should be synchronous and no events should be processed after it returned (like before)
- It should be reentrant and synchronize it's operations by itself (quite sensible requirement)

-----Original Message-----
From: Frank Quinn [mailto:fquinn@...]
Sent: Tuesday, June 20, 2017 10:05 AM
To: Yury Batrakov <yury.batrakov@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...>
Subject: RE: Concurrent subscription.destroy() ? Crash when using tick42rmds transport.

Hi Yury,

Thanks for the detailed query, I have a few outstanding questions and suggestions on this one:

1. I would question whether or not mamaSubscription_processMsg should crash for a muted subscription. Muting is a state which exists to attempt to stop new events coming in. However if you're in the dispatcher thread and you just received an object, it's too late this time - mute should only be invoked prior to prevent the *next* read. So perhaps (knowing nothing about the RMDS bridge) the straightforward solution would be simply to do the checks which may cause muting after the callback?
2. Should the thread which processes events from RMDS server invoke the callback method directly (inline)? Is it on the same thread as is assigned to the MAMA Subscription object? It should be to match the application's expected concurrency behaviour.
3. Rather than muting immediately, you could consider creating a muting callback event which gets enqueued onto your subscription thread. That way the mute event will always be synchronous with the subscription thread and you don't need to worry about locking any associated resources. Locking with subscription objects (particularly MAMA core objects) is hairy stuff and you should avoid it where possible to avoid race conditions.

Cheers,
Frank



-----Original Message-----
From: openmama-dev-bounces@... [mailto:openmama-dev-bounces@...] On Behalf Of Yury Batrakov
Sent: 19 June 2017 17:42
To: openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...>
Subject: [Openmama-dev] Concurrent subscription.destroy() ? Crash when using tick42rmds transport.

Classification: Public

Hi guys,

I've faced the following issue when using OpenMAMA and tick42rmds bridge.
The bridge internally creates a thread to process events from RMDS server, once a message is received that thread invokes mamaSubscription_processMsg. While the message is processed user may want to destroy the subscription (obviously in other thread). To avoid corruption of mamaSubscription object, mamaSubscription_destroy() function calls bridge->mute for the bridge to stop calling mamaSubscription_processMsg() and only then deallocates mamaSubscription. The problem with this approach is the following: here's the pseudo code for RMDS dispatching thread


if(muted) {
// Do not dispatch
return;
}

// Do some other checks <-- mute() may be invoked here
mamaSubscription_processMsg() // processMsg for muted subscription, may crash


The solution for that is to change RMDS bridge to block in bridge->mute() call until mamaSubscription_processMsg() returns but there's another problem: mamaSubscription_processMsg and mamaSubscription_deactivate may deadlock on wombatThrottle. Consider the following scenario:
1. RMDS bridge thread invokes mamaSubscription_processMsg() for message of type initial
2. User thread invokes mamaSubscription_destroy() which acquires wombat throttle lock:
if (impl->mTransport)
throttle = mamaTransportImpl_getThrottle(impl->mTransport,
MAMA_THROTTLE_DEFAULT);

if(NULL != throttle)
{
wombatThrottle_lock(throttle);
}
3. Then mamaSubscription_destroy calls mamaSubscription_deactivate_internal which calls our new version of bridge->mute() which waits for RMDS bridge thread to finish message processing
if (impl->mSubscBridge)
{
impl->mBridgeImpl->bridgeMamaSubscriptionMute (impl->mSubscBridge);
}
4. RMDS bridge handles initial message and tries to acquire the same throttle:
#5 0x00007ffff78d4f32 in wombatThrottle_lock (throttle=0x6298e0) at mama/c_cpp/src/c/throttle.c:441
#6 0x00007ffff78a34e2 in imageRequest_stopWaitForResponse (request=0x14d1a20) at mama/c_cpp/src/c/imagerequest.c:774
#7 0x00007ffff78cbe06 in mamaSubscription_stopWaitForResponse (subscription=0xe36280, ctx=0xe364a0) at mama/c_cpp/src/c/subscription.c:1262
#8 0x00007ffff78a38fe in processPointToPointMessage (callback=0x1527e50, msg=0x642460, msgType=6, ctx=0xe364a0) at mama/c_cpp/src/c/listenermsgcallback.c:169
#9 0x00007ffff78a3f9c in listenerMsgCallback_processMsg (callback=0x1527e50, msg=0x642460, ctx=0xe364a0) at mama/c_cpp/src/c/listenermsgcallback.c:480
#10 0x00007ffff78cd825 in mamaSubscription_processMsg (subscription=0xe36280, msg=0x642460) at mama/c_cpp/src/c/subscription.c:2259

What do you think is the best way to avoid this?


---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.


The information contained in this message may be privileged and confidential and protected from disclosure. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by replying to this message and deleting it from your computer. Thank you. Vela Trading Technologies LLC


---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.
_______________________________________________
Openmama-users mailing list
Openmama-users@...
https://lists.openmama.org/mailman/listinfo/openmama-users
_______________________________________________
Openmama-dev mailing list
Openmama-dev@...
https://lists.openmama.org/mailman/listinfo/openmama-dev


---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.



---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.