Re: Crash in subscription destroy: error in mamaSubscription_getSubjectContext
Frank Quinn <fquinn@...>
Thanks Yury,
toggle quoted messageShow quoted text
Raised https://github.com/OpenMAMA/OpenMAMA/issues/300 so we don't forget to get this sorted. Cheers, Frank
-----Original Message-----
From: Yury Batrakov [mailto:yury.batrakov@...] Sent: 26 June 2017 09:31 To: openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Cc: Frank Quinn <fquinn@...> Subject: Crash in subscription destroy: error in mamaSubscription_getSubjectContext Classification: Public Hi guys, There is a copy-paste error in mamaSubscription_getSubjectContext function that leads to crash on subscription destroy in case if issue symbol differs that the symbol for subscription. The problem is in this line: mama/c_cpp/src/c/subscription.c:1242 (link to GitHub: https://github.com/OpenMAMA/OpenMAMA/blob/master/mama/c_cpp/src/c/subscription.c#L1242): entBridge->createSubscription (entBridge, &(self->mSubjectContext)); The last argument to createSubscription should be context, not &(self->mSubjectContext). Please could you make appropriate change in the code? --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy. The information contained in this message may be privileged and confidential and protected from disclosure. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by replying to this message and deleting it from your computer. Thank you. Vela Trading Technologies LLC
|
|
Code change(s) just landed on origin/next (Successful)
jenkins@...
Some changes have just been added to the origin/next branch!
[fquinn.ni] [MAMA] Add comments to the DQ Publisher Manager headers (#299) mama/c_cpp/src/c/mama/dqpublishermanager.h
Results for OpenMAMA_Snapshot_Linux CI run with latest changes:
You may also check CI console output to view the full results.
|
|
Code change(s) just landed on origin/next (Still Failing)
jenkins@...
Some changes have just been added to the origin/next branch!
No changes
Results for OpenMAMA_Snapshot_Windows CI run with latest changes:
You may also check CI console output to view the full results.
|
|
Crash in subscription destroy: error in mamaSubscription_getSubjectContext
Yury Batrakov
Classification: Public
Hi guys, There is a copy-paste error in mamaSubscription_getSubjectContext function that leads to crash on subscription destroy in case if issue symbol differs that the symbol for subscription. The problem is in this line: mama/c_cpp/src/c/subscription.c:1242 (link to GitHub: https://github.com/OpenMAMA/OpenMAMA/blob/master/mama/c_cpp/src/c/subscription.c#L1242): entBridge->createSubscription (entBridge, &(self->mSubjectContext)); The last argument to createSubscription should be context, not &(self->mSubjectContext). Please could you make appropriate change in the code? --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.
|
|
Re: [Openmama-users] Concurrent subscription.destroy() ? Crash when using tick42rmds transport.
Konstantin Baydarov
Classification: Public
toggle quoted messageShow quoted text
Hi, Tom. As Frank suggested, I switched our app from destroy () to destroyEx() (MamaSubscription), but I still get a segmentation fault. I think the reason of segfault is that while RMDSBridgeSubscription::OnMessage() calls mamaSubscription_processMsg() in the tick42rmds thread - the subscription is destroyed in the mamaQueue_dispatch() by the mama_start() thread. The problem is that tick42rmds processing thread, calling RMDSBridgeSubscription::OnMessage(), is not synchronized with mama_start() thread, so mama_start() thread can destroy mama subscription at any time including in the moment of running mamaSubscription_processMsg(). Here are stacks of both threads that operates on the same subscription concurrently without any synchronization: 1) mama_start() thread that destroys subscription: #0 mamaSubscription_destroy (subscription=0x0) at mama/c_cpp/src/c/subscription.c:2929 #1 0x00007ffff51f259a in tick42rmdsBridgeMamaInbox_destroy () from /export/home/dnauat/komodo/UAT/UK/kb/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so #2 0x00007ffff78b836b in mamaInbox_destroy (inbox=0x1065f10) at mama/c_cpp/src/c/inbox.c:249 #3 0x00007ffff78a2aec in imageRequest_destroy (request=0x104a300) at mama/c_cpp/src/c/imagerequest.c:425 #4 0x00007ffff78a1425 in dqContext_cleanup (ctx=0x7fffec3add10) at mama/c_cpp/src/c/dqstrategy.c:621 #5 0x00007ffff78cc4d3 in mamaSubscription_cleanup (subscription=0x7fffec3adae0) at mama/c_cpp/src/c/subscription.c:1510 #6 0x00007ffff78ce5df in mamaSubscription_destroy (subscription=0x7fffec3adae0) at mama/c_cpp/src/c/subscription.c:2948 #7 0x00007ffff78cc7b6 in mamaSubscription_DestroyThroughQueueCB (Queue=0x61e340, closure=0x7fffec3adae0) at mama/c_cpp/src/c/subscription.c:1604 #8 0x00007ffff78e634f in wombatQueue_dispatchInt (queue=0x61e430, data=0x0, closure=0x0, isTimed=1 '\001', timout=500) at common/c_cpp/src/c/queue.c:319 #9 0x00007ffff78e63d2 in wombatQueue_timedDispatch (queue=0x61e430, data=0x0, closure=0x0, timeout=500) at common/c_cpp/src/c/queue.c:335 #10 0x00007ffff51fa6d9 in tick42rmdsBridgeMamaQueue_dispatch () from /export/home/dnauat/komodo/UAT/UK/kb/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so #11 0x00007ffff78c3a8b in mamaQueue_dispatch (queue=0x61e340) at mama/c_cpp/src/c/queue.c:819 #12 0x00007ffff51f20a8 in tick42rmdsBridge_start () from /export/home/dnauat/komodo/UAT/UK/kb/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so #13 0x00007ffff78a8810 in mama_start (bridgeImpl=0x61d660) at mama/c_cpp/src/c/mama.c:1591 #14 0x00007ffff7b8ee53 in Wombat::Mama::start (bridgeImpl=0x61d660) at mama/c_cpp/src/cpp/mamacpp.cpp:198 #15 0x000000000040c53a in MamaListen::start (this=0x7fffffffd690) at mamalistencpp_mt.cpp:1072 #16 0x000000000040ebaf in main (argc=9, argv=0x7fffffffd948) at mamalistencpp_mt.cpp:1915 2) tick42rmds thread - moment of Segfault: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7ffff38ff700 (LWP 25476)] 0x00007ffff78a3508 in imageRequest_stopWaitForResponse (request=0x0) at mama/c_cpp/src/c/imagerequest.c:772 772 mamaSubscription_getTransport ( impl->mSubscription, &tport ); (gdb) bt #0 0x00007ffff78a3508 in imageRequest_stopWaitForResponse (request=0x0) at mama/c_cpp/src/c/imagerequest.c:772 #1 0x00007ffff78cbe53 in mamaSubscription_stopWaitForResponse (subscription=0x7fffe0587200, ctx=0x7fffe0587420) at mama/c_cpp/src/c/subscription.c:1264 #2 0x00007ffff78a393e in processPointToPointMessage (callback=0x7fffe0587a00, msg=0x63eda0, msgType=6, ctx=0x7fffe0587420) at mama/c_cpp/src/c/listenermsgcallback.c:171 #3 0x00007ffff78a3fdc in listenerMsgCallback_processMsg (callback=0x7fffe0587a00, msg=0x63eda0, ctx=0x7fffe0587420) at mama/c_cpp/src/c/listenermsgcallback.c:482 #4 0x00007ffff78cd9a7 in _mamaSubscription_processMsg (subscription=0x7fffe0587200, msg=0x63eda0) at mama/c_cpp/src/c/subscription.c:2322 #5 0x00007ffff78cd7ff in mamaSubscription_processMsg (subscription=0x7fffe0587200, msg=0x63eda0) at mama/c_cpp/src/c/subscription.c:2272 #6 0x00007ffff51fae33 in RMDSBridgeSubscription::OnMessage(mamaMsgImpl_*, mamaMsgType) () from /export/home/dnauat/komodo/UAT/UK/kb/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so #7 0x00007ffff5257632 in UPASubscription::NotifyListenersRefreshMessage(mamaMsgImpl_*, boost::shared_ptr<RMDSBridgeSubscription>, bool) () from /export/home/dnauat/komodo/UAT/UK/kb/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so #8 0x00007ffff525bad2 in UPASubscription::InternalProcessMarketPriceResponse(RsslMsg*, RsslDecIterator*) () from /export/home/dnauat/komodo/UAT/UK/kb/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so #9 0x00007ffff52611d5 in UPASubscription::ProcessMarketPriceResponse(RsslMsg*, RsslDecIterator*) () from /export/home/dnauat/komodo/UAT/UK/kb/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so #10 0x00007ffff522a268 in UPAConsumer::ProcessResponse(RsslChannel*, RwfBuffer*) () from /export/home/dnauat/komodo/UAT/UK/kb/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so #11 0x00007ffff522a850 in UPAConsumer::ReadFromChannel(RsslChannel*) () from /export/home/dnauat/komodo/UAT/UK/kb/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so #12 0x00007ffff522bb1d in UPAConsumer::Run() () from /export/home/dnauat/komodo/UAT/UK/kb/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so #13 0x00007ffff5211b1c in RMDSSubscriber::threadFunc(void*) () from /export/home/dnauat/komodo/UAT/UK/kb/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so #14 0x00007ffff6733806 in start_thread () from /lib64/libpthread.so.0 #15 0x00007ffff5af59bd in clone () from /lib64/libc.so.6 #16 0x0000000000000000 in ?? () BR, Konstantin Baydarov
-----Original Message-----
From: Tom Doust [mailto:tom.doust@...] Sent: Tuesday, June 20, 2017 8:40 PM To: Konstantin Baydarov <konstantin.baydarov@...>; Yury Batrakov <yury.batrakov@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Subject: Re: [Openmama-dev] [Openmama-users] Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Yes we call back on the bridge’s thread. We don’t queue the message, we leave the client code to do that if it wants to. This is by design. On 20/06/2017, 18:18, "Konstantin Baydarov" <konstantin.baydarov@...> wrote: Classification: Public Hi, Tom. I noticed, that comparing to qpid bridge(that comes with openmama sources), tick42rmds calls mamaSubscription_processMsg() method from separate thread and not from mamaQueue_dispatch(), wondering if it's correct. Probably it's one of the reasons of the issue that we facing? Qpid bridge call stack: #0 mamaSubscription_processMsg (subscription=0x76e150, msg=0x7aebb0) at mama/c_cpp/src/c/subscription.c:2226 #1 0x00007ffff7b4c580 in imageRequestImpl_onInitialMessage (msg=0x7aebb0, closure=0x772710) at mama/c_cpp/src/c/imagerequest.c:225 #2 0x00007ffff648bede in qpidBridgeMamaInboxImpl_onMsg (subscription=0x772900, msg=0x7aebb0, closure=0x7727c0, itemClosure=0x0) at mama/c_cpp/src/c/bridge/qpid/inbox.c:298 #3 0x00007ffff7b76ab4 in mamaSubscription_forwardMsg (subscription=0x772900, msg=0x7aebb0) at mama/c_cpp/src/c/subscription.c:1422 #4 0x00007ffff7b781ef in mamaSubscription_processMsg (subscription=0x772900, msg=0x7aebb0) at mama/c_cpp/src/c/subscription.c:2315 #5 0x00007ffff6490818 in qpidBridgeMamaTransportImpl_queueCallback (queue=0x60eb50, closure=0x61df80) at mama/c_cpp/src/c/bridge/qpid/transport.c:1083 #6 0x00007ffff7b90a1f in wombatQueue_dispatchInt (queue=0x60ecb0, data=0x0, closure=0x0, isTimed=1 '\001', timout=500) at common/c_cpp/src/c/queue.c:319 #7 0x00007ffff7b90aa2 in wombatQueue_timedDispatch (queue=0x60ecb0, data=0x0, closure=0x0, timeout=500) at common/c_cpp/src/c/queue.c:335 #8 0x00007ffff648e720 in qpidBridgeMamaQueue_dispatch (queue=0x60ec40) at mama/c_cpp/src/c/bridge/qpid/queue.c:265 #9 0x00007ffff7b6e1de in mamaQueue_dispatch (queue=0x60eb50) at mama/c_cpp/src/c/queue.c:824 #10 0x00007ffff648a8c3 in qpidBridge_start (defaultEventQueue=0x60eb50) at mama/c_cpp/src/c/bridge/qpid/bridge.c:196 #11 0x00007ffff7b52976 in mama_start (bridgeImpl=0x60e750) at mama/c_cpp/src/c/mama.c:1659 #12 0x0000000000403e61 in buildDataDictionary () at mama/c_cpp/src/examples/c/mamalistenc.c:647 #13 0x000000000040366f in main (argc=9, argv=0x7fffffffd728) at mama/c_cpp/src/examples/c/mamalistenc.c:335 tick42rmds bridge call stack: #0 0x0000000000000000 in ?? () #1 0x00007ffff78cc259 in mamaSubscription_forwardMsg (subscription=0x7fffe9757c50, msg=0x641d60) at mama/c_cpp/src/c/subscription.c:1426 #2 0x00007ffff78a38ec in processPointToPointMessage (callback=0x7fffe9768fb0, msg=0x641d60, msgType=6, ctx=0x7fffe939b500) at mama/c_cpp/src/c/listenermsgcallback.c:174 #3 0x00007ffff78a3f37 in listenerMsgCallback_processMsg (callback=0x7fffe9768fb0, msg=0x641d60, ctx=0x7fffe939b500) at mama/c_cpp/src/c/listenermsgcallback.c:471 #4 0x00007ffff78cd7e5 in mamaSubscription_processMsg (subscription=0x7fffe939b2e0, msg=0x641d60) at mama/c_cpp/src/c/subscription.c:2260 #5 0x00007ffff51fdaf3 in RMDSBridgeSubscription::OnMessage(mamaMsgImpl_*, mamaMsgType) () from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so #6 0x00007ffff5258492 in UPASubscription::NotifyListenersRefreshMessage(mamaMsgImpl_*, boost::shared_ptr<RMDSBridgeSubscription>, bool) () from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so #7 0x00007ffff5259032 in UPASubscription::InternalProcessMarketPriceResponse(RsslMsg*, RsslDecIterator*) () from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so #8 0x00007ffff525e785 in UPASubscription::ProcessMarketPriceResponse(RsslMsg*, RsslDecIterator*) () from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so #9 0x00007ffff522ccc8 in UPAConsumer::ProcessResponse(RsslChannel*, RwfBuffer*) () from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so #10 0x00007ffff522d2b0 in UPAConsumer::ReadFromChannel(RsslChannel*) () from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so #11 0x00007ffff522e62d in UPAConsumer::Run() () from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so #12 0x00007ffff52146cc in RMDSSubscriber::threadFunc(void*) () from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so #13 0x00007ffff6733806 in start_thread () from /lib64/libpthread.so.0 #14 0x00007ffff5af59bd in clone () from /lib64/libc.so.6 #15 0x0000000000000000 in ?? () BR, Konstantin Baydarov -----Original Message----- From: Yury Batrakov Sent: Tuesday, June 20, 2017 8:12 PM To: Tom Doust <tom.doust@...>; Konstantin Baydarov <konstantin.baydarov@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Subject: RE: [Openmama-dev] [Openmama-users] Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Classification: Public Hi Tom, We are using version 1.3. As I see from latest github code the problem still exists. See RMDSBridgeSubscription::OnMessage method: if (isShutdown_ || ((0 != source_) && source_->IsPausedUpdates())) { return; } // ... Shutdown() may be called here // And then MAMA can start destroying subscription_'s fields try { status = mamaSubscription_processMsg(subscription_, msg); // This function will be examining subscription_'s fields being destroyed } -----Original Message----- From: openmama-dev-bounces@... [mailto:openmama-dev-bounces@...] On Behalf Of Tom Doust Sent: Tuesday, June 20, 2017 4:09 PM To: Konstantin Baydarov <konstantin.baydarov@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Subject: Re: [Openmama-dev] [Openmama-users] Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Hi Yury, Konstantin Are you using the current github version of the bridge code? We looked at and fixed some of the issues around locking the subscription destroy some time back. It would be good to know if we have missed something. Best regards Tom -----Original Message----- From: openmama-users-bounces@... [mailto:openmama-users-bounces@...] On Behalf Of Konstantin Baydarov Sent: 20 June 2017 11:32 To: openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Subject: Re: [Openmama-users] Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Classification: Public Hi, guys. I'm working on the issue with Yury. I spotted the deadlock possibility during debugging tick42rmds bridge crash on unsubscribe, will be interested knowing the solution as well. BR, Konstantin Baydarov -----Original Message----- From: Yury Batrakov Sent: Tuesday, June 20, 2017 12:38 PM To: Frank Quinn <fquinn@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Cc: Konstantin Baydarov <konstantin.baydarov@...> Subject: RE: Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Classification: Public Hi Frank, Let me answer your question in random order :) 2. It looks like a designed behavior of RMDS bridge - callbacks are invoked in a thread servicing transport events from a server. One thread per mamaTransport is created. 1. Therefore race conditions are possible. Our case is: mute is called for a subscription, then mamaSubscription_cleanup frees self->mInitialRequest but concurrent mamaSubscription_processMsg call tries to access self->mInitialRequest because the message it processes is initial message from a server. 3. Taking in account p.2 we cannot process mute request asynchronously as MAMA starts freeing subscription resources immediately after mute is called. Do you think it is possible for MAMA not to invoke bridgeMamaSubscriptionMute() under mamaSubscription locks? Thus the contract for this function would be: - This call should be synchronous and no events should be processed after it returned (like before) - It should be reentrant and synchronize it's operations by itself (quite sensible requirement) -----Original Message----- From: Frank Quinn [mailto:fquinn@...] Sent: Tuesday, June 20, 2017 10:05 AM To: Yury Batrakov <yury.batrakov@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Subject: RE: Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Hi Yury, Thanks for the detailed query, I have a few outstanding questions and suggestions on this one: 1. I would question whether or not mamaSubscription_processMsg should crash for a muted subscription. Muting is a state which exists to attempt to stop new events coming in. However if you're in the dispatcher thread and you just received an object, it's too late this time - mute should only be invoked prior to prevent the *next* read. So perhaps (knowing nothing about the RMDS bridge) the straightforward solution would be simply to do the checks which may cause muting after the callback? 2. Should the thread which processes events from RMDS server invoke the callback method directly (inline)? Is it on the same thread as is assigned to the MAMA Subscription object? It should be to match the application's expected concurrency behaviour. 3. Rather than muting immediately, you could consider creating a muting callback event which gets enqueued onto your subscription thread. That way the mute event will always be synchronous with the subscription thread and you don't need to worry about locking any associated resources. Locking with subscription objects (particularly MAMA core objects) is hairy stuff and you should avoid it where possible to avoid race conditions. Cheers, Frank -----Original Message----- From: openmama-dev-bounces@... [mailto:openmama-dev-bounces@...] On Behalf Of Yury Batrakov Sent: 19 June 2017 17:42 To: openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Subject: [Openmama-dev] Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Classification: Public Hi guys, I've faced the following issue when using OpenMAMA and tick42rmds bridge. The bridge internally creates a thread to process events from RMDS server, once a message is received that thread invokes mamaSubscription_processMsg. While the message is processed user may want to destroy the subscription (obviously in other thread). To avoid corruption of mamaSubscription object, mamaSubscription_destroy() function calls bridge->mute for the bridge to stop calling mamaSubscription_processMsg() and only then deallocates mamaSubscription. The problem with this approach is the following: here's the pseudo code for RMDS dispatching thread if(muted) { // Do not dispatch return; } // Do some other checks <-- mute() may be invoked here mamaSubscription_processMsg() // processMsg for muted subscription, may crash The solution for that is to change RMDS bridge to block in bridge->mute() call until mamaSubscription_processMsg() returns but there's another problem: mamaSubscription_processMsg and mamaSubscription_deactivate may deadlock on wombatThrottle. Consider the following scenario: 1. RMDS bridge thread invokes mamaSubscription_processMsg() for message of type initial 2. User thread invokes mamaSubscription_destroy() which acquires wombat throttle lock: if (impl->mTransport) throttle = mamaTransportImpl_getThrottle(impl->mTransport, MAMA_THROTTLE_DEFAULT); if(NULL != throttle) { wombatThrottle_lock(throttle); } 3. Then mamaSubscription_destroy calls mamaSubscription_deactivate_internal which calls our new version of bridge->mute() which waits for RMDS bridge thread to finish message processing if (impl->mSubscBridge) { impl->mBridgeImpl->bridgeMamaSubscriptionMute (impl->mSubscBridge); } 4. RMDS bridge handles initial message and tries to acquire the same throttle: #5 0x00007ffff78d4f32 in wombatThrottle_lock (throttle=0x6298e0) at mama/c_cpp/src/c/throttle.c:441 #6 0x00007ffff78a34e2 in imageRequest_stopWaitForResponse (request=0x14d1a20) at mama/c_cpp/src/c/imagerequest.c:774 #7 0x00007ffff78cbe06 in mamaSubscription_stopWaitForResponse (subscription=0xe36280, ctx=0xe364a0) at mama/c_cpp/src/c/subscription.c:1262 #8 0x00007ffff78a38fe in processPointToPointMessage (callback=0x1527e50, msg=0x642460, msgType=6, ctx=0xe364a0) at mama/c_cpp/src/c/listenermsgcallback.c:169 #9 0x00007ffff78a3f9c in listenerMsgCallback_processMsg (callback=0x1527e50, msg=0x642460, ctx=0xe364a0) at mama/c_cpp/src/c/listenermsgcallback.c:480 #10 0x00007ffff78cd825 in mamaSubscription_processMsg (subscription=0xe36280, msg=0x642460) at mama/c_cpp/src/c/subscription.c:2259 What do you think is the best way to avoid this? --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy. The information contained in this message may be privileged and confidential and protected from disclosure. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by replying to this message and deleting it from your computer. Thank you. Vela Trading Technologies LLC --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy. _______________________________________________ Openmama-users mailing list Openmama-users@... https://lists.openmama.org/mailman/listinfo/openmama-users _______________________________________________ Openmama-dev mailing list Openmama-dev@... https://lists.openmama.org/mailman/listinfo/openmama-dev --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy. --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.
|
|
Re: [Openmama-users] Concurrent subscription.destroy() ? Crash when using tick42rmds transport.
Tom Doust
Yes we call back on the bridge’s thread. We don’t queue the message, we leave the client code to do that if it wants to. This is by design.
toggle quoted messageShow quoted text
On 20/06/2017, 18:18, "Konstantin Baydarov" <konstantin.baydarov@...> wrote:
Classification: Public Hi, Tom. I noticed, that comparing to qpid bridge(that comes with openmama sources), tick42rmds calls mamaSubscription_processMsg() method from separate thread and not from mamaQueue_dispatch(), wondering if it's correct. Probably it's one of the reasons of the issue that we facing? Qpid bridge call stack: #0 mamaSubscription_processMsg (subscription=0x76e150, msg=0x7aebb0) at mama/c_cpp/src/c/subscription.c:2226 #1 0x00007ffff7b4c580 in imageRequestImpl_onInitialMessage (msg=0x7aebb0, closure=0x772710) at mama/c_cpp/src/c/imagerequest.c:225 #2 0x00007ffff648bede in qpidBridgeMamaInboxImpl_onMsg (subscription=0x772900, msg=0x7aebb0, closure=0x7727c0, itemClosure=0x0) at mama/c_cpp/src/c/bridge/qpid/inbox.c:298 #3 0x00007ffff7b76ab4 in mamaSubscription_forwardMsg (subscription=0x772900, msg=0x7aebb0) at mama/c_cpp/src/c/subscription.c:1422 #4 0x00007ffff7b781ef in mamaSubscription_processMsg (subscription=0x772900, msg=0x7aebb0) at mama/c_cpp/src/c/subscription.c:2315 #5 0x00007ffff6490818 in qpidBridgeMamaTransportImpl_queueCallback (queue=0x60eb50, closure=0x61df80) at mama/c_cpp/src/c/bridge/qpid/transport.c:1083 #6 0x00007ffff7b90a1f in wombatQueue_dispatchInt (queue=0x60ecb0, data=0x0, closure=0x0, isTimed=1 '\001', timout=500) at common/c_cpp/src/c/queue.c:319 #7 0x00007ffff7b90aa2 in wombatQueue_timedDispatch (queue=0x60ecb0, data=0x0, closure=0x0, timeout=500) at common/c_cpp/src/c/queue.c:335 #8 0x00007ffff648e720 in qpidBridgeMamaQueue_dispatch (queue=0x60ec40) at mama/c_cpp/src/c/bridge/qpid/queue.c:265 #9 0x00007ffff7b6e1de in mamaQueue_dispatch (queue=0x60eb50) at mama/c_cpp/src/c/queue.c:824 #10 0x00007ffff648a8c3 in qpidBridge_start (defaultEventQueue=0x60eb50) at mama/c_cpp/src/c/bridge/qpid/bridge.c:196 #11 0x00007ffff7b52976 in mama_start (bridgeImpl=0x60e750) at mama/c_cpp/src/c/mama.c:1659 #12 0x0000000000403e61 in buildDataDictionary () at mama/c_cpp/src/examples/c/mamalistenc.c:647 #13 0x000000000040366f in main (argc=9, argv=0x7fffffffd728) at mama/c_cpp/src/examples/c/mamalistenc.c:335 tick42rmds bridge call stack: #0 0x0000000000000000 in ?? () #1 0x00007ffff78cc259 in mamaSubscription_forwardMsg (subscription=0x7fffe9757c50, msg=0x641d60) at mama/c_cpp/src/c/subscription.c:1426 #2 0x00007ffff78a38ec in processPointToPointMessage (callback=0x7fffe9768fb0, msg=0x641d60, msgType=6, ctx=0x7fffe939b500) at mama/c_cpp/src/c/listenermsgcallback.c:174 #3 0x00007ffff78a3f37 in listenerMsgCallback_processMsg (callback=0x7fffe9768fb0, msg=0x641d60, ctx=0x7fffe939b500) at mama/c_cpp/src/c/listenermsgcallback.c:471 #4 0x00007ffff78cd7e5 in mamaSubscription_processMsg (subscription=0x7fffe939b2e0, msg=0x641d60) at mama/c_cpp/src/c/subscription.c:2260 #5 0x00007ffff51fdaf3 in RMDSBridgeSubscription::OnMessage(mamaMsgImpl_*, mamaMsgType) () from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so #6 0x00007ffff5258492 in UPASubscription::NotifyListenersRefreshMessage(mamaMsgImpl_*, boost::shared_ptr<RMDSBridgeSubscription>, bool) () from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so #7 0x00007ffff5259032 in UPASubscription::InternalProcessMarketPriceResponse(RsslMsg*, RsslDecIterator*) () from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so #8 0x00007ffff525e785 in UPASubscription::ProcessMarketPriceResponse(RsslMsg*, RsslDecIterator*) () from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so #9 0x00007ffff522ccc8 in UPAConsumer::ProcessResponse(RsslChannel*, RwfBuffer*) () from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so #10 0x00007ffff522d2b0 in UPAConsumer::ReadFromChannel(RsslChannel*) () from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so #11 0x00007ffff522e62d in UPAConsumer::Run() () from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so #12 0x00007ffff52146cc in RMDSSubscriber::threadFunc(void*) () from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so #13 0x00007ffff6733806 in start_thread () from /lib64/libpthread.so.0 #14 0x00007ffff5af59bd in clone () from /lib64/libc.so.6 #15 0x0000000000000000 in ?? () BR, Konstantin Baydarov -----Original Message----- From: Yury Batrakov Sent: Tuesday, June 20, 2017 8:12 PM To: Tom Doust <tom.doust@...>; Konstantin Baydarov <konstantin.baydarov@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Subject: RE: [Openmama-dev] [Openmama-users] Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Classification: Public Hi Tom, We are using version 1.3. As I see from latest github code the problem still exists. See RMDSBridgeSubscription::OnMessage method: if (isShutdown_ || ((0 != source_) && source_->IsPausedUpdates())) { return; } // ... Shutdown() may be called here // And then MAMA can start destroying subscription_'s fields try { status = mamaSubscription_processMsg(subscription_, msg); // This function will be examining subscription_'s fields being destroyed } -----Original Message----- From: openmama-dev-bounces@... [mailto:openmama-dev-bounces@...] On Behalf Of Tom Doust Sent: Tuesday, June 20, 2017 4:09 PM To: Konstantin Baydarov <konstantin.baydarov@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Subject: Re: [Openmama-dev] [Openmama-users] Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Hi Yury, Konstantin Are you using the current github version of the bridge code? We looked at and fixed some of the issues around locking the subscription destroy some time back. It would be good to know if we have missed something. Best regards Tom -----Original Message----- From: openmama-users-bounces@... [mailto:openmama-users-bounces@...] On Behalf Of Konstantin Baydarov Sent: 20 June 2017 11:32 To: openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Subject: Re: [Openmama-users] Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Classification: Public Hi, guys. I'm working on the issue with Yury. I spotted the deadlock possibility during debugging tick42rmds bridge crash on unsubscribe, will be interested knowing the solution as well. BR, Konstantin Baydarov -----Original Message----- From: Yury Batrakov Sent: Tuesday, June 20, 2017 12:38 PM To: Frank Quinn <fquinn@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Cc: Konstantin Baydarov <konstantin.baydarov@...> Subject: RE: Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Classification: Public Hi Frank, Let me answer your question in random order :) 2. It looks like a designed behavior of RMDS bridge - callbacks are invoked in a thread servicing transport events from a server. One thread per mamaTransport is created. 1. Therefore race conditions are possible. Our case is: mute is called for a subscription, then mamaSubscription_cleanup frees self->mInitialRequest but concurrent mamaSubscription_processMsg call tries to access self->mInitialRequest because the message it processes is initial message from a server. 3. Taking in account p.2 we cannot process mute request asynchronously as MAMA starts freeing subscription resources immediately after mute is called. Do you think it is possible for MAMA not to invoke bridgeMamaSubscriptionMute() under mamaSubscription locks? Thus the contract for this function would be: - This call should be synchronous and no events should be processed after it returned (like before) - It should be reentrant and synchronize it's operations by itself (quite sensible requirement) -----Original Message----- From: Frank Quinn [mailto:fquinn@...] Sent: Tuesday, June 20, 2017 10:05 AM To: Yury Batrakov <yury.batrakov@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Subject: RE: Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Hi Yury, Thanks for the detailed query, I have a few outstanding questions and suggestions on this one: 1. I would question whether or not mamaSubscription_processMsg should crash for a muted subscription. Muting is a state which exists to attempt to stop new events coming in. However if you're in the dispatcher thread and you just received an object, it's too late this time - mute should only be invoked prior to prevent the *next* read. So perhaps (knowing nothing about the RMDS bridge) the straightforward solution would be simply to do the checks which may cause muting after the callback? 2. Should the thread which processes events from RMDS server invoke the callback method directly (inline)? Is it on the same thread as is assigned to the MAMA Subscription object? It should be to match the application's expected concurrency behaviour. 3. Rather than muting immediately, you could consider creating a muting callback event which gets enqueued onto your subscription thread. That way the mute event will always be synchronous with the subscription thread and you don't need to worry about locking any associated resources. Locking with subscription objects (particularly MAMA core objects) is hairy stuff and you should avoid it where possible to avoid race conditions. Cheers, Frank -----Original Message----- From: openmama-dev-bounces@... [mailto:openmama-dev-bounces@...] On Behalf Of Yury Batrakov Sent: 19 June 2017 17:42 To: openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Subject: [Openmama-dev] Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Classification: Public Hi guys, I've faced the following issue when using OpenMAMA and tick42rmds bridge. The bridge internally creates a thread to process events from RMDS server, once a message is received that thread invokes mamaSubscription_processMsg. While the message is processed user may want to destroy the subscription (obviously in other thread). To avoid corruption of mamaSubscription object, mamaSubscription_destroy() function calls bridge->mute for the bridge to stop calling mamaSubscription_processMsg() and only then deallocates mamaSubscription. The problem with this approach is the following: here's the pseudo code for RMDS dispatching thread if(muted) { // Do not dispatch return; } // Do some other checks <-- mute() may be invoked here mamaSubscription_processMsg() // processMsg for muted subscription, may crash The solution for that is to change RMDS bridge to block in bridge->mute() call until mamaSubscription_processMsg() returns but there's another problem: mamaSubscription_processMsg and mamaSubscription_deactivate may deadlock on wombatThrottle. Consider the following scenario: 1. RMDS bridge thread invokes mamaSubscription_processMsg() for message of type initial 2. User thread invokes mamaSubscription_destroy() which acquires wombat throttle lock: if (impl->mTransport) throttle = mamaTransportImpl_getThrottle(impl->mTransport, MAMA_THROTTLE_DEFAULT); if(NULL != throttle) { wombatThrottle_lock(throttle); } 3. Then mamaSubscription_destroy calls mamaSubscription_deactivate_internal which calls our new version of bridge->mute() which waits for RMDS bridge thread to finish message processing if (impl->mSubscBridge) { impl->mBridgeImpl->bridgeMamaSubscriptionMute (impl->mSubscBridge); } 4. RMDS bridge handles initial message and tries to acquire the same throttle: #5 0x00007ffff78d4f32 in wombatThrottle_lock (throttle=0x6298e0) at mama/c_cpp/src/c/throttle.c:441 #6 0x00007ffff78a34e2 in imageRequest_stopWaitForResponse (request=0x14d1a20) at mama/c_cpp/src/c/imagerequest.c:774 #7 0x00007ffff78cbe06 in mamaSubscription_stopWaitForResponse (subscription=0xe36280, ctx=0xe364a0) at mama/c_cpp/src/c/subscription.c:1262 #8 0x00007ffff78a38fe in processPointToPointMessage (callback=0x1527e50, msg=0x642460, msgType=6, ctx=0xe364a0) at mama/c_cpp/src/c/listenermsgcallback.c:169 #9 0x00007ffff78a3f9c in listenerMsgCallback_processMsg (callback=0x1527e50, msg=0x642460, ctx=0xe364a0) at mama/c_cpp/src/c/listenermsgcallback.c:480 #10 0x00007ffff78cd825 in mamaSubscription_processMsg (subscription=0xe36280, msg=0x642460) at mama/c_cpp/src/c/subscription.c:2259 What do you think is the best way to avoid this? --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy. The information contained in this message may be privileged and confidential and protected from disclosure. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by replying to this message and deleting it from your computer. Thank you. Vela Trading Technologies LLC --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy. _______________________________________________ Openmama-users mailing list Openmama-users@... https://lists.openmama.org/mailman/listinfo/openmama-users _______________________________________________ Openmama-dev mailing list Openmama-dev@... https://lists.openmama.org/mailman/listinfo/openmama-dev --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.
|
|
Re: [Openmama-users] Concurrent subscription.destroy() ? Crash when using tick42rmds transport.
Konstantin Baydarov
Classification: Public
toggle quoted messageShow quoted text
Hi, Tom. I noticed, that comparing to qpid bridge(that comes with openmama sources), tick42rmds calls mamaSubscription_processMsg() method from separate thread and not from mamaQueue_dispatch(), wondering if it's correct. Probably it's one of the reasons of the issue that we facing? Qpid bridge call stack: #0 mamaSubscription_processMsg (subscription=0x76e150, msg=0x7aebb0) at mama/c_cpp/src/c/subscription.c:2226 #1 0x00007ffff7b4c580 in imageRequestImpl_onInitialMessage (msg=0x7aebb0, closure=0x772710) at mama/c_cpp/src/c/imagerequest.c:225 #2 0x00007ffff648bede in qpidBridgeMamaInboxImpl_onMsg (subscription=0x772900, msg=0x7aebb0, closure=0x7727c0, itemClosure=0x0) at mama/c_cpp/src/c/bridge/qpid/inbox.c:298 #3 0x00007ffff7b76ab4 in mamaSubscription_forwardMsg (subscription=0x772900, msg=0x7aebb0) at mama/c_cpp/src/c/subscription.c:1422 #4 0x00007ffff7b781ef in mamaSubscription_processMsg (subscription=0x772900, msg=0x7aebb0) at mama/c_cpp/src/c/subscription.c:2315 #5 0x00007ffff6490818 in qpidBridgeMamaTransportImpl_queueCallback (queue=0x60eb50, closure=0x61df80) at mama/c_cpp/src/c/bridge/qpid/transport.c:1083 #6 0x00007ffff7b90a1f in wombatQueue_dispatchInt (queue=0x60ecb0, data=0x0, closure=0x0, isTimed=1 '\001', timout=500) at common/c_cpp/src/c/queue.c:319 #7 0x00007ffff7b90aa2 in wombatQueue_timedDispatch (queue=0x60ecb0, data=0x0, closure=0x0, timeout=500) at common/c_cpp/src/c/queue.c:335 #8 0x00007ffff648e720 in qpidBridgeMamaQueue_dispatch (queue=0x60ec40) at mama/c_cpp/src/c/bridge/qpid/queue.c:265 #9 0x00007ffff7b6e1de in mamaQueue_dispatch (queue=0x60eb50) at mama/c_cpp/src/c/queue.c:824 #10 0x00007ffff648a8c3 in qpidBridge_start (defaultEventQueue=0x60eb50) at mama/c_cpp/src/c/bridge/qpid/bridge.c:196 #11 0x00007ffff7b52976 in mama_start (bridgeImpl=0x60e750) at mama/c_cpp/src/c/mama.c:1659 #12 0x0000000000403e61 in buildDataDictionary () at mama/c_cpp/src/examples/c/mamalistenc.c:647 #13 0x000000000040366f in main (argc=9, argv=0x7fffffffd728) at mama/c_cpp/src/examples/c/mamalistenc.c:335 tick42rmds bridge call stack: #0 0x0000000000000000 in ?? () #1 0x00007ffff78cc259 in mamaSubscription_forwardMsg (subscription=0x7fffe9757c50, msg=0x641d60) at mama/c_cpp/src/c/subscription.c:1426 #2 0x00007ffff78a38ec in processPointToPointMessage (callback=0x7fffe9768fb0, msg=0x641d60, msgType=6, ctx=0x7fffe939b500) at mama/c_cpp/src/c/listenermsgcallback.c:174 #3 0x00007ffff78a3f37 in listenerMsgCallback_processMsg (callback=0x7fffe9768fb0, msg=0x641d60, ctx=0x7fffe939b500) at mama/c_cpp/src/c/listenermsgcallback.c:471 #4 0x00007ffff78cd7e5 in mamaSubscription_processMsg (subscription=0x7fffe939b2e0, msg=0x641d60) at mama/c_cpp/src/c/subscription.c:2260 #5 0x00007ffff51fdaf3 in RMDSBridgeSubscription::OnMessage(mamaMsgImpl_*, mamaMsgType) () from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so #6 0x00007ffff5258492 in UPASubscription::NotifyListenersRefreshMessage(mamaMsgImpl_*, boost::shared_ptr<RMDSBridgeSubscription>, bool) () from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so #7 0x00007ffff5259032 in UPASubscription::InternalProcessMarketPriceResponse(RsslMsg*, RsslDecIterator*) () from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so #8 0x00007ffff525e785 in UPASubscription::ProcessMarketPriceResponse(RsslMsg*, RsslDecIterator*) () from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so #9 0x00007ffff522ccc8 in UPAConsumer::ProcessResponse(RsslChannel*, RwfBuffer*) () from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so #10 0x00007ffff522d2b0 in UPAConsumer::ReadFromChannel(RsslChannel*) () from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so #11 0x00007ffff522e62d in UPAConsumer::Run() () from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so #12 0x00007ffff52146cc in RMDSSubscriber::threadFunc(void*) () from /home/gedeapp/baydkon/apps/dbmama/dbmama-api-1.7.1462_dev/lib/libmamatick42rmdsimpl.so #13 0x00007ffff6733806 in start_thread () from /lib64/libpthread.so.0 #14 0x00007ffff5af59bd in clone () from /lib64/libc.so.6 #15 0x0000000000000000 in ?? () BR, Konstantin Baydarov
-----Original Message-----
From: Yury Batrakov Sent: Tuesday, June 20, 2017 8:12 PM To: Tom Doust <tom.doust@...>; Konstantin Baydarov <konstantin.baydarov@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Subject: RE: [Openmama-dev] [Openmama-users] Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Classification: Public Hi Tom, We are using version 1.3. As I see from latest github code the problem still exists. See RMDSBridgeSubscription::OnMessage method: if (isShutdown_ || ((0 != source_) && source_->IsPausedUpdates())) { return; } // ... Shutdown() may be called here // And then MAMA can start destroying subscription_'s fields try { status = mamaSubscription_processMsg(subscription_, msg); // This function will be examining subscription_'s fields being destroyed } -----Original Message----- From: openmama-dev-bounces@... [mailto:openmama-dev-bounces@...] On Behalf Of Tom Doust Sent: Tuesday, June 20, 2017 4:09 PM To: Konstantin Baydarov <konstantin.baydarov@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Subject: Re: [Openmama-dev] [Openmama-users] Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Hi Yury, Konstantin Are you using the current github version of the bridge code? We looked at and fixed some of the issues around locking the subscription destroy some time back. It would be good to know if we have missed something. Best regards Tom -----Original Message----- From: openmama-users-bounces@... [mailto:openmama-users-bounces@...] On Behalf Of Konstantin Baydarov Sent: 20 June 2017 11:32 To: openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Subject: Re: [Openmama-users] Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Classification: Public Hi, guys. I'm working on the issue with Yury. I spotted the deadlock possibility during debugging tick42rmds bridge crash on unsubscribe, will be interested knowing the solution as well. BR, Konstantin Baydarov -----Original Message----- From: Yury Batrakov Sent: Tuesday, June 20, 2017 12:38 PM To: Frank Quinn <fquinn@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Cc: Konstantin Baydarov <konstantin.baydarov@...> Subject: RE: Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Classification: Public Hi Frank, Let me answer your question in random order :) 2. It looks like a designed behavior of RMDS bridge - callbacks are invoked in a thread servicing transport events from a server. One thread per mamaTransport is created. 1. Therefore race conditions are possible. Our case is: mute is called for a subscription, then mamaSubscription_cleanup frees self->mInitialRequest but concurrent mamaSubscription_processMsg call tries to access self->mInitialRequest because the message it processes is initial message from a server. 3. Taking in account p.2 we cannot process mute request asynchronously as MAMA starts freeing subscription resources immediately after mute is called. Do you think it is possible for MAMA not to invoke bridgeMamaSubscriptionMute() under mamaSubscription locks? Thus the contract for this function would be: - This call should be synchronous and no events should be processed after it returned (like before) - It should be reentrant and synchronize it's operations by itself (quite sensible requirement) -----Original Message----- From: Frank Quinn [mailto:fquinn@...] Sent: Tuesday, June 20, 2017 10:05 AM To: Yury Batrakov <yury.batrakov@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Subject: RE: Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Hi Yury, Thanks for the detailed query, I have a few outstanding questions and suggestions on this one: 1. I would question whether or not mamaSubscription_processMsg should crash for a muted subscription. Muting is a state which exists to attempt to stop new events coming in. However if you're in the dispatcher thread and you just received an object, it's too late this time - mute should only be invoked prior to prevent the *next* read. So perhaps (knowing nothing about the RMDS bridge) the straightforward solution would be simply to do the checks which may cause muting after the callback? 2. Should the thread which processes events from RMDS server invoke the callback method directly (inline)? Is it on the same thread as is assigned to the MAMA Subscription object? It should be to match the application's expected concurrency behaviour. 3. Rather than muting immediately, you could consider creating a muting callback event which gets enqueued onto your subscription thread. That way the mute event will always be synchronous with the subscription thread and you don't need to worry about locking any associated resources. Locking with subscription objects (particularly MAMA core objects) is hairy stuff and you should avoid it where possible to avoid race conditions. Cheers, Frank -----Original Message----- From: openmama-dev-bounces@... [mailto:openmama-dev-bounces@...] On Behalf Of Yury Batrakov Sent: 19 June 2017 17:42 To: openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Subject: [Openmama-dev] Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Classification: Public Hi guys, I've faced the following issue when using OpenMAMA and tick42rmds bridge. The bridge internally creates a thread to process events from RMDS server, once a message is received that thread invokes mamaSubscription_processMsg. While the message is processed user may want to destroy the subscription (obviously in other thread). To avoid corruption of mamaSubscription object, mamaSubscription_destroy() function calls bridge->mute for the bridge to stop calling mamaSubscription_processMsg() and only then deallocates mamaSubscription. The problem with this approach is the following: here's the pseudo code for RMDS dispatching thread if(muted) { // Do not dispatch return; } // Do some other checks <-- mute() may be invoked here mamaSubscription_processMsg() // processMsg for muted subscription, may crash The solution for that is to change RMDS bridge to block in bridge->mute() call until mamaSubscription_processMsg() returns but there's another problem: mamaSubscription_processMsg and mamaSubscription_deactivate may deadlock on wombatThrottle. Consider the following scenario: 1. RMDS bridge thread invokes mamaSubscription_processMsg() for message of type initial 2. User thread invokes mamaSubscription_destroy() which acquires wombat throttle lock: if (impl->mTransport) throttle = mamaTransportImpl_getThrottle(impl->mTransport, MAMA_THROTTLE_DEFAULT); if(NULL != throttle) { wombatThrottle_lock(throttle); } 3. Then mamaSubscription_destroy calls mamaSubscription_deactivate_internal which calls our new version of bridge->mute() which waits for RMDS bridge thread to finish message processing if (impl->mSubscBridge) { impl->mBridgeImpl->bridgeMamaSubscriptionMute (impl->mSubscBridge); } 4. RMDS bridge handles initial message and tries to acquire the same throttle: #5 0x00007ffff78d4f32 in wombatThrottle_lock (throttle=0x6298e0) at mama/c_cpp/src/c/throttle.c:441 #6 0x00007ffff78a34e2 in imageRequest_stopWaitForResponse (request=0x14d1a20) at mama/c_cpp/src/c/imagerequest.c:774 #7 0x00007ffff78cbe06 in mamaSubscription_stopWaitForResponse (subscription=0xe36280, ctx=0xe364a0) at mama/c_cpp/src/c/subscription.c:1262 #8 0x00007ffff78a38fe in processPointToPointMessage (callback=0x1527e50, msg=0x642460, msgType=6, ctx=0xe364a0) at mama/c_cpp/src/c/listenermsgcallback.c:169 #9 0x00007ffff78a3f9c in listenerMsgCallback_processMsg (callback=0x1527e50, msg=0x642460, ctx=0xe364a0) at mama/c_cpp/src/c/listenermsgcallback.c:480 #10 0x00007ffff78cd825 in mamaSubscription_processMsg (subscription=0xe36280, msg=0x642460) at mama/c_cpp/src/c/subscription.c:2259 What do you think is the best way to avoid this? --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy. The information contained in this message may be privileged and confidential and protected from disclosure. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by replying to this message and deleting it from your computer. Thank you. Vela Trading Technologies LLC --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy. _______________________________________________ Openmama-users mailing list Openmama-users@... https://lists.openmama.org/mailman/listinfo/openmama-users _______________________________________________ Openmama-dev mailing list Openmama-dev@... https://lists.openmama.org/mailman/listinfo/openmama-dev --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.
|
|
Re: [Openmama-users] Concurrent subscription.destroy() ? Crash when using tick42rmds transport.
Yury Batrakov
Classification: Public
toggle quoted messageShow quoted text
Hi Tom, We are using version 1.3. As I see from latest github code the problem still exists. See RMDSBridgeSubscription::OnMessage method: if (isShutdown_ || ((0 != source_) && source_->IsPausedUpdates())) { return; } // ... Shutdown() may be called here // And then MAMA can start destroying subscription_'s fields try { status = mamaSubscription_processMsg(subscription_, msg); // This function will be examining subscription_'s fields being destroyed }
-----Original Message-----
From: openmama-dev-bounces@... [mailto:openmama-dev-bounces@...] On Behalf Of Tom Doust Sent: Tuesday, June 20, 2017 4:09 PM To: Konstantin Baydarov <konstantin.baydarov@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Subject: Re: [Openmama-dev] [Openmama-users] Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Hi Yury, Konstantin Are you using the current github version of the bridge code? We looked at and fixed some of the issues around locking the subscription destroy some time back. It would be good to know if we have missed something. Best regards Tom -----Original Message----- From: openmama-users-bounces@... [mailto:openmama-users-bounces@...] On Behalf Of Konstantin Baydarov Sent: 20 June 2017 11:32 To: openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Subject: Re: [Openmama-users] Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Classification: Public Hi, guys. I'm working on the issue with Yury. I spotted the deadlock possibility during debugging tick42rmds bridge crash on unsubscribe, will be interested knowing the solution as well. BR, Konstantin Baydarov -----Original Message----- From: Yury Batrakov Sent: Tuesday, June 20, 2017 12:38 PM To: Frank Quinn <fquinn@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Cc: Konstantin Baydarov <konstantin.baydarov@...> Subject: RE: Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Classification: Public Hi Frank, Let me answer your question in random order :) 2. It looks like a designed behavior of RMDS bridge - callbacks are invoked in a thread servicing transport events from a server. One thread per mamaTransport is created. 1. Therefore race conditions are possible. Our case is: mute is called for a subscription, then mamaSubscription_cleanup frees self->mInitialRequest but concurrent mamaSubscription_processMsg call tries to access self->mInitialRequest because the message it processes is initial message from a server. 3. Taking in account p.2 we cannot process mute request asynchronously as MAMA starts freeing subscription resources immediately after mute is called. Do you think it is possible for MAMA not to invoke bridgeMamaSubscriptionMute() under mamaSubscription locks? Thus the contract for this function would be: - This call should be synchronous and no events should be processed after it returned (like before) - It should be reentrant and synchronize it's operations by itself (quite sensible requirement) -----Original Message----- From: Frank Quinn [mailto:fquinn@...] Sent: Tuesday, June 20, 2017 10:05 AM To: Yury Batrakov <yury.batrakov@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Subject: RE: Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Hi Yury, Thanks for the detailed query, I have a few outstanding questions and suggestions on this one: 1. I would question whether or not mamaSubscription_processMsg should crash for a muted subscription. Muting is a state which exists to attempt to stop new events coming in. However if you're in the dispatcher thread and you just received an object, it's too late this time - mute should only be invoked prior to prevent the *next* read. So perhaps (knowing nothing about the RMDS bridge) the straightforward solution would be simply to do the checks which may cause muting after the callback? 2. Should the thread which processes events from RMDS server invoke the callback method directly (inline)? Is it on the same thread as is assigned to the MAMA Subscription object? It should be to match the application's expected concurrency behaviour. 3. Rather than muting immediately, you could consider creating a muting callback event which gets enqueued onto your subscription thread. That way the mute event will always be synchronous with the subscription thread and you don't need to worry about locking any associated resources. Locking with subscription objects (particularly MAMA core objects) is hairy stuff and you should avoid it where possible to avoid race conditions. Cheers, Frank -----Original Message----- From: openmama-dev-bounces@... [mailto:openmama-dev-bounces@...] On Behalf Of Yury Batrakov Sent: 19 June 2017 17:42 To: openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Subject: [Openmama-dev] Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Classification: Public Hi guys, I've faced the following issue when using OpenMAMA and tick42rmds bridge. The bridge internally creates a thread to process events from RMDS server, once a message is received that thread invokes mamaSubscription_processMsg. While the message is processed user may want to destroy the subscription (obviously in other thread). To avoid corruption of mamaSubscription object, mamaSubscription_destroy() function calls bridge->mute for the bridge to stop calling mamaSubscription_processMsg() and only then deallocates mamaSubscription. The problem with this approach is the following: here's the pseudo code for RMDS dispatching thread if(muted) { // Do not dispatch return; } // Do some other checks <-- mute() may be invoked here mamaSubscription_processMsg() // processMsg for muted subscription, may crash The solution for that is to change RMDS bridge to block in bridge->mute() call until mamaSubscription_processMsg() returns but there's another problem: mamaSubscription_processMsg and mamaSubscription_deactivate may deadlock on wombatThrottle. Consider the following scenario: 1. RMDS bridge thread invokes mamaSubscription_processMsg() for message of type initial 2. User thread invokes mamaSubscription_destroy() which acquires wombat throttle lock: if (impl->mTransport) throttle = mamaTransportImpl_getThrottle(impl->mTransport, MAMA_THROTTLE_DEFAULT); if(NULL != throttle) { wombatThrottle_lock(throttle); } 3. Then mamaSubscription_destroy calls mamaSubscription_deactivate_internal which calls our new version of bridge->mute() which waits for RMDS bridge thread to finish message processing if (impl->mSubscBridge) { impl->mBridgeImpl->bridgeMamaSubscriptionMute (impl->mSubscBridge); } 4. RMDS bridge handles initial message and tries to acquire the same throttle: #5 0x00007ffff78d4f32 in wombatThrottle_lock (throttle=0x6298e0) at mama/c_cpp/src/c/throttle.c:441 #6 0x00007ffff78a34e2 in imageRequest_stopWaitForResponse (request=0x14d1a20) at mama/c_cpp/src/c/imagerequest.c:774 #7 0x00007ffff78cbe06 in mamaSubscription_stopWaitForResponse (subscription=0xe36280, ctx=0xe364a0) at mama/c_cpp/src/c/subscription.c:1262 #8 0x00007ffff78a38fe in processPointToPointMessage (callback=0x1527e50, msg=0x642460, msgType=6, ctx=0xe364a0) at mama/c_cpp/src/c/listenermsgcallback.c:169 #9 0x00007ffff78a3f9c in listenerMsgCallback_processMsg (callback=0x1527e50, msg=0x642460, ctx=0xe364a0) at mama/c_cpp/src/c/listenermsgcallback.c:480 #10 0x00007ffff78cd825 in mamaSubscription_processMsg (subscription=0xe36280, msg=0x642460) at mama/c_cpp/src/c/subscription.c:2259 What do you think is the best way to avoid this? --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy. The information contained in this message may be privileged and confidential and protected from disclosure. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by replying to this message and deleting it from your computer. Thank you. Vela Trading Technologies LLC --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy. _______________________________________________ Openmama-users mailing list Openmama-users@... https://lists.openmama.org/mailman/listinfo/openmama-users _______________________________________________ Openmama-dev mailing list Openmama-dev@... https://lists.openmama.org/mailman/listinfo/openmama-dev --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.
|
|
Re: [Openmama-users] Concurrent subscription.destroy() ? Crash when using tick42rmds transport.
Tom Doust
Hi Yury, Konstantin
toggle quoted messageShow quoted text
Are you using the current github version of the bridge code? We looked at and fixed some of the issues around locking the subscription destroy some time back. It would be good to know if we have missed something. Best regards Tom
-----Original Message-----
From: openmama-users-bounces@... [mailto:openmama-users-bounces@...] On Behalf Of Konstantin Baydarov Sent: 20 June 2017 11:32 To: openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Subject: Re: [Openmama-users] Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Classification: Public Hi, guys. I'm working on the issue with Yury. I spotted the deadlock possibility during debugging tick42rmds bridge crash on unsubscribe, will be interested knowing the solution as well. BR, Konstantin Baydarov -----Original Message----- From: Yury Batrakov Sent: Tuesday, June 20, 2017 12:38 PM To: Frank Quinn <fquinn@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Cc: Konstantin Baydarov <konstantin.baydarov@...> Subject: RE: Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Classification: Public Hi Frank, Let me answer your question in random order :) 2. It looks like a designed behavior of RMDS bridge - callbacks are invoked in a thread servicing transport events from a server. One thread per mamaTransport is created. 1. Therefore race conditions are possible. Our case is: mute is called for a subscription, then mamaSubscription_cleanup frees self->mInitialRequest but concurrent mamaSubscription_processMsg call tries to access self->mInitialRequest because the message it processes is initial message from a server. 3. Taking in account p.2 we cannot process mute request asynchronously as MAMA starts freeing subscription resources immediately after mute is called. Do you think it is possible for MAMA not to invoke bridgeMamaSubscriptionMute() under mamaSubscription locks? Thus the contract for this function would be: - This call should be synchronous and no events should be processed after it returned (like before) - It should be reentrant and synchronize it's operations by itself (quite sensible requirement) -----Original Message----- From: Frank Quinn [mailto:fquinn@...] Sent: Tuesday, June 20, 2017 10:05 AM To: Yury Batrakov <yury.batrakov@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Subject: RE: Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Hi Yury, Thanks for the detailed query, I have a few outstanding questions and suggestions on this one: 1. I would question whether or not mamaSubscription_processMsg should crash for a muted subscription. Muting is a state which exists to attempt to stop new events coming in. However if you're in the dispatcher thread and you just received an object, it's too late this time - mute should only be invoked prior to prevent the *next* read. So perhaps (knowing nothing about the RMDS bridge) the straightforward solution would be simply to do the checks which may cause muting after the callback? 2. Should the thread which processes events from RMDS server invoke the callback method directly (inline)? Is it on the same thread as is assigned to the MAMA Subscription object? It should be to match the application's expected concurrency behaviour. 3. Rather than muting immediately, you could consider creating a muting callback event which gets enqueued onto your subscription thread. That way the mute event will always be synchronous with the subscription thread and you don't need to worry about locking any associated resources. Locking with subscription objects (particularly MAMA core objects) is hairy stuff and you should avoid it where possible to avoid race conditions. Cheers, Frank -----Original Message----- From: openmama-dev-bounces@... [mailto:openmama-dev-bounces@...] On Behalf Of Yury Batrakov Sent: 19 June 2017 17:42 To: openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Subject: [Openmama-dev] Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Classification: Public Hi guys, I've faced the following issue when using OpenMAMA and tick42rmds bridge. The bridge internally creates a thread to process events from RMDS server, once a message is received that thread invokes mamaSubscription_processMsg. While the message is processed user may want to destroy the subscription (obviously in other thread). To avoid corruption of mamaSubscription object, mamaSubscription_destroy() function calls bridge->mute for the bridge to stop calling mamaSubscription_processMsg() and only then deallocates mamaSubscription. The problem with this approach is the following: here's the pseudo code for RMDS dispatching thread if(muted) { // Do not dispatch return; } // Do some other checks <-- mute() may be invoked here mamaSubscription_processMsg() // processMsg for muted subscription, may crash The solution for that is to change RMDS bridge to block in bridge->mute() call until mamaSubscription_processMsg() returns but there's another problem: mamaSubscription_processMsg and mamaSubscription_deactivate may deadlock on wombatThrottle. Consider the following scenario: 1. RMDS bridge thread invokes mamaSubscription_processMsg() for message of type initial 2. User thread invokes mamaSubscription_destroy() which acquires wombat throttle lock: if (impl->mTransport) throttle = mamaTransportImpl_getThrottle(impl->mTransport, MAMA_THROTTLE_DEFAULT); if(NULL != throttle) { wombatThrottle_lock(throttle); } 3. Then mamaSubscription_destroy calls mamaSubscription_deactivate_internal which calls our new version of bridge->mute() which waits for RMDS bridge thread to finish message processing if (impl->mSubscBridge) { impl->mBridgeImpl->bridgeMamaSubscriptionMute (impl->mSubscBridge); } 4. RMDS bridge handles initial message and tries to acquire the same throttle: #5 0x00007ffff78d4f32 in wombatThrottle_lock (throttle=0x6298e0) at mama/c_cpp/src/c/throttle.c:441 #6 0x00007ffff78a34e2 in imageRequest_stopWaitForResponse (request=0x14d1a20) at mama/c_cpp/src/c/imagerequest.c:774 #7 0x00007ffff78cbe06 in mamaSubscription_stopWaitForResponse (subscription=0xe36280, ctx=0xe364a0) at mama/c_cpp/src/c/subscription.c:1262 #8 0x00007ffff78a38fe in processPointToPointMessage (callback=0x1527e50, msg=0x642460, msgType=6, ctx=0xe364a0) at mama/c_cpp/src/c/listenermsgcallback.c:169 #9 0x00007ffff78a3f9c in listenerMsgCallback_processMsg (callback=0x1527e50, msg=0x642460, ctx=0xe364a0) at mama/c_cpp/src/c/listenermsgcallback.c:480 #10 0x00007ffff78cd825 in mamaSubscription_processMsg (subscription=0xe36280, msg=0x642460) at mama/c_cpp/src/c/subscription.c:2259 What do you think is the best way to avoid this? --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy. The information contained in this message may be privileged and confidential and protected from disclosure. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by replying to this message and deleting it from your computer. Thank you. Vela Trading Technologies LLC --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy. _______________________________________________ Openmama-users mailing list Openmama-users@... https://lists.openmama.org/mailman/listinfo/openmama-users
|
|
Re: Concurrent subscription.destroy() ? Crash when using tick42rmds transport.
Frank Quinn <fquinn@...>
Apologies, I misread the part in the first email where it mentions mamaSubscription_destroy from arbitrary threads.
toggle quoted messageShow quoted text
If you're calling mamaSubscription_destroy from a thread which was not the same thread which was associated during mamaSubscription_create(), that goes against the threading requirements of that method: https://openmama.github.io/reference/mama/c/subscription_8h.html#a93d77987c1c97d0dd6cadd34320b501d If you really want to destroy the subscription from an arbitrary thread, you need to call mamaSubscription_destroyEx: https://openmama.github.io/reference/mama/c/subscription_8h.html#ad40c51d2f15e9440581d6bb23cfa5b4f Hope this helps, Cheers, Frank FRANK QUINN Principal Engineer - EMEA Vela Trading Technologies O. +44 289 568 0209 ext. 3592 fquinn@... Adelaide Exchange Building, 2nd Floor, 24-26 Adelaide Street, Belfast, BT2 8GD velatradingtech.com | @vela_tt
-----Original Message-----
From: openmama-dev-bounces@... [mailto:openmama-dev-bounces@...] On Behalf Of Konstantin Baydarov Sent: 20 June 2017 11:32 To: openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Subject: Re: [Openmama-dev] Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Classification: Public Hi, guys. I'm working on the issue with Yury. I spotted the deadlock possibility during debugging tick42rmds bridge crash on unsubscribe, will be interested knowing the solution as well. BR, Konstantin Baydarov -----Original Message----- From: Yury Batrakov Sent: Tuesday, June 20, 2017 12:38 PM To: Frank Quinn <fquinn@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Cc: Konstantin Baydarov <konstantin.baydarov@...> Subject: RE: Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Classification: Public Hi Frank, Let me answer your question in random order :) 2. It looks like a designed behavior of RMDS bridge - callbacks are invoked in a thread servicing transport events from a server. One thread per mamaTransport is created. 1. Therefore race conditions are possible. Our case is: mute is called for a subscription, then mamaSubscription_cleanup frees self->mInitialRequest but concurrent mamaSubscription_processMsg call tries to access self->mInitialRequest because the message it processes is initial message from a server. 3. Taking in account p.2 we cannot process mute request asynchronously as MAMA starts freeing subscription resources immediately after mute is called. Do you think it is possible for MAMA not to invoke bridgeMamaSubscriptionMute() under mamaSubscription locks? Thus the contract for this function would be: - This call should be synchronous and no events should be processed after it returned (like before) - It should be reentrant and synchronize it's operations by itself (quite sensible requirement) -----Original Message----- From: Frank Quinn [mailto:fquinn@...] Sent: Tuesday, June 20, 2017 10:05 AM To: Yury Batrakov <yury.batrakov@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Subject: RE: Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Hi Yury, Thanks for the detailed query, I have a few outstanding questions and suggestions on this one: 1. I would question whether or not mamaSubscription_processMsg should crash for a muted subscription. Muting is a state which exists to attempt to stop new events coming in. However if you're in the dispatcher thread and you just received an object, it's too late this time - mute should only be invoked prior to prevent the *next* read. So perhaps (knowing nothing about the RMDS bridge) the straightforward solution would be simply to do the checks which may cause muting after the callback? 2. Should the thread which processes events from RMDS server invoke the callback method directly (inline)? Is it on the same thread as is assigned to the MAMA Subscription object? It should be to match the application's expected concurrency behaviour. 3. Rather than muting immediately, you could consider creating a muting callback event which gets enqueued onto your subscription thread. That way the mute event will always be synchronous with the subscription thread and you don't need to worry about locking any associated resources. Locking with subscription objects (particularly MAMA core objects) is hairy stuff and you should avoid it where possible to avoid race conditions. Cheers, Frank -----Original Message----- From: openmama-dev-bounces@... [mailto:openmama-dev-bounces@...] On Behalf Of Yury Batrakov Sent: 19 June 2017 17:42 To: openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Subject: [Openmama-dev] Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Classification: Public Hi guys, I've faced the following issue when using OpenMAMA and tick42rmds bridge. The bridge internally creates a thread to process events from RMDS server, once a message is received that thread invokes mamaSubscription_processMsg. While the message is processed user may want to destroy the subscription (obviously in other thread). To avoid corruption of mamaSubscription object, mamaSubscription_destroy() function calls bridge->mute for the bridge to stop calling mamaSubscription_processMsg() and only then deallocates mamaSubscription. The problem with this approach is the following: here's the pseudo code for RMDS dispatching thread if(muted) { // Do not dispatch return; } // Do some other checks <-- mute() may be invoked here mamaSubscription_processMsg() // processMsg for muted subscription, may crash The solution for that is to change RMDS bridge to block in bridge->mute() call until mamaSubscription_processMsg() returns but there's another problem: mamaSubscription_processMsg and mamaSubscription_deactivate may deadlock on wombatThrottle. Consider the following scenario: 1. RMDS bridge thread invokes mamaSubscription_processMsg() for message of type initial 2. User thread invokes mamaSubscription_destroy() which acquires wombat throttle lock: if (impl->mTransport) throttle = mamaTransportImpl_getThrottle(impl->mTransport, MAMA_THROTTLE_DEFAULT); if(NULL != throttle) { wombatThrottle_lock(throttle); } 3. Then mamaSubscription_destroy calls mamaSubscription_deactivate_internal which calls our new version of bridge->mute() which waits for RMDS bridge thread to finish message processing if (impl->mSubscBridge) { impl->mBridgeImpl->bridgeMamaSubscriptionMute (impl->mSubscBridge); } 4. RMDS bridge handles initial message and tries to acquire the same throttle: #5 0x00007ffff78d4f32 in wombatThrottle_lock (throttle=0x6298e0) at mama/c_cpp/src/c/throttle.c:441 #6 0x00007ffff78a34e2 in imageRequest_stopWaitForResponse (request=0x14d1a20) at mama/c_cpp/src/c/imagerequest.c:774 #7 0x00007ffff78cbe06 in mamaSubscription_stopWaitForResponse (subscription=0xe36280, ctx=0xe364a0) at mama/c_cpp/src/c/subscription.c:1262 #8 0x00007ffff78a38fe in processPointToPointMessage (callback=0x1527e50, msg=0x642460, msgType=6, ctx=0xe364a0) at mama/c_cpp/src/c/listenermsgcallback.c:169 #9 0x00007ffff78a3f9c in listenerMsgCallback_processMsg (callback=0x1527e50, msg=0x642460, ctx=0xe364a0) at mama/c_cpp/src/c/listenermsgcallback.c:480 #10 0x00007ffff78cd825 in mamaSubscription_processMsg (subscription=0xe36280, msg=0x642460) at mama/c_cpp/src/c/subscription.c:2259 What do you think is the best way to avoid this? --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy. The information contained in this message may be privileged and confidential and protected from disclosure. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by replying to this message and deleting it from your computer. Thank you. Vela Trading Technologies LLC --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy. _______________________________________________ Openmama-dev mailing list Openmama-dev@... https://lists.openmama.org/mailman/listinfo/openmama-dev The information contained in this message may be privileged and confidential and protected from disclosure. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by replying to this message and deleting it from your computer. Thank you. Vela Trading Technologies LLC
|
|
Re: Concurrent subscription.destroy() ? Crash when using tick42rmds transport.
Konstantin Baydarov
Classification: Public
toggle quoted messageShow quoted text
Hi, guys. I'm working on the issue with Yury. I spotted the deadlock possibility during debugging tick42rmds bridge crash on unsubscribe, will be interested knowing the solution as well. BR, Konstantin Baydarov
-----Original Message-----
From: Yury Batrakov Sent: Tuesday, June 20, 2017 12:38 PM To: Frank Quinn <fquinn@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Cc: Konstantin Baydarov <konstantin.baydarov@...> Subject: RE: Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Classification: Public Hi Frank, Let me answer your question in random order :) 2. It looks like a designed behavior of RMDS bridge - callbacks are invoked in a thread servicing transport events from a server. One thread per mamaTransport is created. 1. Therefore race conditions are possible. Our case is: mute is called for a subscription, then mamaSubscription_cleanup frees self->mInitialRequest but concurrent mamaSubscription_processMsg call tries to access self->mInitialRequest because the message it processes is initial message from a server. 3. Taking in account p.2 we cannot process mute request asynchronously as MAMA starts freeing subscription resources immediately after mute is called. Do you think it is possible for MAMA not to invoke bridgeMamaSubscriptionMute() under mamaSubscription locks? Thus the contract for this function would be: - This call should be synchronous and no events should be processed after it returned (like before) - It should be reentrant and synchronize it's operations by itself (quite sensible requirement) -----Original Message----- From: Frank Quinn [mailto:fquinn@...] Sent: Tuesday, June 20, 2017 10:05 AM To: Yury Batrakov <yury.batrakov@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Subject: RE: Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Hi Yury, Thanks for the detailed query, I have a few outstanding questions and suggestions on this one: 1. I would question whether or not mamaSubscription_processMsg should crash for a muted subscription. Muting is a state which exists to attempt to stop new events coming in. However if you're in the dispatcher thread and you just received an object, it's too late this time - mute should only be invoked prior to prevent the *next* read. So perhaps (knowing nothing about the RMDS bridge) the straightforward solution would be simply to do the checks which may cause muting after the callback? 2. Should the thread which processes events from RMDS server invoke the callback method directly (inline)? Is it on the same thread as is assigned to the MAMA Subscription object? It should be to match the application's expected concurrency behaviour. 3. Rather than muting immediately, you could consider creating a muting callback event which gets enqueued onto your subscription thread. That way the mute event will always be synchronous with the subscription thread and you don't need to worry about locking any associated resources. Locking with subscription objects (particularly MAMA core objects) is hairy stuff and you should avoid it where possible to avoid race conditions. Cheers, Frank -----Original Message----- From: openmama-dev-bounces@... [mailto:openmama-dev-bounces@...] On Behalf Of Yury Batrakov Sent: 19 June 2017 17:42 To: openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Subject: [Openmama-dev] Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Classification: Public Hi guys, I've faced the following issue when using OpenMAMA and tick42rmds bridge. The bridge internally creates a thread to process events from RMDS server, once a message is received that thread invokes mamaSubscription_processMsg. While the message is processed user may want to destroy the subscription (obviously in other thread). To avoid corruption of mamaSubscription object, mamaSubscription_destroy() function calls bridge->mute for the bridge to stop calling mamaSubscription_processMsg() and only then deallocates mamaSubscription. The problem with this approach is the following: here's the pseudo code for RMDS dispatching thread if(muted) { // Do not dispatch return; } // Do some other checks <-- mute() may be invoked here mamaSubscription_processMsg() // processMsg for muted subscription, may crash The solution for that is to change RMDS bridge to block in bridge->mute() call until mamaSubscription_processMsg() returns but there's another problem: mamaSubscription_processMsg and mamaSubscription_deactivate may deadlock on wombatThrottle. Consider the following scenario: 1. RMDS bridge thread invokes mamaSubscription_processMsg() for message of type initial 2. User thread invokes mamaSubscription_destroy() which acquires wombat throttle lock: if (impl->mTransport) throttle = mamaTransportImpl_getThrottle(impl->mTransport, MAMA_THROTTLE_DEFAULT); if(NULL != throttle) { wombatThrottle_lock(throttle); } 3. Then mamaSubscription_destroy calls mamaSubscription_deactivate_internal which calls our new version of bridge->mute() which waits for RMDS bridge thread to finish message processing if (impl->mSubscBridge) { impl->mBridgeImpl->bridgeMamaSubscriptionMute (impl->mSubscBridge); } 4. RMDS bridge handles initial message and tries to acquire the same throttle: #5 0x00007ffff78d4f32 in wombatThrottle_lock (throttle=0x6298e0) at mama/c_cpp/src/c/throttle.c:441 #6 0x00007ffff78a34e2 in imageRequest_stopWaitForResponse (request=0x14d1a20) at mama/c_cpp/src/c/imagerequest.c:774 #7 0x00007ffff78cbe06 in mamaSubscription_stopWaitForResponse (subscription=0xe36280, ctx=0xe364a0) at mama/c_cpp/src/c/subscription.c:1262 #8 0x00007ffff78a38fe in processPointToPointMessage (callback=0x1527e50, msg=0x642460, msgType=6, ctx=0xe364a0) at mama/c_cpp/src/c/listenermsgcallback.c:169 #9 0x00007ffff78a3f9c in listenerMsgCallback_processMsg (callback=0x1527e50, msg=0x642460, ctx=0xe364a0) at mama/c_cpp/src/c/listenermsgcallback.c:480 #10 0x00007ffff78cd825 in mamaSubscription_processMsg (subscription=0xe36280, msg=0x642460) at mama/c_cpp/src/c/subscription.c:2259 What do you think is the best way to avoid this? --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy. The information contained in this message may be privileged and confidential and protected from disclosure. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by replying to this message and deleting it from your computer. Thank you. Vela Trading Technologies LLC --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.
|
|
Re: Concurrent subscription.destroy() ? Crash when using tick42rmds transport.
Yury Batrakov
Classification: Public
toggle quoted messageShow quoted text
Hi Frank, Let me answer your question in random order :) 2. It looks like a designed behavior of RMDS bridge - callbacks are invoked in a thread servicing transport events from a server. One thread per mamaTransport is created. 1. Therefore race conditions are possible. Our case is: mute is called for a subscription, then mamaSubscription_cleanup frees self->mInitialRequest but concurrent mamaSubscription_processMsg call tries to access self->mInitialRequest because the message it processes is initial message from a server. 3. Taking in account p.2 we cannot process mute request asynchronously as MAMA starts freeing subscription resources immediately after mute is called. Do you think it is possible for MAMA not to invoke bridgeMamaSubscriptionMute() under mamaSubscription locks? Thus the contract for this function would be: - This call should be synchronous and no events should be processed after it returned (like before) - It should be reentrant and synchronize it's operations by itself (quite sensible requirement)
-----Original Message-----
From: Frank Quinn [mailto:fquinn@...] Sent: Tuesday, June 20, 2017 10:05 AM To: Yury Batrakov <yury.batrakov@...>; openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Subject: RE: Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Hi Yury, Thanks for the detailed query, I have a few outstanding questions and suggestions on this one: 1. I would question whether or not mamaSubscription_processMsg should crash for a muted subscription. Muting is a state which exists to attempt to stop new events coming in. However if you're in the dispatcher thread and you just received an object, it's too late this time - mute should only be invoked prior to prevent the *next* read. So perhaps (knowing nothing about the RMDS bridge) the straightforward solution would be simply to do the checks which may cause muting after the callback? 2. Should the thread which processes events from RMDS server invoke the callback method directly (inline)? Is it on the same thread as is assigned to the MAMA Subscription object? It should be to match the application's expected concurrency behaviour. 3. Rather than muting immediately, you could consider creating a muting callback event which gets enqueued onto your subscription thread. That way the mute event will always be synchronous with the subscription thread and you don't need to worry about locking any associated resources. Locking with subscription objects (particularly MAMA core objects) is hairy stuff and you should avoid it where possible to avoid race conditions. Cheers, Frank -----Original Message----- From: openmama-dev-bounces@... [mailto:openmama-dev-bounces@...] On Behalf Of Yury Batrakov Sent: 19 June 2017 17:42 To: openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Subject: [Openmama-dev] Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Classification: Public Hi guys, I've faced the following issue when using OpenMAMA and tick42rmds bridge. The bridge internally creates a thread to process events from RMDS server, once a message is received that thread invokes mamaSubscription_processMsg. While the message is processed user may want to destroy the subscription (obviously in other thread). To avoid corruption of mamaSubscription object, mamaSubscription_destroy() function calls bridge->mute for the bridge to stop calling mamaSubscription_processMsg() and only then deallocates mamaSubscription. The problem with this approach is the following: here's the pseudo code for RMDS dispatching thread if(muted) { // Do not dispatch return; } // Do some other checks <-- mute() may be invoked here mamaSubscription_processMsg() // processMsg for muted subscription, may crash The solution for that is to change RMDS bridge to block in bridge->mute() call until mamaSubscription_processMsg() returns but there's another problem: mamaSubscription_processMsg and mamaSubscription_deactivate may deadlock on wombatThrottle. Consider the following scenario: 1. RMDS bridge thread invokes mamaSubscription_processMsg() for message of type initial 2. User thread invokes mamaSubscription_destroy() which acquires wombat throttle lock: if (impl->mTransport) throttle = mamaTransportImpl_getThrottle(impl->mTransport, MAMA_THROTTLE_DEFAULT); if(NULL != throttle) { wombatThrottle_lock(throttle); } 3. Then mamaSubscription_destroy calls mamaSubscription_deactivate_internal which calls our new version of bridge->mute() which waits for RMDS bridge thread to finish message processing if (impl->mSubscBridge) { impl->mBridgeImpl->bridgeMamaSubscriptionMute (impl->mSubscBridge); } 4. RMDS bridge handles initial message and tries to acquire the same throttle: #5 0x00007ffff78d4f32 in wombatThrottle_lock (throttle=0x6298e0) at mama/c_cpp/src/c/throttle.c:441 #6 0x00007ffff78a34e2 in imageRequest_stopWaitForResponse (request=0x14d1a20) at mama/c_cpp/src/c/imagerequest.c:774 #7 0x00007ffff78cbe06 in mamaSubscription_stopWaitForResponse (subscription=0xe36280, ctx=0xe364a0) at mama/c_cpp/src/c/subscription.c:1262 #8 0x00007ffff78a38fe in processPointToPointMessage (callback=0x1527e50, msg=0x642460, msgType=6, ctx=0xe364a0) at mama/c_cpp/src/c/listenermsgcallback.c:169 #9 0x00007ffff78a3f9c in listenerMsgCallback_processMsg (callback=0x1527e50, msg=0x642460, ctx=0xe364a0) at mama/c_cpp/src/c/listenermsgcallback.c:480 #10 0x00007ffff78cd825 in mamaSubscription_processMsg (subscription=0xe36280, msg=0x642460) at mama/c_cpp/src/c/subscription.c:2259 What do you think is the best way to avoid this? --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy. The information contained in this message may be privileged and confidential and protected from disclosure. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by replying to this message and deleting it from your computer. Thank you. Vela Trading Technologies LLC --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.
|
|
Re: Concurrent subscription.destroy() ? Crash when using tick42rmds transport.
Frank Quinn <fquinn@...>
Hi Yury,
toggle quoted messageShow quoted text
Thanks for the detailed query, I have a few outstanding questions and suggestions on this one: 1. I would question whether or not mamaSubscription_processMsg should crash for a muted subscription. Muting is a state which exists to attempt to stop new events coming in. However if you're in the dispatcher thread and you just received an object, it's too late this time - mute should only be invoked prior to prevent the *next* read. So perhaps (knowing nothing about the RMDS bridge) the straightforward solution would be simply to do the checks which may cause muting after the callback? 2. Should the thread which processes events from RMDS server invoke the callback method directly (inline)? Is it on the same thread as is assigned to the MAMA Subscription object? It should be to match the application's expected concurrency behaviour. 3. Rather than muting immediately, you could consider creating a muting callback event which gets enqueued onto your subscription thread. That way the mute event will always be synchronous with the subscription thread and you don't need to worry about locking any associated resources. Locking with subscription objects (particularly MAMA core objects) is hairy stuff and you should avoid it where possible to avoid race conditions. Cheers, Frank
-----Original Message-----
From: openmama-dev-bounces@... [mailto:openmama-dev-bounces@...] On Behalf Of Yury Batrakov Sent: 19 June 2017 17:42 To: openmama-dev <openmama-dev@...>; openmama-users <openmama-users@...> Subject: [Openmama-dev] Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Classification: Public Hi guys, I've faced the following issue when using OpenMAMA and tick42rmds bridge. The bridge internally creates a thread to process events from RMDS server, once a message is received that thread invokes mamaSubscription_processMsg. While the message is processed user may want to destroy the subscription (obviously in other thread). To avoid corruption of mamaSubscription object, mamaSubscription_destroy() function calls bridge->mute for the bridge to stop calling mamaSubscription_processMsg() and only then deallocates mamaSubscription. The problem with this approach is the following: here's the pseudo code for RMDS dispatching thread if(muted) { // Do not dispatch return; } // Do some other checks <-- mute() may be invoked here mamaSubscription_processMsg() // processMsg for muted subscription, may crash The solution for that is to change RMDS bridge to block in bridge->mute() call until mamaSubscription_processMsg() returns but there's another problem: mamaSubscription_processMsg and mamaSubscription_deactivate may deadlock on wombatThrottle. Consider the following scenario: 1. RMDS bridge thread invokes mamaSubscription_processMsg() for message of type initial 2. User thread invokes mamaSubscription_destroy() which acquires wombat throttle lock: if (impl->mTransport) throttle = mamaTransportImpl_getThrottle(impl->mTransport, MAMA_THROTTLE_DEFAULT); if(NULL != throttle) { wombatThrottle_lock(throttle); } 3. Then mamaSubscription_destroy calls mamaSubscription_deactivate_internal which calls our new version of bridge->mute() which waits for RMDS bridge thread to finish message processing if (impl->mSubscBridge) { impl->mBridgeImpl->bridgeMamaSubscriptionMute (impl->mSubscBridge); } 4. RMDS bridge handles initial message and tries to acquire the same throttle: #5 0x00007ffff78d4f32 in wombatThrottle_lock (throttle=0x6298e0) at mama/c_cpp/src/c/throttle.c:441 #6 0x00007ffff78a34e2 in imageRequest_stopWaitForResponse (request=0x14d1a20) at mama/c_cpp/src/c/imagerequest.c:774 #7 0x00007ffff78cbe06 in mamaSubscription_stopWaitForResponse (subscription=0xe36280, ctx=0xe364a0) at mama/c_cpp/src/c/subscription.c:1262 #8 0x00007ffff78a38fe in processPointToPointMessage (callback=0x1527e50, msg=0x642460, msgType=6, ctx=0xe364a0) at mama/c_cpp/src/c/listenermsgcallback.c:169 #9 0x00007ffff78a3f9c in listenerMsgCallback_processMsg (callback=0x1527e50, msg=0x642460, ctx=0xe364a0) at mama/c_cpp/src/c/listenermsgcallback.c:480 #10 0x00007ffff78cd825 in mamaSubscription_processMsg (subscription=0xe36280, msg=0x642460) at mama/c_cpp/src/c/subscription.c:2259 What do you think is the best way to avoid this? --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy. The information contained in this message may be privileged and confidential and protected from disclosure. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by replying to this message and deleting it from your computer. Thank you. Vela Trading Technologies LLC
|
|
Re: Concurrent subscription.destroy() ? Crash when using tick42rmds transport.
Yury Batrakov
Classification: Public
toggle quoted messageShow quoted text
+ Konstantin
-----Original Message-----
From: Yury Batrakov Sent: Monday, June 19, 2017 7:42 PM To: 'openmama-dev' <openmama-dev@...>; 'openmama-users' <openmama-users@...> Subject: Concurrent subscription.destroy() ? Crash when using tick42rmds transport. Classification: Public Hi guys, I've faced the following issue when using OpenMAMA and tick42rmds bridge. The bridge internally creates a thread to process events from RMDS server, once a message is received that thread invokes mamaSubscription_processMsg. While the message is processed user may want to destroy the subscription (obviously in other thread). To avoid corruption of mamaSubscription object, mamaSubscription_destroy() function calls bridge->mute for the bridge to stop calling mamaSubscription_processMsg() and only then deallocates mamaSubscription. The problem with this approach is the following: here's the pseudo code for RMDS dispatching thread if(muted) { // Do not dispatch return; } // Do some other checks <-- mute() may be invoked here mamaSubscription_processMsg() // processMsg for muted subscription, may crash The solution for that is to change RMDS bridge to block in bridge->mute() call until mamaSubscription_processMsg() returns but there's another problem: mamaSubscription_processMsg and mamaSubscription_deactivate may deadlock on wombatThrottle. Consider the following scenario: 1. RMDS bridge thread invokes mamaSubscription_processMsg() for message of type initial 2. User thread invokes mamaSubscription_destroy() which acquires wombat throttle lock: if (impl->mTransport) throttle = mamaTransportImpl_getThrottle(impl->mTransport, MAMA_THROTTLE_DEFAULT); if(NULL != throttle) { wombatThrottle_lock(throttle); } 3. Then mamaSubscription_destroy calls mamaSubscription_deactivate_internal which calls our new version of bridge->mute() which waits for RMDS bridge thread to finish message processing if (impl->mSubscBridge) { impl->mBridgeImpl->bridgeMamaSubscriptionMute (impl->mSubscBridge); } 4. RMDS bridge handles initial message and tries to acquire the same throttle: #5 0x00007ffff78d4f32 in wombatThrottle_lock (throttle=0x6298e0) at mama/c_cpp/src/c/throttle.c:441 #6 0x00007ffff78a34e2 in imageRequest_stopWaitForResponse (request=0x14d1a20) at mama/c_cpp/src/c/imagerequest.c:774 #7 0x00007ffff78cbe06 in mamaSubscription_stopWaitForResponse (subscription=0xe36280, ctx=0xe364a0) at mama/c_cpp/src/c/subscription.c:1262 #8 0x00007ffff78a38fe in processPointToPointMessage (callback=0x1527e50, msg=0x642460, msgType=6, ctx=0xe364a0) at mama/c_cpp/src/c/listenermsgcallback.c:169 #9 0x00007ffff78a3f9c in listenerMsgCallback_processMsg (callback=0x1527e50, msg=0x642460, ctx=0xe364a0) at mama/c_cpp/src/c/listenermsgcallback.c:480 #10 0x00007ffff78cd825 in mamaSubscription_processMsg (subscription=0xe36280, msg=0x642460) at mama/c_cpp/src/c/subscription.c:2259 What do you think is the best way to avoid this? --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.
|
|
Concurrent subscription.destroy() ? Crash when using tick42rmds transport.
Yury Batrakov
Classification: Public
Hi guys, I've faced the following issue when using OpenMAMA and tick42rmds bridge. The bridge internally creates a thread to process events from RMDS server, once a message is received that thread invokes mamaSubscription_processMsg. While the message is processed user may want to destroy the subscription (obviously in other thread). To avoid corruption of mamaSubscription object, mamaSubscription_destroy() function calls bridge->mute for the bridge to stop calling mamaSubscription_processMsg() and only then deallocates mamaSubscription. The problem with this approach is the following: here's the pseudo code for RMDS dispatching thread if(muted) { // Do not dispatch return; } // Do some other checks <-- mute() may be invoked here mamaSubscription_processMsg() // processMsg for muted subscription, may crash The solution for that is to change RMDS bridge to block in bridge->mute() call until mamaSubscription_processMsg() returns but there's another problem: mamaSubscription_processMsg and mamaSubscription_deactivate may deadlock on wombatThrottle. Consider the following scenario: 1. RMDS bridge thread invokes mamaSubscription_processMsg() for message of type initial 2. User thread invokes mamaSubscription_destroy() which acquires wombat throttle lock: if (impl->mTransport) throttle = mamaTransportImpl_getThrottle(impl->mTransport, MAMA_THROTTLE_DEFAULT); if(NULL != throttle) { wombatThrottle_lock(throttle); } 3. Then mamaSubscription_destroy calls mamaSubscription_deactivate_internal which calls our new version of bridge->mute() which waits for RMDS bridge thread to finish message processing if (impl->mSubscBridge) { impl->mBridgeImpl->bridgeMamaSubscriptionMute (impl->mSubscBridge); } 4. RMDS bridge handles initial message and tries to acquire the same throttle: #5 0x00007ffff78d4f32 in wombatThrottle_lock (throttle=0x6298e0) at mama/c_cpp/src/c/throttle.c:441 #6 0x00007ffff78a34e2 in imageRequest_stopWaitForResponse (request=0x14d1a20) at mama/c_cpp/src/c/imagerequest.c:774 #7 0x00007ffff78cbe06 in mamaSubscription_stopWaitForResponse (subscription=0xe36280, ctx=0xe364a0) at mama/c_cpp/src/c/subscription.c:1262 #8 0x00007ffff78a38fe in processPointToPointMessage (callback=0x1527e50, msg=0x642460, msgType=6, ctx=0xe364a0) at mama/c_cpp/src/c/listenermsgcallback.c:169 #9 0x00007ffff78a3f9c in listenerMsgCallback_processMsg (callback=0x1527e50, msg=0x642460, ctx=0xe364a0) at mama/c_cpp/src/c/listenermsgcallback.c:480 #10 0x00007ffff78cd825 in mamaSubscription_processMsg (subscription=0xe36280, msg=0x642460) at mama/c_cpp/src/c/subscription.c:2259 What do you think is the best way to avoid this? --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.
|
|
Code change(s) just landed on origin/next (Successful)
jenkins@...
Some changes have just been added to the origin/next branch!
[Frank Quinn] Updated version information to 6.2.1 mama/c_cpp/src/c/generateMamaSourceFiles.bat mama/jni/build.xml mamda/c_cpp/src/cpp/generateMamdaVersion.bat mama/dotnet/src/cs/MamaVersion.cs mamda/VERSION.scons mamda/java/build.xml mama/VERSION.scons [Frank Quinn] Fixed build issue with windows scons builds mama/c_cpp/SConscript.win [Frank Quinn] Extended the MamaDateTime C++ class with timespec get/set. (#282) mama/c_cpp/src/cpp/mama/MamaDateTime.h mama/c_cpp/src/cpp/datetime.cpp [Frank Quinn] Linux 32 bit heap corruption in TEST_F (FieldPriceTestsC, mama/c_cpp/src/gunittest/c/mamamsg/msgfieldcompositetests.cpp [Frank Quinn] Fixed unit test not implementing publisher success (#288) mama/jni/src/junittests/MamaPublisherTest.java [Frank Quinn] New extended epoch, hints and precision methods (#287) mama/c_cpp/src/cpp/mama/MamaDateTime.h mama/c_cpp/src/gunittest/c/mamadatetime/datetimetest.cpp mama/c_cpp/src/c/datetime.c mama/c_cpp/src/gunittest/cpp/MamaDateTimeTest.cpp mama/c_cpp/src/c/datetimeimpl.h mama/c_cpp/src/c/mama/datetime.h mama/c_cpp/src/cpp/datetime.cpp [Frank Quinn] Getting out of range date value now returns error (#289) mama/c_cpp/src/c/datetime.c [Frank Quinn] Updated licenses and installation files (#290) README.md SConstruct site_scons/community/command_line.py release_scripts/openmama.spec release_scripts/openmama-rpm.sh LICENSE-3RD-PARTY.txt mama/c_cpp/src/c/SConscript mama/c_cpp/src/c/bridge/qpid/SConscript [Frank Quinn] Fixed issue with crash for years prior to 1601 (#292) mama/c_cpp/src/c/datetime.c mama/c_cpp/src/c/mama/datetime.h mama/c_cpp/src/gunittest/c/mamadatetime/datetimetest.cpp
Results for OpenMAMA_Snapshot_Linux CI run with latest changes:
You may also check CI console output to view the full results.
|
|
OpenMAMA 6.2.1 Released
Frank Quinn <fquinn.ni@...>
Hi Folks, We are pleased to announce the final release of OpenMAMA 6.2.1 is now available: This release exists mainly to address several issues coming out of the recent MAMA Datetime changes:
NB: This release includes the removal of the legacy _USE_32BIT_TIME_T compile time macro for 32 bit windows. Please ensure that third party application and bridges are not compiled using this macro to avoid potential corruption of data. For a complete list of all 17 issues included in this release, please see here: https://github.com/OpenMAMA/OpenMAMA/milestone/7?closed=1 A special thanks to all developers, contributors and testers who helped is getting this out door. Cheers, Frank
|
|
Code change(s) just landed on origin/next (Failure)
jenkins@...
Some changes have just been added to the origin/next branch!
No changes
Results for OpenMAMA_Snapshot_Windows CI run with latest changes:
You may also check CI console output to view the full results.
|
|
Code change just landed on origin/master (Successful)
jenkins@...
Some changes have just been added to the origin/master branch!
[Frank Quinn] Added fix for spec file for fresh RPM builds release_scripts/openmama.spec [Frank Quinn] Fixed build issue with mamda testtools on windows mamda/c_cpp/src/testtools/SConscript.win [noreply] Fixed another issue with rpm generation release_scripts/openmama-rpm.sh [fquinn.ni] Added support for onSuccess publisher events mama/dotnet/src/examples/MamaPublisher/MamaPublisherCS.cs mamda/dotnet/src/examples/MamdaTradeTicker/MamdaTradeTicker.cs mama/c_cpp/src/examples/cpp/mamapublishercpp.cpp mama/c_cpp/src/c/mama/publisher.h mamda/c_cpp/src/examples/orderbooks/listenerBookPublisher.cpp mama/c_cpp/src/gunittest/cpp/MamaPublisherTest.cpp mama/c_cpp/src/c/publisher.c mama/dotnet/src/cs/MamaPublisher.cs mama/dotnet/src/cs/MamaPublisherCallback.cs mama/c_cpp/src/cpp/MamaPublisherImpl.h mama/c_cpp/src/cpp/mama/MamaPublisherCallback.h mama/c_cpp/src/cpp/MamaPublisher.cpp mamda/c_cpp/src/examples/mamdapublisher.cpp mama/c_cpp/src/examples/c/mamapublisherc.c mama/jni/src/c/mamapublisherjni.c mama/c_cpp/src/gunittest/c/publishertest.cpp [fquinn.ni] [PLAT-888] - New feature: process during conflation timer when there's mamda/c_cpp/src/cpp/orderbooks/mamda/MamdaOrderBookListener.h mamda/c_cpp/src/cpp/orderbooks/MamdaOrderBookListener.cpp [fquinn.ni] [PLAT-888] - Fixed core and bug by adding code logic to handle when mamda/c_cpp/src/cpp/orderbooks/MamdaOrderBookListener.cpp [fquinn.ni] MAMACPP: Remove mCvectorMsg tracking in MamaMsg. - Add unit test to mama/c_cpp/src/gunittest/cpp/MamaMsgTest.cpp mama/c_cpp/src/cpp/MamaMsg.cpp mama/c_cpp/src/cpp/mama/MamaMsg.h [fquinn.ni] UNITTEST: Run memory leak pattern once Previously, the intent of this mama/c_cpp/src/gunittest/cpp/MamaMsgTest.cpp [fquinn.ni] Fixed issue with multiple subscribers for same topic in qpid common/c_cpp/src/c/mempool.c mama/c_cpp/src/c/bridge/qpid/transport.c [Frank Quinn] fixes #269: MamdaSubscription redundantly creates Exception instance in mamda/java/com/wombat/mamda/MamdaSubscription.java [fquinn.ni] Removed package option from linux builds SConstruct site_scons/community/command_line.py [fquinn.ni] Bugfix mamdatetime 32 bit windows (#274) mama/c_cpp/src/c/SConscript.win mama/c_cpp/src/cpp/mamacpp.vcxproj msvc/PropertySheetAPRWin32Release.props site_scons/community/darwin.py mama/c_cpp/src/c/SConscript site_scons/community/command_line.py mama/c_cpp/src/c/datetime.c .travis.yml mama/c_cpp/src/gunittest/c/mamadatetime/datetimetest.cpp site_scons/community/windows.py mama/c_cpp/src/c/mamac.vcxproj msvc/PropertySheetAPRWin64Release.props [Frank Quinn] Fixed build issue with latest RPM builds release_scripts/openmama.spec release_scripts/openmama-rpm.sh mamda/java/com/wombat/mamda/MamdaSubscription.java [noreply] Fixed issue with date time on 32 bit linux (#275) mama/c_cpp/src/c/datetimeimpl.h mama/c_cpp/src/gunittest/c/mamadatetime/datetimetest.cpp mama/c_cpp/src/c/datetime.c [fquinn.ni] Fix Win32 MamaPublisherTestC.EventSendWithCallbacks SEH exception. mama/c_cpp/src/gunittest/c/publishertest.cpp [Frank Quinn] Updated version information to 6.2.1 mama/VERSION.scons mama/jni/build.xml mama/dotnet/src/cs/MamaVersion.cs mama/c_cpp/src/c/generateMamaSourceFiles.bat mamda/VERSION.scons mamda/c_cpp/src/cpp/generateMamdaVersion.bat mamda/java/build.xml [Frank Quinn] Fixed build issue with windows scons builds mama/c_cpp/SConscript.win [Frank Quinn] Extended the MamaDateTime C++ class with timespec get/set. (#282) mama/c_cpp/src/cpp/datetime.cpp mama/c_cpp/src/cpp/mama/MamaDateTime.h [Frank Quinn] Linux 32 bit heap corruption in TEST_F (FieldPriceTestsC, mama/c_cpp/src/gunittest/c/mamamsg/msgfieldcompositetests.cpp [Frank Quinn] Fixed unit test not implementing publisher success (#288) mama/jni/src/junittests/MamaPublisherTest.java [Frank Quinn] New extended epoch, hints and precision methods (#287) mama/c_cpp/src/c/mama/datetime.h mama/c_cpp/src/cpp/datetime.cpp mama/c_cpp/src/gunittest/c/mamadatetime/datetimetest.cpp mama/c_cpp/src/c/datetime.c mama/c_cpp/src/cpp/mama/MamaDateTime.h mama/c_cpp/src/gunittest/cpp/MamaDateTimeTest.cpp mama/c_cpp/src/c/datetimeimpl.h [Frank Quinn] Getting out of range date value now returns error (#289) mama/c_cpp/src/c/datetime.c [Frank Quinn] Updated licenses and installation files (#290) LICENSE-3RD-PARTY.txt release_scripts/openmama-rpm.sh SConstruct README.md release_scripts/openmama.spec site_scons/community/command_line.py mama/c_cpp/src/c/SConscript mama/c_cpp/src/c/bridge/qpid/SConscript [Frank Quinn] Fixed issue with crash for years prior to 1601 (#292) mama/c_cpp/src/c/mama/datetime.h mama/c_cpp/src/c/datetime.c mama/c_cpp/src/gunittest/c/mamadatetime/datetimetest.cpp
Results for OpenMAMA_Stable_Linux CI run with latest changes:
You may also check CI console output to view the full results.
|
|
Code change(s) just landed on OpenMAMA-6.2.1-rc2 (Failure)
jenkins@...
Some changes have just been added to the OpenMAMA-6.2.1-rc2 branch!
No changes
Results for OpenMAMA_ReleaseCandidate_Windows CI run with latest changes:
You may also check CI console output to view the full results.
|
|