Deadlock in mamaSubscription and mamaTransport destroy logic
Slade, Michael J
Hi OpenMAMA Dev,
We have encountered a deadlock situation in mamaSubscription’s and mamaTransport’s teardown logic due to lock ordering when destroying their underlying mamaPublisher. We are able to reliably reproduce this with the tick42
bridge. Could someone take a look at this for us please? Mike Deadlock – note that subscription and transport attempt to destroy same publisher: 1. transport.c: mamaTransport_destroy() 2. transport.c: mamaTransportImpl_clearTransportWithPublishers() 3. list.c: list_for_each() – Acquires list lock (1) 4. transport.c: mamaTransportImpl_clearTransportPublisherCallback() 5. publisher.c: mamaPublisherImpl_clearTransport() 6. publisher.c: mamaPublisherImpl_destroy() – Attempts to acquire publisher lock (2)
mamaSubscription teardown 1. subscription.c: mamaSubscriptionImpl_onSubscriptionDestroyed() 2. subscription.c: mamaSubscription_cleanup() 3. publisher.c: mamaPublisherImpl_destroy() – Acquires publisher lock (2) 4. transport.c: mamaTransport_removePublisher() 5. list.c: list_remove_element() – Attempts to acquire list lock (1)
This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited. |
|
Frank Quinn
Hi Mike,
Have raised https://github.com/OpenMAMA/OpenMAMA/issues/411 For follow up on this one – let’s track there for paper trail / release note reference etc.
Could you add some details on the execution environment? Particularly the shutdown sequence in the code, and whether or not this is JNI etc to see if anything unusual is compounding the issue?
The transport destroy is not supposed to be attempted by the app until after the subscriptions have all been destroyed and the queues have been drained and destroyed so it sounds like the bridge is still firing callbacks after the subscription has been “destroy”ed to free up the memory.
Cheers, Frank
Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io
From: Openmama-dev@... <Openmama-dev@...>
On Behalf Of Slade, Michael J via lists.openmama.org
Hi OpenMAMA Dev,
We have encountered a deadlock situation in mamaSubscription’s and mamaTransport’s teardown logic due to lock ordering when destroying their underlying mamaPublisher. We are able to reliably reproduce this with the tick42 bridge. Could
someone take a look at this for us please? Mike Deadlock – note that subscription and transport attempt to destroy same publisher:
mamaSubscription teardown
This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited. |
|
Slade, Michael J
Hi Frank,
Thanks for raising the issue in Github.
We were able to reproduce this issue on a Linux (RH6) server using mamalistenc.c to subscribe to ~5000 symbols with the tick42rmds middleware bridge. The only modification made to mamalistenc was to reverse the order in which the subscriptions were destroyed. The deadlock would then occur on shutdown.
Kind regards, Mike
From: Frank Quinn [mailto:fquinn@...]
Hi Mike,
Have raised https://github.com/OpenMAMA/OpenMAMA/issues/411 For follow up on this one – let’s track there for paper trail / release note reference etc.
Could you add some details on the execution environment? Particularly the shutdown sequence in the code, and whether or not this is JNI etc to see if anything unusual is compounding the issue?
The transport destroy is not supposed to be attempted by the app until after the subscriptions have all been destroyed and the queues have been drained and destroyed so it sounds like the bridge is still firing callbacks after the subscription has been “destroy”ed to free up the memory.
Cheers, Frank
Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io
From:
Openmama-dev@... <Openmama-dev@...>
On Behalf Of Slade, Michael J via
lists.openmama.org
Hi OpenMAMA Dev,
We have encountered a deadlock situation in mamaSubscription’s and mamaTransport’s teardown logic due to lock ordering when destroying their underlying mamaPublisher. We are able to reliably reproduce this with the tick42
bridge. Could someone take a look at this for us please? Mike Deadlock – note that subscription and transport attempt to destroy same publisher:
mamaSubscription teardown
This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited. This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited. |
|