Date   

Deadlock in mamaSubscription and mamaTransport destroy logic

Slade, Michael J
 

Hi OpenMAMA Dev,

 

We have encountered a deadlock situation in mamaSubscription’s and mamaTransport’s teardown logic due to lock ordering when destroying their underlying mamaPublisher. We are able to reliably reproduce this with the tick42 bridge. Could someone take a look at this for us please?

Thanks in advance,

Mike

Deadlock – note that subscription and transport attempt to destroy same publisher:
mamaTransport teardown:

1.      transport.c:         mamaTransport_destroy()

2.      transport.c:         mamaTransportImpl_clearTransportWithPublishers()

3.      list.c:                    list_for_each()Acquires list lock (1)

4.  transport.c:         mamaTransportImpl_clearTransportPublisherCallback()

5.  publisher.c:         mamaPublisherImpl_clearTransport()

6.      publisher.c:         mamaPublisherImpl_destroy()Attempts to acquire publisher lock (2)

 

mamaSubscription teardown

1.      subscription.c:    mamaSubscriptionImpl_onSubscriptionDestroyed()

2.      subscription.c:    mamaSubscription_cleanup()

3.      publisher.c:         mamaPublisherImpl_destroy()Acquires publisher lock (2)

4.      transport.c:         mamaTransport_removePublisher()

5.      list.c:                    list_remove_element()Attempts to acquire list lock (1)

 

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.


Re: let's move bridgeMamaSubscriptionMute call up in hierarchy out of throttle locked context

Igor Kovalenko
 

Classification: Public

 

Hi Frank,

 

I confirm that with your change I can reproduce the deadlock I described earlier:

 

(gdb) info threads

  Id   Target Id         Frame

  6    Thread 0x7f51b9fcd700 (LWP 26406) "mamalistencpp_c" 0x00007f51bb6a9993 in select () from /lib64/libc.so.6

  5    Thread 0x7f51b903b700 (LWP 26407) "mamalistencpp_c" 0x00007f51bb6a9993 in select () from /lib64/libc.so.6

  4    Thread 0x7f51b3433700 (LWP 26408) "mamalistencpp_c" 0x00007f51bb6a9993 in select () from /lib64/libc.so.6

  3    Thread 0x7f51b2c32700 (LWP 26414) "mamalistencpp_c" 0x00007f51bb99054d in __lll_lock_wait () from /lib64/libpthread.so.0

  2    Thread 0x7f51b2431700 (LWP 26415) "mamalistencpp_c" 0x00007f51bb67984d in nanosleep () from /lib64/libc.so.6

* 1    Thread 0x7f51bd02d7c0 (LWP 26405) "mamalistencpp_c" 0x00007f51bb99054d in __lll_lock_wait () from /lib64/libpthread.so.0

 

(gdb) thread 1

[Switching to thread 1 (Thread 0x7f51bd02d7c0 (LWP 26405))]

#0  0x00007f51bb99054d in __lll_lock_wait () from /lib64/libpthread.so.0

(gdb) frame 3

#3  0x00007f51bc0912e4 in wlock_lock (lock=0x7f519dda0490) at /xxx/openmama/src/common/c_cpp/src/c/wlock.c:64

64          wthread_mutex_lock( &impl->mMutex );

(gdb) print impl->mMutex

$4 = {__data = {__lock = 2, __count = 1, __owner = 26414, __nusers = 1, __kind = 1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}},

  __size = "\002\000\000\000\001\000\000\000.g\000\000\001\000\000\000\001", '\000' <repeats 22 times>, __align = 4294967298}

 

(gdb) thread 3

[Switching to thread 3 (Thread 0x7f51b2c32700 (LWP 26414))]

#0  0x00007f51bb99054d in __lll_lock_wait () from /lib64/libpthread.so.0

(gdb) frame 3

#3  0x00007f51bc0912e4 in wlock_lock (lock=0xa3f6a0) at /xxx/openmama/src/common/c_cpp/src/c/wlock.c:64

64          wthread_mutex_lock( &impl->mMutex );

(gdb) print impl->mMutex

$3 = {__data = {__lock = 2, __count = 1, __owner = 26405, __nusers = 1, __kind = 1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}},

  __size = "\002\000\000\000\001\000\000\000%g\000\000\001\000\000\000\001", '\000' <repeats 22 times>, __align = 4294967298}

 

From: Openmama-dev@... [mailto:Openmama-dev@...] On Behalf Of Igor Kovalenko
Sent: Wednesday, May 13, 2020 14:43
To: Frank Quinn <fquinn@...>; openmama-dev@...
Subject: Re: [Openmama-dev] let's move bridgeMamaSubscriptionMute call up in hierarchy out of throttle locked context

 

Classification: Public

 

Hi Frank,

 

Thank you for looking into this issue.

Testing this change will take some time, but I think this change would introduce a deadlock due to change in locking order:

 

wombatThrottle_sendQueuedMessage

  [A] acquires list_lock (self->mMsgQueue)

      and calls info->mActionCb() which is wrapped mamaSubscriptionImpl_completeMarketDataInitialisation()

  [B]    which acquires impl->mCreateDestroyLock

 

.. while your implementation does locking in B-A order.

 

My implementation of this reduced locked scope preserves AB locking order. In fact it looks very much like the following pseudo-code

 

{

  acquire throttle lock

  acquire mCreateDestroyLock

  /* protecting impl from changes via throttle sendQueuedMessage callback invocation */

  check if impl state is MAMA_SUBSCRIPTION_ACTIVATING

  if so remove throttle action and set impl state to MAMA_SUBSCRIPTION_DEACTIVATED, so throttle sendQueuedMessage callback invocation would skip this impl

  drop mCreateDestroyLock

  drop throttle lock

}

/* do not acquire throttle lock, since throttle sendQueuedMessage will already skip impl not in MAMA_SUBSCRIPTION_ACTIVATING state*/

acquire mCreateDestroyLock

unconditionally call mute subscription *before mamaSubscription_deactivate_internal has any chance*

here goes original code to handle impl states (now the case of MAMA_SUBSCRIPTION_ACTIVATING  is a no-op since we handled it above)

drop mCreateDestroyLock

 

From: Frank Quinn [mailto:fquinn@...]
Sent: Monday, May 11, 2020 22:24
To: Igor Kovalenko <igor.kovalenko@...>; openmama-dev@...
Subject: RE: [Openmama-dev] let's move bridgeMamaSubscriptionMute call up in hierarchy out of throttle locked context

 

Hi Igor, can you try out this branch please?

 

https://github.com/fquinner/OpenMAMA/tree/feature/subscription-throttle-lock-scope-reduction

 

Cheers,

Frank

 

Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io

 

From: Igor Kovalenko <igor.kovalenko@...>
Sent: 28 April 2020 08:15
To: Frank Quinn <fquinn@...>; openmama-dev@...
Subject: RE: [Openmama-dev] let's move bridgeMamaSubscriptionMute call up in hierarchy out of throttle locked context

 

Classification: Public

 

Hi Frank,

 

Yes I will be able to review and test the PR.

 

From: Frank Quinn [mailto:fquinn@...]
Sent: Tuesday, April 28, 2020 00:21
To: Igor Kovalenko <igor.kovalenko@...>; openmama-dev@...
Subject: RE: [Openmama-dev] let's move bridgeMamaSubscriptionMute call up in hierarchy out of throttle locked context

 

Hi Igor,

 

Erm, I think I understand the issue… If I put a PR together, can you review and verify if this fixes the issue?

 

Cheers,

Frank

 

Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io

 

From: Openmama-dev@... <Openmama-dev@...> On Behalf Of Igor Kovalenko via lists.openmama.org
Sent: 27 April 2020 18:21
To: openmama-dev@...
Subject: Re: [Openmama-dev] let's move bridgeMamaSubscriptionMute call up in hierarchy out of throttle locked context

 

Classification: Public

 

Hi

 

I now seem to have successfully tested a proper fix for the issue which requires calling bridgeMamaSubscriptionMute outside of throttle locked context.

 

All it takes is to handle a subscription in state MAMA_SUBSCRIPTION_ACTIVATING as special case in mamaSubscription_deactivate with throttle lock still held to drain related event from throttle queue.

Following that all other cases do not require throttle lock anymore.

Now when throttle lock is dropped but with mCreateDestroyLock acquired it is safe to call bridgeMamaSubscriptionMute outside of throttle locked context.

 

In my local testing I can confirm this (+ a couple of obvious lines to tick42rmds code) fixes the old issue reported back in July, 2017 regarding crash with tick42rmds bridge wrt concurrent subscription.destroy() call.

As a bonus the queueing solution in tick42rmds bridge is no longer required, which reduces memory consumption and improves latency.

 

I think that history regarding old issue is still available if one searches for this subject in mail archives: [Openmama-dev] [Openmama-users] Concurrent subscription.destroy() ? Crash when using tick42rmds transport.

 

Frank, I hope you will be able to have a look into this to possibly land this change into the openmama code.

 

From: Openmama-dev@... [mailto:Openmama-dev@...] On Behalf Of Igor Kovalenko
Sent: Tuesday, March 31, 2020 13:33
To: openmama-dev@...
Subject: [Openmama-dev] let's move bridgeMamaSubscriptionMute call up in hierarchy out of throttle locked context

 

Classification: Public

 

Hi openmama team,

 

I'd like to propose moving bridgeMamaSubscriptionMute call from mamaSubscription_deactivate_internal up in the call hierarchy to mamaSubscription_deactivate so Mute() is called before any throttle lock is acquired (that is, just after checking if subscription is not NULL in mamaSubscription_deactivate).

 

Since in this case the client is removing subscription anyway, it should not matter that transport bridge Mute() method will be called a bit earlier in subscription lifecycle.

 

I'll try to explain why this change is needed for tick42rmds bridge to operate reliably. A required change to tick42rmds is outlined below as well.

Locally I tested a combination of this proposed change to openmama with tick42rmds bridge modified as outlined below, and also with other transport bridges (unmodified) that we use, and found no negative side effects.

 

A couple of years back there was a discussion regarding tick42rmds transport issue.

My understanding is that tick42rmds design choice to process subscriptions on bridge thread as opposed to Mama.start() thread violates a few openmama threading assumptions.

As a consequence we can reproduce a crash during moderately fast subscribe/unsubscribe sequence where tick42rmds bridge is still trying to deliver data to callback of already destroyed mama subscription.

This crash is caused by a race condition detailed below.

 

Tick42rmds bridge implements tick42rmdsBridgeMamaSubscription_mute method which does call Shutdown() on bridge thread subscription object.

Intention there appears to be to mark bridge thread subscription object as shut down so that bridge thread ignores it and is not trying to deliver a callbacks on mama subscription objects being destroyed.

Unfortunately there is no proper locking there so that seldom a Mute() call returns before Shutdown() takes effect.

If this happens, bridge thread proceeds to deliver next message to mama subscription callback while mama subscription object is being destroyed by the thread returned from Mute().

 

Without significant redesign of tick42rmds this race can be closed by adding proper locking between tick42rmdsBridgeMamaSubscription_mute and bridge thread, so that bridge thread is guaranteed to see result of Shutdown() before Mute() call returns. Simple implementation would be to add a mutex shared between Shutdown() call and bridge thread methods issuing callbacks to mama subscriptions.

 

So far so good. Yet unfortunately openmama locking scheme needs to be modified for this to work due to a problem with locking order of MAMA_THROTTLE_DEFAULT throttle.

There are many places where MAMA_THROTTLE_DEFAULT lock is acquired. According to my analysis with valgrind/helgrind/drd acquisition of corresponding mutex must always be a leaf call in locking hierarchy to prevent ABBA-style deadlock.

 

If Mute() and Shutdown() are synchronized with a mutex, then the following would happen:

 

client thread destroying mama subscription eventually calls mamaSubscription_destroy

       which calls mamaSubscription_deactivate()

    [A]  .. mamaSubscription_deactivate acquires lock [A] on throttle object

            then calls mamaSubscription_deactivate_internal()

              .. which calls impl->mBridgeImpl->bridgeMamaSubscriptionMute (impl->mSubscBridge)

                 which is tick42rmdsBridgeMamaSubscription_mute

    [B]          .. tick42rmdsBridgeMamaSubscription_mute acquires bridge thread subscription object lock [B]

XX client thread gets suspended here for some reason so it is not calling Shutdown() yet

 

tick42rmds bridge thread calls RMDSBridgeSubscription::OnMessage (or anything else which requires calling mamaSubscription_processMsg later)

    [B]  .. bridge thread acquires bridge thread subscription object lock [B]

                bridge thread checks if Shutdown() is called on this object

XX          .. since client thread is suspended, Shutdown() is not in effect

            bridge thread calls mamaSubscription_processMsg() to deliver message to mama subscription

            ..

    [A]     imageRequest_stopWaitForResponse() is called which tries to acquire throttle lock [A]

 

If my proposal to move bridgeMamaSubscriptionMute call out of locked scope is implemented, this deadlock would not occur.

 

I will not be able to share any code change due to our organizational restrictions but will try to answer other questions regarding this proposal.

 

--

 

Kind regards,
Igor Kovalenko

 



---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.



---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.



---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.



---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.



---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.


Re: let's move bridgeMamaSubscriptionMute call up in hierarchy out of throttle locked context

Igor Kovalenko
 

Classification: Public

 

Hi Frank,

 

Thank you for looking into this issue.

Testing this change will take some time, but I think this change would introduce a deadlock due to change in locking order:

 

wombatThrottle_sendQueuedMessage

  [A] acquires list_lock (self->mMsgQueue)

      and calls info->mActionCb() which is wrapped mamaSubscriptionImpl_completeMarketDataInitialisation()

  [B]    which acquires impl->mCreateDestroyLock

 

.. while your implementation does locking in B-A order.

 

My implementation of this reduced locked scope preserves AB locking order. In fact it looks very much like the following pseudo-code

 

{

  acquire throttle lock

  acquire mCreateDestroyLock

  /* protecting impl from changes via throttle sendQueuedMessage callback invocation */

  check if impl state is MAMA_SUBSCRIPTION_ACTIVATING

  if so remove throttle action and set impl state to MAMA_SUBSCRIPTION_DEACTIVATED, so throttle sendQueuedMessage callback invocation would skip this impl

  drop mCreateDestroyLock

  drop throttle lock

}

/* do not acquire throttle lock, since throttle sendQueuedMessage will already skip impl not in MAMA_SUBSCRIPTION_ACTIVATING state*/

acquire mCreateDestroyLock

unconditionally call mute subscription *before mamaSubscription_deactivate_internal has any chance*

here goes original code to handle impl states (now the case of MAMA_SUBSCRIPTION_ACTIVATING  is a no-op since we handled it above)

drop mCreateDestroyLock

 

From: Frank Quinn [mailto:fquinn@...]
Sent: Monday, May 11, 2020 22:24
To: Igor Kovalenko <igor.kovalenko@...>; openmama-dev@...
Subject: RE: [Openmama-dev] let's move bridgeMamaSubscriptionMute call up in hierarchy out of throttle locked context

 

Hi Igor, can you try out this branch please?

 

https://github.com/fquinner/OpenMAMA/tree/feature/subscription-throttle-lock-scope-reduction

 

Cheers,

Frank

 

Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io

 

From: Igor Kovalenko <igor.kovalenko@...>
Sent: 28 April 2020 08:15
To: Frank Quinn <fquinn@...>; openmama-dev@...
Subject: RE: [Openmama-dev] let's move bridgeMamaSubscriptionMute call up in hierarchy out of throttle locked context

 

Classification: Public

 

Hi Frank,

 

Yes I will be able to review and test the PR.

 

From: Frank Quinn [mailto:fquinn@...]
Sent: Tuesday, April 28, 2020 00:21
To: Igor Kovalenko <igor.kovalenko@...>; openmama-dev@...
Subject: RE: [Openmama-dev] let's move bridgeMamaSubscriptionMute call up in hierarchy out of throttle locked context

 

Hi Igor,

 

Erm, I think I understand the issue… If I put a PR together, can you review and verify if this fixes the issue?

 

Cheers,

Frank

 

Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io

 

From: Openmama-dev@... <Openmama-dev@...> On Behalf Of Igor Kovalenko via lists.openmama.org
Sent: 27 April 2020 18:21
To: openmama-dev@...
Subject: Re: [Openmama-dev] let's move bridgeMamaSubscriptionMute call up in hierarchy out of throttle locked context

 

Classification: Public

 

Hi

 

I now seem to have successfully tested a proper fix for the issue which requires calling bridgeMamaSubscriptionMute outside of throttle locked context.

 

All it takes is to handle a subscription in state MAMA_SUBSCRIPTION_ACTIVATING as special case in mamaSubscription_deactivate with throttle lock still held to drain related event from throttle queue.

Following that all other cases do not require throttle lock anymore.

Now when throttle lock is dropped but with mCreateDestroyLock acquired it is safe to call bridgeMamaSubscriptionMute outside of throttle locked context.

 

In my local testing I can confirm this (+ a couple of obvious lines to tick42rmds code) fixes the old issue reported back in July, 2017 regarding crash with tick42rmds bridge wrt concurrent subscription.destroy() call.

As a bonus the queueing solution in tick42rmds bridge is no longer required, which reduces memory consumption and improves latency.

 

I think that history regarding old issue is still available if one searches for this subject in mail archives: [Openmama-dev] [Openmama-users] Concurrent subscription.destroy() ? Crash when using tick42rmds transport.

 

Frank, I hope you will be able to have a look into this to possibly land this change into the openmama code.

 

From: Openmama-dev@... [mailto:Openmama-dev@...] On Behalf Of Igor Kovalenko
Sent: Tuesday, March 31, 2020 13:33
To: openmama-dev@...
Subject: [Openmama-dev] let's move bridgeMamaSubscriptionMute call up in hierarchy out of throttle locked context

 

Classification: Public

 

Hi openmama team,

 

I'd like to propose moving bridgeMamaSubscriptionMute call from mamaSubscription_deactivate_internal up in the call hierarchy to mamaSubscription_deactivate so Mute() is called before any throttle lock is acquired (that is, just after checking if subscription is not NULL in mamaSubscription_deactivate).

 

Since in this case the client is removing subscription anyway, it should not matter that transport bridge Mute() method will be called a bit earlier in subscription lifecycle.

 

I'll try to explain why this change is needed for tick42rmds bridge to operate reliably. A required change to tick42rmds is outlined below as well.

Locally I tested a combination of this proposed change to openmama with tick42rmds bridge modified as outlined below, and also with other transport bridges (unmodified) that we use, and found no negative side effects.

 

A couple of years back there was a discussion regarding tick42rmds transport issue.

My understanding is that tick42rmds design choice to process subscriptions on bridge thread as opposed to Mama.start() thread violates a few openmama threading assumptions.

As a consequence we can reproduce a crash during moderately fast subscribe/unsubscribe sequence where tick42rmds bridge is still trying to deliver data to callback of already destroyed mama subscription.

This crash is caused by a race condition detailed below.

 

Tick42rmds bridge implements tick42rmdsBridgeMamaSubscription_mute method which does call Shutdown() on bridge thread subscription object.

Intention there appears to be to mark bridge thread subscription object as shut down so that bridge thread ignores it and is not trying to deliver a callbacks on mama subscription objects being destroyed.

Unfortunately there is no proper locking there so that seldom a Mute() call returns before Shutdown() takes effect.

If this happens, bridge thread proceeds to deliver next message to mama subscription callback while mama subscription object is being destroyed by the thread returned from Mute().

 

Without significant redesign of tick42rmds this race can be closed by adding proper locking between tick42rmdsBridgeMamaSubscription_mute and bridge thread, so that bridge thread is guaranteed to see result of Shutdown() before Mute() call returns. Simple implementation would be to add a mutex shared between Shutdown() call and bridge thread methods issuing callbacks to mama subscriptions.

 

So far so good. Yet unfortunately openmama locking scheme needs to be modified for this to work due to a problem with locking order of MAMA_THROTTLE_DEFAULT throttle.

There are many places where MAMA_THROTTLE_DEFAULT lock is acquired. According to my analysis with valgrind/helgrind/drd acquisition of corresponding mutex must always be a leaf call in locking hierarchy to prevent ABBA-style deadlock.

 

If Mute() and Shutdown() are synchronized with a mutex, then the following would happen:

 

client thread destroying mama subscription eventually calls mamaSubscription_destroy

       which calls mamaSubscription_deactivate()

    [A]  .. mamaSubscription_deactivate acquires lock [A] on throttle object

            then calls mamaSubscription_deactivate_internal()

              .. which calls impl->mBridgeImpl->bridgeMamaSubscriptionMute (impl->mSubscBridge)

                 which is tick42rmdsBridgeMamaSubscription_mute

    [B]          .. tick42rmdsBridgeMamaSubscription_mute acquires bridge thread subscription object lock [B]

XX client thread gets suspended here for some reason so it is not calling Shutdown() yet

 

tick42rmds bridge thread calls RMDSBridgeSubscription::OnMessage (or anything else which requires calling mamaSubscription_processMsg later)

    [B]  .. bridge thread acquires bridge thread subscription object lock [B]

                bridge thread checks if Shutdown() is called on this object

XX          .. since client thread is suspended, Shutdown() is not in effect

            bridge thread calls mamaSubscription_processMsg() to deliver message to mama subscription

            ..

    [A]     imageRequest_stopWaitForResponse() is called which tries to acquire throttle lock [A]

 

If my proposal to move bridgeMamaSubscriptionMute call out of locked scope is implemented, this deadlock would not occur.

 

I will not be able to share any code change due to our organizational restrictions but will try to answer other questions regarding this proposal.

 

--

 

Kind regards,
Igor Kovalenko

 



---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.



---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.



---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.



---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.


Re: let's move bridgeMamaSubscriptionMute call up in hierarchy out of throttle locked context

Frank Quinn
 

Hi Igor, can you try out this branch please?

 

https://github.com/fquinner/OpenMAMA/tree/feature/subscription-throttle-lock-scope-reduction

 

Cheers,

Frank

 

Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io

 

From: Igor Kovalenko <igor.kovalenko@...>
Sent: 28 April 2020 08:15
To: Frank Quinn <fquinn@...>; openmama-dev@...
Subject: RE: [Openmama-dev] let's move bridgeMamaSubscriptionMute call up in hierarchy out of throttle locked context

 

Classification: Public

 

Hi Frank,

 

Yes I will be able to review and test the PR.

 

From: Frank Quinn [mailto:fquinn@...]
Sent: Tuesday, April 28, 2020 00:21
To: Igor Kovalenko <igor.kovalenko@...>; openmama-dev@...
Subject: RE: [Openmama-dev] let's move bridgeMamaSubscriptionMute call up in hierarchy out of throttle locked context

 

Hi Igor,

 

Erm, I think I understand the issue… If I put a PR together, can you review and verify if this fixes the issue?

 

Cheers,

Frank

 

Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io

 

From: Openmama-dev@... <Openmama-dev@...> On Behalf Of Igor Kovalenko via lists.openmama.org
Sent: 27 April 2020 18:21
To: openmama-dev@...
Subject: Re: [Openmama-dev] let's move bridgeMamaSubscriptionMute call up in hierarchy out of throttle locked context

 

Classification: Public

 

Hi

 

I now seem to have successfully tested a proper fix for the issue which requires calling bridgeMamaSubscriptionMute outside of throttle locked context.

 

All it takes is to handle a subscription in state MAMA_SUBSCRIPTION_ACTIVATING as special case in mamaSubscription_deactivate with throttle lock still held to drain related event from throttle queue.

Following that all other cases do not require throttle lock anymore.

Now when throttle lock is dropped but with mCreateDestroyLock acquired it is safe to call bridgeMamaSubscriptionMute outside of throttle locked context.

 

In my local testing I can confirm this (+ a couple of obvious lines to tick42rmds code) fixes the old issue reported back in July, 2017 regarding crash with tick42rmds bridge wrt concurrent subscription.destroy() call.

As a bonus the queueing solution in tick42rmds bridge is no longer required, which reduces memory consumption and improves latency.

 

I think that history regarding old issue is still available if one searches for this subject in mail archives: [Openmama-dev] [Openmama-users] Concurrent subscription.destroy() ? Crash when using tick42rmds transport.

 

Frank, I hope you will be able to have a look into this to possibly land this change into the openmama code.

 

From: Openmama-dev@... [mailto:Openmama-dev@...] On Behalf Of Igor Kovalenko
Sent: Tuesday, March 31, 2020 13:33
To: openmama-dev@...
Subject: [Openmama-dev] let's move bridgeMamaSubscriptionMute call up in hierarchy out of throttle locked context

 

Classification: Public

 

Hi openmama team,

 

I'd like to propose moving bridgeMamaSubscriptionMute call from mamaSubscription_deactivate_internal up in the call hierarchy to mamaSubscription_deactivate so Mute() is called before any throttle lock is acquired (that is, just after checking if subscription is not NULL in mamaSubscription_deactivate).

 

Since in this case the client is removing subscription anyway, it should not matter that transport bridge Mute() method will be called a bit earlier in subscription lifecycle.

 

I'll try to explain why this change is needed for tick42rmds bridge to operate reliably. A required change to tick42rmds is outlined below as well.

Locally I tested a combination of this proposed change to openmama with tick42rmds bridge modified as outlined below, and also with other transport bridges (unmodified) that we use, and found no negative side effects.

 

A couple of years back there was a discussion regarding tick42rmds transport issue.

My understanding is that tick42rmds design choice to process subscriptions on bridge thread as opposed to Mama.start() thread violates a few openmama threading assumptions.

As a consequence we can reproduce a crash during moderately fast subscribe/unsubscribe sequence where tick42rmds bridge is still trying to deliver data to callback of already destroyed mama subscription.

This crash is caused by a race condition detailed below.

 

Tick42rmds bridge implements tick42rmdsBridgeMamaSubscription_mute method which does call Shutdown() on bridge thread subscription object.

Intention there appears to be to mark bridge thread subscription object as shut down so that bridge thread ignores it and is not trying to deliver a callbacks on mama subscription objects being destroyed.

Unfortunately there is no proper locking there so that seldom a Mute() call returns before Shutdown() takes effect.

If this happens, bridge thread proceeds to deliver next message to mama subscription callback while mama subscription object is being destroyed by the thread returned from Mute().

 

Without significant redesign of tick42rmds this race can be closed by adding proper locking between tick42rmdsBridgeMamaSubscription_mute and bridge thread, so that bridge thread is guaranteed to see result of Shutdown() before Mute() call returns. Simple implementation would be to add a mutex shared between Shutdown() call and bridge thread methods issuing callbacks to mama subscriptions.

 

So far so good. Yet unfortunately openmama locking scheme needs to be modified for this to work due to a problem with locking order of MAMA_THROTTLE_DEFAULT throttle.

There are many places where MAMA_THROTTLE_DEFAULT lock is acquired. According to my analysis with valgrind/helgrind/drd acquisition of corresponding mutex must always be a leaf call in locking hierarchy to prevent ABBA-style deadlock.

 

If Mute() and Shutdown() are synchronized with a mutex, then the following would happen:

 

client thread destroying mama subscription eventually calls mamaSubscription_destroy

       which calls mamaSubscription_deactivate()

    [A]  .. mamaSubscription_deactivate acquires lock [A] on throttle object

            then calls mamaSubscription_deactivate_internal()

              .. which calls impl->mBridgeImpl->bridgeMamaSubscriptionMute (impl->mSubscBridge)

                 which is tick42rmdsBridgeMamaSubscription_mute

    [B]          .. tick42rmdsBridgeMamaSubscription_mute acquires bridge thread subscription object lock [B]

XX client thread gets suspended here for some reason so it is not calling Shutdown() yet

 

tick42rmds bridge thread calls RMDSBridgeSubscription::OnMessage (or anything else which requires calling mamaSubscription_processMsg later)

    [B]  .. bridge thread acquires bridge thread subscription object lock [B]

                bridge thread checks if Shutdown() is called on this object

XX          .. since client thread is suspended, Shutdown() is not in effect

            bridge thread calls mamaSubscription_processMsg() to deliver message to mama subscription

            ..

    [A]     imageRequest_stopWaitForResponse() is called which tries to acquire throttle lock [A]

 

If my proposal to move bridgeMamaSubscriptionMute call out of locked scope is implemented, this deadlock would not occur.

 

I will not be able to share any code change due to our organizational restrictions but will try to answer other questions regarding this proposal.

 

--

 

Kind regards,
Igor Kovalenko

 



---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.



---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.



---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.


Re: Early releases of lock in mamaSubscription destroy/deallocate logic

Frank Quinn
 

Hi Mike,

 

I have put together something that looks like it address this (I modified qpid and MamaListen locally with various sleeps to verify behaviour). Turned out to be a little trickier than I had anticipated with the different paths through the state machine.

 

It works by assigning an atomic reference counter to increment for the bridge, and the Java wrapper, such that only when receiving a statement of disinterest from both will it release the memory. Since calling deallocate is an admission that you will no longer be using that memory any more, operating on a first past the post system should be safe.

 

There are many paths through the MAMA subscription state machine though so I'd appreciate if you could give it a test in your target environment before I merge it in. You can find my development branch for this change on:

 

https://github.com/fquinner/OpenMAMA/tree/feature/subscription-reference-count

 

Let me know if this works for you.

 

Cheers,

Frank

 

Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io

 

From: Slade, Michael J <michael.j.slade@...>
Sent: 29 April 2020 10:01
To: Frank Quinn <fquinn@...>
Cc: openmama-dev@...
Subject: RE: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

Thanks Frank.

 

If you could take a look at implementing this, that would be much appreciated.

 

Mike

 

From: Frank Quinn [mailto:fquinn@...]
Sent: 28 April 2020 19:52
To: Slade, Michael J (CIB Tech, GBR) <michael.j.slade@...>
Cc: openmama-dev@...
Subject: RE: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

OK thanks Mike I thought user interaction was the primary problem but sounds like it's the finalizers.

 

This is where the subscription stuff gets hairy because there are a few different but similar areas at play here.

 

You should ideally unlock before the mamaSubscriptionImpl_deallocate to avoid undefined behaviour in the destroy yes. However...

 

In Java, when the GC kicks in, it fires off the JNI Java_com_wombat_mama_MamaSubscription_deallocate method which in turn calls mamaSubscription_deallocate which does acquire the subscription's mCreateDestroyLock

which could be (already) held onto while mamaSubscriptionImpl_deallocate is then called (!). So we actually already have undefined behaviour on that path when you look at it so that's effectively the same bug.

 

The path with Java destroy goes JNI Java_com_wombat_mama_MamaSubscription_destroy which calls mamaSubscription_destroy which will usually deactivate the subscription, then go on to invoke the destroyed callback. I was suggesting at this point, we *do* hang onto the lock until that callback is completed to protect the subscription object.

 

The path mamaSubscriptionImpl_onSubscriptionDestroyed is a different beast when the middleware is letting MAMA know that a subscription has been destroyed which may be via mamaSubscription_destroy -> mamaSubscription -> deactivate. In this case, we currently unlock before the callback. I was suggesting this should be moved until after the callback, but before the deallocate. If we hung onto the lock until after the deallocate, then we'd just be emulating the buggy behaviour already present in mamaSubscription_deallocate.

 

But yes when I look this through there is a more subtle issue afoot - because we have onSubscriptionDestroyed which will be called here (already inside the lock)

 

destroy

  deactivate

    lock mCreateDestroyLock

    mamaSubscription_deactivate_internal

     bridge_destroy

        mamaSubscriptionImpl_onSubscriptionDestroyed

          lock mCreateDestroyLock

          unlock mCreateDestroyLock

          impldeallocate()

    unlock

  lock

 

Now, since wlocks are recursive and these are from the same thread, this won't deadlock and should already be protected from the finalizer, but it also may not happen in this order depending on how the ondestroy callback is implemented in the bridge, so you could get something more like this:

 

destroy

  deactivate

    lock mCreateDestroyLock

    mamaSubscription_deactivate_internal

      bridge_destroy

        ... bridge takes this under consideration ...

    unlock mCreateDestroyLock

  lock

 

... time passes...

 

mamaSubscriptionImpl_onSubscriptionDestroyed

  lock mCreateDestroyLock

  unlock mCreateDestroyLock

  impldeallocate() <-- this is where it looks like the GC has an opportunity to cause trouble?

 

If we are going to have multiple threads coming in like this, then yes I think an atomic reference counter in the subscription object to track when each resource depending on the subscription object has gained and lost interest would be preferable to holding onto the lock. Would fix the existing bug already in the finalizer too.

 

Is this something you're looking clarification on before implementing or you want me to have a look at implementing this?

 

Cheers,

Frank

 

Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io

 

From: Slade, Michael J <michael.j.slade@...>
Sent: 28 April 2020 12:37
To: Frank Quinn <fquinn@...>
Cc: openmama-dev@...
Subject: RE: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

Hi Frank,

 

Why do we need to unlock the mutex before the deallocate call? I understand that this call destroys the lock, but with the proposed wrapper around the lock the destroy call can defer the actual release of memory to the unlock call.

 

The early release could be exploited because the MamaSubscription Java wrapper calls deallocate in its finalize method. Due to this, the GC thread could acquire the lock and release memory in mamaSubscriptionImpl_deallocate while the lock-releasing thread is trying to use this memory to invoke the user callback.

 

Thanks,

Mike

 

From: Frank Quinn [mailto:fquinn@...]
Sent: 27 April 2020 22:04
To: Slade, Michael J (CIB Tech, GBR) <michael.j.slade@...>
Cc: openmama-dev@...
Subject: RE: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

Hi Mike,

 

Could you shed some more light on what exactly the problem is here? I have had a refresher in that area, and from what I can see, I can’t think of any reason to unlock that particular mutex before the user callback so I’m not opposed to moving until after that callback if that will resolve whatever issue you’re seeing? It will need to be before the deallocate though.

 

The only reason I could think of for why this was done in the first place is to avoid deadlock if the user calls something like mamaSubscription_deallocate or something else that uses that mutex from the callback, but then I guess they’d learn pretty quickly if they had fallen foul of that. Then I thought this might be somehow exploited in one of the Java or .NET wrappers but couldn’t find any evidence of that either. Plus that callback is more informative than anything else for the application, so if they are attempting to deallocate, and that’s causing funny business (or something like that?), we just need to make it clear in the callback documentation that they shouldn’t be doing that.

 

Cheers,

Frank

 

Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io

 

From: Openmama-dev@... <Openmama-dev@...> On Behalf Of Bill Torpey via lists.openmama.org
Sent: 05 April 2020 15:02
To: Slade, Michael J <michael.j.slade@...>
Cc: openmama-dev@...
Subject: Re: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

At first glance, that sounds like just “kicking the can down the road” — i.e., you’re still left with the problem of what to do with these wrappers such that you can tear them down in a thread-safe manner.

 

Having said that, if you have an implementation that works, I’m sure that others would be interested, so maybe put it up on GitHub or something?

 

My personal preference would be to try to find something on the intertubes that has been tested up the wazoo — concurrency is hard, and the hardest part IMO is tear-down of event sources/sinks.

 

On Apr 3, 2020, at 10:39 AM, Slade, Michael J <michael.j.slade@...> wrote:

 

Thank you for your reply Bill.

 

To get around the problem with attempting to destroy the mutex while it is locked, could we instead define our own wrapper around a recursive mutex which handles the following:

1.      Keep a count of how many locks have been performed by current thread.

2.      Defer destroy to unlock call if mutex currently locked.

 

This will keep all destroy/deallocate logic thread safe and stop the need for early unlocks.

 

Mike

 

From: Bill Torpey [mailto:wallstprog@...] 
Sent: 01 April 2020 18:10
To: Slade, Michael J (CIB Tech, GBR) <michael.j.slade@...>
Cc: openmama-dev@...
Subject: Re: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

As far as I can tell, there’s no perfect solution here.    From https://linux.die.net/man/3/pthread_mutex_destroy 

 

Attempting to destroy a locked mutex results in undefined behavior.

 

So you either destroy the locked mutex, which is UB, or you unlock it first.  In the second case, another thread could grab it, which would likely cause a crash at some point.  

 

You can make the second problem go away by doing something like queueing the actual destroy in such a way that you know for certain that no other threads are waiting on the mutex, but that’s a fair bit of work, and tricky in its own right.  We took that approach in our application, and it generally works well, and avoids a whole host of lifetime-related problems in OpenMAMA.

 

 

On Mar 31, 2020, at 11:50 AM, Slade, Michael J via Lists.Openmama.Org <michael.j.slade=jpmorgan.com@...> wrote:

 

Hi OpenMAMA Devs,

 

In the subscription destroy/deallocate logic, there are a couple of functions which seem to release the lock used to protect access to the subscription structure early – i.e. before the user callback ‘mamaSubscriptionImpl_invokeDestroyedCallback’ or ‘mamaSubscriptionImpl_deallocate’ function is invoked. Is anyone able to explain why the lock is released in this way, and not held until the end of the function?
 
Thanks in advance,
Mike

mamaSubscription_destroy:2848

mama_status mamaSubscription_destroy(mamaSubscription subscription)

{

    /* Returns. */

    mama_status ret = MAMA_STATUS_NULL_ARG;

    if(NULL != subscription)

    {

        /* Get the impl. */

        mamaSubscriptionImpl *impl = (mamaSubscriptionImpl *)subscription;

 

        mamaSubscription_deactivate(subscription);

 

         wlock_lock(impl->mCreateDestroyLock);

 

        /* The next action will depend on the current state of the subscription. */

        switch(wInterlocked_read(&impl->mState))

        {

                /* Otherwise the subscription has been correctly removed from the throttle. */

                /* Fall through to perform the remaining clean-up. */

 

                /* For the following states the subscription is not active, simply perform clean-up. */

            case MAMA_SUBSCRIPTION_SETUP:

            case MAMA_SUBSCRIPTION_DEACTIVATED:

                 mamaSubscription_cleanup(subscription);

                 mamaSubscriptionImpl_setState(impl, MAMA_SUBSCRIPTION_DESTROYED);

                  wlock_unlock(impl->mCreateDestroyLock);

                 mamaSubscriptionImpl_invokeDestroyedCallback(subscription);

                 return MAMA_STATUS_OK;

                break;

 

mamaSubscriptionImpl_onSubscriptionDestroyed:3292+3303

 

void MAMACALLTYPE mamaSubscriptionImpl_onSubscriptionDestroyed(mamaSubscription subscription, void *closure)
{
    /* Obtain the impl from the subscription object. */
    mamaSubscriptionImpl *impl = (mamaSubscriptionImpl *)subscription;
    if(NULL != impl)
    {
 
        if(NULL != impl->mQueue)
            mamaQueue_decrementObjectCount(&impl->mLockHandle, impl->mQueue);
 
        /* Lock the mutex. */
        wlock_lock(impl->mCreateDestroyLock);
 
 
        /* The next action will depend on the current state of the subscription. */
        switch(wInterlocked_read(&impl->mState))
        {
                /* The subscription is being deactivated. */
            case MAMA_SUBSCRIPTION_DEACTIVATING:
                /* Change the state. */
                mamaSubscriptionImpl_setState(impl, MAMA_SUBSCRIPTION_DEACTIVATED);
                break;
 
                /* The subscription is being deallocated, i.e. mamaSubscription_deallocate has been called
                 * before the destroy callback has come in from the bridge.
                 */
            case MAMA_SUBSCRIPTION_DEALLOCATING :
                 mamaSubscription_cleanup(subscription);
                 wlock_unlock(impl->mCreateDestroyLock);
                 mamaSubscriptionImpl_invokeDestroyedCallback(impl);
                /* Delete the subscription. */
                mamaSubscriptionImpl_deallocate(impl);
                return;
                break;
 
                /* The subscription is being destroyed. */
            case MAMA_SUBSCRIPTION_DESTROYING :
                 mamaSubscription_cleanup(subscription);
                 mamaSubscriptionImpl_setState(impl, MAMA_SUBSCRIPTION_DESTROYED);
                  wlock_unlock(impl->mCreateDestroyLock);
                 mamaSubscriptionImpl_invokeDestroyedCallback(impl);
                 return;
                break;

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.

 

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.

 

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.


Re: Early releases of lock in mamaSubscription destroy/deallocate logic

Slade, Michael J
 

Thanks Frank.

 

If you could take a look at implementing this, that would be much appreciated.

 

Mike

 

From: Frank Quinn [mailto:fquinn@...]
Sent: 28 April 2020 19:52
To: Slade, Michael J (CIB Tech, GBR) <michael.j.slade@...>
Cc: openmama-dev@...
Subject: RE: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

OK thanks Mike I thought user interaction was the primary problem but sounds like it's the finalizers.

 

This is where the subscription stuff gets hairy because there are a few different but similar areas at play here.

 

You should ideally unlock before the mamaSubscriptionImpl_deallocate to avoid undefined behaviour in the destroy yes. However...

 

In Java, when the GC kicks in, it fires off the JNI Java_com_wombat_mama_MamaSubscription_deallocate method which in turn calls mamaSubscription_deallocate which does acquire the subscription's mCreateDestroyLock

which could be (already) held onto while mamaSubscriptionImpl_deallocate is then called (!). So we actually already have undefined behaviour on that path when you look at it so that's effectively the same bug.

 

The path with Java destroy goes JNI Java_com_wombat_mama_MamaSubscription_destroy which calls mamaSubscription_destroy which will usually deactivate the subscription, then go on to invoke the destroyed callback. I was suggesting at this point, we *do* hang onto the lock until that callback is completed to protect the subscription object.

 

The path mamaSubscriptionImpl_onSubscriptionDestroyed is a different beast when the middleware is letting MAMA know that a subscription has been destroyed which may be via mamaSubscription_destroy -> mamaSubscription -> deactivate. In this case, we currently unlock before the callback. I was suggesting this should be moved until after the callback, but before the deallocate. If we hung onto the lock until after the deallocate, then we'd just be emulating the buggy behaviour already present in mamaSubscription_deallocate.

 

But yes when I look this through there is a more subtle issue afoot - because we have onSubscriptionDestroyed which will be called here (already inside the lock)

 

destroy

  deactivate

    lock mCreateDestroyLock

    mamaSubscription_deactivate_internal

     bridge_destroy

        mamaSubscriptionImpl_onSubscriptionDestroyed

          lock mCreateDestroyLock

          unlock mCreateDestroyLock

          impldeallocate()

    unlock

  lock

 

Now, since wlocks are recursive and these are from the same thread, this won't deadlock and should already be protected from the finalizer, but it also may not happen in this order depending on how the ondestroy callback is implemented in the bridge, so you could get something more like this:

 

destroy

  deactivate

    lock mCreateDestroyLock

    mamaSubscription_deactivate_internal

      bridge_destroy

        ... bridge takes this under consideration ...

    unlock mCreateDestroyLock

  lock

 

... time passes...

 

mamaSubscriptionImpl_onSubscriptionDestroyed

  lock mCreateDestroyLock

  unlock mCreateDestroyLock

  impldeallocate() <-- this is where it looks like the GC has an opportunity to cause trouble?

 

If we are going to have multiple threads coming in like this, then yes I think an atomic reference counter in the subscription object to track when each resource depending on the subscription object has gained and lost interest would be preferable to holding onto the lock. Would fix the existing bug already in the finalizer too.

 

Is this something you're looking clarification on before implementing or you want me to have a look at implementing this?

 

Cheers,

Frank

 

Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io

 

From: Slade, Michael J <michael.j.slade@...>
Sent: 28 April 2020 12:37
To: Frank Quinn <fquinn@...>
Cc: openmama-dev@...
Subject: RE: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

Hi Frank,

 

Why do we need to unlock the mutex before the deallocate call? I understand that this call destroys the lock, but with the proposed wrapper around the lock the destroy call can defer the actual release of memory to the unlock call.

 

The early release could be exploited because the MamaSubscription Java wrapper calls deallocate in its finalize method. Due to this, the GC thread could acquire the lock and release memory in mamaSubscriptionImpl_deallocate while the lock-releasing thread is trying to use this memory to invoke the user callback.

 

Thanks,

Mike

 

From: Frank Quinn [mailto:fquinn@...]
Sent: 27 April 2020 22:04
To: Slade, Michael J (CIB Tech, GBR) <michael.j.slade@...>
Cc: openmama-dev@...
Subject: RE: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

Hi Mike,

 

Could you shed some more light on what exactly the problem is here? I have had a refresher in that area, and from what I can see, I can’t think of any reason to unlock that particular mutex before the user callback so I’m not opposed to moving until after that callback if that will resolve whatever issue you’re seeing? It will need to be before the deallocate though.

 

The only reason I could think of for why this was done in the first place is to avoid deadlock if the user calls something like mamaSubscription_deallocate or something else that uses that mutex from the callback, but then I guess they’d learn pretty quickly if they had fallen foul of that. Then I thought this might be somehow exploited in one of the Java or .NET wrappers but couldn’t find any evidence of that either. Plus that callback is more informative than anything else for the application, so if they are attempting to deallocate, and that’s causing funny business (or something like that?), we just need to make it clear in the callback documentation that they shouldn’t be doing that.

 

Cheers,

Frank

 

Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io

 

From: Openmama-dev@... <Openmama-dev@...> On Behalf Of Bill Torpey via lists.openmama.org
Sent: 05 April 2020 15:02
To: Slade, Michael J <michael.j.slade@...>
Cc: openmama-dev@...
Subject: Re: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

At first glance, that sounds like just “kicking the can down the road” — i.e., you’re still left with the problem of what to do with these wrappers such that you can tear them down in a thread-safe manner.

 

Having said that, if you have an implementation that works, I’m sure that others would be interested, so maybe put it up on GitHub or something?

 

My personal preference would be to try to find something on the intertubes that has been tested up the wazoo — concurrency is hard, and the hardest part IMO is tear-down of event sources/sinks.

 

On Apr 3, 2020, at 10:39 AM, Slade, Michael J <michael.j.slade@...> wrote:

 

Thank you for your reply Bill.

 

To get around the problem with attempting to destroy the mutex while it is locked, could we instead define our own wrapper around a recursive mutex which handles the following:

1.      Keep a count of how many locks have been performed by current thread.

2.      Defer destroy to unlock call if mutex currently locked.

 

This will keep all destroy/deallocate logic thread safe and stop the need for early unlocks.

 

Mike

 

From: Bill Torpey [mailto:wallstprog@...] 
Sent: 01 April 2020 18:10
To: Slade, Michael J (CIB Tech, GBR) <michael.j.slade@...>
Cc: openmama-dev@...
Subject: Re: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

As far as I can tell, there’s no perfect solution here.    From https://linux.die.net/man/3/pthread_mutex_destroy 

 

Attempting to destroy a locked mutex results in undefined behavior.

 

So you either destroy the locked mutex, which is UB, or you unlock it first.  In the second case, another thread could grab it, which would likely cause a crash at some point.  

 

You can make the second problem go away by doing something like queueing the actual destroy in such a way that you know for certain that no other threads are waiting on the mutex, but that’s a fair bit of work, and tricky in its own right.  We took that approach in our application, and it generally works well, and avoids a whole host of lifetime-related problems in OpenMAMA.

 

 

On Mar 31, 2020, at 11:50 AM, Slade, Michael J via Lists.Openmama.Org <michael.j.slade=jpmorgan.com@...> wrote:

 

Hi OpenMAMA Devs,

 

In the subscription destroy/deallocate logic, there are a couple of functions which seem to release the lock used to protect access to the subscription structure early – i.e. before the user callback ‘mamaSubscriptionImpl_invokeDestroyedCallback’ or ‘mamaSubscriptionImpl_deallocate’ function is invoked. Is anyone able to explain why the lock is released in this way, and not held until the end of the function?
 
Thanks in advance,
Mike

mamaSubscription_destroy:2848

mama_status mamaSubscription_destroy(mamaSubscription subscription)

{

    /* Returns. */

    mama_status ret = MAMA_STATUS_NULL_ARG;

    if(NULL != subscription)

    {

        /* Get the impl. */

        mamaSubscriptionImpl *impl = (mamaSubscriptionImpl *)subscription;

 

        mamaSubscription_deactivate(subscription);

 

         wlock_lock(impl->mCreateDestroyLock);

 

        /* The next action will depend on the current state of the subscription. */

        switch(wInterlocked_read(&impl->mState))

        {

                /* Otherwise the subscription has been correctly removed from the throttle. */

                /* Fall through to perform the remaining clean-up. */

 

                /* For the following states the subscription is not active, simply perform clean-up. */

            case MAMA_SUBSCRIPTION_SETUP:

            case MAMA_SUBSCRIPTION_DEACTIVATED:

                 mamaSubscription_cleanup(subscription);

                 mamaSubscriptionImpl_setState(impl, MAMA_SUBSCRIPTION_DESTROYED);

                  wlock_unlock(impl->mCreateDestroyLock);

                 mamaSubscriptionImpl_invokeDestroyedCallback(subscription);

                 return MAMA_STATUS_OK;

                break;

 

mamaSubscriptionImpl_onSubscriptionDestroyed:3292+3303

 

void MAMACALLTYPE mamaSubscriptionImpl_onSubscriptionDestroyed(mamaSubscription subscription, void *closure)
{
    /* Obtain the impl from the subscription object. */
    mamaSubscriptionImpl *impl = (mamaSubscriptionImpl *)subscription;
    if(NULL != impl)
    {
 
        if(NULL != impl->mQueue)
            mamaQueue_decrementObjectCount(&impl->mLockHandle, impl->mQueue);
 
        /* Lock the mutex. */
        wlock_lock(impl->mCreateDestroyLock);
 
 
        /* The next action will depend on the current state of the subscription. */
        switch(wInterlocked_read(&impl->mState))
        {
                /* The subscription is being deactivated. */
            case MAMA_SUBSCRIPTION_DEACTIVATING:
                /* Change the state. */
                mamaSubscriptionImpl_setState(impl, MAMA_SUBSCRIPTION_DEACTIVATED);
                break;
 
                /* The subscription is being deallocated, i.e. mamaSubscription_deallocate has been called
                 * before the destroy callback has come in from the bridge.
                 */
            case MAMA_SUBSCRIPTION_DEALLOCATING :
                 mamaSubscription_cleanup(subscription);
                 wlock_unlock(impl->mCreateDestroyLock);
                 mamaSubscriptionImpl_invokeDestroyedCallback(impl);
                /* Delete the subscription. */
                mamaSubscriptionImpl_deallocate(impl);
                return;
                break;
 
                /* The subscription is being destroyed. */
            case MAMA_SUBSCRIPTION_DESTROYING :
                 mamaSubscription_cleanup(subscription);
                 mamaSubscriptionImpl_setState(impl, MAMA_SUBSCRIPTION_DESTROYED);
                  wlock_unlock(impl->mCreateDestroyLock);
                 mamaSubscriptionImpl_invokeDestroyedCallback(impl);
                 return;
                break;

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.

 

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.

 

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.


Re: Early releases of lock in mamaSubscription destroy/deallocate logic

Frank Quinn
 

OK thanks Mike I thought user interaction was the primary problem but sounds like it's the finalizers.

 

This is where the subscription stuff gets hairy because there are a few different but similar areas at play here.

 

You should ideally unlock before the mamaSubscriptionImpl_deallocate to avoid undefined behaviour in the destroy yes. However...

 

In Java, when the GC kicks in, it fires off the JNI Java_com_wombat_mama_MamaSubscription_deallocate method which in turn calls mamaSubscription_deallocate which does acquire the subscription's mCreateDestroyLock

which could be (already) held onto while mamaSubscriptionImpl_deallocate is then called (!). So we actually already have undefined behaviour on that path when you look at it so that's effectively the same bug.

 

The path with Java destroy goes JNI Java_com_wombat_mama_MamaSubscription_destroy which calls mamaSubscription_destroy which will usually deactivate the subscription, then go on to invoke the destroyed callback. I was suggesting at this point, we *do* hang onto the lock until that callback is completed to protect the subscription object.

 

The path mamaSubscriptionImpl_onSubscriptionDestroyed is a different beast when the middleware is letting MAMA know that a subscription has been destroyed which may be via mamaSubscription_destroy -> mamaSubscription -> deactivate. In this case, we currently unlock before the callback. I was suggesting this should be moved until after the callback, but before the deallocate. If we hung onto the lock until after the deallocate, then we'd just be emulating the buggy behaviour already present in mamaSubscription_deallocate.

 

But yes when I look this through there is a more subtle issue afoot - because we have onSubscriptionDestroyed which will be called here (already inside the lock)

 

destroy

  deactivate

    lock mCreateDestroyLock

    mamaSubscription_deactivate_internal

     bridge_destroy

        mamaSubscriptionImpl_onSubscriptionDestroyed

          lock mCreateDestroyLock

          unlock mCreateDestroyLock

          impldeallocate()

    unlock

  lock

 

Now, since wlocks are recursive and these are from the same thread, this won't deadlock and should already be protected from the finalizer, but it also may not happen in this order depending on how the ondestroy callback is implemented in the bridge, so you could get something more like this:

 

destroy

  deactivate

    lock mCreateDestroyLock

    mamaSubscription_deactivate_internal

      bridge_destroy

        ... bridge takes this under consideration ...

    unlock mCreateDestroyLock

  lock

 

... time passes...

 

mamaSubscriptionImpl_onSubscriptionDestroyed

  lock mCreateDestroyLock

  unlock mCreateDestroyLock

  impldeallocate() <-- this is where it looks like the GC has an opportunity to cause trouble?

 

If we are going to have multiple threads coming in like this, then yes I think an atomic reference counter in the subscription object to track when each resource depending on the subscription object has gained and lost interest would be preferable to holding onto the lock. Would fix the existing bug already in the finalizer too.

 

Is this something you're looking clarification on before implementing or you want me to have a look at implementing this?

 

Cheers,

Frank

 

Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io

 

From: Slade, Michael J <michael.j.slade@...>
Sent: 28 April 2020 12:37
To: Frank Quinn <fquinn@...>
Cc: openmama-dev@...
Subject: RE: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

Hi Frank,

 

Why do we need to unlock the mutex before the deallocate call? I understand that this call destroys the lock, but with the proposed wrapper around the lock the destroy call can defer the actual release of memory to the unlock call.

 

The early release could be exploited because the MamaSubscription Java wrapper calls deallocate in its finalize method. Due to this, the GC thread could acquire the lock and release memory in mamaSubscriptionImpl_deallocate while the lock-releasing thread is trying to use this memory to invoke the user callback.

 

Thanks,

Mike

 

From: Frank Quinn [mailto:fquinn@...]
Sent: 27 April 2020 22:04
To: Slade, Michael J (CIB Tech, GBR) <michael.j.slade@...>
Cc: openmama-dev@...
Subject: RE: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

Hi Mike,

 

Could you shed some more light on what exactly the problem is here? I have had a refresher in that area, and from what I can see, I can’t think of any reason to unlock that particular mutex before the user callback so I’m not opposed to moving until after that callback if that will resolve whatever issue you’re seeing? It will need to be before the deallocate though.

 

The only reason I could think of for why this was done in the first place is to avoid deadlock if the user calls something like mamaSubscription_deallocate or something else that uses that mutex from the callback, but then I guess they’d learn pretty quickly if they had fallen foul of that. Then I thought this might be somehow exploited in one of the Java or .NET wrappers but couldn’t find any evidence of that either. Plus that callback is more informative than anything else for the application, so if they are attempting to deallocate, and that’s causing funny business (or something like that?), we just need to make it clear in the callback documentation that they shouldn’t be doing that.

 

Cheers,

Frank

 

Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io

 

From: Openmama-dev@... <Openmama-dev@...> On Behalf Of Bill Torpey via lists.openmama.org
Sent: 05 April 2020 15:02
To: Slade, Michael J <michael.j.slade@...>
Cc: openmama-dev@...
Subject: Re: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

At first glance, that sounds like just “kicking the can down the road” — i.e., you’re still left with the problem of what to do with these wrappers such that you can tear them down in a thread-safe manner.

 

Having said that, if you have an implementation that works, I’m sure that others would be interested, so maybe put it up on GitHub or something?

 

My personal preference would be to try to find something on the intertubes that has been tested up the wazoo — concurrency is hard, and the hardest part IMO is tear-down of event sources/sinks.

 

On Apr 3, 2020, at 10:39 AM, Slade, Michael J <michael.j.slade@...> wrote:

 

Thank you for your reply Bill.

 

To get around the problem with attempting to destroy the mutex while it is locked, could we instead define our own wrapper around a recursive mutex which handles the following:

1.      Keep a count of how many locks have been performed by current thread.

2.      Defer destroy to unlock call if mutex currently locked.

 

This will keep all destroy/deallocate logic thread safe and stop the need for early unlocks.

 

Mike

 

From: Bill Torpey [mailto:wallstprog@...] 
Sent: 01 April 2020 18:10
To: Slade, Michael J (CIB Tech, GBR) <michael.j.slade@...>
Cc: openmama-dev@...
Subject: Re: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

As far as I can tell, there’s no perfect solution here.    From https://linux.die.net/man/3/pthread_mutex_destroy 

 

Attempting to destroy a locked mutex results in undefined behavior.

 

So you either destroy the locked mutex, which is UB, or you unlock it first.  In the second case, another thread could grab it, which would likely cause a crash at some point.  

 

You can make the second problem go away by doing something like queueing the actual destroy in such a way that you know for certain that no other threads are waiting on the mutex, but that’s a fair bit of work, and tricky in its own right.  We took that approach in our application, and it generally works well, and avoids a whole host of lifetime-related problems in OpenMAMA.

 

 

On Mar 31, 2020, at 11:50 AM, Slade, Michael J via Lists.Openmama.Org <michael.j.slade=jpmorgan.com@...> wrote:

 

Hi OpenMAMA Devs,

 

In the subscription destroy/deallocate logic, there are a couple of functions which seem to release the lock used to protect access to the subscription structure early – i.e. before the user callback ‘mamaSubscriptionImpl_invokeDestroyedCallback’ or ‘mamaSubscriptionImpl_deallocate’ function is invoked. Is anyone able to explain why the lock is released in this way, and not held until the end of the function?
 
Thanks in advance,
Mike

mamaSubscription_destroy:2848

mama_status mamaSubscription_destroy(mamaSubscription subscription)

{

    /* Returns. */

    mama_status ret = MAMA_STATUS_NULL_ARG;

    if(NULL != subscription)

    {

        /* Get the impl. */

        mamaSubscriptionImpl *impl = (mamaSubscriptionImpl *)subscription;

 

        mamaSubscription_deactivate(subscription);

 

         wlock_lock(impl->mCreateDestroyLock);

 

        /* The next action will depend on the current state of the subscription. */

        switch(wInterlocked_read(&impl->mState))

        {

                /* Otherwise the subscription has been correctly removed from the throttle. */

                /* Fall through to perform the remaining clean-up. */

 

                /* For the following states the subscription is not active, simply perform clean-up. */

            case MAMA_SUBSCRIPTION_SETUP:

            case MAMA_SUBSCRIPTION_DEACTIVATED:

                 mamaSubscription_cleanup(subscription);

                 mamaSubscriptionImpl_setState(impl, MAMA_SUBSCRIPTION_DESTROYED);

                  wlock_unlock(impl->mCreateDestroyLock);

                 mamaSubscriptionImpl_invokeDestroyedCallback(subscription);

                 return MAMA_STATUS_OK;

                break;

 

mamaSubscriptionImpl_onSubscriptionDestroyed:3292+3303

 

void MAMACALLTYPE mamaSubscriptionImpl_onSubscriptionDestroyed(mamaSubscription subscription, void *closure)
{
    /* Obtain the impl from the subscription object. */
    mamaSubscriptionImpl *impl = (mamaSubscriptionImpl *)subscription;
    if(NULL != impl)
    {
 
        if(NULL != impl->mQueue)
            mamaQueue_decrementObjectCount(&impl->mLockHandle, impl->mQueue);
 
        /* Lock the mutex. */
        wlock_lock(impl->mCreateDestroyLock);
 
 
        /* The next action will depend on the current state of the subscription. */
        switch(wInterlocked_read(&impl->mState))
        {
                /* The subscription is being deactivated. */
            case MAMA_SUBSCRIPTION_DEACTIVATING:
                /* Change the state. */
                mamaSubscriptionImpl_setState(impl, MAMA_SUBSCRIPTION_DEACTIVATED);
                break;
 
                /* The subscription is being deallocated, i.e. mamaSubscription_deallocate has been called
                 * before the destroy callback has come in from the bridge.
                 */
            case MAMA_SUBSCRIPTION_DEALLOCATING :
                 mamaSubscription_cleanup(subscription);
                 wlock_unlock(impl->mCreateDestroyLock);
                 mamaSubscriptionImpl_invokeDestroyedCallback(impl);
                /* Delete the subscription. */
                mamaSubscriptionImpl_deallocate(impl);
                return;
                break;
 
                /* The subscription is being destroyed. */
            case MAMA_SUBSCRIPTION_DESTROYING :
                 mamaSubscription_cleanup(subscription);
                 mamaSubscriptionImpl_setState(impl, MAMA_SUBSCRIPTION_DESTROYED);
                  wlock_unlock(impl->mCreateDestroyLock);
                 mamaSubscriptionImpl_invokeDestroyedCallback(impl);
                 return;
                break;

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.

 

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.

 

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.


Re: Early releases of lock in mamaSubscription destroy/deallocate logic

Slade, Michael J
 

Hi Frank,

 

Why do we need to unlock the mutex before the deallocate call? I understand that this call destroys the lock, but with the proposed wrapper around the lock the destroy call can defer the actual release of memory to the unlock call.

 

The early release could be exploited because the MamaSubscription Java wrapper calls deallocate in its finalize method. Due to this, the GC thread could acquire the lock and release memory in mamaSubscriptionImpl_deallocate while the lock-releasing thread is trying to use this memory to invoke the user callback.

 

Thanks,

Mike

 

From: Frank Quinn [mailto:fquinn@...]
Sent: 27 April 2020 22:04
To: Slade, Michael J (CIB Tech, GBR) <michael.j.slade@...>
Cc: openmama-dev@...
Subject: RE: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

Hi Mike,

 

Could you shed some more light on what exactly the problem is here? I have had a refresher in that area, and from what I can see, I can’t think of any reason to unlock that particular mutex before the user callback so I’m not opposed to moving until after that callback if that will resolve whatever issue you’re seeing? It will need to be before the deallocate though.

 

The only reason I could think of for why this was done in the first place is to avoid deadlock if the user calls something like mamaSubscription_deallocate or something else that uses that mutex from the callback, but then I guess they’d learn pretty quickly if they had fallen foul of that. Then I thought this might be somehow exploited in one of the Java or .NET wrappers but couldn’t find any evidence of that either. Plus that callback is more informative than anything else for the application, so if they are attempting to deallocate, and that’s causing funny business (or something like that?), we just need to make it clear in the callback documentation that they shouldn’t be doing that.

 

Cheers,

Frank

 

Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io

 

From: Openmama-dev@... <Openmama-dev@...> On Behalf Of Bill Torpey via lists.openmama.org
Sent: 05 April 2020 15:02
To: Slade, Michael J <michael.j.slade@...>
Cc: openmama-dev@...
Subject: Re: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

At first glance, that sounds like just “kicking the can down the road” — i.e., you’re still left with the problem of what to do with these wrappers such that you can tear them down in a thread-safe manner.

 

Having said that, if you have an implementation that works, I’m sure that others would be interested, so maybe put it up on GitHub or something?

 

My personal preference would be to try to find something on the intertubes that has been tested up the wazoo — concurrency is hard, and the hardest part IMO is tear-down of event sources/sinks.

 

On Apr 3, 2020, at 10:39 AM, Slade, Michael J <michael.j.slade@...> wrote:

 

Thank you for your reply Bill.

 

To get around the problem with attempting to destroy the mutex while it is locked, could we instead define our own wrapper around a recursive mutex which handles the following:

1.      Keep a count of how many locks have been performed by current thread.

2.      Defer destroy to unlock call if mutex currently locked.

 

This will keep all destroy/deallocate logic thread safe and stop the need for early unlocks.

 

Mike

 

From: Bill Torpey [mailto:wallstprog@...] 
Sent: 01 April 2020 18:10
To: Slade, Michael J (CIB Tech, GBR) <michael.j.slade@...>
Cc: openmama-dev@...
Subject: Re: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

As far as I can tell, there’s no perfect solution here.    From https://linux.die.net/man/3/pthread_mutex_destroy 

 

Attempting to destroy a locked mutex results in undefined behavior.

 

So you either destroy the locked mutex, which is UB, or you unlock it first.  In the second case, another thread could grab it, which would likely cause a crash at some point.  

 

You can make the second problem go away by doing something like queueing the actual destroy in such a way that you know for certain that no other threads are waiting on the mutex, but that’s a fair bit of work, and tricky in its own right.  We took that approach in our application, and it generally works well, and avoids a whole host of lifetime-related problems in OpenMAMA.

 



On Mar 31, 2020, at 11:50 AM, Slade, Michael J via Lists.Openmama.Org <michael.j.slade=jpmorgan.com@...> wrote:

 

Hi OpenMAMA Devs,

 

In the subscription destroy/deallocate logic, there are a couple of functions which seem to release the lock used to protect access to the subscription structure early – i.e. before the user callback ‘mamaSubscriptionImpl_invokeDestroyedCallback’ or ‘mamaSubscriptionImpl_deallocate’ function is invoked. Is anyone able to explain why the lock is released in this way, and not held until the end of the function?
 
Thanks in advance,
Mike

mamaSubscription_destroy:2848

mama_status mamaSubscription_destroy(mamaSubscription subscription)

{

    /* Returns. */

    mama_status ret = MAMA_STATUS_NULL_ARG;

    if(NULL != subscription)

    {

        /* Get the impl. */

        mamaSubscriptionImpl *impl = (mamaSubscriptionImpl *)subscription;

 

        mamaSubscription_deactivate(subscription);

 

         wlock_lock(impl->mCreateDestroyLock);

 

        /* The next action will depend on the current state of the subscription. */

        switch(wInterlocked_read(&impl->mState))

        {

                /* Otherwise the subscription has been correctly removed from the throttle. */

                /* Fall through to perform the remaining clean-up. */

 

                /* For the following states the subscription is not active, simply perform clean-up. */

            case MAMA_SUBSCRIPTION_SETUP:

            case MAMA_SUBSCRIPTION_DEACTIVATED:

                 mamaSubscription_cleanup(subscription);

                 mamaSubscriptionImpl_setState(impl, MAMA_SUBSCRIPTION_DESTROYED);

                  wlock_unlock(impl->mCreateDestroyLock);

                 mamaSubscriptionImpl_invokeDestroyedCallback(subscription);

                 return MAMA_STATUS_OK;

                break;

 

mamaSubscriptionImpl_onSubscriptionDestroyed:3292+3303

 

void MAMACALLTYPE mamaSubscriptionImpl_onSubscriptionDestroyed(mamaSubscription subscription, void *closure)
{
    /* Obtain the impl from the subscription object. */
    mamaSubscriptionImpl *impl = (mamaSubscriptionImpl *)subscription;
    if(NULL != impl)
    {
 
        if(NULL != impl->mQueue)
            mamaQueue_decrementObjectCount(&impl->mLockHandle, impl->mQueue);
 
        /* Lock the mutex. */
        wlock_lock(impl->mCreateDestroyLock);
 
 
        /* The next action will depend on the current state of the subscription. */
        switch(wInterlocked_read(&impl->mState))
        {
                /* The subscription is being deactivated. */
            case MAMA_SUBSCRIPTION_DEACTIVATING:
                /* Change the state. */
                mamaSubscriptionImpl_setState(impl, MAMA_SUBSCRIPTION_DEACTIVATED);
                break;
 
                /* The subscription is being deallocated, i.e. mamaSubscription_deallocate has been called
                 * before the destroy callback has come in from the bridge.
                 */
            case MAMA_SUBSCRIPTION_DEALLOCATING :
                 mamaSubscription_cleanup(subscription);
                 wlock_unlock(impl->mCreateDestroyLock);
                 mamaSubscriptionImpl_invokeDestroyedCallback(impl);
                /* Delete the subscription. */
                mamaSubscriptionImpl_deallocate(impl);
                return;
                break;
 
                /* The subscription is being destroyed. */
            case MAMA_SUBSCRIPTION_DESTROYING :
                 mamaSubscription_cleanup(subscription);
                 mamaSubscriptionImpl_setState(impl, MAMA_SUBSCRIPTION_DESTROYED);
                  wlock_unlock(impl->mCreateDestroyLock);
                 mamaSubscriptionImpl_invokeDestroyedCallback(impl);
                 return;
                break;

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.

 

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.

 

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.


Re: let's move bridgeMamaSubscriptionMute call up in hierarchy out of throttle locked context

Igor Kovalenko
 

Classification: Public

 

Hi Frank,

 

Yes I will be able to review and test the PR.

 

From: Frank Quinn [mailto:fquinn@...]
Sent: Tuesday, April 28, 2020 00:21
To: Igor Kovalenko <igor.kovalenko@...>; openmama-dev@...
Subject: RE: [Openmama-dev] let's move bridgeMamaSubscriptionMute call up in hierarchy out of throttle locked context

 

Hi Igor,

 

Erm, I think I understand the issue… If I put a PR together, can you review and verify if this fixes the issue?

 

Cheers,

Frank

 

Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io

 

From: Openmama-dev@... <Openmama-dev@...> On Behalf Of Igor Kovalenko via lists.openmama.org
Sent: 27 April 2020 18:21
To: openmama-dev@...
Subject: Re: [Openmama-dev] let's move bridgeMamaSubscriptionMute call up in hierarchy out of throttle locked context

 

Classification: Public

 

Hi

 

I now seem to have successfully tested a proper fix for the issue which requires calling bridgeMamaSubscriptionMute outside of throttle locked context.

 

All it takes is to handle a subscription in state MAMA_SUBSCRIPTION_ACTIVATING as special case in mamaSubscription_deactivate with throttle lock still held to drain related event from throttle queue.

Following that all other cases do not require throttle lock anymore.

Now when throttle lock is dropped but with mCreateDestroyLock acquired it is safe to call bridgeMamaSubscriptionMute outside of throttle locked context.

 

In my local testing I can confirm this (+ a couple of obvious lines to tick42rmds code) fixes the old issue reported back in July, 2017 regarding crash with tick42rmds bridge wrt concurrent subscription.destroy() call.

As a bonus the queueing solution in tick42rmds bridge is no longer required, which reduces memory consumption and improves latency.

 

I think that history regarding old issue is still available if one searches for this subject in mail archives: [Openmama-dev] [Openmama-users] Concurrent subscription.destroy() ? Crash when using tick42rmds transport.

 

Frank, I hope you will be able to have a look into this to possibly land this change into the openmama code.

 

From: Openmama-dev@... [mailto:Openmama-dev@...] On Behalf Of Igor Kovalenko
Sent: Tuesday, March 31, 2020 13:33
To: openmama-dev@...
Subject: [Openmama-dev] let's move bridgeMamaSubscriptionMute call up in hierarchy out of throttle locked context

 

Classification: Public

 

Hi openmama team,

 

I'd like to propose moving bridgeMamaSubscriptionMute call from mamaSubscription_deactivate_internal up in the call hierarchy to mamaSubscription_deactivate so Mute() is called before any throttle lock is acquired (that is, just after checking if subscription is not NULL in mamaSubscription_deactivate).

 

Since in this case the client is removing subscription anyway, it should not matter that transport bridge Mute() method will be called a bit earlier in subscription lifecycle.

 

I'll try to explain why this change is needed for tick42rmds bridge to operate reliably. A required change to tick42rmds is outlined below as well.

Locally I tested a combination of this proposed change to openmama with tick42rmds bridge modified as outlined below, and also with other transport bridges (unmodified) that we use, and found no negative side effects.

 

A couple of years back there was a discussion regarding tick42rmds transport issue.

My understanding is that tick42rmds design choice to process subscriptions on bridge thread as opposed to Mama.start() thread violates a few openmama threading assumptions.

As a consequence we can reproduce a crash during moderately fast subscribe/unsubscribe sequence where tick42rmds bridge is still trying to deliver data to callback of already destroyed mama subscription.

This crash is caused by a race condition detailed below.

 

Tick42rmds bridge implements tick42rmdsBridgeMamaSubscription_mute method which does call Shutdown() on bridge thread subscription object.

Intention there appears to be to mark bridge thread subscription object as shut down so that bridge thread ignores it and is not trying to deliver a callbacks on mama subscription objects being destroyed.

Unfortunately there is no proper locking there so that seldom a Mute() call returns before Shutdown() takes effect.

If this happens, bridge thread proceeds to deliver next message to mama subscription callback while mama subscription object is being destroyed by the thread returned from Mute().

 

Without significant redesign of tick42rmds this race can be closed by adding proper locking between tick42rmdsBridgeMamaSubscription_mute and bridge thread, so that bridge thread is guaranteed to see result of Shutdown() before Mute() call returns. Simple implementation would be to add a mutex shared between Shutdown() call and bridge thread methods issuing callbacks to mama subscriptions.

 

So far so good. Yet unfortunately openmama locking scheme needs to be modified for this to work due to a problem with locking order of MAMA_THROTTLE_DEFAULT throttle.

There are many places where MAMA_THROTTLE_DEFAULT lock is acquired. According to my analysis with valgrind/helgrind/drd acquisition of corresponding mutex must always be a leaf call in locking hierarchy to prevent ABBA-style deadlock.

 

If Mute() and Shutdown() are synchronized with a mutex, then the following would happen:

 

client thread destroying mama subscription eventually calls mamaSubscription_destroy

       which calls mamaSubscription_deactivate()

    [A]  .. mamaSubscription_deactivate acquires lock [A] on throttle object

            then calls mamaSubscription_deactivate_internal()

              .. which calls impl->mBridgeImpl->bridgeMamaSubscriptionMute (impl->mSubscBridge)

                 which is tick42rmdsBridgeMamaSubscription_mute

    [B]          .. tick42rmdsBridgeMamaSubscription_mute acquires bridge thread subscription object lock [B]

XX client thread gets suspended here for some reason so it is not calling Shutdown() yet

 

tick42rmds bridge thread calls RMDSBridgeSubscription::OnMessage (or anything else which requires calling mamaSubscription_processMsg later)

    [B]  .. bridge thread acquires bridge thread subscription object lock [B]

                bridge thread checks if Shutdown() is called on this object

XX          .. since client thread is suspended, Shutdown() is not in effect

            bridge thread calls mamaSubscription_processMsg() to deliver message to mama subscription

            ..

    [A]     imageRequest_stopWaitForResponse() is called which tries to acquire throttle lock [A]

 

If my proposal to move bridgeMamaSubscriptionMute call out of locked scope is implemented, this deadlock would not occur.

 

I will not be able to share any code change due to our organizational restrictions but will try to answer other questions regarding this proposal.

 

--

 

Kind regards,
Igor Kovalenko

 



---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.



---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.



---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.


Re: let's move bridgeMamaSubscriptionMute call up in hierarchy out of throttle locked context

Frank Quinn
 

Hi Igor,

 

Erm, I think I understand the issue… If I put a PR together, can you review and verify if this fixes the issue?

 

Cheers,

Frank

 

Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io

 

From: Openmama-dev@... <Openmama-dev@...> On Behalf Of Igor Kovalenko via lists.openmama.org
Sent: 27 April 2020 18:21
To: openmama-dev@...
Subject: Re: [Openmama-dev] let's move bridgeMamaSubscriptionMute call up in hierarchy out of throttle locked context

 

Classification: Public

 

Hi

 

I now seem to have successfully tested a proper fix for the issue which requires calling bridgeMamaSubscriptionMute outside of throttle locked context.

 

All it takes is to handle a subscription in state MAMA_SUBSCRIPTION_ACTIVATING as special case in mamaSubscription_deactivate with throttle lock still held to drain related event from throttle queue.

Following that all other cases do not require throttle lock anymore.

Now when throttle lock is dropped but with mCreateDestroyLock acquired it is safe to call bridgeMamaSubscriptionMute outside of throttle locked context.

 

In my local testing I can confirm this (+ a couple of obvious lines to tick42rmds code) fixes the old issue reported back in July, 2017 regarding crash with tick42rmds bridge wrt concurrent subscription.destroy() call.

As a bonus the queueing solution in tick42rmds bridge is no longer required, which reduces memory consumption and improves latency.

 

I think that history regarding old issue is still available if one searches for this subject in mail archives: [Openmama-dev] [Openmama-users] Concurrent subscription.destroy() ? Crash when using tick42rmds transport.

 

Frank, I hope you will be able to have a look into this to possibly land this change into the openmama code.

 

From: Openmama-dev@... [mailto:Openmama-dev@...] On Behalf Of Igor Kovalenko
Sent: Tuesday, March 31, 2020 13:33
To: openmama-dev@...
Subject: [Openmama-dev] let's move bridgeMamaSubscriptionMute call up in hierarchy out of throttle locked context

 

Classification: Public

 

Hi openmama team,

 

I'd like to propose moving bridgeMamaSubscriptionMute call from mamaSubscription_deactivate_internal up in the call hierarchy to mamaSubscription_deactivate so Mute() is called before any throttle lock is acquired (that is, just after checking if subscription is not NULL in mamaSubscription_deactivate).

 

Since in this case the client is removing subscription anyway, it should not matter that transport bridge Mute() method will be called a bit earlier in subscription lifecycle.

 

I'll try to explain why this change is needed for tick42rmds bridge to operate reliably. A required change to tick42rmds is outlined below as well.

Locally I tested a combination of this proposed change to openmama with tick42rmds bridge modified as outlined below, and also with other transport bridges (unmodified) that we use, and found no negative side effects.

 

A couple of years back there was a discussion regarding tick42rmds transport issue.

My understanding is that tick42rmds design choice to process subscriptions on bridge thread as opposed to Mama.start() thread violates a few openmama threading assumptions.

As a consequence we can reproduce a crash during moderately fast subscribe/unsubscribe sequence where tick42rmds bridge is still trying to deliver data to callback of already destroyed mama subscription.

This crash is caused by a race condition detailed below.

 

Tick42rmds bridge implements tick42rmdsBridgeMamaSubscription_mute method which does call Shutdown() on bridge thread subscription object.

Intention there appears to be to mark bridge thread subscription object as shut down so that bridge thread ignores it and is not trying to deliver a callbacks on mama subscription objects being destroyed.

Unfortunately there is no proper locking there so that seldom a Mute() call returns before Shutdown() takes effect.

If this happens, bridge thread proceeds to deliver next message to mama subscription callback while mama subscription object is being destroyed by the thread returned from Mute().

 

Without significant redesign of tick42rmds this race can be closed by adding proper locking between tick42rmdsBridgeMamaSubscription_mute and bridge thread, so that bridge thread is guaranteed to see result of Shutdown() before Mute() call returns. Simple implementation would be to add a mutex shared between Shutdown() call and bridge thread methods issuing callbacks to mama subscriptions.

 

So far so good. Yet unfortunately openmama locking scheme needs to be modified for this to work due to a problem with locking order of MAMA_THROTTLE_DEFAULT throttle.

There are many places where MAMA_THROTTLE_DEFAULT lock is acquired. According to my analysis with valgrind/helgrind/drd acquisition of corresponding mutex must always be a leaf call in locking hierarchy to prevent ABBA-style deadlock.

 

If Mute() and Shutdown() are synchronized with a mutex, then the following would happen:

 

client thread destroying mama subscription eventually calls mamaSubscription_destroy

       which calls mamaSubscription_deactivate()

    [A]  .. mamaSubscription_deactivate acquires lock [A] on throttle object

            then calls mamaSubscription_deactivate_internal()

              .. which calls impl->mBridgeImpl->bridgeMamaSubscriptionMute (impl->mSubscBridge)

                 which is tick42rmdsBridgeMamaSubscription_mute

    [B]          .. tick42rmdsBridgeMamaSubscription_mute acquires bridge thread subscription object lock [B]

XX client thread gets suspended here for some reason so it is not calling Shutdown() yet

 

tick42rmds bridge thread calls RMDSBridgeSubscription::OnMessage (or anything else which requires calling mamaSubscription_processMsg later)

    [B]  .. bridge thread acquires bridge thread subscription object lock [B]

                bridge thread checks if Shutdown() is called on this object

XX          .. since client thread is suspended, Shutdown() is not in effect

            bridge thread calls mamaSubscription_processMsg() to deliver message to mama subscription

            ..

    [A]     imageRequest_stopWaitForResponse() is called which tries to acquire throttle lock [A]

 

If my proposal to move bridgeMamaSubscriptionMute call out of locked scope is implemented, this deadlock would not occur.

 

I will not be able to share any code change due to our organizational restrictions but will try to answer other questions regarding this proposal.

 

--

 

Kind regards,
Igor Kovalenko

 



---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.



---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.


Re: Early releases of lock in mamaSubscription destroy/deallocate logic

Frank Quinn
 

Hi Mike,

 

Could you shed some more light on what exactly the problem is here? I have had a refresher in that area, and from what I can see, I can’t think of any reason to unlock that particular mutex before the user callback so I’m not opposed to moving until after that callback if that will resolve whatever issue you’re seeing? It will need to be before the deallocate though.

 

The only reason I could think of for why this was done in the first place is to avoid deadlock if the user calls something like mamaSubscription_deallocate or something else that uses that mutex from the callback, but then I guess they’d learn pretty quickly if they had fallen foul of that. Then I thought this might be somehow exploited in one of the Java or .NET wrappers but couldn’t find any evidence of that either. Plus that callback is more informative than anything else for the application, so if they are attempting to deallocate, and that’s causing funny business (or something like that?), we just need to make it clear in the callback documentation that they shouldn’t be doing that.

 

Cheers,

Frank

 

Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io

 

From: Openmama-dev@... <Openmama-dev@...> On Behalf Of Bill Torpey via lists.openmama.org
Sent: 05 April 2020 15:02
To: Slade, Michael J <michael.j.slade@...>
Cc: openmama-dev@...
Subject: Re: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

At first glance, that sounds like just “kicking the can down the road” — i.e., you’re still left with the problem of what to do with these wrappers such that you can tear them down in a thread-safe manner.

 

Having said that, if you have an implementation that works, I’m sure that others would be interested, so maybe put it up on GitHub or something?

 

My personal preference would be to try to find something on the intertubes that has been tested up the wazoo — concurrency is hard, and the hardest part IMO is tear-down of event sources/sinks.



On Apr 3, 2020, at 10:39 AM, Slade, Michael J <michael.j.slade@...> wrote:

 

Thank you for your reply Bill.

 

To get around the problem with attempting to destroy the mutex while it is locked, could we instead define our own wrapper around a recursive mutex which handles the following:

1.      Keep a count of how many locks have been performed by current thread.

2.      Defer destroy to unlock call if mutex currently locked.

 

This will keep all destroy/deallocate logic thread safe and stop the need for early unlocks.

 

Mike

 

From: Bill Torpey [mailto:wallstprog@...] 
Sent: 01 April 2020 18:10
To: Slade, Michael J (CIB Tech, GBR) <michael.j.slade@...>
Cc: openmama-dev@...
Subject: Re: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

As far as I can tell, there’s no perfect solution here.    From https://linux.die.net/man/3/pthread_mutex_destroy 

 

Attempting to destroy a locked mutex results in undefined behavior.

 

So you either destroy the locked mutex, which is UB, or you unlock it first.  In the second case, another thread could grab it, which would likely cause a crash at some point.  

 

You can make the second problem go away by doing something like queueing the actual destroy in such a way that you know for certain that no other threads are waiting on the mutex, but that’s a fair bit of work, and tricky in its own right.  We took that approach in our application, and it generally works well, and avoids a whole host of lifetime-related problems in OpenMAMA.

 




On Mar 31, 2020, at 11:50 AM, Slade, Michael J via Lists.Openmama.Org <michael.j.slade=jpmorgan.com@...> wrote:

 

Hi OpenMAMA Devs,

 

In the subscription destroy/deallocate logic, there are a couple of functions which seem to release the lock used to protect access to the subscription structure early – i.e. before the user callback ‘mamaSubscriptionImpl_invokeDestroyedCallback’ or ‘mamaSubscriptionImpl_deallocate’ function is invoked. Is anyone able to explain why the lock is released in this way, and not held until the end of the function?
 
Thanks in advance,
Mike

mamaSubscription_destroy:2848

mama_status mamaSubscription_destroy(mamaSubscription subscription)

{

    /* Returns. */

    mama_status ret = MAMA_STATUS_NULL_ARG;

    if(NULL != subscription)

    {

        /* Get the impl. */

        mamaSubscriptionImpl *impl = (mamaSubscriptionImpl *)subscription;

 

        mamaSubscription_deactivate(subscription);

 

         wlock_lock(impl->mCreateDestroyLock);

 

        /* The next action will depend on the current state of the subscription. */

        switch(wInterlocked_read(&impl->mState))

        {

                /* Otherwise the subscription has been correctly removed from the throttle. */

                /* Fall through to perform the remaining clean-up. */

 

                /* For the following states the subscription is not active, simply perform clean-up. */

            case MAMA_SUBSCRIPTION_SETUP:

            case MAMA_SUBSCRIPTION_DEACTIVATED:

                 mamaSubscription_cleanup(subscription);

                 mamaSubscriptionImpl_setState(impl, MAMA_SUBSCRIPTION_DESTROYED);

                  wlock_unlock(impl->mCreateDestroyLock);

                 mamaSubscriptionImpl_invokeDestroyedCallback(subscription);

                 return MAMA_STATUS_OK;

                break;

 

mamaSubscriptionImpl_onSubscriptionDestroyed:3292+3303

 

void MAMACALLTYPE mamaSubscriptionImpl_onSubscriptionDestroyed(mamaSubscription subscription, void *closure)
{
    /* Obtain the impl from the subscription object. */
    mamaSubscriptionImpl *impl = (mamaSubscriptionImpl *)subscription;
    if(NULL != impl)
    {
 
        if(NULL != impl->mQueue)
            mamaQueue_decrementObjectCount(&impl->mLockHandle, impl->mQueue);
 
        /* Lock the mutex. */
        wlock_lock(impl->mCreateDestroyLock);
 
 
        /* The next action will depend on the current state of the subscription. */
        switch(wInterlocked_read(&impl->mState))
        {
                /* The subscription is being deactivated. */
            case MAMA_SUBSCRIPTION_DEACTIVATING:
                /* Change the state. */
                mamaSubscriptionImpl_setState(impl, MAMA_SUBSCRIPTION_DEACTIVATED);
                break;
 
                /* The subscription is being deallocated, i.e. mamaSubscription_deallocate has been called
                 * before the destroy callback has come in from the bridge.
                 */
            case MAMA_SUBSCRIPTION_DEALLOCATING :
                 mamaSubscription_cleanup(subscription);
                 wlock_unlock(impl->mCreateDestroyLock);
                 mamaSubscriptionImpl_invokeDestroyedCallback(impl);
                /* Delete the subscription. */
                mamaSubscriptionImpl_deallocate(impl);
                return;
                break;
 
                /* The subscription is being destroyed. */
            case MAMA_SUBSCRIPTION_DESTROYING :
                 mamaSubscription_cleanup(subscription);
                 mamaSubscriptionImpl_setState(impl, MAMA_SUBSCRIPTION_DESTROYED);
                  wlock_unlock(impl->mCreateDestroyLock);
                 mamaSubscriptionImpl_invokeDestroyedCallback(impl);
                 return;
                break;

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.

 

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.

 


Re: let's move bridgeMamaSubscriptionMute call up in hierarchy out of throttle locked context

Igor Kovalenko
 

Classification: Public

 

Hi

 

I now seem to have successfully tested a proper fix for the issue which requires calling bridgeMamaSubscriptionMute outside of throttle locked context.

 

All it takes is to handle a subscription in state MAMA_SUBSCRIPTION_ACTIVATING as special case in mamaSubscription_deactivate with throttle lock still held to drain related event from throttle queue.

Following that all other cases do not require throttle lock anymore.

Now when throttle lock is dropped but with mCreateDestroyLock acquired it is safe to call bridgeMamaSubscriptionMute outside of throttle locked context.

 

In my local testing I can confirm this (+ a couple of obvious lines to tick42rmds code) fixes the old issue reported back in July, 2017 regarding crash with tick42rmds bridge wrt concurrent subscription.destroy() call.

As a bonus the queueing solution in tick42rmds bridge is no longer required, which reduces memory consumption and improves latency.

 

I think that history regarding old issue is still available if one searches for this subject in mail archives: [Openmama-dev] [Openmama-users] Concurrent subscription.destroy() ? Crash when using tick42rmds transport.

 

Frank, I hope you will be able to have a look into this to possibly land this change into the openmama code.

 

From: Openmama-dev@... [mailto:Openmama-dev@...] On Behalf Of Igor Kovalenko
Sent: Tuesday, March 31, 2020 13:33
To: openmama-dev@...
Subject: [Openmama-dev] let's move bridgeMamaSubscriptionMute call up in hierarchy out of throttle locked context

 

Classification: Public

 

Hi openmama team,

 

I'd like to propose moving bridgeMamaSubscriptionMute call from mamaSubscription_deactivate_internal up in the call hierarchy to mamaSubscription_deactivate so Mute() is called before any throttle lock is acquired (that is, just after checking if subscription is not NULL in mamaSubscription_deactivate).

 

Since in this case the client is removing subscription anyway, it should not matter that transport bridge Mute() method will be called a bit earlier in subscription lifecycle.

 

I'll try to explain why this change is needed for tick42rmds bridge to operate reliably. A required change to tick42rmds is outlined below as well.

Locally I tested a combination of this proposed change to openmama with tick42rmds bridge modified as outlined below, and also with other transport bridges (unmodified) that we use, and found no negative side effects.

 

A couple of years back there was a discussion regarding tick42rmds transport issue.

My understanding is that tick42rmds design choice to process subscriptions on bridge thread as opposed to Mama.start() thread violates a few openmama threading assumptions.

As a consequence we can reproduce a crash during moderately fast subscribe/unsubscribe sequence where tick42rmds bridge is still trying to deliver data to callback of already destroyed mama subscription.

This crash is caused by a race condition detailed below.

 

Tick42rmds bridge implements tick42rmdsBridgeMamaSubscription_mute method which does call Shutdown() on bridge thread subscription object.

Intention there appears to be to mark bridge thread subscription object as shut down so that bridge thread ignores it and is not trying to deliver a callbacks on mama subscription objects being destroyed.

Unfortunately there is no proper locking there so that seldom a Mute() call returns before Shutdown() takes effect.

If this happens, bridge thread proceeds to deliver next message to mama subscription callback while mama subscription object is being destroyed by the thread returned from Mute().

 

Without significant redesign of tick42rmds this race can be closed by adding proper locking between tick42rmdsBridgeMamaSubscription_mute and bridge thread, so that bridge thread is guaranteed to see result of Shutdown() before Mute() call returns. Simple implementation would be to add a mutex shared between Shutdown() call and bridge thread methods issuing callbacks to mama subscriptions.

 

So far so good. Yet unfortunately openmama locking scheme needs to be modified for this to work due to a problem with locking order of MAMA_THROTTLE_DEFAULT throttle.

There are many places where MAMA_THROTTLE_DEFAULT lock is acquired. According to my analysis with valgrind/helgrind/drd acquisition of corresponding mutex must always be a leaf call in locking hierarchy to prevent ABBA-style deadlock.

 

If Mute() and Shutdown() are synchronized with a mutex, then the following would happen:

 

client thread destroying mama subscription eventually calls mamaSubscription_destroy

       which calls mamaSubscription_deactivate()

    [A]  .. mamaSubscription_deactivate acquires lock [A] on throttle object

            then calls mamaSubscription_deactivate_internal()

              .. which calls impl->mBridgeImpl->bridgeMamaSubscriptionMute (impl->mSubscBridge)

                 which is tick42rmdsBridgeMamaSubscription_mute

    [B]          .. tick42rmdsBridgeMamaSubscription_mute acquires bridge thread subscription object lock [B]

XX client thread gets suspended here for some reason so it is not calling Shutdown() yet

 

tick42rmds bridge thread calls RMDSBridgeSubscription::OnMessage (or anything else which requires calling mamaSubscription_processMsg later)

    [B]  .. bridge thread acquires bridge thread subscription object lock [B]

                bridge thread checks if Shutdown() is called on this object

XX          .. since client thread is suspended, Shutdown() is not in effect

            bridge thread calls mamaSubscription_processMsg() to deliver message to mama subscription

            ..

    [A]     imageRequest_stopWaitForResponse() is called which tries to acquire throttle lock [A]

 

If my proposal to move bridgeMamaSubscriptionMute call out of locked scope is implemented, this deadlock would not occur.

 

I will not be able to share any code change due to our organizational restrictions but will try to answer other questions regarding this proposal.

 

--

 

Kind regards,
Igor Kovalenko

 



---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.



---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.


Re: Early releases of lock in mamaSubscription destroy/deallocate logic

Bill Torpey
 

At first glance, that sounds like just “kicking the can down the road” — i.e., you’re still left with the problem of what to do with these wrappers such that you can tear them down in a thread-safe manner.

Having said that, if you have an implementation that works, I’m sure that others would be interested, so maybe put it up on GitHub or something?

My personal preference would be to try to find something on the intertubes that has been tested up the wazoo — concurrency is hard, and the hardest part IMO is tear-down of event sources/sinks.

On Apr 3, 2020, at 10:39 AM, Slade, Michael J <michael.j.slade@...> wrote:

Thank you for your reply Bill.
 
To get around the problem with attempting to destroy the mutex while it is locked, could we instead define our own wrapper around a recursive mutex which handles the following:
1.      Keep a count of how many locks have been performed by current thread.
2.      Defer destroy to unlock call if mutex currently locked.
 
This will keep all destroy/deallocate logic thread safe and stop the need for early unlocks.
 
Mike
 
From: Bill Torpey [mailto:wallstprog@...] 
Sent: 01 April 2020 18:10
To: Slade, Michael J (CIB Tech, GBR) <michael.j.slade@...>
Cc: openmama-dev@...
Subject: Re: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic
 
As far as I can tell, there’s no perfect solution here.    From https://linux.die.net/man/3/pthread_mutex_destroy 
 
Attempting to destroy a locked mutex results in undefined behavior.
 
So you either destroy the locked mutex, which is UB, or you unlock it first.  In the second case, another thread could grab it, which would likely cause a crash at some point.  
 
You can make the second problem go away by doing something like queueing the actual destroy in such a way that you know for certain that no other threads are waiting on the mutex, but that’s a fair bit of work, and tricky in its own right.  We took that approach in our application, and it generally works well, and avoids a whole host of lifetime-related problems in OpenMAMA.
 


On Mar 31, 2020, at 11:50 AM, Slade, Michael J via Lists.Openmama.Org <michael.j.slade=jpmorgan.com@...> wrote:
 
Hi OpenMAMA Devs,
 
In the subscription destroy/deallocate logic, there are a couple of functions which seem to release the lock used to protect access to the subscription structure early – i.e. before the user callback ‘mamaSubscriptionImpl_invokeDestroyedCallback’ or ‘mamaSubscriptionImpl_deallocate’ function is invoked. Is anyone able to explain why the lock is released in this way, and not held until the end of the function?
 
Thanks in advance,
Mike

mamaSubscription_destroy:2848

mama_status mamaSubscription_destroy(mamaSubscription subscription)
{
    /* Returns. */
    mama_status ret = MAMA_STATUS_NULL_ARG;
    if(NULL != subscription)
    {
        /* Get the impl. */
        mamaSubscriptionImpl *impl = (mamaSubscriptionImpl *)subscription;
 
        mamaSubscription_deactivate(subscription);
 
         wlock_lock(impl->mCreateDestroyLock);
 
        /* The next action will depend on the current state of the subscription. */
        switch(wInterlocked_read(&impl->mState))
        {
                /* Otherwise the subscription has been correctly removed from the throttle. */
                /* Fall through to perform the remaining clean-up. */
 
                /* For the following states the subscription is not active, simply perform clean-up. */
            case MAMA_SUBSCRIPTION_SETUP:
            case MAMA_SUBSCRIPTION_DEACTIVATED:
                 mamaSubscription_cleanup(subscription);
                 mamaSubscriptionImpl_setState(impl, MAMA_SUBSCRIPTION_DESTROYED);
                  wlock_unlock(impl->mCreateDestroyLock);
                 mamaSubscriptionImpl_invokeDestroyedCallback(subscription);
                 return MAMA_STATUS_OK;
                break;
 
mamaSubscriptionImpl_onSubscriptionDestroyed:3292+3303
 
void MAMACALLTYPE mamaSubscriptionImpl_onSubscriptionDestroyed(mamaSubscription subscription, void *closure)
{
    /* Obtain the impl from the subscription object. */
    mamaSubscriptionImpl *impl = (mamaSubscriptionImpl *)subscription;
    if(NULL != impl)
    {
 
        if(NULL != impl->mQueue)
            mamaQueue_decrementObjectCount(&impl->mLockHandle, impl->mQueue);
 
        /* Lock the mutex. */
        wlock_lock(impl->mCreateDestroyLock);
 
 
        /* The next action will depend on the current state of the subscription. */
        switch(wInterlocked_read(&impl->mState))
        {
                /* The subscription is being deactivated. */
            case MAMA_SUBSCRIPTION_DEACTIVATING:
                /* Change the state. */
                mamaSubscriptionImpl_setState(impl, MAMA_SUBSCRIPTION_DEACTIVATED);
                break;
 
                /* The subscription is being deallocated, i.e. mamaSubscription_deallocate has been called
                 * before the destroy callback has come in from the bridge.
                 */
            case MAMA_SUBSCRIPTION_DEALLOCATING :
                 mamaSubscription_cleanup(subscription);
                 wlock_unlock(impl->mCreateDestroyLock);
                 mamaSubscriptionImpl_invokeDestroyedCallback(impl);
                /* Delete the subscription. */
                mamaSubscriptionImpl_deallocate(impl);
                return;
                break;
 
                /* The subscription is being destroyed. */
            case MAMA_SUBSCRIPTION_DESTROYING :
                 mamaSubscription_cleanup(subscription);
                 mamaSubscriptionImpl_setState(impl, MAMA_SUBSCRIPTION_DESTROYED);
                  wlock_unlock(impl->mCreateDestroyLock);
                 mamaSubscriptionImpl_invokeDestroyedCallback(impl);
                 return;
                break;
This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.
 

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.



Re: Early releases of lock in mamaSubscription destroy/deallocate logic

Slade, Michael J
 

Thank you for your reply Bill.

 

To get around the problem with attempting to destroy the mutex while it is locked, could we instead define our own wrapper around a recursive mutex which handles the following:

1.      Keep a count of how many locks have been performed by current thread.

2.      Defer destroy to unlock call if mutex currently locked.

 

This will keep all destroy/deallocate logic thread safe and stop the need for early unlocks.

 

Mike

 

From: Bill Torpey [mailto:wallstprog@...]
Sent: 01 April 2020 18:10
To: Slade, Michael J (CIB Tech, GBR) <michael.j.slade@...>
Cc: openmama-dev@...
Subject: Re: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

As far as I can tell, there’s no perfect solution here.    From https://linux.die.net/man/3/pthread_mutex_destroy 

 

Attempting to destroy a locked mutex results in undefined behavior.

 

So you either destroy the locked mutex, which is UB, or you unlock it first.  In the second case, another thread could grab it, which would likely cause a crash at some point.  

 

You can make the second problem go away by doing something like queueing the actual destroy in such a way that you know for certain that no other threads are waiting on the mutex, but that’s a fair bit of work, and tricky in its own right.  We took that approach in our application, and it generally works well, and avoids a whole host of lifetime-related problems in OpenMAMA.

 



On Mar 31, 2020, at 11:50 AM, Slade, Michael J via Lists.Openmama.Org <michael.j.slade=jpmorgan.com@...> wrote:

 

Hi OpenMAMA Devs,

 

In the subscription destroy/deallocate logic, there are a couple of functions which seem to release the lock used to protect access to the subscription structure early – i.e. before the user callback ‘mamaSubscriptionImpl_invokeDestroyedCallback’ or ‘mamaSubscriptionImpl_deallocate’ function is invoked. Is anyone able to explain why the lock is released in this way, and not held until the end of the function?
 
Thanks in advance,
Mike

mamaSubscription_destroy:2848

mama_status mamaSubscription_destroy(mamaSubscription subscription)

{

    /* Returns. */

    mama_status ret = MAMA_STATUS_NULL_ARG;

    if(NULL != subscription)

    {

        /* Get the impl. */

        mamaSubscriptionImpl *impl = (mamaSubscriptionImpl *)subscription;

 

        mamaSubscription_deactivate(subscription);

 

         wlock_lock(impl->mCreateDestroyLock);

 

        /* The next action will depend on the current state of the subscription. */

        switch(wInterlocked_read(&impl->mState))

        {

                /* Otherwise the subscription has been correctly removed from the throttle. */

                /* Fall through to perform the remaining clean-up. */

 

                /* For the following states the subscription is not active, simply perform clean-up. */

            case MAMA_SUBSCRIPTION_SETUP:

            case MAMA_SUBSCRIPTION_DEACTIVATED:

                 mamaSubscription_cleanup(subscription);

                 mamaSubscriptionImpl_setState(impl, MAMA_SUBSCRIPTION_DESTROYED);

                  wlock_unlock(impl->mCreateDestroyLock);

                 mamaSubscriptionImpl_invokeDestroyedCallback(subscription);

                 return MAMA_STATUS_OK;

                break;

 

mamaSubscriptionImpl_onSubscriptionDestroyed:3292+3303

 

void MAMACALLTYPE mamaSubscriptionImpl_onSubscriptionDestroyed(mamaSubscription subscription, void *closure)
{
    /* Obtain the impl from the subscription object. */
    mamaSubscriptionImpl *impl = (mamaSubscriptionImpl *)subscription;
    if(NULL != impl)
    {
 
        if(NULL != impl->mQueue)
            mamaQueue_decrementObjectCount(&impl->mLockHandle, impl->mQueue);
 
        /* Lock the mutex. */
        wlock_lock(impl->mCreateDestroyLock);
 
 
        /* The next action will depend on the current state of the subscription. */
        switch(wInterlocked_read(&impl->mState))
        {
                /* The subscription is being deactivated. */
            case MAMA_SUBSCRIPTION_DEACTIVATING:
                /* Change the state. */
                mamaSubscriptionImpl_setState(impl, MAMA_SUBSCRIPTION_DEACTIVATED);
                break;
 
                /* The subscription is being deallocated, i.e. mamaSubscription_deallocate has been called
                 * before the destroy callback has come in from the bridge.
                 */
            case MAMA_SUBSCRIPTION_DEALLOCATING :
                 mamaSubscription_cleanup(subscription);
                 wlock_unlock(impl->mCreateDestroyLock);
                 mamaSubscriptionImpl_invokeDestroyedCallback(impl);
                /* Delete the subscription. */
                mamaSubscriptionImpl_deallocate(impl);
                return;
                break;
 
                /* The subscription is being destroyed. */
            case MAMA_SUBSCRIPTION_DESTROYING :
                 mamaSubscription_cleanup(subscription);
                 mamaSubscriptionImpl_setState(impl, MAMA_SUBSCRIPTION_DESTROYED);
                  wlock_unlock(impl->mCreateDestroyLock);
                 mamaSubscriptionImpl_invokeDestroyedCallback(impl);
                 return;
                break;

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.

 

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.


Re: Early releases of lock in mamaSubscription destroy/deallocate logic

Bill Torpey
 

As far as I can tell, there’s no perfect solution here.    From https://linux.die.net/man/3/pthread_mutex_destroy 

Attempting to destroy a locked mutex results in undefined behavior.

So you either destroy the locked mutex, which is UB, or you unlock it first.  In the second case, another thread could grab it, which would likely cause a crash at some point.  

You can make the second problem go away by doing something like queueing the actual destroy in such a way that you know for certain that no other threads are waiting on the mutex, but that’s a fair bit of work, and tricky in its own right.  We took that approach in our application, and it generally works well, and avoids a whole host of lifetime-related problems in OpenMAMA.


On Mar 31, 2020, at 11:50 AM, Slade, Michael J via Lists.Openmama.Org <michael.j.slade=jpmorgan.com@...> wrote:

Hi OpenMAMA Devs,
 
In the subscription destroy/deallocate logic, there are a couple of functions which seem to release the lock used to protect access to the subscription structure early – i.e. before the user callback ‘mamaSubscriptionImpl_invokeDestroyedCallback’ or ‘mamaSubscriptionImpl_deallocate’ function is invoked. Is anyone able to explain why the lock is released in this way, and not held until the end of the function?
 
Thanks in advance,
Mike

mamaSubscription_destroy:2848

mama_status mamaSubscription_destroy(mamaSubscription subscription)
{
    /* Returns. */
    mama_status ret = MAMA_STATUS_NULL_ARG;
    if(NULL != subscription)
    {
        /* Get the impl. */
        mamaSubscriptionImpl *impl = (mamaSubscriptionImpl *)subscription;
 
        mamaSubscription_deactivate(subscription);
 
         wlock_lock(impl->mCreateDestroyLock);
 
        /* The next action will depend on the current state of the subscription. */
        switch(wInterlocked_read(&impl->mState))
        {
                /* Otherwise the subscription has been correctly removed from the throttle. */
                /* Fall through to perform the remaining clean-up. */
 
                /* For the following states the subscription is not active, simply perform clean-up. */
            case MAMA_SUBSCRIPTION_SETUP:
            case MAMA_SUBSCRIPTION_DEACTIVATED:
                 mamaSubscription_cleanup(subscription);
                 mamaSubscriptionImpl_setState(impl, MAMA_SUBSCRIPTION_DESTROYED);
                  wlock_unlock(impl->mCreateDestroyLock);
                 mamaSubscriptionImpl_invokeDestroyedCallback(subscription);
                 return MAMA_STATUS_OK;
                break;
 
mamaSubscriptionImpl_onSubscriptionDestroyed:3292+3303
 
void MAMACALLTYPE mamaSubscriptionImpl_onSubscriptionDestroyed(mamaSubscription subscription, void *closure)
{
    /* Obtain the impl from the subscription object. */
    mamaSubscriptionImpl *impl = (mamaSubscriptionImpl *)subscription;
    if(NULL != impl)
    {
 
        if(NULL != impl->mQueue)
            mamaQueue_decrementObjectCount(&impl->mLockHandle, impl->mQueue);
 
        /* Lock the mutex. */
        wlock_lock(impl->mCreateDestroyLock);
 
 
        /* The next action will depend on the current state of the subscription. */
        switch(wInterlocked_read(&impl->mState))
        {
                /* The subscription is being deactivated. */
            case MAMA_SUBSCRIPTION_DEACTIVATING:
                /* Change the state. */
                mamaSubscriptionImpl_setState(impl, MAMA_SUBSCRIPTION_DEACTIVATED);
                break;
 
                /* The subscription is being deallocated, i.e. mamaSubscription_deallocate has been called
                 * before the destroy callback has come in from the bridge.
                 */
            case MAMA_SUBSCRIPTION_DEALLOCATING :
                 mamaSubscription_cleanup(subscription);
                 wlock_unlock(impl->mCreateDestroyLock);
                 mamaSubscriptionImpl_invokeDestroyedCallback(impl);
                /* Delete the subscription. */
                mamaSubscriptionImpl_deallocate(impl);
                return;
                break;
 
                /* The subscription is being destroyed. */
            case MAMA_SUBSCRIPTION_DESTROYING :
                 mamaSubscription_cleanup(subscription);
                 mamaSubscriptionImpl_setState(impl, MAMA_SUBSCRIPTION_DESTROYED);
                  wlock_unlock(impl->mCreateDestroyLock);
                 mamaSubscriptionImpl_invokeDestroyedCallback(impl);
                 return;
                break;

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.



Early releases of lock in mamaSubscription destroy/deallocate logic

Slade, Michael J
 

Hi OpenMAMA Devs,

 

In the subscription destroy/deallocate logic, there are a couple of functions which seem to release the lock used to protect access to the subscription structure early – i.e. before the user callback ‘mamaSubscriptionImpl_invokeDestroyedCallback’ or ‘mamaSubscriptionImpl_deallocate’ function is invoked. Is anyone able to explain why the lock is released in this way, and not held until the end of the function?
 
Thanks in advance,
Mike

mamaSubscription_destroy:2848

mama_status mamaSubscription_destroy(mamaSubscription subscription)

{

    /* Returns. */

    mama_status ret = MAMA_STATUS_NULL_ARG;

    if(NULL != subscription)

    {

        /* Get the impl. */

        mamaSubscriptionImpl *impl = (mamaSubscriptionImpl *)subscription;

 

        mamaSubscription_deactivate(subscription);

 

         wlock_lock(impl->mCreateDestroyLock);

 

        /* The next action will depend on the current state of the subscription. */

        switch(wInterlocked_read(&impl->mState))

        {

                /* Otherwise the subscription has been correctly removed from the throttle. */

                /* Fall through to perform the remaining clean-up. */

 

                /* For the following states the subscription is not active, simply perform clean-up. */

            case MAMA_SUBSCRIPTION_SETUP:

            case MAMA_SUBSCRIPTION_DEACTIVATED:

                 mamaSubscription_cleanup(subscription);

                 mamaSubscriptionImpl_setState(impl, MAMA_SUBSCRIPTION_DESTROYED);

                  wlock_unlock(impl->mCreateDestroyLock);

                 mamaSubscriptionImpl_invokeDestroyedCallback(subscription);

                 return MAMA_STATUS_OK;

                break;

 

mamaSubscriptionImpl_onSubscriptionDestroyed:3292+3303

 

void MAMACALLTYPE mamaSubscriptionImpl_onSubscriptionDestroyed(mamaSubscription subscription, void *closure)
{
    /* Obtain the impl from the subscription object. */
    mamaSubscriptionImpl *impl = (mamaSubscriptionImpl *)subscription;
    if(NULL != impl)
    {
 
        if(NULL != impl->mQueue)
            mamaQueue_decrementObjectCount(&impl->mLockHandle, impl->mQueue);
 
        /* Lock the mutex. */
        wlock_lock(impl->mCreateDestroyLock);
 
 
        /* The next action will depend on the current state of the subscription. */
        switch(wInterlocked_read(&impl->mState))
        {
                /* The subscription is being deactivated. */
            case MAMA_SUBSCRIPTION_DEACTIVATING:
                /* Change the state. */
                mamaSubscriptionImpl_setState(impl, MAMA_SUBSCRIPTION_DEACTIVATED);
                break;
 
                /* The subscription is being deallocated, i.e. mamaSubscription_deallocate has been called
                 * before the destroy callback has come in from the bridge.
                 */
            case MAMA_SUBSCRIPTION_DEALLOCATING :
                 mamaSubscription_cleanup(subscription);
                 wlock_unlock(impl->mCreateDestroyLock);
                 mamaSubscriptionImpl_invokeDestroyedCallback(impl);
                /* Delete the subscription. */
                mamaSubscriptionImpl_deallocate(impl);
                return;
                break;
 
                /* The subscription is being destroyed. */
            case MAMA_SUBSCRIPTION_DESTROYING :
                 mamaSubscription_cleanup(subscription);
                 mamaSubscriptionImpl_setState(impl, MAMA_SUBSCRIPTION_DESTROYED);
                  wlock_unlock(impl->mCreateDestroyLock);
                 mamaSubscriptionImpl_invokeDestroyedCallback(impl);
                 return;
                break;

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.


let's move bridgeMamaSubscriptionMute call up in hierarchy out of throttle locked context

Igor Kovalenko
 

Classification: Public


Hi openmama team,

 

I'd like to propose moving bridgeMamaSubscriptionMute call from mamaSubscription_deactivate_internal up in the call hierarchy to mamaSubscription_deactivate so Mute() is called before any throttle lock is acquired (that is, just after checking if subscription is not NULL in mamaSubscription_deactivate).

 

Since in this case the client is removing subscription anyway, it should not matter that transport bridge Mute() method will be called a bit earlier in subscription lifecycle.

 

I'll try to explain why this change is needed for tick42rmds bridge to operate reliably. A required change to tick42rmds is outlined below as well.

Locally I tested a combination of this proposed change to openmama with tick42rmds bridge modified as outlined below, and also with other transport bridges (unmodified) that we use, and found no negative side effects.

 

A couple of years back there was a discussion regarding tick42rmds transport issue.

My understanding is that tick42rmds design choice to process subscriptions on bridge thread as opposed to Mama.start() thread violates a few openmama threading assumptions.

As a consequence we can reproduce a crash during moderately fast subscribe/unsubscribe sequence where tick42rmds bridge is still trying to deliver data to callback of already destroyed mama subscription.

This crash is caused by a race condition detailed below.

 

Tick42rmds bridge implements tick42rmdsBridgeMamaSubscription_mute method which does call Shutdown() on bridge thread subscription object.

Intention there appears to be to mark bridge thread subscription object as shut down so that bridge thread ignores it and is not trying to deliver a callbacks on mama subscription objects being destroyed.

Unfortunately there is no proper locking there so that seldom a Mute() call returns before Shutdown() takes effect.

If this happens, bridge thread proceeds to deliver next message to mama subscription callback while mama subscription object is being destroyed by the thread returned from Mute().

 

Without significant redesign of tick42rmds this race can be closed by adding proper locking between tick42rmdsBridgeMamaSubscription_mute and bridge thread, so that bridge thread is guaranteed to see result of Shutdown() before Mute() call returns. Simple implementation would be to add a mutex shared between Shutdown() call and bridge thread methods issuing callbacks to mama subscriptions.

 

So far so good. Yet unfortunately openmama locking scheme needs to be modified for this to work due to a problem with locking order of MAMA_THROTTLE_DEFAULT throttle.

There are many places where MAMA_THROTTLE_DEFAULT lock is acquired. According to my analysis with valgrind/helgrind/drd acquisition of corresponding mutex must always be a leaf call in locking hierarchy to prevent ABBA-style deadlock.

 

If Mute() and Shutdown() are synchronized with a mutex, then the following would happen:

 

client thread destroying mama subscription eventually calls mamaSubscription_destroy

       which calls mamaSubscription_deactivate()

    [A]  .. mamaSubscription_deactivate acquires lock [A] on throttle object

            then calls mamaSubscription_deactivate_internal()

              .. which calls impl->mBridgeImpl->bridgeMamaSubscriptionMute (impl->mSubscBridge)

                 which is tick42rmdsBridgeMamaSubscription_mute

    [B]          .. tick42rmdsBridgeMamaSubscription_mute acquires bridge thread subscription object lock [B]

XX client thread gets suspended here for some reason so it is not calling Shutdown() yet

 

tick42rmds bridge thread calls RMDSBridgeSubscription::OnMessage (or anything else which requires calling mamaSubscription_processMsg later)

    [B]  .. bridge thread acquires bridge thread subscription object lock [B]

                bridge thread checks if Shutdown() is called on this object

XX          .. since client thread is suspended, Shutdown() is not in effect

            bridge thread calls mamaSubscription_processMsg() to deliver message to mama subscription

            ..

    [A]     imageRequest_stopWaitForResponse() is called which tries to acquire throttle lock [A]

 

If my proposal to move bridgeMamaSubscriptionMute call out of locked scope is implemented, this deadlock would not occur.

 

I will not be able to share any code change due to our organizational restrictions but will try to answer other questions regarding this proposal.

 

--

 

Kind regards,
Igor Kovalenko

 



---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.


OpenMAMA plugin enhancement to allow source/symbol-mapping plugin

Keith Rudd
 

I took an action from recent OpenMAMA Steering Committee meeting to look into MAMA plugin interface enhancement, specifically with a view to providing for a pre-subscription plugin hook that would allow for a custom symbol mapper and/or subscription logger plugin.

 

I've had a look at the OpenMAMA 6.3.0 code with this in mind and have concluded the following.

 

There is now a 'subscriptionPostCreate' hook defined:

typedef mama_status (*mamaPlugin_subscriptionPostCreateHook) (mamaPluginInfo pluginInfo, mamaSubscription subscription);

It gets called once and only from mamaSubscription_setupBasic() in subscription.c

This could be used for subscription logging, but would not be useful for symbol mapping for a couple of reasons.

One is that it occurs after some initializations for activating the subscription have taken place, namely calls to mamaPublisherImpl_createByIndex() and image_request(), both of which use the originally specified source and symbol.

The other is that there is currently no way of resetting the subscription source after mamaSubscription_create() anyway.

 

Two changes are needed for a plugin hook that does source/symbol mapping (changes source and/or symbol) to be practical I think.

 

1) A plugin hook at a slightly earlier point in mamaSubscription_setupBasic().

I suggest just after line 605 in subscription.c, where a call to setSubscInfo (self, transport, root, source, symbol) is made.

After this point, source and symbol are stored in the mamaSubscriptionImpl struct and only used from the stored values,  so if set (changed) then changes will be effective.

Could implement either by moving the 'subscriptionPostCreateHook' to this slightly earlier point, or by just creating a new additional plugin, say called 'subscriptionPostSetUpHook', inserted at this point.

I slightly prefer the second choice, but OK either way.

 

2) Add an additional accessor function for mamaSubscription,

mama_status mamaSubscription_setSourceName (mamaSubscription subscription, const char* sourceName)

This would be very similar to the mamaSubscription_setSymbol() function, just changing sourceName (mSubscSource in mamaSubscriptionImpl struct) instead.

 

These are the changes I propose.

 

Regards,

Keith



---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.


Re: Adding Blank data representation to MAMA API data types

Frank Quinn
 

Yes that’s why I was suggesting this approach – it’s backwards compatible with existing bridges (we can make the bridge methods optional now where no implementation exists) and it doesn’t require adding a new MAMA field type (since it’s a special case here common to all existing field types which the payload bridge may optionally handle).

 

The interface could be modified to suit of course but just wanted to raise a proposed solution which won’t involve having to do all the fun we currently have with trying to parse a date / price several different ways (string, datetime etc) to try and make this fit.

 

But you’re right I’ll await further definition…

 

Cheers,

Frank

 

Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io

 

From: Phelan, Nigel <nigel.phelan@...>
Sent: 11 December 2019 09:02
To: Frank Quinn <fquinn@...>; Keith Rudd <keith.rudd@...>; 'openmama-dev@...' <openmama-dev@...>
Subject: RE: [Openmama-dev] Adding Blank data representation to MAMA API data types

 

I’d be cautious about that one, Frank – you’d need to think about the implications for the payload bridges because you’d need valid representations for NULL for all field types (or possibly you’d need a new “null field” type and numerous tweaks to handle the possibility that any field could contain content of the type specified in the dictionary or a null field representation).  I’d probably wait until the problem was better defined before proposing an implementation.

 

Nigel

 


Nigel Phelan | Corporate & Investment Bank | Market Data Services | J.P. Morgan

 

From: Frank Quinn [mailto:fquinn@...]
Sent: 11 December 2019 07:48
To: Phelan, Nigel (CIB Tech, GBR) <nigel.phelan@...>; Keith Rudd <keith.rudd@...>; 'openmama-dev@...' <openmama-dev@...>
Subject: RE: [Openmama-dev] Adding Blank data representation to MAMA API data types

 

For the record, I’m not opposed to a “first class” null data type in principle. Json and msgpack for example have it fine. It does raise a bit of ambiguity though on how to handle a NULL value in terms of caching and data type safety. I think we would need to make it clear that a NULL value is indicative of a “this field no longer has a valid value” scenario rather than “this field has not been updated” when being applied to any caches.

 

I think the cleanest way to do this is something like

 

mamaMsgField result;

mamaMsg_getField(msg, name, fid, &result);

mama_bool_t isNull = mamaMsgField_isNull(field); // <-- New Method

 

That way we could apply across all fields without having to have separate implementations per data type. That would then expose a similar bridge method to handle that. This is opposed to actually mapping the null type to a proper mamaMsg_getNull() type approach which I think would be abrasive for most OpenMAMA developers who are using strongly typed languages.

 

Cheers,

Frank

 

Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io

 

From: Openmama-dev@... <Openmama-dev@...> On Behalf Of Phelan, Nigel via Lists.Openmama.Org
Sent: 10 December 2019 17:34
To: Keith Rudd <keith.rudd@...>; 'openmama-dev@...' <openmama-dev@...>
Cc: Openmama-dev@...
Subject: Re: [Openmama-dev] Adding Blank data representation to MAMA API data types

 

For integers, yes, I agree.  For dateTimes, as I said, you can indicate if the date part is valid or the time part is valid (hasDate()/hasTime()/setHints()).  If neither flag is set, we interpret that as an invalid dateTime and tell people to render as a blank.

 

Nigel

 


Nigel Phelan | Corporate & Investment Bank | Market Data Services | J.P. Morgan

 

From: Openmama-dev@... [mailto:Openmama-dev@...] On Behalf Of Keith Rudd
Sent: 10 December 2019 16:35
To: Phelan, Nigel (CIB Tech, GBR) <nigel.phelan@...>; 'openmama-dev@...' <openmama-dev@...>
Subject: Re: [Openmama-dev] Adding Blank data representation to MAMA API data types

 

Thanks Nigel – I am with you now.

Was not aware of the ‘setIsValidPrice()’ method. Our version of the tick42rmds bridge does not use it when translating messages received, but it could be updated to do that.

This only covers the price field case as there do not appear to be any equivalents methods for other field types, like times or integers, but perhaps this will be enough to cover the most important case.

Will refer back to our users for more analysis to see if using this is enough.

 

From: Phelan, Nigel [mailto:nigel.phelan@...]
Sent: 10 December 2019 12:19
To: Keith Rudd <keith.rudd@...>; 'openmama-dev@...' <openmama-dev@...>
Subject: RE: Adding Blank data representation to MAMA API data types

 

Still not sure I see the problem, Keith.  If you receive the data as a string field and, when the transport bridge tries to map it to a mamaPrice field, the conversion fails, you should call setIsValidPrice() to mark the field as containing an “invalid” price and the consuming client should honour this and display it as a blank (in exactly the same way you would if someone decided to publish “NOT QUOTING” in a bid/ask price field because the underlying middleware allowed free text entries there).  That’s why I was looking for an example of when this breaks.

 

Nigel

 


Nigel Phelan | Corporate & Investment Bank | Market Data Services | J.P. Morgan

 

From: Keith Rudd [mailto:keith.rudd@...]
Sent: 10 December 2019 11:47
To: Phelan, Nigel (CIB Tech, GBR) <nigel.phelan@...>; 'openmama-dev@...' <openmama-dev@...>
Subject: RE: Adding Blank data representation to MAMA API data types

 

 

Hi Nigel,

 

I don’t have a RIC to quote here, but the problem exists for certain data delivered over the Elektron feed, with data from contributors where we have no control over what conventions they use.

For most data sets you only ever see blank values in initial data which is not really a problem. It’s when they are used to update a previously valid field value, distinguishing between a ‘blank’ and a zero (possibly valid) becomes important. The tick42rmds (or any other) adaptor can only really deliver a zero in the MAMA field it renders from the received ‘blank’ as there is no blank representation in MAMA and this is indistinguishable from a valid zero update, even though they are semantically different.

 

Regards,

Keith

 

 

From: Phelan, Nigel [mailto:nigel.phelan@...]
Sent: 09 December 2019 17:03
To: Keith Rudd <keith.rudd@...>; 'openmama-dev@...' <openmama-dev@...>
Subject: RE: Adding Blank data representation to MAMA API data types

 

Hi Keith – what we normally do here is use the relevant getters/setters on the Price and DateTime fields (getIsValidPrice, setIsValidPrice, HasDate, HasTime).  We tell people to render blanks for invalid prices and for DateTimes with no date or time part.  I believe the tick42rmds adaptor for TREP maps these correctly from the native TREP RWF encoding.  Do you have counter examples?

 

Nigel

 


Nigel Phelan | Corporate & Investment Bank | Market Data Services | J.P. Morgan

 

From: Openmama-dev@... [mailto:Openmama-dev@...] On Behalf Of Keith Rudd
Sent: 09 December 2019 16:49
To: 'openmama-dev@...' <openmama-dev@...>
Subject: [Openmama-dev] Adding Blank data representation to MAMA API data types

 

Certain other market data platforms / APIs (well, TREP) support the concept of ‘blank’ field values.

The price and time field types actually have special blank data representations defined.

This is used in the context where a publisher wants to include a field in a message but with no assigned value, possibly for example to indicate that a price that was previously offered and valid is now no longer available.

(Granted there are other ways to do this, with another price status field etc. but we do see this method used sometimes and with no other indicator of validity)

 

This introduces a conundrum when translating such values to the MAMA API as prices, times, etc. have no such equivalent blank representation in MAMA.

 

So, my proposal is that we might extend MAMA field data types to include such ‘blank’ values.

This would need to be done while preserving backwards-compatibility.

 

A discussion point for now

-  Is this a feature which others see value in?

-  Any feasible suggestions for how it might be achieved and what values would be used for ‘blank’ ?

 

Regards,

Keith

 



---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.



---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.



---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.


Re: Adding Blank data representation to MAMA API data types

Phelan, Nigel
 

I’d be cautious about that one, Frank – you’d need to think about the implications for the payload bridges because you’d need valid representations for NULL for all field types (or possibly you’d need a new “null field” type and numerous tweaks to handle the possibility that any field could contain content of the type specified in the dictionary or a null field representation).  I’d probably wait until the problem was better defined before proposing an implementation.

 

Nigel

 


Nigel Phelan | Corporate & Investment Bank | Market Data Services | J.P. Morgan

 

From: Frank Quinn [mailto:fquinn@...]
Sent: 11 December 2019 07:48
To: Phelan, Nigel (CIB Tech, GBR) <nigel.phelan@...>; Keith Rudd <keith.rudd@...>; 'openmama-dev@...' <openmama-dev@...>
Subject: RE: [Openmama-dev] Adding Blank data representation to MAMA API data types

 

For the record, I’m not opposed to a “first class” null data type in principle. Json and msgpack for example have it fine. It does raise a bit of ambiguity though on how to handle a NULL value in terms of caching and data type safety. I think we would need to make it clear that a NULL value is indicative of a “this field no longer has a valid value” scenario rather than “this field has not been updated” when being applied to any caches.

 

I think the cleanest way to do this is something like

 

mamaMsgField result;

mamaMsg_getField(msg, name, fid, &result);

mama_bool_t isNull = mamaMsgField_isNull(field); // <-- New Method

 

That way we could apply across all fields without having to have separate implementations per data type. That would then expose a similar bridge method to handle that. This is opposed to actually mapping the null type to a proper mamaMsg_getNull() type approach which I think would be abrasive for most OpenMAMA developers who are using strongly typed languages.

 

Cheers,

Frank

 

Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io

 

From: Openmama-dev@... <Openmama-dev@...> On Behalf Of Phelan, Nigel via Lists.Openmama.Org
Sent: 10 December 2019 17:34
To: Keith Rudd <keith.rudd@...>; 'openmama-dev@...' <openmama-dev@...>
Cc: Openmama-dev@...
Subject: Re: [Openmama-dev] Adding Blank data representation to MAMA API data types

 

For integers, yes, I agree.  For dateTimes, as I said, you can indicate if the date part is valid or the time part is valid (hasDate()/hasTime()/setHints()).  If neither flag is set, we interpret that as an invalid dateTime and tell people to render as a blank.

 

Nigel

 


Nigel Phelan | Corporate & Investment Bank | Market Data Services | J.P. Morgan

 

From: Openmama-dev@... [mailto:Openmama-dev@...] On Behalf Of Keith Rudd
Sent: 10 December 2019 16:35
To: Phelan, Nigel (CIB Tech, GBR) <nigel.phelan@...>; 'openmama-dev@...' <openmama-dev@...>
Subject: Re: [Openmama-dev] Adding Blank data representation to MAMA API data types

 

Thanks Nigel – I am with you now.

Was not aware of the ‘setIsValidPrice()’ method. Our version of the tick42rmds bridge does not use it when translating messages received, but it could be updated to do that.

This only covers the price field case as there do not appear to be any equivalents methods for other field types, like times or integers, but perhaps this will be enough to cover the most important case.

Will refer back to our users for more analysis to see if using this is enough.

 

From: Phelan, Nigel [mailto:nigel.phelan@...]
Sent: 10 December 2019 12:19
To: Keith Rudd <keith.rudd@...>; 'openmama-dev@...' <openmama-dev@...>
Subject: RE: Adding Blank data representation to MAMA API data types

 

Still not sure I see the problem, Keith.  If you receive the data as a string field and, when the transport bridge tries to map it to a mamaPrice field, the conversion fails, you should call setIsValidPrice() to mark the field as containing an “invalid” price and the consuming client should honour this and display it as a blank (in exactly the same way you would if someone decided to publish “NOT QUOTING” in a bid/ask price field because the underlying middleware allowed free text entries there).  That’s why I was looking for an example of when this breaks.

 

Nigel

 


Nigel Phelan | Corporate & Investment Bank | Market Data Services | J.P. Morgan

 

From: Keith Rudd [mailto:keith.rudd@...]
Sent: 10 December 2019 11:47
To: Phelan, Nigel (CIB Tech, GBR) <nigel.phelan@...>; 'openmama-dev@...' <openmama-dev@...>
Subject: RE: Adding Blank data representation to MAMA API data types

 

 

Hi Nigel,

 

I don’t have a RIC to quote here, but the problem exists for certain data delivered over the Elektron feed, with data from contributors where we have no control over what conventions they use.

For most data sets you only ever see blank values in initial data which is not really a problem. It’s when they are used to update a previously valid field value, distinguishing between a ‘blank’ and a zero (possibly valid) becomes important. The tick42rmds (or any other) adaptor can only really deliver a zero in the MAMA field it renders from the received ‘blank’ as there is no blank representation in MAMA and this is indistinguishable from a valid zero update, even though they are semantically different.

 

Regards,

Keith

 

 

From: Phelan, Nigel [mailto:nigel.phelan@...]
Sent: 09 December 2019 17:03
To: Keith Rudd <keith.rudd@...>; 'openmama-dev@...' <openmama-dev@...>
Subject: RE: Adding Blank data representation to MAMA API data types

 

Hi Keith – what we normally do here is use the relevant getters/setters on the Price and DateTime fields (getIsValidPrice, setIsValidPrice, HasDate, HasTime).  We tell people to render blanks for invalid prices and for DateTimes with no date or time part.  I believe the tick42rmds adaptor for TREP maps these correctly from the native TREP RWF encoding.  Do you have counter examples?

 

Nigel

 


Nigel Phelan | Corporate & Investment Bank | Market Data Services | J.P. Morgan

 

From: Openmama-dev@... [mailto:Openmama-dev@...] On Behalf Of Keith Rudd
Sent: 09 December 2019 16:49
To: 'openmama-dev@...' <openmama-dev@...>
Subject: [Openmama-dev] Adding Blank data representation to MAMA API data types

 

Certain other market data platforms / APIs (well, TREP) support the concept of ‘blank’ field values.

The price and time field types actually have special blank data representations defined.

This is used in the context where a publisher wants to include a field in a message but with no assigned value, possibly for example to indicate that a price that was previously offered and valid is now no longer available.

(Granted there are other ways to do this, with another price status field etc. but we do see this method used sometimes and with no other indicator of validity)

 

This introduces a conundrum when translating such values to the MAMA API as prices, times, etc. have no such equivalent blank representation in MAMA.

 

So, my proposal is that we might extend MAMA field data types to include such ‘blank’ values.

This would need to be done while preserving backwards-compatibility.

 

A discussion point for now

-  Is this a feature which others see value in?

-  Any feasible suggestions for how it might be achieved and what values would be used for ‘blank’ ?

 

Regards,

Keith

 



---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.



---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.



---
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and delete this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Please refer to https://www.db.com/disclosures for additional EU corporate and regulatory disclosures and to http://www.db.com/unitedkingdom/content/privacy.htm for information about privacy.

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.

21 - 40 of 2295