Date   

CentOS 6 / RHEL 6 Support (Reply Needed)

Frank Quinn
 

Hi Folks,

 

CentOS 6 is reaching its main EOL on the 30th November (10 days time), so I’d like to ask the community where they are in respect to migration.

 

Please feel free to reply to me either directly or via list with one of the below:

 

  1. CentOS 6? How quaint. Kill it with fire.
  2. We are moving off CentOS 6 and expect to be migrated away from it before the next OpenMAMA release in Spring.
  3. I think we’d need CentOS support for at least another year or so until all our systems migrate.

 

Cheers,

Frank

 

Frank Quinn

Cascadium

T: +44 (0) 28 8678 8015

E: fquinn@...

W: http://cascadium.io

 


OpenMAMA contribution to FINOS

Frank Quinn
 

Hi Folks,

 

I am excited to announce that OpenMAMA was successfully contributed to FINOS and it's now available as a FINOS hosted project at https://github.com/finos/OpenMAMA. Any old links and git clones should automatically redirect, but if there are any issues with this please let us know.

 

Our website is now hosted at openmama.finos.org and you can reach out to the project team using any of the channels on our support page.

 

Being part of FINOS gives OpenMAMA a specialist target audience to tap into, and helps raise the visibility and profile of the project directly to where it’s likely to be most relevant. We are hoping it will help continue to drive forward with our core goals of adoption, interoperability and visibility.

 

Many thanks to the FINOS team and to the wider FINOS Community for their warm welcome and support in moving OpenMAMA to its new home!

 

Cheers,

Frank

 

Frank Quinn

Cascadium

T: +44 (0) 28 8678 8015

E: fquinn@...

W: http://cascadium.io

 


Announcing OZ: a production-quality, open-source transport for OpenMAMA using ZeroMQ

Bill Torpey
 

Everyone on this mailing list is already well acquainted with OpenMAMA's awesome-ness, but I want to let you know about something that makes OpenMAMA even more awesome.

One thing that has always been a trouble spot for OpenMAMA is the lack of a reliable, high-performance, open-source transport library. The AVIS bridge was little more than a toy, and while the Qpid bridge was an improvement, there has never been a production-quality, open-source transport bridge for OpenMAMA.

Until now.

NYFIX, a division of Itiviti AB, has recently released an open-source transport for OpenMAMA based on ZeroMQ which we are calling OZ. OZ has been in live production use supporting the NYFIX Marketplace since early March in our data centers in Europe, Asia and the U.S., processing roughly 50 million messages per day.

OZ was designed to support many of the most popular features of typical MOM's:

  • publish/subscribe messaging using topic-based addressing, supporting hierarchical topic namespaces and "wildcard" subscriptions;
  • request/reply (inbox) messaging for transactional interactions;
  • dynamic service discovery with minimal configuration;
  • broker-less architecture for reduced latency and optimum throughput;
  • self-describing messages (thanks to OpenMAMA-omnm).

We recently published a whitepaper describing OZ, but the more technical-minded will probably prefer to check out the docs and/or take a look at the sample code. (In particular, the examples use modern C++ constructs to provide a kinder, gentler introduction to OpenMAMA).

We're hopeful that the OpenMAMA community will find OZ helpful, and we look forward to working with everyone in the community to make OZ, and OpenMAMA itself, even better. We encourage everyone to check out OZ at https://github.com/nyfix/OZ.

Please contact me directly or raise an issue with any comments, suggestions, etc.


OpenMAMA 6.3.1 Released

Frank Quinn
 

Hi Folks,

 

We are pleased to announce OpenMAMA 6.3.1 is finally here and has now propagated to Cloudsmith, Maven central etc!

 

This is a maintenance release which fixes several outstanding bugs and introduces some new functionality.

 

  • Added experimental .NET core support (including C# on Linux)
  • Updated supported platforms to include CentOS 8, Ubuntu 18.04 LTS and Ubuntu 20.04 LTS
  • Added quickstart / start of tutorial code for all languages
  • Modernized JNI implementation to use javac -h rather than javah
  • Added json and normalized string methods for all language bindings (formerly reserved for only C)
  • Fix multithreaded builds in OpenMAMA and add "experimental" cloudsmith repository automatic upload
  • Added support for clang sanitizers
  • Added code for creating your own Vagrant box for OpenMAMA (and we uploaded our own)
  • Added ansible code for creating a demo environment
  • Fixed a few race conditions, leaks and bugs
  • Improved Mac Support

 

For a complete list of all 21 issues included in this release, please see here: https://github.com/OpenMAMA/OpenMAMA/milestone/11?closed=1

 

Cheers,

Frank

 

Frank Quinn

Cascadium

T: +44 (0) 28 8678 8015

E: fquinn@...

W: http://cascadium.io

 


Re: OpenMAMA 6.3.1 RC1 Released

Frank Quinn
 

Hi folks - a reminder that we are going into the final week of this RC -if any major issues are spotted please raise as a priority!


On Thu, 24 Sep 2020, 18:32 Frank Quinn, <fquinn@...> wrote:

Hi Folks,

 

We (finally) have a new RC ready for testing with compiled artifacts for supported platforms – check it out:

 

https://github.com/OpenMAMA/OpenMAMA/releases/tag/OpenMAMA-6.3.1-rc1

This is a maintenance release which fixes several outstanding bugs and introduces some new functionality.

  • Added .NET core support (includig C# on Linux)
  • Updated supported platforms to include CentOS 8, Ubuntu 18.04 LTS and Ubuntu 20.04 LTS
  • Added quickstart / start of tutorial code for all languages
  • Modernized JNI implementation to use javac -h rather than javah
  • Added json and normalized string methods for all language bindings (formerly reserved for only C)
  • Fix multithreaded builds in OpenMAMA and add "experimental" cloudsmith repository automatic upload
  • Added support for clang sanitizers
  • Added code for creating your own Vagrant box for OpenMAMA (and we uploaded our own)
  • Added ansible code for creating a demo environment
  • Fixed a few race conditions, leaks and bugs
  • Improved Mac Support

For a complete list of all 21 issues included in this release, please see here: https://github.com/OpenMAMA/OpenMAMA/milestone/11?closed=1

The first RC stage will last for 2 weeks, so if no major issues are found, we will provide the GA release on Monday 12th October 2020.

 

Good hunting.

 

Cheers.

Frank

 

Frank Quinn

Cascadium

T: +44 (0) 28 8678 8015

E: fquinn@...

W: http://cascadium.io

 


New OpenMAMA Website is Live!

Frank Quinn
 

Hi Folks,

 

After a successful testing period (thanks to all who were involved), we are delighted to announce the official launch the new OpenMAMA website which went live yesterday evening: https://openmama.org

 

We will be getting the word out on social media so please like / share our posts to extend our reach! Also please update any existing links etc. to the OpenMAMA website where appropriate.

 

The site is a complete overhaul and has several key improvements over the former site beyond aesthetics:

 

  • It is responsive
  • It is a static, bloat-free website so it should load very quickly
  • You can submit your own suggested changes to the site via pull request because it's all on github
  • It merges the documentation and main marketing website into one
  • It generally tries to de-duplicate information and the content has had a general refresh

 

If there are any issues please raise them at https://github.com/OpenMAMA/openmama.github.io/issues and we hope you enjoy the new site! As usual, we’ll be keeping an eye on gitter if you have any immediate feedback / concerns.

 

Cheers,

Frank

 

Frank Quinn

Cascadium

T: +44 (0) 28 8678 8015

E: fquinn@...

W: http://cascadium.io

 


OpenMAMA 6.3.1 RC1 Released

Frank Quinn
 

Hi Folks,

 

We (finally) have a new RC ready for testing with compiled artifacts for supported platforms – check it out:

 

https://github.com/OpenMAMA/OpenMAMA/releases/tag/OpenMAMA-6.3.1-rc1

This is a maintenance release which fixes several outstanding bugs and introduces some new functionality.

  • Added .NET core support (includig C# on Linux)
  • Updated supported platforms to include CentOS 8, Ubuntu 18.04 LTS and Ubuntu 20.04 LTS
  • Added quickstart / start of tutorial code for all languages
  • Modernized JNI implementation to use javac -h rather than javah
  • Added json and normalized string methods for all language bindings (formerly reserved for only C)
  • Fix multithreaded builds in OpenMAMA and add "experimental" cloudsmith repository automatic upload
  • Added support for clang sanitizers
  • Added code for creating your own Vagrant box for OpenMAMA (and we uploaded our own)
  • Added ansible code for creating a demo environment
  • Fixed a few race conditions, leaks and bugs
  • Improved Mac Support

For a complete list of all 21 issues included in this release, please see here: https://github.com/OpenMAMA/OpenMAMA/milestone/11?closed=1

The first RC stage will last for 2 weeks, so if no major issues are found, we will provide the GA release on Monday 12th October 2020.

 

Good hunting.

 

Cheers.

Frank

 

Frank Quinn

Cascadium

T: +44 (0) 28 8678 8015

E: fquinn@...

W: http://cascadium.io

 


Re: OpenMAMA Preview Website Launch

Frank Quinn
 

Hi Folks,

 

Quick reminder about the invitation below, and to thank everyone who has checked it out already (I can see a noticeable increase in traffic since sending this out so I know you folks are looking).

 

Particular thanks to our friends at Solace who submitted the first ever pull request to the new website! Others are welcome if appropriate!

 

Cheers,

Frank

Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io

 

From: Openmama-dev@... <Openmama-dev@...> On Behalf Of Frank Quinn via lists.openmama.org
Sent: 21 September 2020 09:40
To: openmama-dev@...
Subject: [Openmama-dev] OpenMAMA Preview Website Launch

 

Hi Folks,

 

It is with great pleasure that I can declare that a new website is on its way.

 

It is currently hosted at https://openmama.github.io as a preview version (the former documentation website address) and therefore I invite the members of the mailing list to have a look, and report any issues back directly to be before its "proper" launch to openmama.org on Monday 28th September.

 

The site is a complete overhaul and has several key improvements over the former site beyond aesthetics:

 

  • It is responsive
  • It is a static, bloat-free website so it should load very quickly
  • You can submit your own suggested changes to the site via pull request because it's all on github
  • It merges the documentation and main marketing website into one
  • It generally tries to de-duplicate information and the content has had a general refresh

 

Anyway if there are any issues please raise them at https://github.com/OpenMAMA/openmama.github.io/issues and I hope you enjoy the new site! As usual, I’ll be keeping an eye on gitter if you have any immediate feedback / concerns.

 

Cheers,

Frank

 

Frank Quinn

Cascadium

T: +44 (0) 28 8678 8015

E: fquinn@...

W: http://cascadium.io

 


OpenMAMA Preview Website Launch

Frank Quinn
 

Hi Folks,

 

It is with great pleasure that I can declare that a new website is on its way.

 

It is currently hosted at https://openmama.github.io as a preview version (the former documentation website address) and therefore I invite the members of the mailing list to have a look, and report any issues back directly to be before its "proper" launch to openmama.org on Monday 28th September.

 

The site is a complete overhaul and has several key improvements over the former site beyond aesthetics:

 

  • It is responsive
  • It is a static, bloat-free website so it should load very quickly
  • You can submit your own suggested changes to the site via pull request because it's all on github
  • It merges the documentation and main marketing website into one
  • It generally tries to de-duplicate information and the content has had a general refresh

 

Anyway if there are any issues please raise them at https://github.com/OpenMAMA/openmama.github.io/issues and I hope you enjoy the new site! As usual, I’ll be keeping an eye on gitter if you have any immediate feedback / concerns.

 

Cheers,

Frank

 

Frank Quinn

Cascadium

T: +44 (0) 28 8678 8015

E: fquinn@...

W: http://cascadium.io

 


Re: Proposal: OpenMAMA Default Branch change to "next"

Frank Quinn
 

Hi Folks,

 

The default branch for OpenMAMA has now been set to “next”. Looks a little more lively already 😊

 

If anyone runs into any issues, let me know. Hopefully overall this will make things a little easier.

 

Cheers,

Frank

Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io

 

From: Openmama-dev@... <Openmama-dev@...> On Behalf Of Phelan, Nigel via lists.openmama.org
Sent: 25 August 2020 12:49
To: openmama-dev@...
Subject: Re: [Openmama-dev] Proposal: OpenMAMA Default Branch change to "next"

 

Sounds quite sensible to me, Frank.

 

Nigel

 


Nigel Phelan | Corporate & Investment Bank | Market Data Services | J.P. Morgan

 

From: Openmama-dev@... [mailto:Openmama-dev@...] On Behalf Of Frank Quinn
Sent: 25 August 2020 12:35
To: openmama-dev@...
Subject: [Openmama-dev] Proposal: OpenMAMA Default Branch change to "next"

 

Hi Folks,

 

I would like to propose that we change the OpenMAMA github default git branch to “next” instead of “master”.

 

This would mean that:

 

  • When you visit our github page, you will see the latest development code and documentation
  • When you clone the repository, you no longer need to change branch before making feature branches which will eventually become pull requests.

 

Note all existing clones etc would remain unchanged so this won’t actually “break” anything for existing OpenMAMA developers.

 

The rationale being that “master” is reflective of the last stable release, but it’s also the source of the main README output when visitors come to the github page which makes it look like the project is less active than it actually is, repo documentation effectively slows down to match the biannual release cycle etc.

 

The industry in general is also gradually moving away from the “master” nomenclature for branches with linux and github both making noises about moving away from defaulting to that branch name, so this would pave the way for eventual retirement of the branch name (which is actually fine - we can just use our release branches instead as we already do with minimal impact).

 

If anyone has any comments or feedback on this, please respond on this thread.

 

Cheers,

Frank

 

Frank Quinn

Cascadium

T: +44 (0) 28 8678 8015

E: fquinn@...

W: http://cascadium.io

 

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, malicious content and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.


Re: Proposal: OpenMAMA Default Branch change to "next"

Phelan, Nigel
 

Sounds quite sensible to me, Frank.

 

Nigel

 


Nigel Phelan | Corporate & Investment Bank | Market Data Services | J.P. Morgan

 

From: Openmama-dev@... [mailto:Openmama-dev@...] On Behalf Of Frank Quinn
Sent: 25 August 2020 12:35
To: openmama-dev@...
Subject: [Openmama-dev] Proposal: OpenMAMA Default Branch change to "next"

 

Hi Folks,

 

I would like to propose that we change the OpenMAMA github default git branch to “next” instead of “master”.

 

This would mean that:

 

  • When you visit our github page, you will see the latest development code and documentation
  • When you clone the repository, you no longer need to change branch before making feature branches which will eventually become pull requests.

 

Note all existing clones etc would remain unchanged so this won’t actually “break” anything for existing OpenMAMA developers.

 

The rationale being that “master” is reflective of the last stable release, but it’s also the source of the main README output when visitors come to the github page which makes it look like the project is less active than it actually is, repo documentation effectively slows down to match the biannual release cycle etc.

 

The industry in general is also gradually moving away from the “master” nomenclature for branches with linux and github both making noises about moving away from defaulting to that branch name, so this would pave the way for eventual retirement of the branch name (which is actually fine - we can just use our release branches instead as we already do with minimal impact).

 

If anyone has any comments or feedback on this, please respond on this thread.

 

Cheers,

Frank

 

Frank Quinn

Cascadium

T: +44 (0) 28 8678 8015

E: fquinn@...

W: http://cascadium.io

 

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, malicious content and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.


Proposal: OpenMAMA Default Branch change to "next"

Frank Quinn
 

Hi Folks,

 

I would like to propose that we change the OpenMAMA github default git branch to “next” instead of “master”.

 

This would mean that:

 

  • When you visit our github page, you will see the latest development code and documentation
  • When you clone the repository, you no longer need to change branch before making feature branches which will eventually become pull requests.

 

Note all existing clones etc would remain unchanged so this won’t actually “break” anything for existing OpenMAMA developers.

 

The rationale being that “master” is reflective of the last stable release, but it’s also the source of the main README output when visitors come to the github page which makes it look like the project is less active than it actually is, repo documentation effectively slows down to match the biannual release cycle etc.

 

The industry in general is also gradually moving away from the “master” nomenclature for branches with linux and github both making noises about moving away from defaulting to that branch name, so this would pave the way for eventual retirement of the branch name (which is actually fine - we can just use our release branches instead as we already do with minimal impact).

 

If anyone has any comments or feedback on this, please respond on this thread.

 

Cheers,

Frank

 

Frank Quinn

Cascadium

T: +44 (0) 28 8678 8015

E: fquinn@...

W: http://cascadium.io

 


Re: Last call for OpenMAMA 6.3.1 release candidate

Frank Quinn
 

Hi Folks,

 

Looks like a number of issues are still coming in and I have identified a few things that I want to get in myself, so I’m going to push this back 1 week to the weekend commencing 28-Aug 2020.

 

Cheers,

Frank

Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io

 

From: Frank Quinn
Sent: 18 August 2020 23:12
To: openmama-users@...
Subject: Last call for OpenMAMA 6.3.1 release candidate

 

Hi Folks,

 

I plan on taking a cut for OpenMAMA 6.3.1 RC1 over the coming weekend (21-Aug 2020). If anyone has anything they want to submit before then please let me know so I can delay the cut if necessary, otherwise I will continue as planned.

 

This release includes some changes to support:

 

  • CentOS 8
  • Fedora 31
  • Fedora 32
  • Ubuntu 20.04 LTS

 

It also effectively drops support for the now EOL:

 

  • Fedora 28
  • Fedora 29
  • Ubuntu 14.04 LTS

 

This is in preparation for a RC that will be cut within the next few weeks which is now set up to include RPMs and Debian packages for these modern distros to make installation easy, take care of dependencies etc.

 

Cheers,

Frank

 

Frank Quinn

Cascadium

T: +44 (0) 28 8678 8015

E: fquinn@...

W: http://cascadium.io

 


Last call for OpenMAMA 6.3.1 release candidate

Frank Quinn
 

Hi Folks,

 

I plan on taking a cut for OpenMAMA 6.3.1 RC1 over the coming weekend (21-Aug 2020). If anyone has anything they want to submit before then please let me know so I can delay the cut if necessary, otherwise I will continue as planned.

 

This release includes some changes to support:

 

  • CentOS 8
  • Fedora 31
  • Fedora 32
  • Ubuntu 20.04 LTS

 

It also effectively drops support for the now EOL:

 

  • Fedora 28
  • Fedora 29
  • Ubuntu 14.04 LTS

 

This is in preparation for a RC that will be cut within the next few weeks which is now set up to include RPMs and Debian packages for these modern distros to make installation easy, take care of dependencies etc.

 

Cheers,

Frank

 

Frank Quinn

Cascadium

T: +44 (0) 28 8678 8015

E: fquinn@...

W: http://cascadium.io

 


CentOS 8 + Ubuntu 20.04 LTS Support

Frank Quinn
 

Hi Folks,

 

I have just pushed some changes into the next branch which add some changes to support:

 

  • CentOS 8
  • Ubuntu 20.04

 

It also effectively drops support for the now EOL:

 

  • Fedora 28
  • Fedora 29
  • Ubuntu 14.04 LTS

 

This is in preparation for a RC that will be cut within the next few weeks which is now set up to include RPMs and Debian packages for these modern distros to make installation easy, take care of dependencies etc.

 

If you're a user of these new operating system versions, please It a whirl.

 

Cheers,

Frank

 

 

Frank Quinn

Cascadium

T: +44 (0) 28 8678 8015

E: fquinn@...

W: http://cascadium.io

 


Re: Deadlock in mamaSubscription and mamaTransport destroy logic

Slade, Michael J
 

Hi Frank,

 

Thanks for raising the issue in Github.

 

We were able to reproduce this issue on a Linux (RH6) server using mamalistenc.c to subscribe to ~5000 symbols with the tick42rmds middleware bridge. The only modification made to mamalistenc was to reverse the order in which the subscriptions were destroyed. The deadlock would then occur on shutdown.

 

Kind regards,

Mike

 

From: Frank Quinn [mailto:fquinn@...]
Sent: 01 June 2020 22:11
To: Slade, Michael J (CIB Tech, GBR) <michael.j.slade@...>; openmama-dev@...
Subject: RE: Deadlock in mamaSubscription and mamaTransport destroy logic

 

Hi Mike,

 

Have raised https://github.com/OpenMAMA/OpenMAMA/issues/411 For follow up on this one – let’s track there for paper trail / release note reference etc.

 

Could you add some details on the execution environment? Particularly the shutdown sequence in the code, and whether or not this is JNI etc to see if anything unusual is compounding the issue?

 

The transport destroy is not supposed to be attempted by the app until after the subscriptions have all been destroyed and the queues have been drained and destroyed so it sounds like the bridge is still firing callbacks after the subscription has been “destroy”ed to free up the memory.

 

Cheers,

Frank

 

Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io

 

From: Openmama-dev@... <Openmama-dev@...> On Behalf Of Slade, Michael J via lists.openmama.org
Sent: 27 May 2020 15:41
To: openmama-dev@...
Subject: [Openmama-dev] Deadlock in mamaSubscription and mamaTransport destroy logic

 

Hi OpenMAMA Dev,

 

We have encountered a deadlock situation in mamaSubscription’s and mamaTransport’s teardown logic due to lock ordering when destroying their underlying mamaPublisher. We are able to reliably reproduce this with the tick42 bridge. Could someone take a look at this for us please?

Thanks in advance,

Mike

Deadlock – note that subscription and transport attempt to destroy same publisher:
mamaTransport teardown:

  1. transport.c:         mamaTransport_destroy()
  2. transport.c:         mamaTransportImpl_clearTransportWithPublishers()
  3. list.c:                    list_for_each()Acquires list lock (1)
  4. transport.c:         mamaTransportImpl_clearTransportPublisherCallback()
  5. publisher.c:         mamaPublisherImpl_clearTransport()
  6. publisher.c:         mamaPublisherImpl_destroy()Attempts to acquire publisher lock (2)

 

mamaSubscription teardown

  1. subscription.c:    mamaSubscriptionImpl_onSubscriptionDestroyed()
  2. subscription.c:    mamaSubscription_cleanup()
  3. publisher.c:         mamaPublisherImpl_destroy()Acquires publisher lock (2)
  4. transport.c:         mamaTransport_removePublisher()
  5. list.c:                    list_remove_element()Attempts to acquire list lock (1)

 

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.


Re: Early releases of lock in mamaSubscription destroy/deallocate logic

Frank Quinn
 

That's what the change should do - its a reference count that defers cleanup until the last reference has been released (since both the jvm GC and the bridge thread could both hit this code). It also moves the lock release until a little later in the cycle. Mike, feel free to reach out to me directly or on Gitter to see if we can get lined up here.

I think the other deadlock reported is a different issue to this one (though a similar cause with asynchronous destroy callbacks from the middleware causing a stir somewhere else).

Cheers,
Frank



On 4 Jun 2020 15:05, "Phelan, Nigel" <nigel.phelan@...> wrote:

Hi Frank

 

Does this actually address the issue with prematurely releasing the lock?  Wouldn’t it be safer, as Mike suggested, to allow the thread that holds the lock to mark the object for destruction, but to defer the memory clean up until the unlock occurs?

 

Do you want to have a side conversation with Mike about his concerns – I think it might help?

 

Nigel

 


Nigel Phelan | Corporate & Investment Bank | Market Data Services | J.P. Morgan

 

From: Openmama-dev@... [mailto:Openmama-dev@...] On Behalf Of Frank Quinn
Sent: 01 June 2020 22:16
To: Slade, Michael J (CIB Tech, GBR) <michael.j.slade@...>
Cc: openmama-dev@...
Subject: Re: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

Hi Mike – did you have any joy with testing this one?

 

Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io

 

From: Frank Quinn
Sent: 11 May 2020 20:19
To: Slade, Michael J <michael.j.slade@...>
Cc: openmama-dev@...
Subject: RE: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

Hi Mike,

 

I have put together something that looks like it address this (I modified qpid and MamaListen locally with various sleeps to verify behaviour). Turned out to be a little trickier than I had anticipated with the different paths through the state machine.

 

It works by assigning an atomic reference counter to increment for the bridge, and the Java wrapper, such that only when receiving a statement of disinterest from both will it release the memory. Since calling deallocate is an admission that you will no longer be using that memory any more, operating on a first past the post system should be safe.

 

There are many paths through the MAMA subscription state machine though so I'd appreciate if you could give it a test in your target environment before I merge it in. You can find my development branch for this change on:

 

https://github.com/fquinner/OpenMAMA/tree/feature/subscription-reference-count

 

Let me know if this works for you.

 

Cheers,

Frank

 

Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io

 

From: Slade, Michael J <michael.j.slade@...>
Sent: 29 April 2020 10:01
To: Frank Quinn <fquinn@...>
Cc: openmama-dev@...
Subject: RE: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

Thanks Frank.

 

If you could take a look at implementing this, that would be much appreciated.

 

Mike

 

From: Frank Quinn [mailto:fquinn@...]
Sent: 28 April 2020 19:52
To: Slade, Michael J (CIB Tech, GBR) <michael.j.slade@...>
Cc: openmama-dev@...
Subject: RE: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

OK thanks Mike I thought user interaction was the primary problem but sounds like it's the finalizers.

 

This is where the subscription stuff gets hairy because there are a few different but similar areas at play here.

 

You should ideally unlock before the mamaSubscriptionImpl_deallocate to avoid undefined behaviour in the destroy yes. However...

 

In Java, when the GC kicks in, it fires off the JNI Java_com_wombat_mama_MamaSubscription_deallocate method which in turn calls mamaSubscription_deallocate which does acquire the subscription's mCreateDestroyLock

which could be (already) held onto while mamaSubscriptionImpl_deallocate is then called (!). So we actually already have undefined behaviour on that path when you look at it so that's effectively the same bug.

 

The path with Java destroy goes JNI Java_com_wombat_mama_MamaSubscription_destroy which calls mamaSubscription_destroy which will usually deactivate the subscription, then go on to invoke the destroyed callback. I was suggesting at this point, we *do* hang onto the lock until that callback is completed to protect the subscription object.

 

The path mamaSubscriptionImpl_onSubscriptionDestroyed is a different beast when the middleware is letting MAMA know that a subscription has been destroyed which may be via mamaSubscription_destroy -> mamaSubscription -> deactivate. In this case, we currently unlock before the callback. I was suggesting this should be moved until after the callback, but before the deallocate. If we hung onto the lock until after the deallocate, then we'd just be emulating the buggy behaviour already present in mamaSubscription_deallocate.

 

But yes when I look this through there is a more subtle issue afoot - because we have onSubscriptionDestroyed which will be called here (already inside the lock)

 

destroy

  deactivate

    lock mCreateDestroyLock

    mamaSubscription_deactivate_internal

     bridge_destroy

        mamaSubscriptionImpl_onSubscriptionDestroyed

          lock mCreateDestroyLock

          unlock mCreateDestroyLock

          impldeallocate()

    unlock

  lock

 

Now, since wlocks are recursive and these are from the same thread, this won't deadlock and should already be protected from the finalizer, but it also may not happen in this order depending on how the ondestroy callback is implemented in the bridge, so you could get something more like this:

 

destroy

  deactivate

    lock mCreateDestroyLock

    mamaSubscription_deactivate_internal

      bridge_destroy

        ... bridge takes this under consideration ...

    unlock mCreateDestroyLock

  lock

 

... time passes...

 

mamaSubscriptionImpl_onSubscriptionDestroyed

  lock mCreateDestroyLock

  unlock mCreateDestroyLock

  impldeallocate() <-- this is where it looks like the GC has an opportunity to cause trouble?

 

If we are going to have multiple threads coming in like this, then yes I think an atomic reference counter in the subscription object to track when each resource depending on the subscription object has gained and lost interest would be preferable to holding onto the lock. Would fix the existing bug already in the finalizer too.

 

Is this something you're looking clarification on before implementing or you want me to have a look at implementing this?

 

Cheers,

Frank

 

Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io

 

From: Slade, Michael J <michael.j.slade@...>
Sent: 28 April 2020 12:37
To: Frank Quinn <fquinn@...>
Cc: openmama-dev@...
Subject: RE: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

Hi Frank,

 

Why do we need to unlock the mutex before the deallocate call? I understand that this call destroys the lock, but with the proposed wrapper around the lock the destroy call can defer the actual release of memory to the unlock call.

 

The early release could be exploited because the MamaSubscription Java wrapper calls deallocate in its finalize method. Due to this, the GC thread could acquire the lock and release memory in mamaSubscriptionImpl_deallocate while the lock-releasing thread is trying to use this memory to invoke the user callback.

 

Thanks,

Mike

 

From: Frank Quinn [mailto:fquinn@...]
Sent: 27 April 2020 22:04
To: Slade, Michael J (CIB Tech, GBR) <michael.j.slade@...>
Cc: openmama-dev@...
Subject: RE: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

Hi Mike,

 

Could you shed some more light on what exactly the problem is here? I have had a refresher in that area, and from what I can see, I can’t think of any reason to unlock that particular mutex before the user callback so I’m not opposed to moving until after that callback if that will resolve whatever issue you’re seeing? It will need to be before the deallocate though.

 

The only reason I could think of for why this was done in the first place is to avoid deadlock if the user calls something like mamaSubscription_deallocate or something else that uses that mutex from the callback, but then I guess they’d learn pretty quickly if they had fallen foul of that. Then I thought this might be somehow exploited in one of the Java or .NET wrappers but couldn’t find any evidence of that either. Plus that callback is more informative than anything else for the application, so if they are attempting to deallocate, and that’s causing funny business (or something like that?), we just need to make it clear in the callback documentation that they shouldn’t be doing that.

 

Cheers,

Frank

 

Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io

 

From: Openmama-dev@... <Openmama-dev@...> On Behalf Of Bill Torpey via lists.openmama.org
Sent: 05 April 2020 15:02
To: Slade, Michael J <michael.j.slade@...>
Cc: openmama-dev@...
Subject: Re: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

At first glance, that sounds like just “kicking the can down the road” — i.e., you’re still left with the problem of what to do with these wrappers such that you can tear them down in a thread-safe manner.

 

Having said that, if you have an implementation that works, I’m sure that others would be interested, so maybe put it up on GitHub or something?

 

My personal preference would be to try to find something on the intertubes that has been tested up the wazoo — concurrency is hard, and the hardest part IMO is tear-down of event sources/sinks.

 

On Apr 3, 2020, at 10:39 AM, Slade, Michael J <michael.j.slade@...> wrote:

 

Thank you for your reply Bill.

 

To get around the problem with attempting to destroy the mutex while it is locked, could we instead define our own wrapper around a recursive mutex which handles the following:

1.      Keep a count of how many locks have been performed by current thread.

2.      Defer destroy to unlock call if mutex currently locked.

 

This will keep all destroy/deallocate logic thread safe and stop the need for early unlocks.

 

Mike

 

From: Bill Torpey [mailto:wallstprog@...
Sent: 01 April 2020 18:10
To: Slade, Michael J (CIB Tech, GBR) <michael.j.slade@...>
Cc: openmama-dev@...
Subject: Re: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

As far as I can tell, there’s no perfect solution here.    From https://linux.die.net/man/3/pthread_mutex_destroy 

 

Attempting to destroy a locked mutex results in undefined behavior.

 

So you either destroy the locked mutex, which is UB, or you unlock it first.  In the second case, another thread could grab it, which would likely cause a crash at some point.  

 

You can make the second problem go away by doing something like queueing the actual destroy in such a way that you know for certain that no other threads are waiting on the mutex, but that’s a fair bit of work, and tricky in its own right.  We took that approach in our application, and it generally works well, and avoids a whole host of lifetime-related problems in OpenMAMA.

 

 

On Mar 31, 2020, at 11:50 AM, Slade, Michael J via Lists.Openmama.Org <michael.j.slade=jpmorgan.com@...> wrote:

 

Hi OpenMAMA Devs,

 

In the subscription destroy/deallocate logic, there are a couple of functions which seem to release the lock used to protect access to the subscription structure early – i.e. before the user callback ‘mamaSubscriptionImpl_invokeDestroyedCallback’ or ‘mamaSubscriptionImpl_deallocate’ function is invoked. Is anyone able to explain why the lock is released in this way, and not held until the end of the function?
 
Thanks in advance,
Mike

mamaSubscription_destroy:2848

mama_status mamaSubscription_destroy(mamaSubscription subscription)

{

    /* Returns. */

    mama_status ret = MAMA_STATUS_NULL_ARG;

    if(NULL != subscription)

    {

        /* Get the impl. */

        mamaSubscriptionImpl *impl = (mamaSubscriptionImpl *)subscription;

 

        mamaSubscription_deactivate(subscription);

 

         wlock_lock(impl->mCreateDestroyLock);

 

        /* The next action will depend on the current state of the subscription. */

        switch(wInterlocked_read(&impl->mState))

        {

                /* Otherwise the subscription has been correctly removed from the throttle. */

                /* Fall through to perform the remaining clean-up. */

 

                /* For the following states the subscription is not active, simply perform clean-up. */

            case MAMA_SUBSCRIPTION_SETUP:

            case MAMA_SUBSCRIPTION_DEACTIVATED:

                 mamaSubscription_cleanup(subscription);

                 mamaSubscriptionImpl_setState(impl, MAMA_SUBSCRIPTION_DESTROYED);

                  wlock_unlock(impl->mCreateDestroyLock);

                 mamaSubscriptionImpl_invokeDestroyedCallback(subscription);

                 return MAMA_STATUS_OK;

                break;

 

mamaSubscriptionImpl_onSubscriptionDestroyed:3292+3303

 

void MAMACALLTYPE mamaSubscriptionImpl_onSubscriptionDestroyed(mamaSubscription subscription, void *closure)
{
    /* Obtain the impl from the subscription object. */
    mamaSubscriptionImpl *impl = (mamaSubscriptionImpl *)subscription;
    if(NULL != impl)
    {
 
        if(NULL != impl->mQueue)
            mamaQueue_decrementObjectCount(&impl->mLockHandle, impl->mQueue);
 
        /* Lock the mutex. */
        wlock_lock(impl->mCreateDestroyLock);
 
 
        /* The next action will depend on the current state of the subscription. */
        switch(wInterlocked_read(&impl->mState))
        {
                /* The subscription is being deactivated. */
            case MAMA_SUBSCRIPTION_DEACTIVATING:
                /* Change the state. */
                mamaSubscriptionImpl_setState(impl, MAMA_SUBSCRIPTION_DEACTIVATED);
                break;
 
                /* The subscription is being deallocated, i.e. mamaSubscription_deallocate has been called
                 * before the destroy callback has come in from the bridge.
                 */
            case MAMA_SUBSCRIPTION_DEALLOCATING :
                 mamaSubscription_cleanup(subscription);
                 wlock_unlock(impl->mCreateDestroyLock);
                 mamaSubscriptionImpl_invokeDestroyedCallback(impl);
                /* Delete the subscription. */
                mamaSubscriptionImpl_deallocate(impl);
                return;
                break;
 
                /* The subscription is being destroyed. */
            case MAMA_SUBSCRIPTION_DESTROYING :
                 mamaSubscription_cleanup(subscription);
                 mamaSubscriptionImpl_setState(impl, MAMA_SUBSCRIPTION_DESTROYED);
                  wlock_unlock(impl->mCreateDestroyLock);
                 mamaSubscriptionImpl_invokeDestroyedCallback(impl);
                 return;
                break;

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.

 

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.

 

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.



Re: Early releases of lock in mamaSubscription destroy/deallocate logic

Phelan, Nigel
 

Hi Frank

 

Does this actually address the issue with prematurely releasing the lock?  Wouldn’t it be safer, as Mike suggested, to allow the thread that holds the lock to mark the object for destruction, but to defer the memory clean up until the unlock occurs?

 

Do you want to have a side conversation with Mike about his concerns – I think it might help?

 

Nigel

 


Nigel Phelan | Corporate & Investment Bank | Market Data Services | J.P. Morgan

 

From: Openmama-dev@... [mailto:Openmama-dev@...] On Behalf Of Frank Quinn
Sent: 01 June 2020 22:16
To: Slade, Michael J (CIB Tech, GBR) <michael.j.slade@...>
Cc: openmama-dev@...
Subject: Re: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

Hi Mike – did you have any joy with testing this one?

 

Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io

 

From: Frank Quinn
Sent: 11 May 2020 20:19
To: Slade, Michael J <michael.j.slade@...>
Cc: openmama-dev@...
Subject: RE: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

Hi Mike,

 

I have put together something that looks like it address this (I modified qpid and MamaListen locally with various sleeps to verify behaviour). Turned out to be a little trickier than I had anticipated with the different paths through the state machine.

 

It works by assigning an atomic reference counter to increment for the bridge, and the Java wrapper, such that only when receiving a statement of disinterest from both will it release the memory. Since calling deallocate is an admission that you will no longer be using that memory any more, operating on a first past the post system should be safe.

 

There are many paths through the MAMA subscription state machine though so I'd appreciate if you could give it a test in your target environment before I merge it in. You can find my development branch for this change on:

 

https://github.com/fquinner/OpenMAMA/tree/feature/subscription-reference-count

 

Let me know if this works for you.

 

Cheers,

Frank

 

Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io

 

From: Slade, Michael J <michael.j.slade@...>
Sent: 29 April 2020 10:01
To: Frank Quinn <fquinn@...>
Cc: openmama-dev@...
Subject: RE: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

Thanks Frank.

 

If you could take a look at implementing this, that would be much appreciated.

 

Mike

 

From: Frank Quinn [mailto:fquinn@...]
Sent: 28 April 2020 19:52
To: Slade, Michael J (CIB Tech, GBR) <michael.j.slade@...>
Cc: openmama-dev@...
Subject: RE: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

OK thanks Mike I thought user interaction was the primary problem but sounds like it's the finalizers.

 

This is where the subscription stuff gets hairy because there are a few different but similar areas at play here.

 

You should ideally unlock before the mamaSubscriptionImpl_deallocate to avoid undefined behaviour in the destroy yes. However...

 

In Java, when the GC kicks in, it fires off the JNI Java_com_wombat_mama_MamaSubscription_deallocate method which in turn calls mamaSubscription_deallocate which does acquire the subscription's mCreateDestroyLock

which could be (already) held onto while mamaSubscriptionImpl_deallocate is then called (!). So we actually already have undefined behaviour on that path when you look at it so that's effectively the same bug.

 

The path with Java destroy goes JNI Java_com_wombat_mama_MamaSubscription_destroy which calls mamaSubscription_destroy which will usually deactivate the subscription, then go on to invoke the destroyed callback. I was suggesting at this point, we *do* hang onto the lock until that callback is completed to protect the subscription object.

 

The path mamaSubscriptionImpl_onSubscriptionDestroyed is a different beast when the middleware is letting MAMA know that a subscription has been destroyed which may be via mamaSubscription_destroy -> mamaSubscription -> deactivate. In this case, we currently unlock before the callback. I was suggesting this should be moved until after the callback, but before the deallocate. If we hung onto the lock until after the deallocate, then we'd just be emulating the buggy behaviour already present in mamaSubscription_deallocate.

 

But yes when I look this through there is a more subtle issue afoot - because we have onSubscriptionDestroyed which will be called here (already inside the lock)

 

destroy

  deactivate

    lock mCreateDestroyLock

    mamaSubscription_deactivate_internal

     bridge_destroy

        mamaSubscriptionImpl_onSubscriptionDestroyed

          lock mCreateDestroyLock

          unlock mCreateDestroyLock

          impldeallocate()

    unlock

  lock

 

Now, since wlocks are recursive and these are from the same thread, this won't deadlock and should already be protected from the finalizer, but it also may not happen in this order depending on how the ondestroy callback is implemented in the bridge, so you could get something more like this:

 

destroy

  deactivate

    lock mCreateDestroyLock

    mamaSubscription_deactivate_internal

      bridge_destroy

        ... bridge takes this under consideration ...

    unlock mCreateDestroyLock

  lock

 

... time passes...

 

mamaSubscriptionImpl_onSubscriptionDestroyed

  lock mCreateDestroyLock

  unlock mCreateDestroyLock

  impldeallocate() <-- this is where it looks like the GC has an opportunity to cause trouble?

 

If we are going to have multiple threads coming in like this, then yes I think an atomic reference counter in the subscription object to track when each resource depending on the subscription object has gained and lost interest would be preferable to holding onto the lock. Would fix the existing bug already in the finalizer too.

 

Is this something you're looking clarification on before implementing or you want me to have a look at implementing this?

 

Cheers,

Frank

 

Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io

 

From: Slade, Michael J <michael.j.slade@...>
Sent: 28 April 2020 12:37
To: Frank Quinn <fquinn@...>
Cc: openmama-dev@...
Subject: RE: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

Hi Frank,

 

Why do we need to unlock the mutex before the deallocate call? I understand that this call destroys the lock, but with the proposed wrapper around the lock the destroy call can defer the actual release of memory to the unlock call.

 

The early release could be exploited because the MamaSubscription Java wrapper calls deallocate in its finalize method. Due to this, the GC thread could acquire the lock and release memory in mamaSubscriptionImpl_deallocate while the lock-releasing thread is trying to use this memory to invoke the user callback.

 

Thanks,

Mike

 

From: Frank Quinn [mailto:fquinn@...]
Sent: 27 April 2020 22:04
To: Slade, Michael J (CIB Tech, GBR) <michael.j.slade@...>
Cc: openmama-dev@...
Subject: RE: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

Hi Mike,

 

Could you shed some more light on what exactly the problem is here? I have had a refresher in that area, and from what I can see, I can’t think of any reason to unlock that particular mutex before the user callback so I’m not opposed to moving until after that callback if that will resolve whatever issue you’re seeing? It will need to be before the deallocate though.

 

The only reason I could think of for why this was done in the first place is to avoid deadlock if the user calls something like mamaSubscription_deallocate or something else that uses that mutex from the callback, but then I guess they’d learn pretty quickly if they had fallen foul of that. Then I thought this might be somehow exploited in one of the Java or .NET wrappers but couldn’t find any evidence of that either. Plus that callback is more informative than anything else for the application, so if they are attempting to deallocate, and that’s causing funny business (or something like that?), we just need to make it clear in the callback documentation that they shouldn’t be doing that.

 

Cheers,

Frank

 

Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io

 

From: Openmama-dev@... <Openmama-dev@...> On Behalf Of Bill Torpey via lists.openmama.org
Sent: 05 April 2020 15:02
To: Slade, Michael J <michael.j.slade@...>
Cc: openmama-dev@...
Subject: Re: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

At first glance, that sounds like just “kicking the can down the road” — i.e., you’re still left with the problem of what to do with these wrappers such that you can tear them down in a thread-safe manner.

 

Having said that, if you have an implementation that works, I’m sure that others would be interested, so maybe put it up on GitHub or something?

 

My personal preference would be to try to find something on the intertubes that has been tested up the wazoo — concurrency is hard, and the hardest part IMO is tear-down of event sources/sinks.

 

On Apr 3, 2020, at 10:39 AM, Slade, Michael J <michael.j.slade@...> wrote:

 

Thank you for your reply Bill.

 

To get around the problem with attempting to destroy the mutex while it is locked, could we instead define our own wrapper around a recursive mutex which handles the following:

1.      Keep a count of how many locks have been performed by current thread.

2.      Defer destroy to unlock call if mutex currently locked.

 

This will keep all destroy/deallocate logic thread safe and stop the need for early unlocks.

 

Mike

 

From: Bill Torpey [mailto:wallstprog@...] 
Sent: 01 April 2020 18:10
To: Slade, Michael J (CIB Tech, GBR) <michael.j.slade@...>
Cc: openmama-dev@...
Subject: Re: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

As far as I can tell, there’s no perfect solution here.    From https://linux.die.net/man/3/pthread_mutex_destroy 

 

Attempting to destroy a locked mutex results in undefined behavior.

 

So you either destroy the locked mutex, which is UB, or you unlock it first.  In the second case, another thread could grab it, which would likely cause a crash at some point.  

 

You can make the second problem go away by doing something like queueing the actual destroy in such a way that you know for certain that no other threads are waiting on the mutex, but that’s a fair bit of work, and tricky in its own right.  We took that approach in our application, and it generally works well, and avoids a whole host of lifetime-related problems in OpenMAMA.

 

 

On Mar 31, 2020, at 11:50 AM, Slade, Michael J via Lists.Openmama.Org <michael.j.slade=jpmorgan.com@...> wrote:

 

Hi OpenMAMA Devs,

 

In the subscription destroy/deallocate logic, there are a couple of functions which seem to release the lock used to protect access to the subscription structure early – i.e. before the user callback ‘mamaSubscriptionImpl_invokeDestroyedCallback’ or ‘mamaSubscriptionImpl_deallocate’ function is invoked. Is anyone able to explain why the lock is released in this way, and not held until the end of the function?
 
Thanks in advance,
Mike

mamaSubscription_destroy:2848

mama_status mamaSubscription_destroy(mamaSubscription subscription)

{

    /* Returns. */

    mama_status ret = MAMA_STATUS_NULL_ARG;

    if(NULL != subscription)

    {

        /* Get the impl. */

        mamaSubscriptionImpl *impl = (mamaSubscriptionImpl *)subscription;

 

        mamaSubscription_deactivate(subscription);

 

         wlock_lock(impl->mCreateDestroyLock);

 

        /* The next action will depend on the current state of the subscription. */

        switch(wInterlocked_read(&impl->mState))

        {

                /* Otherwise the subscription has been correctly removed from the throttle. */

                /* Fall through to perform the remaining clean-up. */

 

                /* For the following states the subscription is not active, simply perform clean-up. */

            case MAMA_SUBSCRIPTION_SETUP:

            case MAMA_SUBSCRIPTION_DEACTIVATED:

                 mamaSubscription_cleanup(subscription);

                 mamaSubscriptionImpl_setState(impl, MAMA_SUBSCRIPTION_DESTROYED);

                  wlock_unlock(impl->mCreateDestroyLock);

                 mamaSubscriptionImpl_invokeDestroyedCallback(subscription);

                 return MAMA_STATUS_OK;

                break;

 

mamaSubscriptionImpl_onSubscriptionDestroyed:3292+3303

 

void MAMACALLTYPE mamaSubscriptionImpl_onSubscriptionDestroyed(mamaSubscription subscription, void *closure)
{
    /* Obtain the impl from the subscription object. */
    mamaSubscriptionImpl *impl = (mamaSubscriptionImpl *)subscription;
    if(NULL != impl)
    {
 
        if(NULL != impl->mQueue)
            mamaQueue_decrementObjectCount(&impl->mLockHandle, impl->mQueue);
 
        /* Lock the mutex. */
        wlock_lock(impl->mCreateDestroyLock);
 
 
        /* The next action will depend on the current state of the subscription. */
        switch(wInterlocked_read(&impl->mState))
        {
                /* The subscription is being deactivated. */
            case MAMA_SUBSCRIPTION_DEACTIVATING:
                /* Change the state. */
                mamaSubscriptionImpl_setState(impl, MAMA_SUBSCRIPTION_DEACTIVATED);
                break;
 
                /* The subscription is being deallocated, i.e. mamaSubscription_deallocate has been called
                 * before the destroy callback has come in from the bridge.
                 */
            case MAMA_SUBSCRIPTION_DEALLOCATING :
                 mamaSubscription_cleanup(subscription);
                 wlock_unlock(impl->mCreateDestroyLock);
                 mamaSubscriptionImpl_invokeDestroyedCallback(impl);
                /* Delete the subscription. */
                mamaSubscriptionImpl_deallocate(impl);
                return;
                break;
 
                /* The subscription is being destroyed. */
            case MAMA_SUBSCRIPTION_DESTROYING :
                 mamaSubscription_cleanup(subscription);
                 mamaSubscriptionImpl_setState(impl, MAMA_SUBSCRIPTION_DESTROYED);
                  wlock_unlock(impl->mCreateDestroyLock);
                 mamaSubscriptionImpl_invokeDestroyedCallback(impl);
                 return;
                break;

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.

 

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.

 

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.


Re: Early releases of lock in mamaSubscription destroy/deallocate logic

Frank Quinn
 

Hi Mike – did you have any joy with testing this one?

 

Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io

 

From: Frank Quinn
Sent: 11 May 2020 20:19
To: Slade, Michael J <michael.j.slade@...>
Cc: openmama-dev@...
Subject: RE: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

Hi Mike,

 

I have put together something that looks like it address this (I modified qpid and MamaListen locally with various sleeps to verify behaviour). Turned out to be a little trickier than I had anticipated with the different paths through the state machine.

 

It works by assigning an atomic reference counter to increment for the bridge, and the Java wrapper, such that only when receiving a statement of disinterest from both will it release the memory. Since calling deallocate is an admission that you will no longer be using that memory any more, operating on a first past the post system should be safe.

 

There are many paths through the MAMA subscription state machine though so I'd appreciate if you could give it a test in your target environment before I merge it in. You can find my development branch for this change on:

 

https://github.com/fquinner/OpenMAMA/tree/feature/subscription-reference-count

 

Let me know if this works for you.

 

Cheers,

Frank

 

Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io

 

From: Slade, Michael J <michael.j.slade@...>
Sent: 29 April 2020 10:01
To: Frank Quinn <fquinn@...>
Cc: openmama-dev@...
Subject: RE: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

Thanks Frank.

 

If you could take a look at implementing this, that would be much appreciated.

 

Mike

 

From: Frank Quinn [mailto:fquinn@...]
Sent: 28 April 2020 19:52
To: Slade, Michael J (CIB Tech, GBR) <michael.j.slade@...>
Cc: openmama-dev@...
Subject: RE: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

OK thanks Mike I thought user interaction was the primary problem but sounds like it's the finalizers.

 

This is where the subscription stuff gets hairy because there are a few different but similar areas at play here.

 

You should ideally unlock before the mamaSubscriptionImpl_deallocate to avoid undefined behaviour in the destroy yes. However...

 

In Java, when the GC kicks in, it fires off the JNI Java_com_wombat_mama_MamaSubscription_deallocate method which in turn calls mamaSubscription_deallocate which does acquire the subscription's mCreateDestroyLock

which could be (already) held onto while mamaSubscriptionImpl_deallocate is then called (!). So we actually already have undefined behaviour on that path when you look at it so that's effectively the same bug.

 

The path with Java destroy goes JNI Java_com_wombat_mama_MamaSubscription_destroy which calls mamaSubscription_destroy which will usually deactivate the subscription, then go on to invoke the destroyed callback. I was suggesting at this point, we *do* hang onto the lock until that callback is completed to protect the subscription object.

 

The path mamaSubscriptionImpl_onSubscriptionDestroyed is a different beast when the middleware is letting MAMA know that a subscription has been destroyed which may be via mamaSubscription_destroy -> mamaSubscription -> deactivate. In this case, we currently unlock before the callback. I was suggesting this should be moved until after the callback, but before the deallocate. If we hung onto the lock until after the deallocate, then we'd just be emulating the buggy behaviour already present in mamaSubscription_deallocate.

 

But yes when I look this through there is a more subtle issue afoot - because we have onSubscriptionDestroyed which will be called here (already inside the lock)

 

destroy

  deactivate

    lock mCreateDestroyLock

    mamaSubscription_deactivate_internal

     bridge_destroy

        mamaSubscriptionImpl_onSubscriptionDestroyed

          lock mCreateDestroyLock

          unlock mCreateDestroyLock

          impldeallocate()

    unlock

  lock

 

Now, since wlocks are recursive and these are from the same thread, this won't deadlock and should already be protected from the finalizer, but it also may not happen in this order depending on how the ondestroy callback is implemented in the bridge, so you could get something more like this:

 

destroy

  deactivate

    lock mCreateDestroyLock

    mamaSubscription_deactivate_internal

      bridge_destroy

        ... bridge takes this under consideration ...

    unlock mCreateDestroyLock

  lock

 

... time passes...

 

mamaSubscriptionImpl_onSubscriptionDestroyed

  lock mCreateDestroyLock

  unlock mCreateDestroyLock

  impldeallocate() <-- this is where it looks like the GC has an opportunity to cause trouble?

 

If we are going to have multiple threads coming in like this, then yes I think an atomic reference counter in the subscription object to track when each resource depending on the subscription object has gained and lost interest would be preferable to holding onto the lock. Would fix the existing bug already in the finalizer too.

 

Is this something you're looking clarification on before implementing or you want me to have a look at implementing this?

 

Cheers,

Frank

 

Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io

 

From: Slade, Michael J <michael.j.slade@...>
Sent: 28 April 2020 12:37
To: Frank Quinn <fquinn@...>
Cc: openmama-dev@...
Subject: RE: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

Hi Frank,

 

Why do we need to unlock the mutex before the deallocate call? I understand that this call destroys the lock, but with the proposed wrapper around the lock the destroy call can defer the actual release of memory to the unlock call.

 

The early release could be exploited because the MamaSubscription Java wrapper calls deallocate in its finalize method. Due to this, the GC thread could acquire the lock and release memory in mamaSubscriptionImpl_deallocate while the lock-releasing thread is trying to use this memory to invoke the user callback.

 

Thanks,

Mike

 

From: Frank Quinn [mailto:fquinn@...]
Sent: 27 April 2020 22:04
To: Slade, Michael J (CIB Tech, GBR) <michael.j.slade@...>
Cc: openmama-dev@...
Subject: RE: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

Hi Mike,

 

Could you shed some more light on what exactly the problem is here? I have had a refresher in that area, and from what I can see, I can’t think of any reason to unlock that particular mutex before the user callback so I’m not opposed to moving until after that callback if that will resolve whatever issue you’re seeing? It will need to be before the deallocate though.

 

The only reason I could think of for why this was done in the first place is to avoid deadlock if the user calls something like mamaSubscription_deallocate or something else that uses that mutex from the callback, but then I guess they’d learn pretty quickly if they had fallen foul of that. Then I thought this might be somehow exploited in one of the Java or .NET wrappers but couldn’t find any evidence of that either. Plus that callback is more informative than anything else for the application, so if they are attempting to deallocate, and that’s causing funny business (or something like that?), we just need to make it clear in the callback documentation that they shouldn’t be doing that.

 

Cheers,

Frank

 

Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io

 

From: Openmama-dev@... <Openmama-dev@...> On Behalf Of Bill Torpey via lists.openmama.org
Sent: 05 April 2020 15:02
To: Slade, Michael J <michael.j.slade@...>
Cc: openmama-dev@...
Subject: Re: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

At first glance, that sounds like just “kicking the can down the road” — i.e., you’re still left with the problem of what to do with these wrappers such that you can tear them down in a thread-safe manner.

 

Having said that, if you have an implementation that works, I’m sure that others would be interested, so maybe put it up on GitHub or something?

 

My personal preference would be to try to find something on the intertubes that has been tested up the wazoo — concurrency is hard, and the hardest part IMO is tear-down of event sources/sinks.

 

On Apr 3, 2020, at 10:39 AM, Slade, Michael J <michael.j.slade@...> wrote:

 

Thank you for your reply Bill.

 

To get around the problem with attempting to destroy the mutex while it is locked, could we instead define our own wrapper around a recursive mutex which handles the following:

1.      Keep a count of how many locks have been performed by current thread.

2.      Defer destroy to unlock call if mutex currently locked.

 

This will keep all destroy/deallocate logic thread safe and stop the need for early unlocks.

 

Mike

 

From: Bill Torpey [mailto:wallstprog@...] 
Sent: 01 April 2020 18:10
To: Slade, Michael J (CIB Tech, GBR) <michael.j.slade@...>
Cc: openmama-dev@...
Subject: Re: [Openmama-dev] Early releases of lock in mamaSubscription destroy/deallocate logic

 

As far as I can tell, there’s no perfect solution here.    From https://linux.die.net/man/3/pthread_mutex_destroy 

 

Attempting to destroy a locked mutex results in undefined behavior.

 

So you either destroy the locked mutex, which is UB, or you unlock it first.  In the second case, another thread could grab it, which would likely cause a crash at some point.  

 

You can make the second problem go away by doing something like queueing the actual destroy in such a way that you know for certain that no other threads are waiting on the mutex, but that’s a fair bit of work, and tricky in its own right.  We took that approach in our application, and it generally works well, and avoids a whole host of lifetime-related problems in OpenMAMA.

 

 

On Mar 31, 2020, at 11:50 AM, Slade, Michael J via Lists.Openmama.Org <michael.j.slade=jpmorgan.com@...> wrote:

 

Hi OpenMAMA Devs,

 

In the subscription destroy/deallocate logic, there are a couple of functions which seem to release the lock used to protect access to the subscription structure early – i.e. before the user callback ‘mamaSubscriptionImpl_invokeDestroyedCallback’ or ‘mamaSubscriptionImpl_deallocate’ function is invoked. Is anyone able to explain why the lock is released in this way, and not held until the end of the function?
 
Thanks in advance,
Mike

mamaSubscription_destroy:2848

mama_status mamaSubscription_destroy(mamaSubscription subscription)

{

    /* Returns. */

    mama_status ret = MAMA_STATUS_NULL_ARG;

    if(NULL != subscription)

    {

        /* Get the impl. */

        mamaSubscriptionImpl *impl = (mamaSubscriptionImpl *)subscription;

 

        mamaSubscription_deactivate(subscription);

 

         wlock_lock(impl->mCreateDestroyLock);

 

        /* The next action will depend on the current state of the subscription. */

        switch(wInterlocked_read(&impl->mState))

        {

                /* Otherwise the subscription has been correctly removed from the throttle. */

                /* Fall through to perform the remaining clean-up. */

 

                /* For the following states the subscription is not active, simply perform clean-up. */

            case MAMA_SUBSCRIPTION_SETUP:

            case MAMA_SUBSCRIPTION_DEACTIVATED:

                 mamaSubscription_cleanup(subscription);

                 mamaSubscriptionImpl_setState(impl, MAMA_SUBSCRIPTION_DESTROYED);

                  wlock_unlock(impl->mCreateDestroyLock);

                 mamaSubscriptionImpl_invokeDestroyedCallback(subscription);

                 return MAMA_STATUS_OK;

                break;

 

mamaSubscriptionImpl_onSubscriptionDestroyed:3292+3303

 

void MAMACALLTYPE mamaSubscriptionImpl_onSubscriptionDestroyed(mamaSubscription subscription, void *closure)
{
    /* Obtain the impl from the subscription object. */
    mamaSubscriptionImpl *impl = (mamaSubscriptionImpl *)subscription;
    if(NULL != impl)
    {
 
        if(NULL != impl->mQueue)
            mamaQueue_decrementObjectCount(&impl->mLockHandle, impl->mQueue);
 
        /* Lock the mutex. */
        wlock_lock(impl->mCreateDestroyLock);
 
 
        /* The next action will depend on the current state of the subscription. */
        switch(wInterlocked_read(&impl->mState))
        {
                /* The subscription is being deactivated. */
            case MAMA_SUBSCRIPTION_DEACTIVATING:
                /* Change the state. */
                mamaSubscriptionImpl_setState(impl, MAMA_SUBSCRIPTION_DEACTIVATED);
                break;
 
                /* The subscription is being deallocated, i.e. mamaSubscription_deallocate has been called
                 * before the destroy callback has come in from the bridge.
                 */
            case MAMA_SUBSCRIPTION_DEALLOCATING :
                 mamaSubscription_cleanup(subscription);
                 wlock_unlock(impl->mCreateDestroyLock);
                 mamaSubscriptionImpl_invokeDestroyedCallback(impl);
                /* Delete the subscription. */
                mamaSubscriptionImpl_deallocate(impl);
                return;
                break;
 
                /* The subscription is being destroyed. */
            case MAMA_SUBSCRIPTION_DESTROYING :
                 mamaSubscription_cleanup(subscription);
                 mamaSubscriptionImpl_setState(impl, MAMA_SUBSCRIPTION_DESTROYED);
                  wlock_unlock(impl->mCreateDestroyLock);
                 mamaSubscriptionImpl_invokeDestroyedCallback(impl);
                 return;
                break;

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.

 

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.

 

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.


Re: Deadlock in mamaSubscription and mamaTransport destroy logic

Frank Quinn
 

Hi Mike,

 

Have raised https://github.com/OpenMAMA/OpenMAMA/issues/411 For follow up on this one – let’s track there for paper trail / release note reference etc.

 

Could you add some details on the execution environment? Particularly the shutdown sequence in the code, and whether or not this is JNI etc to see if anything unusual is compounding the issue?

 

The transport destroy is not supposed to be attempted by the app until after the subscriptions have all been destroyed and the queues have been drained and destroyed so it sounds like the bridge is still firing callbacks after the subscription has been “destroy”ed to free up the memory.

 

Cheers,

Frank

 

Frank Quinn, Cascadium | +44 (0) 28 8678 8015 | http://cascadium.io

 

From: Openmama-dev@... <Openmama-dev@...> On Behalf Of Slade, Michael J via lists.openmama.org
Sent: 27 May 2020 15:41
To: openmama-dev@...
Subject: [Openmama-dev] Deadlock in mamaSubscription and mamaTransport destroy logic

 

Hi OpenMAMA Dev,

 

We have encountered a deadlock situation in mamaSubscription’s and mamaTransport’s teardown logic due to lock ordering when destroying their underlying mamaPublisher. We are able to reliably reproduce this with the tick42 bridge. Could someone take a look at this for us please?

Thanks in advance,

Mike

Deadlock – note that subscription and transport attempt to destroy same publisher:
mamaTransport teardown:

  1. transport.c:         mamaTransport_destroy()
  2. transport.c:         mamaTransportImpl_clearTransportWithPublishers()
  3. list.c:                    list_for_each() Acquires list lock (1)
  4. transport.c:         mamaTransportImpl_clearTransportPublisherCallback()
  5. publisher.c:         mamaPublisherImpl_clearTransport()
  6. publisher.c:         mamaPublisherImpl_destroy() Attempts to acquire publisher lock (2)

 

mamaSubscription teardown

  1. subscription.c:    mamaSubscriptionImpl_onSubscriptionDestroyed()
  2. subscription.c:    mamaSubscription_cleanup()
  3. publisher.c:         mamaPublisherImpl_destroy() Acquires publisher lock (2)
  4. transport.c:         mamaTransport_removePublisher()
  5. list.c:                    list_remove_element() Attempts to acquire list lock (1)

 

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidential, privileged or legal entity information, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.