Could be (or have been) some compiler/optimizer bug.
Other LE deveopers noticed similar issues on other devices as well so we stuck to cortex-a53 which works fine.
It's hard to tell if optimizing for cortex-a72 would give any real-world performance improvements and as digging into possible compiler bugs isn't fun we haven't investigated that further.