I have a custom board running code (RA6M5/FSP4.5.0/FreeRTOS) which can occasionally generate a hard-fault.
The hard-fault is related to memory access performed inside a critical section. When the system operates solely using taskENTER_CRITICAL() or taskENTER_CRITICAL_FROM_ISR() everything works fine. But mixing the two will eventually cause a hard fault.
The debugger when the hard fault occurs makes the call stack look nonsensical. Its hard to get insight once you are in the hard fault handler.
Investigating taskENTER_CRITICAL I saw this on line 198:```NOTE: This may alter the stack (depending on the portable implementation)so must be used with care!```
This lead me to assume this was the case and try to use methods that might fix the issue. In particular I added calls to __DSB() at the beginning of each critical section. These seem to work but only when I run the program using the debugger.
I'm open to trying different primitive instead of a critical section. Inside that critical section is access to a simple circular buffer.
David said:These seem to work but only when I run the program using the debugger.
It seems more related to my use of a bootloader. With no bootloader the __DSB() work around works. When I start the application via the bootloader I see the issue.
I've aligned the stack size in the boot loader BPS\RA Common\Main Stack Size to be the same as the application, it was 0x800. Now it is 0x400
David said:'ve aligned the stack size in the boot loader BPS\RA Common\Main Stack Size to be the same as the application, it was 0x800. Now it is 0x400
This did not help.
I've switched to running both the bootloader and application with 0x800, the bootloader/tinycrypt library warns when you build with a stack smaller than this. I've been running for a while without issue. I'll report back when a number of hours has passed. Usually in the setup I have it fails in < 1000 seconds.
I succeeded in running for 26K seconds, having said that a post elsewhere (https://community.renesas.com/mcu-mpu/ra/f/forum/34955/the-firmware-has-hard-fault-and-hung-some-times/125093#125093) makes me think I should revisit the priorities of my interrupts. I'll post a reply higher up to clarify some things.