I have a custom board running code (RA6M5/FSP4.5.0/FreeRTOS) which can occasionally generate a hard-fault.
The hard-fault is related to memory access performed inside a critical section. When the system operates solely using taskENTER_CRITICAL() or taskENTER_CRITICAL_FROM_ISR() everything works fine. But mixing the two will eventually cause a hard fault.
The debugger when the hard fault occurs makes the call stack look nonsensical. Its hard to get insight once you are in the hard fault handler.
Investigating taskENTER_CRITICAL I saw this on line 198:```NOTE: This may alter the stack (depending on the portable implementation)so must be used with care!```
This lead me to assume this was the case and try to use methods that might fix the issue. In particular I added calls to __DSB() at the beginning of each critical section. These seem to work but only when I run the program using the debugger.
I'm open to trying different primitive instead of a critical section. Inside that critical section is access to a simple circular buffer.
David said:These seem to work but only when I run the program using the debugger.
It seems more related to my use of a bootloader. With no bootloader the __DSB() work around works. When I start the application via the bootloader I see the issue.
I've aligned the stack size in the boot loader BPS\RA Common\Main Stack Size to be the same as the application, it was 0x800. Now it is 0x400
David said:'ve aligned the stack size in the boot loader BPS\RA Common\Main Stack Size to be the same as the application, it was 0x800. Now it is 0x400
This did not help.
Hello,
What kind of hard fault do you get ?
Also how do you mix taskENTER_CRITICAL and taskENTER_CRITICAL_FROM_ISR ?
taskENTER_CRITICAL_FROM_ISR should be called in an interrupt service routine (ISR) only.
AZ_Renesas said:What kind of hard fault do you get ?
I broke out the handlers below and the HardFault_Handler was the one being called.
from startup.c:
void NMI_Handler(void); // NMI has many sources and is handled by BSP
void HardFault_Handler(void) WEAK_REF_ATTRIBUTE;
void MemManage_Handler(void) WEAK_REF_ATTRIBUTE;
void BusFault_Handler(void) WEAK_REF_ATTRIBUTE;
void UsageFault_Handler(void) WEAK_REF_ATTRIBUTE;
void SecureFault_Handler(void) WEAK_REF_ATTRIBUTE;
void SVC_Handler(void) WEAK_REF_ATTRIBUTE;
void DebugMon_Handler(void) WEAK_REF_ATTRIBUTE;
void PendSV_Handler(void) WEAK_REF_ATTRIBUTE;
void SysTick_Handler(void) WEAK_REF_ATTRIBUTE;
AZ_Renesas said:Also how do you mix taskENTER_CRITICAL and taskENTER_CRITICAL_FROM_ISR ?
I would say that the shared resources protected by these calls are both types.
A timer interrupt is producer of data, a timer interrupt is consumer of data, and sporadically a task is a producer.
I've switched to running both the bootloader and application with 0x800, the bootloader/tinycrypt library warns when you build with a stack smaller than this. I've been running for a while without issue. I'll report back when a number of hours has passed. Usually in the setup I have it fails in < 1000 seconds.
I succeeded in running for 26K seconds, having said that a post elsewhere (https://community.renesas.com/mcu-mpu/ra/f/forum/34955/the-firmware-has-hard-fault-and-hung-some-times/125093#125093) makes me think I should revisit the priorities of my interrupts. I'll post a reply higher up to clarify some things.
I've been reading https://www.freertos.org/RTOS-Cortex-M3-M4.html , I'm currently using a CM33 (RA6M5) so I believe it applies.
#ifndef configLIBRARY_MAX_SYSCALL_INTERRUPT_PRIORITY #define configLIBRARY_MAX_SYSCALL_INTERRUPT_PRIORITY ((1)) #endif #ifndef configMAX_SYSCALL_INTERRUPT_PRIORITY #define configMAX_SYSCALL_INTERRUPT_PRIORITY (configLIBRARY_MAX_SYSCALL_INTERRUPT_PRIORITY << (8 - __NVIC_PRIO_BITS)) #endif
R7FA6M5BH.h defines __NVIC_PRIO_BITS as 4 which would mean that configMAX_SYSCALL_INTERRUPT_PRIORITY = 1 << (8-4) = 16
If I read the documentation correctly this means that I would have to specify any interrupt that uses a critical section at 16 or above... Trouble is 15 is highest priority I can set with the FSP.
Am I missing something here?
If the program runs for a long time and then hangs, it could be because a stack overflow. Please try to increase the stack size.
Also when the hard fault occurs, what information do you get from the Fault Stats window on e2studio ?
AZ_Renesas said:If the program runs for a long time and then hangs, it could be because a stack overflow. Please try to increase the stack size
I have tried increasing the stack it did not change the behaviour.
AZ_Renesas said:Also when the hard fault occurs, what information do you get from the Fault Stats window on e2studio ?
I don't think it is available to me using the RA6M5/FSP4.5.0. At least I can't find it.