About the instruction execution speed

Hello ~

I saw the following information while referring to the document(R01US0015EJ0230 rev 2.30 page 40).

"Remark This number of clocks is for when the program is in the internal ROM (flash memory) area. When fetching an
instruction from the internal RAM area, the number of clocks is four times the number of clocks plus 6,
maximum."
In general, I know that when the code is performed in Ram, the execution speed is faster than flash XIP.
However, in renesas RL78, I am confused to see that performing instruction in Ram is four times slower than performing in flash.

Did I misunderstand the document?

  • Code execution from RAM is faster for MPUs using external non-volatile memory to store the application. In such cases the application is copied from external memory to RAM during boot-process. In case of RL78 code execution is typically done directly from internal Flash Memory and no copying is necessary. Code execution from RAM is only necessary in special uses cases, e.g. self-programming of the Flash Memory. The RL78 RAM is optimized for fast data-access and the Flash memory for fast code execution.
    Your understanding of the manual is correct, Code execution from RAM is slower than from FLASH in case of RL78.

  • Hello Fragero.

    First of all, thank you for your kind explanation.


    In almost all flashes where XIP is supported, code execution speed is dramatically faster in RAM. 

    This is because the read speed of RAM is faster than the read speed of flash. 

    From this point of view, I do not understand that the operating speed in ram is four times slower than flash.

    I think there are reasons such as the difference in bus clk used in flash and ram, or the limitation of ram of RL78.

  • I also assume that this is caused by bus clk and addition bus-interfaces. But as I explained, in case of RL78 code is executed from FLASH in normal operation mode and not from RAM. Therefore a slower code execution from RAM is no disadvantage.

  • I think the fact that Ram is slow is obvious disadvantage.
    As you know, when writing nonvolatile data in flash or self updating, you have to perform code in ram.

    It is clearly disadvantageous that RAM is four times slower than flash, and this is a point to consider when designing a system.

    Anyway, thank you Fragero for checking that code execution is slow in RAM.

    Sir. Renesas, if you know the root cause, please reply.

     

  • Just to add to what already Fragero has mentioned:

  • The optimization of insn-fetch from code-flash has to do with how the internal bus was designed into the CPU core.  As pointed out there is little need to execute from RAM on RL78, only when programming code-flash (note there is no RAM exec necessary to program / erase the data-flash).  Typically this scenario is not time-critical, as the device must wait for flash operation to complete.  Note that it is not even possible to fetch insns from the data-flash.

    Many developers coming from other devices that have generic bus access to all memories are confused by this concept since it completely backwards from what they are used to using.  They don't realize that RL78 has zero wait-state insn-fetch from code-flash - it doesn't get any faster than that.

    Some of the new RL78 devices support dual-bank code-flash to allow for OTA programming and execute from flash while programming the other bank.

    If you have a time-critical operation that must for some strange reason run while programming the code-flash, perhaps another solution would be warranted.

  • Hello JimB

    Thank you for your kind explanation.
    Your reply helped me a lot.

    F23 or 24 does not belong to the rl78 dual bank support products you mentioned?

    I have never seen the contents of dual bank or RWW in F23/24 specification, so I ask.

    Thank you .

  • I believe only the RL78/I1C device (non-automotive 512KB for meter applications) has the dual-bank flash feature.  Unfortunately this has not been incorporated into the automotive devices yet.

    Perhaps the coming 512KB F25 auto device would have this feature, but that is just a guess since it is not yet released and I don't have access to such early information.

    One other point to note about the F23/F24 devices, for some very strange reason the RFD T02 code-flash programming does not support the feature to deactivate all interrupt vectors and use a single common ISR located in RAM for all interrupts (for the Gen1 X1x devices this was the FSL_ChangeInterruptTable function).  The somewhat similar flash technology RL78 Gen-2 X2x non-auto devices do support this with the T01 version of RFD (with function R_RFD_ChangeInterruptVector).

    This means that it is impossible to service interrupts during code-flash erase / programming operations with the F2x devices.  I cannot imagine why this feature is missing, but I would certainly want that ability if I were writing a boot-loader or needed to otherwise program the code-flash.

  • JimB

    Your kind reply and opinion helped me a lot.
    I totally agree with you.
    chips that support dual banks should be supported with interrupts during flash writing.
    Otherwise, dual bank is meaningless.
    Of course, as you said, it is a pity that the vector table is fixed in F2x.

    Thank you again.