Hello,
We have a Linux board using RZG2UL chip, with a project based on Yocto.
We have an issue when we use the "reboot" command:
We don't have this issue when we use "poweroff" command
Here is the stack trace:
[ 44.444490] systemd-shutdown[1]: Using hardware watchdog 'Renesas RZ/G2L WDT Watchdog', version 0, device /dev/watchdog0 [ 44.455615] systemd-shutdown[1]: Watchdog running with a timeout of 30s. [ 44.507221] systemd-shutdown[1]: Syncing filesystems and block devices. [ 44.515544] systemd-shutdown[1]: Sending SIGTERM to remaining processes... [ 44.536649] systemd-journald[175]: Received SIGTERM from PID 1 (systemd-shutdow). [ 44.557369] audit: type=1335 audit(1719932941.989:7): pid=175 uid=0 auid=4294967295 tty=(none) ses=4294967295 comm="systemd-journal" exe="/lib/systemd/systemd-journald" nl-mcgrp=1 op=disconnect res=1 [ 44.588650] systemd-shutdown[1]: Sending SIGKILL to remaining processes... [ 44.607848] systemd-shutdown[1]: Unmounting file systems. [ 44.615222] systemd-shutdown[1]: All filesystems unmounted. [ 44.620951] systemd-shutdown[1]: Deactivating swaps. [ 44.626096] systemd-shutdown[1]: All swaps deactivated. [ 44.631393] systemd-shutdown[1]: Detaching loop devices. [ 44.640910] systemd-shutdown[1]: All loop devices detached. [ 44.646668] systemd-shutdown[1]: Stopping MD devices. [ 44.652234] systemd-shutdown[1]: All MD devices stopped. [ 44.657591] systemd-shutdown[1]: Detaching DM devices. [ 44.663025] systemd-shutdown[1]: All DM devices detached. [ 44.668457] systemd-shutdown[1]: All filesystems, swaps, loop devices, MD devices and DM devices detached. [ 44.682536] systemd-shutdown[1]: Syncing filesystems and block devices. [ 44.690719] systemd-shutdown[1]: Rebooting. [ 44.724956] reboot: Restarting system [ 44.728946] ------------[ cut here ]------------ [ 44.733552] No atomic I2C transfer handler for 'i2c-0' [ 44.738722] WARNING: CPU: 0 PID: 1 at drivers/i2c/i2c-core.h:40 i2c_smbus_xfer+0x108/0x120 [ 44.746953] Modules linked in: moal(O) mlan(O) [ 44.751396] CPU: 0 PID: 1 Comm: systemd-shutdow Tainted: G O 5.10.175-cip29-yocto-standard #1 [ 44.761182] Hardware name: Development board based on r9a07g043u11 (DT) [ 44.767771] pstate: 60400085 (nZCv daIf +PAN -UAO -TCO BTYPE=--) [ 44.773753] pc : i2c_smbus_xfer+0x108/0x120 [ 44.777918] lr : i2c_smbus_xfer+0x108/0x120 [ 44.782081] sp : ffff800011073bb0 [ 44.785380] x29: ffff800011073bb0 x28: ffff000009450000 [ 44.790674] x27: 0000000000000000 x26: 0000000000000000 [ 44.795966] x25: 0000000000000013 x24: 0000000000000000 [ 44.801259] x23: 0000000000000000 x22: 0000000000000058 [ 44.806552] x21: ffff800011073c16 x20: 0000000000000002 [ 44.811844] x19: ffff00000a3220c8 x18: 0000000000000030 [ 44.817136] x17: 0000000000000000 x16: 0000000000000000 [ 44.822428] x15: ffffffffffffffff x14: 0720072007200720 [ 44.827720] x13: ffff800010f11a18 x12: 00000000000004d1 [ 44.833012] x11: 000000000000019b x10: ffff800010f69a18 [ 44.838305] x9 : 00000000fffff000 x8 : ffff800010f11a18 [ 44.843597] x7 : ffff800010f69a18 x6 : 0000000000000000 [ 44.848890] x5 : 000000000000bff4 x4 : 0000000000000000 [ 44.854182] x3 : 00000000ffffffff x2 : 0000000000000000 [ 44.859475] x1 : 0000000000000000 x0 : ffff000009450000 [ 44.864768] Call trace: [ 44.867205] i2c_smbus_xfer+0x108/0x120 [ 44.871026] i2c_smbus_write_byte_data+0x40/0x70 [ 44.875628] da9062_wdt_restart+0x30/0x90 [ 44.879623] watchdog_restart_notifier+0x1c/0x3c [ 44.884226] atomic_notifier_call_chain+0x60/0x90 [ 44.888911] do_kernel_restart+0x24/0x30 [ 44.892820] machine_restart+0x50/0x54 [ 44.896554] __do_sys_reboot+0x200/0x260 [ 44.900460] __arm64_sys_reboot+0x24/0x30 [ 44.904454] el0_svc_common.constprop.0+0x78/0x1c4 [ 44.909225] do_el0_svc+0x24/0x9c [ 44.912528] el0_svc+0x14/0x20 [ 44.915570] el0_sync_handler+0xb0/0xb4 [ 44.919389] el0_sync+0x180/0x1c0 [ 44.922689] ---[ end trace e98ed21c2fb1dfc6 ]--- [ 44.927654] ------------[ cut here ]------------ [ 44.932263] WARNING: CPU: 0 PID: 1 at kernel/rcu/tree_plugin.h:297 rcu_note_context_switch+0x44/0x340 [ 44.941444] Modules linked in: moal(O) mlan(O) [ 44.945879] CPU: 0 PID: 1 Comm: systemd-shutdow Tainted: G W O 5.10.175-cip29-yocto-standard #1 [ 44.955665] Hardware name: Development board based on r9a07g043u11 (DT) [ 44.962252] pstate: 20400085 (nzCv daIf +PAN -UAO -TCO BTYPE=--) [ 44.968234] pc : rcu_note_context_switch+0x44/0x340 [ 44.973093] lr : __schedule+0xb4/0x6e0 [ 44.976824] sp : ffff800011073840 [ 44.980123] x29: ffff800011073840 x28: 0000000000000002 [ 44.985416] x27: 0000000000000000 x26: ffff00000a3220c8 [ 44.990709] x25: ffff000009450000 x24: ffff800010fe8000 [ 44.996001] x23: 0000000000000000 x22: ffff000009450000 [ 45.001294] x21: ffff000009450000 x20: ffff800010de29c0 [ 45.006586] x19: ffff00003fe109c0 x18: 0000000000000030 [ 45.011879] x17: 0000000000000000 x16: 0000000000000000 [ 45.017171] x15: ffffffffffffffff x14: 00000000000001f1 [ 45.022463] x13: 0000000000000000 x12: 0000000000000000 [ 45.027755] x11: 0000000000000000 x10: 0000000000000940 [ 45.033048] x9 : ffff8000110737c0 x8 : 0000000000000001 [ 45.038341] x7 : ffff00003fe0aba8 x6 : 1200104000000000 [ 45.043633] x5 : 1200104000000000 x4 : 0000000000000000 [ 45.048925] x3 : 0000000000000000 x2 : ffff800010ef8b50 [ 45.054218] x1 : ffff800010a8cec0 x0 : 0000000000000001 [ 45.059510] Call trace: [ 45.061948] rcu_note_context_switch+0x44/0x340 [ 45.066459] __schedule+0xb4/0x6e0 [ 45.069846] schedule+0x70/0x104 [ 45.073062] schedule_timeout+0x80/0xf0 [ 45.076882] wait_for_completion_timeout+0x80/0x10c [ 45.081740] riic_xfer+0xe4/0x160 [ 45.085046] __i2c_transfer+0x160/0x4ec [ 45.088865] i2c_smbus_xfer_emulated+0xe0/0x600 [ 45.093376] __i2c_smbus_xfer+0x108/0x210 [ 45.097368] i2c_smbus_xfer+0x7c/0x120 [ 45.101101] i2c_smbus_write_byte_data+0x40/0x70 [ 45.105701] da9062_wdt_restart+0x30/0x90 [ 45.109695] watchdog_restart_notifier+0x1c/0x3c [ 45.114293] atomic_notifier_call_chain+0x60/0x90 [ 45.118978] do_kernel_restart+0x24/0x30 [ 45.122885] machine_restart+0x50/0x54 [ 45.126618] __do_sys_reboot+0x200/0x260 [ 45.130524] __arm64_sys_reboot+0x24/0x30 [ 45.134517] el0_svc_common.constprop.0+0x78/0x1c4 [ 45.139288] do_el0_svc+0x24/0x9c [ 45.142589] el0_svc+0x14/0x20 [ 45.145630] el0_sync_handler+0xb0/0xb4 [ 45.149450] el0_sync+0x180/0x1c0 [ 45.152749] ---[ end trace e98ed21c2fb1dfc7 ]--- [ 45.157487] da9062 0-0058: Failed to shutdown (err = -6) [ 45.662853] Reboot failed -- System halted
Callstack:
static int da9062_wdt_restart(struct watchdog_device *wdd, unsigned long action, void *data) { struct da9062_watchdog *wdt = watchdog_get_drvdata(wdd); struct i2c_client *client = to_i2c_client(wdt->hw->dev); int ret; /* Don't use regmap because it is not atomic safe */ ret = i2c_smbus_write_byte_data(client, DA9062AA_CONTROL_F, DA9062AA_SHUTDOWN_MASK); if (ret < 0) dev_alert(wdt->hw->dev, "Failed to shutdown (err = %d)\n", ret); /* wait for reset to assert... */ mdelay(500); return ret; } s32 i2c_smbus_write_byte_data(const struct i2c_client *client, u8 command, u8 value) { union i2c_smbus_data data; data.byte = value; return i2c_smbus_xfer(client->adapter, client->addr, client->flags, I2C_SMBUS_WRITE, command, I2C_SMBUS_BYTE_DATA, &data); } /** * i2c_smbus_xfer - execute SMBus protocol operations * @adapter: Handle to I2C bus * @addr: Address of SMBus slave on that bus * @flags: I2C_CLIENT_* flags (usually zero or I2C_CLIENT_PEC) * @read_write: I2C_SMBUS_READ or I2C_SMBUS_WRITE * @command: Byte interpreted by slave, for protocols which use such bytes * @protocol: SMBus protocol operation to execute, such as I2C_SMBUS_PROC_CALL * @data: Data to be read or written * * This executes an SMBus protocol operation, and returns a negative * errno code else zero on success. */ s32 i2c_smbus_xfer(struct i2c_adapter *adapter, u16 addr, unsigned short flags, char read_write, u8 command, int protocol, union i2c_smbus_data *data) { s32 res; res = __i2c_lock_bus_helper(adapter); if (res) return res; res = __i2c_smbus_xfer(adapter, addr, flags, read_write, command, protocol, data); i2c_unlock_bus(adapter, I2C_LOCK_SEGMENT); return res; } EXPORT_SYMBOL(i2c_smbus_xfer);
As I understand, the function 'da9062_wdt_restart()' is registered to a list of functions that needs to be called by the kernel in case of restart (from do_kernel_restart())
In this stack trace, i don't know what is the problem:
and/or:
Did someonw know what is going wrong here ?
What is error -6 ? ENXIO ? Where does it comes from ?
I found that ENXIO is from the i2c-riic.c driver and it appears when we receive a NACK on I2C
DTS: i2c0: i2c@10058000 { #address-cells = <1>; #size-cells = <0>; compatible = "renesas,riic-r9a07g043", "renesas,riic-rz"; reg = <0 0x10058000 0 0x400>; interrupts = <GIC_SPI 350 IRQ_TYPE_LEVEL_HIGH>, <GIC_SPI 348 IRQ_TYPE_EDGE_RISING>, <GIC_SPI 349 IRQ_TYPE_EDGE_RISING>, <GIC_SPI 352 IRQ_TYPE_LEVEL_HIGH>, <GIC_SPI 353 IRQ_TYPE_LEVEL_HIGH>, <GIC_SPI 351 IRQ_TYPE_LEVEL_HIGH>, <GIC_SPI 354 IRQ_TYPE_LEVEL_HIGH>, <GIC_SPI 355 IRQ_TYPE_LEVEL_HIGH>; interrupt-names = "tei", "ri", "ti", "spi", "sti", "naki", "ali", "tmoi"; clocks = <&cpg CPG_MOD R9A07G043_I2C0_PCLK>; clock-frequency = <100000>; resets = <&cpg R9A07G043_I2C0_MRST>; power-domains = <&cpg>; status = "disabled"; }; i2c-riic.c static struct riic_irq_desc riic_irqs[] = { { .res_num = 0, .isr = riic_tend_isr, .name = "riic-tend" }, { .res_num = 1, .isr = riic_rdrf_isr, .name = "riic-rdrf" }, { .res_num = 2, .isr = riic_tdre_isr, .name = "riic-tdre" }, { .res_num = 3, .isr = riic_stop_isr, .name = "riic-stop" }, { .res_num = 5, .isr = riic_tend_isr, .name = "riic-nack" }, }; static int riic_i2c_probe(struct platform_device *pdev) { struct riic_dev *riic; struct i2c_adapter *adap; struct resource *res; struct i2c_timings i2c_t; struct reset_control *rstc; int i, ret; ... for (i = 0; i < ARRAY_SIZE(riic_irqs); i++) { res = platform_get_resource(pdev, IORESOURCE_IRQ, riic_irqs[i].res_num); if (!res) return -ENODEV; ret = devm_request_irq(&pdev->dev, res->start, riic_irqs[i].isr, 0, riic_irqs[i].name, riic); if (ret) { dev_err(&pdev->dev, "failed to request irq %s\n", riic_irqs[i].name); return ret; } } ... return 0; out: pm_runtime_disable(&pdev->dev); return ret; } static irqreturn_t riic_tend_isr(int irq, void *data) { struct riic_dev *riic = data; if (readb(riic->base + RIIC_ICSR2) & ICSR2_NACKF) { /* We got a NACKIE */ readb(riic->base + RIIC_ICDRR); /* dummy read */ riic_clear_set_bit(riic, ICSR2_NACKF, 0, RIIC_ICSR2); riic->err = -ENXIO; } else if (riic->bytes_left) { return IRQ_NONE; } if (riic->is_last || riic->err) { riic_clear_set_bit(riic, ICIER_TEIE, ICIER_SPIE, RIIC_ICIER); writeb(ICCR2_SP, riic->base + RIIC_ICCR2); } else { /* Transfer is complete, but do not send STOP */ riic_clear_set_bit(riic, ICIER_TEIE, 0, RIIC_ICIER); complete(&riic->msg_done); } return IRQ_HANDLED; }
The function 'riic_tend_isr()' can be called for interrupt 0 and 5, they are mapped like this in DTS:- 0 = <GIC_SPI 350 IRQ_TYPE_LEVEL_HIGH>- 5 = <GIC_SPI 351 IRQ_TYPE_LEVEL_HIGH>
p491:
|--------------------|--------------------|-------------------|----------------|----------------| | Interrupt Source | Cause of Interrupt | Interrupt ID | SGI,PPI,SPI No | Interrupt Type | |--------------------|--------------------|-------------------|----------------|----------------| | I2C (ch0) | INTRIICTEI0 | 382 | SPI 350 | Level | | I2C (ch0) | INTRIICNAKI0 | 383 | SPI 351 | Level |
Note that if I remove the DA9062 from DTS:
1. The reboot works
2. The command 'i2ctransfer 0 w2@0x58 0x13 0x02' successfully shutdown the board. This command do exactly what the driver do on shutdown.
3. But this is not a solution, we need to keep it working and understand the failure.
So at this point, i just don't understand why we receive a NACK when the driver try to inform the PMIC to shutdown
Is it possible that something has been uninitialised before sending the "shutdown" command to the PMIC (i2c, clock, something else)? (as this action is the last action of "reboot" sequence)
An issue might be that when you do a shutdown, the kernel shuts off interrupts. So if the I2C driver needs to use interrupts, it will not work correctly.
But, if you just want to reboot the device, you do not need the PMIC. By default in the RZ BSP, the reboot command uses the watchdog time to instantly reset the device. This WDT reset is the same a PRESET reset. So writing to the PMIC is not needed.
Hello*
Thanks for the response.
I think the interrupts are still enabled because, if we get the error -6 (ENXIO) it can only be from the function 'riic_tend_isr()' of this driver and this function is called when we receive the NACK interrupt. :/
For the second part, in fact, I have configured the PMIC in the DTS because we want the watchdog feature, but we don't necessary need to process the reboot with the PMIC shutdown command, this "process" is not my own, this is made by the da9062 driver (automatically)
Also: Note that systemd is also configuring the watchdog TIMEOUT (with: RebootWatchdogSec) before coming at this error, so in any cases it can reboot, but I would like to suppress this "crash".
Also, can you tell me where is this part:
By default in the RZ BSP, the reboot command uses the watchdog time to instantly reset the device. This WDT reset is the same a PRESET reset. So writing to the PMIC is not needed.
Are you talking about what systemd does ? (using RebootWatchdogSec) or something else like in the kernel ?
If it's kernel side, it's possible that adding da9062 driver break the thing because the restart function of DA9062 has highest priority (128) than renesas stuff.
But to check this, i need to know what we are talking about
Dragontry said:Also, can you tell me where is this part:
Have a look in the WDT driver for RZ/G2L
https://github.com/renesas-rz/rz_linux-cip/blob/rz-5.10-cip41/drivers/watchdog/rzg2l_wdt.c
It registers a 'restart' handler:
static const struct watchdog_ops rzg2l_wdt_ops = { .owner = THIS_MODULE, .start = rzg2l_wdt_start, .stop = rzg2l_wdt_stop, .ping = rzg2l_wdt_ping, .set_timeout = rzg2l_wdt_set_timeout, .restart = rzg2l_wdt_restart, };
You can see that function rzg2l_wdt_restart() forces an immediate RESET of the device.
static int rzg2l_wdt_restart(struct watchdog_device *wdev, unsigned long action, void *data) { struct rzg2l_wdt_priv *priv = watchdog_get_drvdata(wdev); clk_prepare_enable(priv->pclk); clk_prepare_enable(priv->osc_clk); if (priv->devtype == WDT_RZG2L) { /* Generate Reset (WDTRSTB) Signal on parity error */ rzg2l_wdt_write(priv, 0, PECR); /* Force parity error */ rzg2l_wdt_write(priv, PEEN_FORCE, PEEN); } else {
Good!
So in my board as the DA9062 WDT driver is pulled in, because we enable the driver of the PMIC in the DTS, there are two "watchdog drivers" that register the "restart" function, and maybe the DA9062 takes the priority
So maybe I see two options :
1. Change the priority of the DA9062 driver for the restart function to the smallest (0: Restart handler of last resort, with limited restart capabilities)
2. Change the priority of the renesas driver for the restart function to the highest (255: Highest priority restart handler, will preempt all other restart handlers)
In the DA9062 they configure it in the probe function with 128:
watchdog_set_restart_priority(&wdt->wdtdev, 128);
I agree. I think that will fix your problem.
Thanks,
I confirm that putting the priority on the renesas restart function solved the probem.
But if someone from renesas come here, i will be happy to understand why the PMIC DA9062 restart function is not working in this context
(maybe some peripheral stopped?)