KZS9131 Link up, no connection

Hello everyone,

we are using the RZ/G2L for our product, however there is a problem with the Ethernet connection. As a phy we are using the Microchip KSZ9131.

We have setup our device tree according to this documentation:
https://renesas.info/wiki/RZ-G/RZG_DeviceTree#Ethernet

&eth1 {
    pinctrl-names = "default";
    pinctrl-0 = <&eth1_rgmii_pins>;
    phy-handle = <&phy1>;
    phy-mode = "rgmii-id";

    status = "okay";

    phy1: phy@7 {
        reg = <7>;
compatible = "ethernet-phy-ieee802.3-c22";
        interrupt-parent = <&pinctrl>;
        interrupts = <RZG2L_GPIO(32, 0) IRQ_TYPE_LEVEL_LOW>;
    };

};

&pinctrl {
    eth1_rgmii_pins: eth1  {
        pinmux =    <RZG2L_PORT_PINMUX(37, 0, 1)>, // ET1_MDC
                    <RZG2L_PORT_PINMUX(37, 1, 1)>, // ET1_MDIO
                    <RZG2L_PORT_PINMUX(29, 0, 1)>, // ET1_TXC
                    <RZG2L_PORT_PINMUX(29, 1, 1)>, // ET1_TX_CTL
                    <RZG2L_PORT_PINMUX(30, 0, 1)>, // ET1_TXD0
                    <RZG2L_PORT_PINMUX(30, 1, 1)>, // ET1_TXD1
                    <RZG2L_PORT_PINMUX(31, 0, 1)>, // ET1_TXD2
                    <RZG2L_PORT_PINMUX(31, 1, 1)>, // ET1_TXD3
                    <RZG2L_PORT_PINMUX(33, 1, 1)>, // ET1_RXC
                    <RZG2L_PORT_PINMUX(34, 0, 1)>, // ET1_RX_CTL
                    <RZG2L_PORT_PINMUX(34, 1, 1)>, // ET1_RXD0
                    <RZG2L_PORT_PINMUX(35, 0, 1)>, // ET1_RXD1
                    <RZG2L_PORT_PINMUX(35, 1, 1)>, // ET1_RXD2
                    <RZG2L_PORT_PINMUX(36, 0, 1)>; // ET1_RXD3
    };
};


In dmesg the phy gets detected:
[  120.049083] Microchip KSZ9131 Gigabit PHY 11c30000.ethernet-ffffffff:07: attached PHY driver [Microchip KSZ9131 Gigabit PHY] (mii_bus:phy_addr=11c30000.ethernet-ffffffff:07, irq=196)
[  122.900711] ravb 11c30000.ethernet eth0: Link is Up - 1Gbps/Full - flow control off
[  122.900835] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready

and, as far as I understand,  ethtools also looks fine.

ethtool eth0
Settings for eth0:
        Supported ports: [ TP    MII ]
        Supported link modes:   10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Supported FEC modes: Not reported
        Advertised link modes:  10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
        Advertised pause frame use: No
        Advertised auto-negotiation: Yes
        Advertised FEC modes: Not reported
        Link partner advertised link modes:  10baseT/Half 10baseT/Full
                                             100baseT/Half 100baseT/Full
                                             1000baseT/Full
        Link partner advertised pause frame use: Symmetric
        Link partner advertised auto-negotiation: Yes
        Link partner advertised FEC modes: Not reported
        Speed: 1000Mb/s
        Duplex: Full
        Auto-negotiation: on
        master-slave cfg: preferred slave
        master-slave status: slave
        Port: Twisted Pair
        PHYAD: 7
        Transceiver: external
        MDI-X: Unknown
        Supports Wake-on: g
        Wake-on: d
        Current message level: 0x000000cc (204)
                               link timer rx_err tx_err
        Link detected: yes

However, we do not get any connection. DHCPCD fails to get an IP address, and setting IP address and gateway manually also does not allow us to ping or access other devices in the network.

Any help would be greatly appreciated.

Best
Christian



  • Try and delete these lines from the phy node

            interrupt-parent = <&pinctrl>;

            interrupts = <RZG2L_GPIO(32, 0) IRQ_TYPE_LEVEL_LOW>;

    Last week, we found that was causing the issue.

    The interrupt line is not needed or used. The kernel will poll the PHY using MDIO. But, if the interrupt line is in the device tree, then the PHY will not be polled and Ethernet might not work correctly.

    Chris

  • Unfortunately, even after uncommenting these lines, we are still not able to get a working Ethernet connection. As we are using a system on module, we have used another variant with a different SoC, so we can confirm the phy is working correctly. No problem there from 10MBit/s to 1GBit/s

    We have also tried lowering the link speeds in the switch in order to test if the 1GBit/s is the problem, but even at just 10MBit/s we are having the same issue.

    Any other idea?

    Weirdly enough, the packets count is actually going up, no errors are reported.
    eth0: flags=-28605<UP,BROADCAST,RUNNING,MULTICAST,DYNAMIC>  mtu 1500
            inet 192.168.101.201  netmask 255.255.255.0  broadcast 192.168.101.255
            inet6 fe80::84f4:e3ff:fe35:5330  prefixlen 64  scopeid 0x20<link>
            ether 86:f4:e3:35:53:30  txqueuelen 1000  (Ethernet)
            RX packets 569  bytes 57160 (55.8 KiB)
            RX errors 0  dropped 8  overruns 0  frame 0
            TX packets 117  bytes 26738 (26.1 KiB)
            TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
            device interrupt 164

  • We have done some further investigation. Setting a fixed IP and trying to ping the board from another machine, we can see it receiving the ARP requests and also trying to answer them. However, we can not see the ARP replies using Wireshark on the machine that send the ping.
    This leads me to believe, that there is a problem with the phy's tx configuration.

    Does anyone have any ideas?

  • This leads me to believe, that there is a problem with the phy's tx configuration.

    That sounds the same as the issue I saw with someone else. The Ethernet MAC driver was not being configured correctly because the messages from the PHY were not being read.

    Again, the issue was the interrupt line the Device Tree (the interrupt line was not used in software, so the PHY was never polled)

    But, if you have already tried my suggestion and it does not work, then it might be something else.

    However, looking that the messages to/from the PHY chip over MDIO is the first place to start.

  • I can see a periodic communication between the PHY and MAC via MDIO. As far as I understand, it also seems to be all correct. The correct PHY is addressed and the PHY is responding to the read commands.
    If my decoding is correct, the MAC reads registers 0 and 1 of the PHY and the responses are as expected according to the PHY's datasheet.

    Edit:
    Also I forgot to mention: I have checked the generated .dtb file via device-tree-compiler and the interrupts are correctly removed.

  • So after loaning a proper high-speed scope, I can now measure the RGMII signals. On the RX side everything looks correct, however I do not measure any signal on the TXC line. Data and control lines show activity, so I am not sure why this is happening.

    Does anyone have any ideas?

  • If my decoding is correct, the MAC reads registers 0 and 1 of the PHY and the responses are as expected according to the PHY's datasheet.

    Just FYI, the Renesas MAC driver just sends/receives raw MDIO packets. It the upper layers in the kernel that decide what should be sent.

    Have a look at function ravb_set_rate_gbeth()

    Put some printk statements in there to see if/when it is being called, and what speed it is being sent to.

    What should normally happen is that the after the PHY establishes a link and picks a speed, the kernel will call the MAC driver and set the speed to match. Yyou want to make sure that is happening.

  • Hi ,

    If you can measure the TXC line there is no clock for TX, you can check bit RCSI (bit 10) of DMAC Status Register (CSR). This bit indicated status of Reference 250 MHz Clock. This clock is used to generate for TX line. If this bit is 0, ref_clk has output to MAC layer.

    And by the way, please check bit ERCS (bit 18) of DMAC Mode Register (CCC). We should disable reference 250 MHz clock(clk_miitx_gtx_refclk) stop function. If this bit is 0, it is okay.

  • Hello ,

    the output for ETH0 DMAC Status Register:

    > devmem 0x11C3000C
    > 0x00000004

    and here for the DMAC Mode Register:

    > devmem 0x11C30000
    > 0x00000002

    So bit 10 and bit 18 are both 0.

    Yet still no signal on the txc line and again no problem reading signals on the data and ctl lines. Oddly enough, I measure a dc voltage (around 1.4V), on the txc line, which I can not explain myself.

  • You should ping from your board to another board then check the ARP package from destination board to know which data the dest board receive such as MAC, IP, ...

  • None. I do not get any outgoing traffic. I mirrored the port the motherboard is connected to and logged all traffic via Wireshark. I  am able to detect the ARP requests when I ping from another machine, I can see the ARP table entries on the RZ/G2L after pinging, I can see the ARP response using TCPDump on the RZ/G2L and I can see RX and TX package count increasing using ifconfig. Also, I see the data changing on the TX data lines, however as stated before, no TX clock.

    But I do not see any outgoing packages on the network.

    I have now tested 3 different RZ/G2L modules, all show the same behavior, so I am quite certain it's a configuration issue, I just do not know where.

  • Just to be sure, have you already checked the TXC pin register configuration? I mean PMC2D, PFC2D.

  • Yes I have. It also looks good. Here is the devmem for the two registers.

    PMC2D
    > devmem 0x1103022D 8
    > 0x03

    If I understand the datasheet correctly, this is also correct, as both P29_0 and P29_1 are in Peripheral Function Mode.

    PFC2D
    > devmem 0x110304B4
    > 0x00000011

    According to the pinfunction list, function 1 is correct for P29_1 as it is ET1_TXC.

    From my understanding, this should also be correct?

  • Hi ,

    Could you help me to check below CPG registers in your side?

    - CPG_PL6_ETH_SSEL: This register to choose source clock for ethernet 250MHz clk.

    - If bit 0 is 0, please check CPG_SAMPLL6_STBY that PLL6 is in active or reset state: bit 0 RESETB should be 1.

    - If bit 0 is 1, please check CPG_SAMPLL5_STBY that PLL5 is in active or reset state: bit 0 RESETB should be 1.