In our proprietary board we have two ethernet interfaces (one for the lan and one for the wan). After a random time, it can be minutes or days, the interfaces stops working (mainly the wan that has telnet server). When interfaces crash, they don't even respond to ping.
I tried to debug the problem but unfortunately E2studio crashes after a while. What I have seen is that the IP Helper threads keep working even if they don't respond to the ping.
Now I have also enabled the traceX hoping to understand something. What do you think I should focus on?
I confirm that isolating the network solved my problem, but this is still a problem that exists if using NetX on crowded networks.
Have you tried using WireShark to see where the interface stops responding to pings? That might help narrow down your search.
Often seemingly random crashes can be attributed to stack or buffer overflow issues. Perhaps that would be the place to start with traceX- see if you can spot any buffers or stacks that keep growing over time- this could help identify the problem source.
Hopefully others in the forum will offer additional advice- this is always a tricky type of bug to chase down...
I have done several tests. As per your suggestion I also tried sniffing with wireshark. When it stops working it not even respond to ARP requests (in the attached file there are two ping tests, one on 192.168.1.33 which works and one on 192.168.1.34 which is the port which in this case is blocked). I have also attached screenshots of the RTOS Resource windows.
Every now and then the ethernet port starts working for a while. From what I understand the stack is not broken and the threads keep running. I have also attached the traceX files when it works and when it crashes (but I can't understand much).
Anyone have any ideas?
How are the packet pools configured? It sounds like you could be running out of packet space causing the hangs. It you are handling the packets at the application level (e.g. not using one of the NetX applications) then it is up to the application to release received packets If this is not done in a timely manner or there are not enough packets available then the network will hang waiting for packet space to become available. Also, it worth considering having different packet pools for different application using the network so one application isn't able to consume lots of packet space and starve other parts of the application.
Use the below API to check the state of packet pools which may help determine if this is the problem.
Then I also have the statistics of the packet pool when the ethernet crashes.
WAN: Total packets sent: 15627 Total bytes sent: 8752012 Send packets dropped: 1449 Total packets received: 2908634 Total bytes received: 7306566 Received packets dropped: 2853380 Received checksum errors: 0 Invalid packets: 9 Total fragments sent: 0 Total fragments received: 0 Total pool packets: 64 Free packets pool: 48 Empty pool request: 39 Empty pool suspensions: 0 Invalid packets release: 0
By sniffing the ethernet port the arp requests arrive but the machine does not respond (see attached image)..
NOTE: Now I am testing the system independently (begore it was connected to the corporate LAN). See if it does not crash. But still I should make it work in every scenario.
Emoty pool requests means 48 times you ran out if packets.
Hi Iarry.I had also increased the pool to 2048 packets, but it still crashed. And consider that the system didn't use ethernets (they were active but application not send/receive user data).As I have confirmed from Oscar it seems a problem when too many packets arrive (maybe broadcast or multicast), the ethernet stops working.
For the moment, using a separate network, the problem no longer occurred. In my case it is not a problem because it is a device that works on networks with few devices, but should understand if there is a bug in some library for other projects.
As Paolo says, the problem is about the stack crashing not a temporary difficult to handle high traffic volume.
In my case, it is mandatory to solve the problem because i don't know where the product will be used.
But, it is important to get some feedback from RENESAS PEOPLE in order to see if this is a potential stack bug.