Author Topic: lwip - limiting CPU usage on STM32  (Read 3508 times)

0 Members and 2 Guests are viewing this topic.

Offline harerodTopic starter

  • Frequent Contributor
  • **
  • Posts: 469
  • Country: de
  • ee - digital & analog
    • My services:
lwip - limiting CPU usage on STM32
« on: January 27, 2022, 07:17:31 pm »
Scenario: STM32F4 with lwIP and freeRTOS

This particular device is inside a safe environment, which must never have an internet connection. Protocol errors, e.g. due to lost UDP packets, will be handled by another layer.

One weak point of this whole setup is the variable CPU load lwIP generates. Normally we have sparse UDP exchanges, one telegram every second or so. At an higher packet rate, e.g. TFTP transfers, lwIP really starts eating into the available CPU bandwidth.

What I have been looking for, but couldn't find a good answer to, is: How could we go about limitting the CPU bandwidth that lwIP may eat up?

One idea that came to my mind, is monitoring CPU load (WWDG+EWI (Early Wakeup Interrupt)) and dynamically changing the priority of / disabling the ETH Interrupt. WWDG+EWI would be rather simple to add, since the device only uses IWDG so far.


Any comments or better of the shelf solutions that I didn't find / think of?

edit 2201301639: corrected typo in headline
« Last Edit: January 30, 2022, 03:40:12 pm by harerod »
 

Offline thm_w

  • Super Contributor
  • ***
  • Posts: 6996
  • Country: ca
  • Non-expert
Re: lwip - limitting CPU usage on STM32
« Reply #1 on: January 28, 2022, 02:20:59 am »
Can you shut off the interrupt if the lwIP buffer reaches a certain size (without cutting off a packet)?
Then the RTOS task priority should be lower, so it will not re-enable interrupt until that task has time to run and process the data.

But I guess that would need to be done within the interrupt itself.

Code: [Select]
/**
 * @brief STM32F2 Ethernet MAC interrupt service routine
 **/
 
void ETH_IRQHandler(void)
{
   bool_t flag;
   uint32_t status;
 
   //Interrupt service routine prologue
   osEnterIsr();
 
   //This flag will be set if a higher priority task must be woken
   flag = FALSE;
 
   //Read DMA status register
   status = ETH->DMASR;
 
   //Packet transmitted?
   if((status & ETH_DMASR_TS) != 0)
   {
      //Clear TS interrupt flag
      ETH->DMASR = ETH_DMASR_TS;
 
      //Check whether the TX buffer is available for writing
      if((txCurDmaDesc->tdes0 & ETH_TDES0_OWN) == 0)
      {
         //Notify the TCP/IP stack that the transmitter is ready to send
         flag |= osSetEventFromIsr(&nicDriverInterface->nicTxEvent);
      }
   }
 
   //Packet received?
   if((status & ETH_DMASR_RS) != 0)
   {
      //Clear RS interrupt flag
      ETH->DMASR = ETH_DMASR_RS;
 
      //Set event flag
      nicDriverInterface->nicEvent = TRUE;
      //Notify the TCP/IP stack of the event
      flag |= osSetEventFromIsr(&netEvent);
   }
 
   //Clear NIS interrupt flag
   ETH->DMASR = ETH_DMASR_NIS;
 
   //Interrupt service routine epilogue
   osExitIsr(flag);
}
 


But how often can this interrupt occur?
I can't see it happening so much that it overtakes everything else. lwip should not be running until any higher priority tasks are finished.
« Last Edit: January 28, 2022, 02:26:21 am by thm_w »
Profile -> Modify profile -> Look and Layout ->  Don't show users' signatures
 

Offline westfw

  • Super Contributor
  • ***
  • Posts: 4281
  • Country: us
Re: lwip - limitting CPU usage on STM32
« Reply #2 on: January 28, 2022, 03:33:39 am »
Quote
At an higher packet rate, e.g. TFTP transfers, lwIP really starts eating into the available CPU bandwidth.
Well, for TFTP, you can always ACK more slowly...Most of the IP protocols suites are sort of self-timed.  If you want them to use less CPU, adjust the ack timing.(and make sure things like the Nagel algorithm are turned on.  That'll optimize packetization based on performance as well as network bandwidth.)
(that's generic advice.  I'm not 100% certain that it applies to lwip, or how, but ... it should.)
 

Offline capt bullshot

  • Super Contributor
  • ***
  • Posts: 3033
  • Country: de
    • Mostly useless stuff, but nice to have: wunderkis.de
Re: lwip - limitting CPU usage on STM32
« Reply #3 on: January 28, 2022, 02:27:46 pm »
Back in the days, I had a similar issue with LwIP.
What I did, afair, was to limit the processing rate of the incoming packets, by some artificial delay.

MAC -> chained DMA -> bottom half interrupt handler enables upper half handler by OS call -> upper half handler runs in OS context (insert delay here afair) sets up new DMA buffers, and processes received packet.

Anything else worked by itself then, reducing the OS load as a result.
Safety devices hinder evolution
 

Offline harerodTopic starter

  • Frequent Contributor
  • **
  • Posts: 469
  • Country: de
  • ee - digital & analog
    • My services:
Re: lwip - limitting CPU usage on STM32
« Reply #4 on: January 28, 2022, 05:46:35 pm »
...
But how often can this interrupt occur?
...


Whenever data appears at the Ethernet interface, which has either the correct IP address or is a broadcast. At 100Mbit this can be fairly often. This is the reason why I am looking for a way to throttle that interrupt.
You can reproduce the effect with any Ethernet tool that provides "stress testing". With "Packet Sender" that would be "intense traffic generator".

 

Offline peter-h

  • Super Contributor
  • ***
  • Posts: 3995
  • Country: gb
  • Doing electronics since the 1960s...
Re: lwip - limitting CPU usage on STM32
« Reply #5 on: January 29, 2022, 06:15:31 pm »
I am not sure if this makes sense because it was someone else working on my project who did the ethernet code (we are using ST's LWIP library, with MbedTLS but that's a different thing) but we are polling the ethernet stuff instead of running under interrupt, from an RTOS task.

This reduces ethernet performance but it puts us in control, so we cannot get overwhelmed by fast incoming packets. That's a known issue in embedded systems; there have been products which would practically hang because they could not cope with incoming data, and their ethernet controller didn't support hardware packet filtering (by IP).

And one can control the RTOS (FreeRTOS) priority, etc.

I can get you a more detailed description if you need it. I think it may be the same thing as westfw has suggested above.

We have a LAN8742, like one of ST's development boards. I don't know if this has hardware packet filtering by IP. I suspect not, otherwise how would you create a system which responds on multiple IPs (which is possible on say a PC).
« Last Edit: January 29, 2022, 06:19:16 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline harerodTopic starter

  • Frequent Contributor
  • **
  • Posts: 469
  • Country: de
  • ee - digital & analog
    • My services:
Re: lwip - limitting CPU usage on STM32
« Reply #6 on: January 29, 2022, 08:56:44 pm »
peter-h, thanks for sharing. Would you mind telling us which protocols you are using?

As for the PHY - think of this component as an interface between the MAC and the actual transport medium. The MAX232 of Ethernet, if you will. I have written adaptations for more PHY's with STM32's than I can remember. They are largely interchangeable. Adaptation is mostly about different ID registers and different special capabilities. They are connected to the MAC via two interfaces. One is called MDIO, I2C-like and used for configuration. The other is the data highway, the MII. Check your PHY datasheet for more info.

Information about the STM32F4 ETH Media Access Control can be found in RM0090 chapter ETH/MAC. It is this MAC that does basic filtering. Not every packet on the medium causes an interrupt, only if it fits certain criteria. Hence my comment about broadcast and IP-address in my last post.
 

Offline westfw

  • Super Contributor
  • ***
  • Posts: 4281
  • Country: us
Re: lwip - limitting CPU usage on STM32
« Reply #7 on: January 29, 2022, 11:29:49 pm »
Quote
At an higher packet rate, e.g. TFTP transfers, lwIP really starts eating into the available CPU bandwidth.
Quote
which has either the correct IP address or is a broadcast. At 100Mbit this can be fairly often. This is the reason why I am looking for a way to throttle that interrupt.
So which is causing your performance problems?  The traffic that you are specifically using/requesting, extra broadcast/multicast traffic on the net, or "uninteresting" traffic directed at your ethernet address?  Is your net "healthy" (ie not prone to "broadcast storms", DoS attacks, or other issues)?  Does a packet trace from your "busy" period look reasonable? (alas, such traces are harder to get than they used to be.)

There's a principle: "Random Early Drop" that suggests that dropping packets "early" (ie before you spend a lot of CPU cycles trying to interpret them, or fill queues with them) is useful for avoiding congestion.  But it's usually aimed at network congestion (and it's been pissing me off for 30+ years that very little of the "congestion avoidance" research has ignored endpoint congestion...)
 

Online nctnico

  • Super Contributor
  • ***
  • Posts: 27704
  • Country: nl
    • NCT Developments
Re: lwip - limitting CPU usage on STM32
« Reply #8 on: January 30, 2022, 12:08:14 am »
peter-h, thanks for sharing. Would you mind telling us which protocols you are using?

As for the PHY - think of this component as an interface between the MAC and the actual transport medium. The MAX232 of Ethernet, if you will. I have written adaptations for more PHY's with STM32's than I can remember. They are largely interchangeable. Adaptation is mostly about different ID registers and different special capabilities. They are connected to the MAC via two interfaces. One is called MDIO, I2C-like and used for configuration. The other is the data highway, the MII. Check your PHY datasheet for more info.

Information about the STM32F4 ETH Media Access Control can be found in RM0090 chapter ETH/MAC. It is this MAC that does basic filtering. Not every packet on the medium causes an interrupt, only if it fits certain criteria. Hence my comment about broadcast and IP-address in my last post.
You will need to handle broadcast packets and packets with the MAC address of your device otherwise things will be seriously broken. However a simple solution is to let the ethernet buffers overflow. If you have a decent MAC and microcontroller then it will use DMA to transfer data which sits on a bus which is normally not used for CPU <-> memory transfer (at least NXP's ARM microcontrollers are built this way). The MAC will keep pumping data into the buffers but they are not getting processed.

Lwip is polled using an OS thread (in a function called tcpip_thread )so if you put a sleep (20ms or so) in that polling thread, you'll tune lwip down a bit without breaking stuff. But you'll get a much lower network datarate to your device because it will take 20ms before the next packet is being processed. Another option is to add a loop counter and execute the sleep every 10 loops (play with the number of loops and delay a bit until you get the desired effect).

Another option is to use an external chip like the Wiznet W5500. This offloads dealing with network traffic completely and only forwards network data that is relevant.
« Last Edit: January 30, 2022, 12:11:02 am by nctnico »
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 

Offline westfw

  • Super Contributor
  • ***
  • Posts: 4281
  • Country: us
Re: lwip - limitting CPU usage on STM32
« Reply #9 on: January 30, 2022, 01:34:26 am »
Quote
a simple solution is to let the ethernet buffers overflow.
But it sounds more impressive on your resume if you call it "Random Early Drop"!

(but yeah; in general you don't want to receive more pps than you can effectively process, and the earlier you get rid of them, the less processing they use.  Letting them fail at the MAC level ought to be fine, unless there is some degenerate condition that causes the important packets to be dropped more often than the unimportant packets.)
 

Online nctnico

  • Super Contributor
  • ***
  • Posts: 27704
  • Country: nl
    • NCT Developments
Re: lwip - limitting CPU usage on STM32
« Reply #10 on: January 30, 2022, 02:00:18 am »
Quote
a simple solution is to let the ethernet buffers overflow.
But it sounds more impressive on your resume if you call it "Random Early Drop"!

(but yeah; in general you don't want to receive more pps than you can effectively process, and the earlier you get rid of them, the less processing they use.  Letting them fail at the MAC level ought to be fine, unless there is some degenerate condition that causes the important packets to be dropped more often than the unimportant packets.)
That is always the issue with not being able to keep up. For a recent project I had to resort to a bit-banged MAC due to component shortage. Ofcourse there is no way it can keep up if there is a huge burst of broadcast traffic but re-transmits implemented in most protocols make that it works just fine nevertheless.
There are small lies, big lies and then there is what is on the screen of your oscilloscope.
 

Offline peter-h

  • Super Contributor
  • ***
  • Posts: 3995
  • Country: gb
  • Doing electronics since the 1960s...
Re: lwip - limitting CPU usage on STM32
« Reply #11 on: January 30, 2022, 10:31:23 am »
Quote
Would you mind telling us which protocols you are using?

I am not 100% sure but IIRC it is HTTP, HTTPS (MbedTLS), NTP, DHCP, ICMP (you can ping it). There are a lot more sitting in the ST library



but I don't know which of these (e.g. TFTP) are actually enabled.

Thanks for the explanation of the interface. I have not been involved with this fr over a year so I forgot (spent some time on the 25MHz/50MHz clock arrangement which seems to be a hotly debated topic). There is an RJ45 with integrated magnetics (the Hanrun usual type), then the LAN8742 which is connected using the reduced interface to the 32F417.
« Last Edit: January 30, 2022, 10:54:28 am by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline harerodTopic starter

  • Frequent Contributor
  • **
  • Posts: 469
  • Country: de
  • ee - digital & analog
    • My services:
Re: lwip - limiting CPU usage on STM32
« Reply #12 on: January 30, 2022, 03:38:44 pm »
Everybody, thanks for your input. Food for thought. For the moment I would prefer to stick with the setup described in my previous posts (STM32 + PHY).

Even if the device is in a safe environment, I am concerned with its weak points. Fast Ethernet allows frame rates well in excess of 1MHz 100kHz, so any rogue participant in the local network could be a problem. One real world example that I saw recently, was a bug in a third party software, which started spamming the network.

peter-h, thanks for your input. You anticipated and already answered my next question via private mail: your MAC fills DMA buffers with pre-filtered packets, whose status is polled by an RTOS task. I can see how this would give CPU/bus bandwidth control. I will look into this concept. I will also have to check how much bandwidth is used when the buffers are full and packets are being dropped. Again, my job of checking the RM0090 and my firmware setup.

edit 220131: thanks to westfw I corrected the framrate estimate
« Last Edit: January 31, 2022, 08:51:51 am by harerod »
 

Offline westfw

  • Super Contributor
  • ***
  • Posts: 4281
  • Country: us
Re: lwip - limiting CPU usage on STM32
« Reply #13 on: January 30, 2022, 08:55:59 pm »
Quote
Fast Ethernet allows frame rates well in excess of 1MHz


The maximum frame rate for 100Mbps (“fast”) Ethernet is “only” about 148kfps.
(Minimum frame is 84 bytes.)

 

Offline peter-h

  • Super Contributor
  • ***
  • Posts: 3995
  • Country: gb
  • Doing electronics since the 1960s...
Re: lwip - limiting CPU usage on STM32
« Reply #14 on: January 31, 2022, 07:43:22 am »
That is however a high interrupt rate - one every 7us.

The ISR would have to be very short. The HAL coders have a company requirement of a minimum ISR size of 100 lines of C, of which at least 30% have to be if() statements skipping past unused code, which makes that 70ns per line of C :)
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline harerodTopic starter

  • Frequent Contributor
  • **
  • Posts: 469
  • Country: de
  • ee - digital & analog
    • My services:
Re: lwip - limiting CPU usage on STM32
« Reply #15 on: January 31, 2022, 08:56:50 am »
westfw - serves me right for working on a Sunday. I have corrected the frame rate estimate in my last post. Seems that I had slipped into the Gigabit Ethernet section:
https://wiki.networksecuritytoolkit.org/nstwiki/index.php/LAN_Ethernet_Maximum_Rates,_Generation,_Capturing_&_Monitoring
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf