Author Topic: FreeRTOS - where should the CPU be spending most of its time? (Read 4674 times)

peter-h · « **Reply #25 on:** July 20, 2023, 05:07:55 pm »

That (last sentence) puts a whole different perspective on it, doesn't it?

Interrupts are vectored; no need to check source (unless, obviously, you have combined sources per vector).

And if the WFI was occassionally terminated by some other event, that seems harmless, no? Especially if the Icc saving is per the DS.

If I was getting spurious interrupts then the system would crash. So what is left?

It is interesting because there indeed are some while() loops in my code e.g. checking for DMA completion. But there is no ISR, although to make WFI work properly one should set up an ISR (which does nothing beyond clearing the source as usual). Such loops currently always allow the RTOS to switch tasks, but it could be after as much as 1ms. Normally I don't do that.

In practice I have yield=true only when feeding slow SPI devices e.g. an LED controller which uses a 400kHz clock and transfers ~10 bytes. I don't use yield=true for 512 byte transfers to/from SPI FLASH at 21MHz (24us).

Siwastaja · « **Reply #26 on:** July 20, 2023, 05:23:34 pm »

Yeah, I was talking about spurious wakeups, not spurious interrupts.

This pattern:

Code: [Select]

while(1)
{
   WFI();
   if(flag) do_something();
}

just loops back to sleep if something unexpected caused the wakeup, no harm done.

Interrupts work as they normally do.

peter-h · « **Reply #27 on:** July 20, 2023, 05:55:47 pm »

OK, but if I understand you right, that yellow loop above will not work with WFI unless I set up an ISR for one of these

NDTR = 0
TCIF0 != 0
EN = 0

and that interrupt will exit WFI.

Anyway, spoke too soon: WFI in the idle task broke my RTC setting from GPS time. I wonder if tasks running at idle priority (as most of mine are) are buggered up by the WFI in the idle task?

bugnate · « **Reply #28 on:** July 20, 2023, 07:36:45 pm »

Quote from: Siwastaja on July 20, 2023, 04:26:58 pm

It would be weird to disable interrupts and then instruct the CPU to Wait For Interrupt. But people suggest the weirdest things.

I thought so too the first time I saw it, but has since turned out to be a common idiom on projects that brought me in. Functionally used as a shoddy WFE when the os/app design isn't quite right.

ejeffrey · « **Reply #29 on:** July 20, 2023, 07:56:37 pm »

Quote from: Siwastaja on July 20, 2023, 04:26:58 pm

It would be weird to disable interrupts and then instruct the CPU to Wait For Interrupt. But people suggest the weirdest things.

It's to eliminate a race conditions where an interrupt happens after checking for runnable tasks but before calling WFI. In that case, the ISR may have run and marked a task runnable but you go to sleep anyway.

If your latency sensitive work happens in the ISR itself and it's ok to wait for the next timer interrupt to run the backend task you don't need to do this, but if you need to run a task very quickly after the ISR you need to do something like this.

peter-h · « **Reply #30 on:** July 20, 2023, 08:04:08 pm »

Quote

ISR may have run and marked a task runnable

Does that matter if I never do anything with the RTOS from within an ISR?

But presumably there is no harm in doing that.

Quote

I wonder if tasks running at idle priority (as most of mine are) are buggered up by the WFI in the idle task?

Yeah, I was right!

If you use WFI then you should not have any tasks at tskIDLE_PRIORITY (=0). So I am now changing these to osPriorityLow (=8).

I think Ejeffrey's point is here
https://community.arm.com/support-forums/f/architectures-and-processors-forum/1483/cortex-m4-guaranteed-wakeup-from-wfi
but that page has every combination possible

Maybe I should use

__disable_irq();
__WFI();

but that doesn't work. The WFI does not re-enable them. Based on that article it should be

Code: [Select]


// This gets called when RTOS idle task has nothing to do.
// The disable_irq is needed only to eliminate a race condition where an interrupt happens
// after checking for runnable tasks but before calling WFI.
// In that case, the ISR may have run and marked a task runnable but you go to sleep anyway.
// Currently we are not doing any RTOS actions inside ISRs, anyway.

void vApplicationIdleHook (void)
{
	__asm volatile ("cpsid i" : : : "memory");		// __disable_irq();
	__asm volatile ("wfi");					// __WFI();
	__asm volatile ("cpsie i" : : : "memory");		// __enable_irq();
}

(I am avoiding doing a #include of a huge .h file with some unwanted stuff in it)

bugnate · « **Reply #31 on:** July 20, 2023, 09:23:15 pm »

Quote from: peter-h on July 20, 2023, 08:04:08 pm

but that doesn't work. The WFI does not re-enable them. Based on that article it should be

Well, no. You need to stop and think about what is actually happening here and what he said. ejeffrey isn't wrong but he buried the lede.

Yes, such a race condition can happen in certain designs but the enable/disables here are meaningless and a complete waste. Think about it... you are disabling IRQs, the MCU sleeps, an interrupt occurs, MCU wakes up, does not immediately execute the IRQ but instead enables IRQs and then executes IRQs. What part of that fixed the race? You need an atomic transition to sleep, not the sleep itself (which doesn't make sense anyway). Yes, if an IRQ hits after __disable_irq() you are potentially better off (won't sleep)... but what if the IRQ hits before __disable_irq()? Same race.

For this particular solution, the key is that you also have to check the run queue / idle condition after disabling IRQs but before you WFI. If you do this, you will either catch the IRQ in your pre-sleep check (after the irq disable) or if the irq occurs after the check you will just blow though the WFI instead of sleeping (it won't sleep with an interrupt pending).

However...

Quote from: peter-h

Does that matter if I never do anything with the RTOS from within an ISR?

As you are allude to here, I think that really none of this actually matters for you...

ejeffrey · « **Reply #32 on:** July 20, 2023, 09:23:52 pm »

WFI wakes if any interrupts are pending regardless of whether interrupts are disabled but it doesn't reenable them you have to do it yourself.

peter-h · « **Reply #33 on:** July 20, 2023, 09:59:11 pm »

I am reverting back to previous. The WFI breaks a lot of stuff in a subtle way. I think it is to do with task priorities. I wrote most tasks with idle priority (0) and the whole thing is running co-operatively, yielding to RTOS when nothing to do. System tasks run higher e.g. LWIP runs at ~24, ETH runs even higher, etc.

I would like the 28mA saving but...

ejeffrey · « **Reply #34 on:** July 20, 2023, 11:11:23 pm »

Quote from: peter-h on July 20, 2023, 09:59:11 pm

I am reverting back to previous. The WFI breaks a lot of stuff in a subtle way. I think it is to do with task priorities. I wrote most tasks with idle priority (0) and the whole thing is running co-operatively, yielding to RTOS when nothing to do. System tasks run higher e.g. LWIP runs at ~24, ETH runs even higher, etc.

If you are not tracking which your tasks are runnable but instead relying on continuous polling I agree, it's not practical. It's not really a question of task priorities: if you are not in a situation where there is nothing to do until an interrupt arrives it doesn't make sense to wait for an interrupt.

peter-h · « **Reply #35 on:** July 21, 2023, 05:43:26 am »

Spent more time on this... It is more subtle.

I had some relative task priority issue which didn't cause a problem until using WFI. I have fixed that.

A bigger issue is that I am running SPI to talk to a GPS receiver and you have to continually send it dummy data (0x00) to check if there is anything arriving. If no rx data you get an 0xFF back. This stupid polling process (a stupid design of the UBLOX GPS - previous threads on this where I posted my incredulity at the stupid SPI interface on it) needs to be pretty well solid. If I yield to the RTOS during the SPI process (which is always just 1 byte), I get the desired result with WFI (>80% in the idle function) but then I lose data because the yield brings the CPU back to that job after x ms, where 0 < x < 1, and 1ms is too late. One problem is that the SPI tx/rx has to be blocking so I can raise /CS at the end of it. That part of the code could do with a total redesign... but the obvious way (an interrupt driven state machine) can't be used on a shared SPI; it would be fine on a dedicated SPI. Or even having the GPS on a UART would have helped, but I have no spare UARTs (no spare GPIO actually). SPI was simpler and is very fast: a few hundred kbits/sec equivalent. Maybe an SPI UART... there are some with loads of buffering too, but they are expensive and weird chips with no possible replacement. But actually the SPI GPS is an optional feature anyway.

I did solve it by yielding to RTOS if there is no data to read (if 0xFF was returned) but not within the SPI DMA wait. And yes using DMA to move 1 byte is silly but I have a very unified SPI API which is thread-safe etc (automatic re-initialisation, etc).

This is with a lot of ETH activity but > 80% sleep is still good.

I think there is a problem with running a load (or any?) tasks at idle priority (0) with the WFI trick because the Idle task runs at 0 itself also. Can't see why though, given that I have set it to pre-emptive so equal tasks still run, switched on each tick. I have now set all those to Low ( 8 ) and carefully tweaked that to handle cases where one task generates data for another.

It seems to run. No disable_irq() was used; I don't have any ISRs interacting with the RTOS.

This has been a very useful bit of learning for low power products using these chips, although probably I would not do it with a 32F4 because the saving is not much. A 50% power saving just makes the on-chip temp sensor a little less useless

And all the deeper-sleep modes involve oscillator startup.

Tation · « **Reply #36 on:** July 21, 2023, 10:45:43 am »

Quote from: Siwastaja on July 20, 2023, 04:26:58 pm

It would be weird to disable interrupts and then instruct the CPU to Wait For Interrupt. But people suggest the weirdest things.

On Cortex-M, an (un-masked) incoming interrupt will awake the core, even with interrupts disabled. The ISR will not be invoked, though, until interrupts are re-enabled.

Seen similar constructs in bare metal environments, when some check of shared resources is needed before deciding to execute, or not, WFI. All that part is put inside a __disable_irq()/__enable_irq() pair. The idiom I have seen (and used) is:

Code: [Select]

__disable_irq();
if (all_tasks_idling()) {
  __WFI();
}
__enable_irq();

where all_tasks_idling() can take a time to check if all tasks are idle. If an ISR arrives during the execution of all_tasks_idling(), after it has checked that interrupt, the function may, incorrectly, return that it is OK to sleep the core. __WFI() will be executed, but it will not sleep the system (WFI cannot sleep if there are pending interrupts), thus interrupts are immediately re-enabled and the interrupt serviced.

I do not see the point in:

Code: [Select]

__disable_irq();
__WFI();
__enable_irq();

instead.

peter-h · « **Reply #37 on:** July 21, 2023, 11:11:53 am »

Quote

all_tasks_idling

Isn't the RTOS supposed to be doing this check, in effect?

The relevant bit of the code is here:

Code: [Select]

static portTASK_FUNCTION( prvIdleTask, pvParameters )
{
	/* Stop warnings. */
	( void ) pvParameters;

	/** THIS IS THE RTOS IDLE TASK - WHICH IS CREATED AUTOMATICALLY WHEN THE
	SCHEDULER IS STARTED. **/

	/* In case a task that has a secure context deletes itself, in which case
	the idle task is responsible for deleting the task's secure context, if
	any. */
	portTASK_CALLS_SECURE_FUNCTIONS();

	for( ;; )
	{
		/* See if any tasks have deleted themselves - if so then the idle task
		is responsible for freeing the deleted task's TCB and stack. */
		prvCheckTasksWaitingTermination();

		#if ( configUSE_PREEMPTION == 0 )
		{
			/* If we are not using preemption we keep forcing a task switch to
			see if any other task has become available.  If we are using
			preemption we don't need to do this as any task becoming available
			will automatically get the processor anyway. */
			taskYIELD();
		}
		#endif /* configUSE_PREEMPTION */

		#if ( ( configUSE_PREEMPTION == 1 ) && ( configIDLE_SHOULD_YIELD == 1 ) )
		{
			/* When using preemption tasks of equal priority will be
			timesliced.  If a task that is sharing the idle priority is ready
			to run then the idle task should yield before the end of the
			timeslice.

			A critical region is not required here as we are just reading from
			the list, and an occasional incorrect value will not matter.  If
			the ready list at the idle priority contains more than one task
			then a task other than the idle task is ready to execute. */
			if( listCURRENT_LIST_LENGTH( &( pxReadyTasksLists[ tskIDLE_PRIORITY ] ) ) > ( UBaseType_t ) 1 )
			{
				taskYIELD();
			}
			else
			{
				mtCOVERAGE_TEST_MARKER();
			}
		}
		#endif /* ( ( configUSE_PREEMPTION == 1 ) && ( configIDLE_SHOULD_YIELD == 1 ) ) */

		#if ( configUSE_IDLE_HOOK == 1 )
		{
			//extern void vApplicationIdleHook( void );

			/* Call the user defined function from within the idle task.  This
			allows the application designer to add background functionality
			without the overhead of a separate task.
			NOTE: vApplicationIdleHook() MUST NOT, UNDER ANY CIRCUMSTANCES,
			CALL A FUNCTION THAT MIGHT BLOCK. */
			vApplicationIdleHook();
		}
		#endif /* configUSE_IDLE_HOOK */

But maybe you are referring to checking stuff outside the RTOS e.g. whether some bit of hardware needs attention, but has not yet done an interrupt, but you don't want to wait potentially 1ms for that interrupt to be serviced? I don't understand that, because an interrupt will terminate the WFI within 5 CPU clocks anyway. I have looked online for examples of what might go inside vApplicationIdleHook() but have not found any.

This is pretty impressive. No ETH, USB VCP debugs off, no SPI GPS

WatchfulEye · « **Reply #38 on:** July 21, 2023, 08:02:18 pm »

Quote from: peter-h on July 21, 2023, 11:11:53 am

Isn't the RTOS supposed to be doing this check, in effect?

Yes. The key issue is that if you depend on tasks which have low latency requirements, correct implementation is essential - and this requires that the Sleep instruction (wfi) must be within the same critical section as the idleness check.

The problem with just adding wfi as a hook into the idle task, is that you don't have access to the idleness check, and therefore can't wrap it with the critical section (unless there is an OS API to allow you to perform the check again ,in which case you can wrap it).

While I'm not familiar with freeRTOS, some quick searching suggests that it does support CPU sleep modes on cortex-M cores, and has an optional OS feature called "tickless idle" which has a task scheduler which does exactly this.

Siwastaja · « **Reply #39 on:** July 22, 2023, 05:26:33 am »

Indeed all of this is the responsibility of the OS. If the OS does not implement such primitive feature, it's not worth using IMHO. Having to modify the OS and glue such fundamental features to it is absolutely nuts.

SiliconWizard · « **Reply #40 on:** July 22, 2023, 05:52:43 am »

Quote from: Siwastaja on July 22, 2023, 05:26:33 am

Indeed all of this is the responsibility of the OS. If the OS does not implement such primitive feature, it's not worth using IMHO. Having to modify the OS and glue such fundamental features to it is absolutely nuts.

Agreed.

peter-h · « **Reply #41 on:** July 22, 2023, 07:47:00 am »

Quote

The key issue is that if you depend on tasks which have low latency requirements

I am trying to work out the meaning of this - since my RTOS just switches at 1kHz - and I reckon it refers to a scenario where one is using an ISR to trigger an RTOS task.

I am not doing any of that. My system is simple:

- 1kHz task switching
- co-operative tasks (doing taskyield, or more often osDelay(1), or osDelay(lots) if there is really nothing to do)
- async stuff is all normal interrupts
- no interaction between above ISRs and the RTOS
- the RTOS code provides mutexes (which I use a lot) and inter-task messages (which I don't use)

The RTOS I wrote way back (Z180/Z280) would have done all of the above perfectly well, except it didn't do priorities, but the application didn't need them.

I did look at the "tickless" mode but could not see a need. Probably misunderstood it...

Why FreeRTOS? It was available as a prepackaged module from ST, in Cube IDE. It also works solidly.

hans · « **Reply #42 on:** July 22, 2023, 08:53:47 am »

Just going to interject my 2 cents after this discussion has been going..

If you have access to a JLink then also have a look at a tool from called SystemView. You can also use it to profile your application CPU time, but not only in a % count, but also on a time view graph with all the context switches and IRQs that are bouncing around in the system. It can be useful to detect issues like priority inversion and such. I think there is a FreeRTOS port for it.
It runs off Segger's RTT backend, which is a FIFO structure to provide tracing data to the host machine. The application puts in event messages whenever it is doing something, so this is a instrumentation profiler, and the host just empties the buffer as fast as it can. I think there is also a second channel to have printf() messages being redirected, so you can annotate your logs.

Unfortunately, because this is a commercial project I presume, it's not free though.. But if you need to dig a lot deeper into this it might be of use. Now a sampling profiler is usually also fine to get a rough idea what is costing CPU time. Even human-based-sampler by running the application and halting it a couple of times with a random distribution is usually good enough to get an idea.

If you want to have your application be as robust as possible against timing side effects then having everything event based is ideal. Usually all my RTOS tasks, if I use one, enter an infinite loop and then wait for signals/queue data before they start working on a particular command/action. Those signals are sent by other tasks and/or IRQs. This means that tasks are not time-based sleeping but instead waiting all the time. If they have work they go right into it. Its similar (but more overhead) to an event based framework, which doesnt need a RTOS kernel at all but then looses preemptive switching, of which I may want to transition to in the future (I've some ideas for this that are tailored towards low-power applications, which as you may know is my largest interest).

In the contrary, as Doctorandus says, if your application is polling instead with osDelay()s its more prone to falling over when the CPU load changes drastically. Now it doesnt have to be "dramatic" like the application hangs up, but its throughput bound may be less for example.

I think that a WFI(); would always be safe to add in a bare idle() task. The scheduler tick of a RTOS is usually ran on a timer (=IRQ), or with some parts on demand (usually also originating from an IRQ or a chain of task executions). The context switching code may also run in its own IRQ for some RTOS designs. E.g. if you have an IRQ that sends data to a task to process further, then the RTOS could check in its osSignalSet() function if that task was waiting for that signal and if it is OK to do a preemptive switch straight away (e.g. based on task priorities). If you write your application completely based on these kinds of signals or other IPC constructions (like queues etc.) then you could disable the scheduler tick, as it wont contribute anything to the scheduling behaviour as all is handled "on demand" in e.g. the osSignalSet (task activation) and osSignalWait (task deactivation) functions. You would probably loose the osDelay() functons though, as thats why the scheduler runs on a fixed time interval (1kHz=1ms ticks)

peter-h · « **Reply #43 on:** July 22, 2023, 03:06:28 pm »

Systemview is €1480 plus €296/year for support. Percepio is similarly priced. That's a bit too much, for what I would hope is a simple RTOS application.

Quote

You would probably loose the osDelay()

I use that a lot. It is also the primary means to yield to RTOS.

mikerj · « **Reply #44 on:** July 22, 2023, 04:51:53 pm »

If you want to yield to switch context (rather than delay) there is taskYIELD().

peter-h · « **Reply #45 on:** July 22, 2023, 04:59:37 pm »

Indeed; I have used both.

taskYIELD is arguably not as useful as osDelay(1) because the former yields only to same or higher priority tasks, whereas the latter lets everything else run.

Quote

Indeed all of this is the responsibility of the OS. If the OS does not implement such primitive feature, it's not worth using IMHO. Having to modify the OS and glue such fundamental features to it is absolutely nuts.

What I was getting at is that vApplicationIdleHook() gets run only if the RTOS idle task has nothing to do. So it is already doing the right thing. All you need to do is add WFI - or some other hardware-specific code.

peter-h · « **Reply #46 on:** July 24, 2023, 11:04:28 am »

Can anyone think of an example of what else can be turned off after WFI i.e. for the remainder of the 1ms tick? A timer could be worth a few mA but stopping a timer is usually a completely useless thing to do.

I found no examples online.

Tation · « **Reply #47 on:** July 25, 2023, 10:44:39 am »

Quote from: peter-h on July 21, 2023, 11:11:53 am

Quote
all_tasks_idling
Isn't the RTOS supposed to be doing this check, in effect?

Quote from: Siwastaja on July 22, 2023, 05:26:33 am

Indeed all of this is the responsibility of the OS. If the OS does not implement such primitive feature, it's not worth using IMHO. Having to modify the OS and glue such fundamental features to it is absolutely nuts.

As I said, I've seen this:

Code: [Select]

__disable_irq();
if (all_tasks_idling()) {
  __WFI();
}
__enable_irq();

in bare-metal scenarios. An RTOS takes care of this, except for the __WFI(), as the user may want or not to activate sleep, or different depths of sleep.

My aim was to illustrate when and where use __disable_irq()/__enable_irq() around __WFI().


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

Author Topic: FreeRTOS - where should the CPU be spending most of its time? (Read 4674 times)

Share me