Author Topic: STM 32F4 FPU registers and main() gotcha  (Read 352 times)

rhodges and 3 Guests are viewing this topic.

Online peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3788
  • Country: gb
  • Doing electronics since the 1960s...
STM 32F4 FPU registers and main() gotcha
« on: Today at 01:50:47 pm »
I wonder why this
http://www.efton.sk/STM32/gotcha/g203.html
does not cause loads of trouble all over the place.

AIUI, it relates to C compilers treating the function main() differently when it comes to FPU stack operations. This is pretty weird, to be generating different code for a function, based on its name!

Maybe because I am using GCC (v11) and this version of GCC just happens to work i.e. does not emit the extra stack pushes/pops.

To work around this, the FPU enable code would need to go into the startup.s code i.e. before main() is entered. I am doing that but purely by accident; my startupxxx.s code called b_main() and that starts the FPU with

Code: [Select]
// ========== This was in SystemInit() ============

#if (__FPU_PRESENT == 1) && (__FPU_USED == 1)
SCB->CPACR |= ((3UL << 10*2)|(3UL << 11*2));  /* set CP10 and CP11 Full Access */
#endif

That code is commonly used Cube MX ("HAL") stuff which you find all over the internet...

FreeRTOS seems to do it again when it starts up (inside main() this time):

Code: [Select]
/* Ensure the VFP is enabled - it should be anyway. */
vPortEnableVFP();

/* Lazy save always. */
*( portFPCCR ) |= portASPEN_AND_LSPEN_BITS;

and vPortEnableVFP() contains

/* This is a naked function. */
static void vPortEnableVFP( void )
{
__asm volatile
(
" ldr.w r0, =0xE000ED88 \n" /* The FPU enable bits are in the CPACR. */
" ldr r1, [r0] \n"
" \n"
" orr r1, r1, #( 0xf << 20 ) \n" /* Enable CP10 and CP11 coprocessors, then save back. */
" str r1, [r0] \n"
" bx r14 "
);
}
>

Does this make sense to anyone? It seems to be working by accident, but it is a really weird thing as it is C compiler dependent, and to be sure you want to enable to FPU in the startup.s code.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline wek

  • Frequent Contributor
  • **
  • Posts: 512
  • Country: sk
Re: STM 32F4 FPU registers and main() gotcha
« Reply #1 on: Today at 02:06:10 pm »
Quote
I wonder why this
http://www.efton.sk/STM32/gotcha/g203.html
does not cause loads of trouble all over the place.
Because main() usually does not contain FP operations, so usually the compiler does not need to stack FP registers.

Usually, main() consists only from a bunch of function calls. And, usually, those functions - especially if they handle FP - are located in separate files, thus are not subject to inlining.

Even with moderate FP usage within a function there's probably no stacking. I don't remember the details of the API, but are many FP registers, so probably some of them the callee don't need to preserve.

The problem happened to me because I don't write programs in the usual way, so quite a significant portion of my programs tend to be either explicitly, or inlined, in main() (I love spaghetti, and have and use a spaghetti-making machine).

Quote
my startupxxx.s code called b_main() and that starts the FPU
b_main() is a C-function, and as such, it is vulnerable to the same problem, potential FP registers stacking - and it does not happen because of the same reason, you most probably have no FP operation in that function.

Quote
Code: [Select]
/* This is a naked function. */
static void vPortEnableVFP( void )

If it's naked indeed (i.e. there is somewhere a prototype with __attribute__((naked))), then there's no C prologue thus no registers stacking and no vulnerability of the kind described. However, the functions leading to calling that vPortEnableVFP() *are* vulnerable - but, again, FreeRTOS functions most probably have no FP operations in them.

JW
« Last Edit: Today at 02:13:20 pm by wek »
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4236
  • Country: nz
Re: STM 32F4 FPU registers and main() gotcha
« Reply #2 on: Today at 02:50:52 pm »
I wonder why this
http://www.efton.sk/STM32/gotcha/g203.html
does not cause loads of trouble all over the place.

Nothing STM or even Arm-specific in that.

If you're going to use an FPU (or vector unit, on ISAs / cores that have them) then you need to enable them before running a function that uses them, where "using" could involve arithmetic or, yes, storing or loading FPU registers.

You can perfectly well do that in main(), just as long as main() is running in privileged mode and doesn't itself use the FPU (etc) before initialising it -- including using it by saving registers in the prologue.

This will apply to anything that has an initially-disabled functional unit: It's certainly true on RISC-V (both FPU and Vector units, if present and used, need to be changed from "Off" to "Initial" or "Clean" in the mstatus.FS and mstasus.VS fields) and I'd imagine it is similar on x86, MIPS, PowerPC, ... too.
 

Online peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3788
  • Country: gb
  • Doing electronics since the 1960s...
Re: STM 32F4 FPU registers and main() gotcha
« Reply #3 on: Today at 03:13:36 pm »
Thank you both.

I am certainly not using floats before enabling the FPU (which is done in b_main() which then does a long jump to main() which never returns) and I would hope that if I was, it would comprehensively not work :)

It is probably by accident that main() does not use floats currently. I do have some printf() debug calls in there (printf() being mapped to come out on the SWV ITM debug port) which output longs but not floats. If they were floats, would that matter? I am confused.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline dietert1

  • Super Contributor
  • ***
  • Posts: 2247
  • Country: br
    • CADT Homepage
Re: STM 32F4 FPU registers and main() gotcha
« Reply #4 on: Today at 03:14:43 pm »
Today something similar happened when i worked on a small Win32 test app (network client).
There were no FPU operations in main(), but some in a thread started with CreateThread(). The app failed with "FPU not initialized" error. I solved the problem using _beginthreadex() instead and it worked. I learned that _beginthreadex() includes necessary CRT initializations.

Regards, Dieter
 

Online peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3788
  • Country: gb
  • Doing electronics since the 1960s...
Re: STM 32F4 FPU registers and main() gotcha
« Reply #5 on: Today at 03:41:14 pm »
That however is not the same thing. It is obvious that float ops with an uninitialised FPU are not going to work.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline dietert1

  • Super Contributor
  • ***
  • Posts: 2247
  • Country: br
    • CADT Homepage
Re: STM 32F4 FPU registers and main() gotcha
« Reply #6 on: Today at 04:03:13 pm »
I reported an actual incident and how the FPU remained uninitialized. Not on a STM32, but on Win32. Probably one can do something similar on a STM32.
 

Offline wek

  • Frequent Contributor
  • **
  • Posts: 512
  • Country: sk
Re: STM 32F4 FPU registers and main() gotcha
« Reply #7 on: Today at 04:19:24 pm »
Quote
I am certainly not using floats before enabling the FPU

The gotcha is in the fact, that even if you don't use floats in a function before enabling FPU, the compiler can do so.

If there are FP operations in a function - anywhere in that function - and those operations are so extensive that the compiler can't perform them using only the "callee-modifiable (*)" FP-registers, it then stacks the "callee-saves" FP-registers, and does so in the function's prologue, ie. before any C line is executed. Normally, main() is no exception in this regard (there is/are command-line flag/s which can make it an exception, though; but that might be topic for a different discussion).

Now if you jump to main() after FP being enable, no part of main() executes before enabling FP thus your main() is safe.  If you enable FP in a different C function, that function is not safe; but again, you are not likely to do any FP operations (enabling FP itself does not count, as it does not use FP registers and FP instructions) in that function.

And no worry: would you be caught by this gotcha, it's an immediate 100% fault (I'm not sure which one, but pending individual treatment they normally all escalate to HardFault anyway).

(*)
Quote from: ARM ABI ("Procedure Call Standard for the ArmĀ® Architecture, chapter 6 The Base Procedure Call Standard, subchapter 6.1.2.1 VFP register usage conventions
Registers s16-s31 (d8-d15, q4-q7) must be preserved across subroutine calls; registers s0-s15 (d0-d7, q0-q3) do not
need to be preserved
JW
« Last Edit: Today at 04:26:10 pm by wek »
 
The following users thanked this post: harerod

Online peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3788
  • Country: gb
  • Doing electronics since the 1960s...
Re: STM 32F4 FPU registers and main() gotcha
« Reply #8 on: Today at 04:32:01 pm »
This FPU stuff is above my pay grade :) but is the problem that the stacking of the FPU registers fails if the FPU is not enabled? Then I can understand it. Those registers are unlikely to be accessible if the FPU is not enabled (same with SPI etc etc).

So you will be stacking garbage, and then when this is popped, the FPU is loaded with garbage. Or will the CPU get a permanent "wait state" from the non-enabled FPU?
« Last Edit: Today at 04:35:21 pm by peter-h »
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline harerod

  • Frequent Contributor
  • **
  • Posts: 464
  • Country: de
  • ee - digital & analog
    • My services:
Re: STM 32F4 FPU registers and main() gotcha
« Reply #9 on: Today at 04:54:18 pm »
... Those registers are unlikely to be accessible if the FPU is not enabled (same with SPI etc etc). ...

Which is implied in the footnotes of the article you linked in your initial post: http://www.efton.sk/STM32/gotcha/g203.html
The FPU seems to be no different from any other peripheral on the STM32 - enable before first access. This may require a combination of power and clock.
I haven't used the STM32F4 FPU in such a long time, although I designed heaps of devices based on that MCU. During the first tests I wrote setup routines based on the datasheet.
This may have been before CooCox and Atollic became available. Where have the last ten years gone?
 

Offline dietert1

  • Super Contributor
  • ***
  • Posts: 2247
  • Country: br
    • CADT Homepage
Re: STM 32F4 FPU registers and main() gotcha
« Reply #10 on: Today at 05:04:06 pm »
Other people experienced hard faults with STM32 FPU while using FreeRTOS. Apparently initialization of FPU isn't 100 % automatic. In my Win32 case i have to include some FPU usage in main() in order to make it work in the thread.
 

Offline wek

  • Frequent Contributor
  • **
  • Posts: 512
  • Country: sk
Re: STM 32F4 FPU registers and main() gotcha
« Reply #11 on: Today at 06:03:29 pm »
The FPU is part of the processor core, so it's not like other peripherals. This is ARM's rules, not ST's.

So, if you don't enable it, and attempt to access its registers, the processor throws UsageFault (ARMĀ® v7-M Architecture Reference Manual B1.6.3 Pseudocode details of FP operation). If you don't have UsageFault enabled - which is the default - then it escalates to HardFault.

JW
 

Offline wek

  • Frequent Contributor
  • **
  • Posts: 512
  • Country: sk
Re: STM 32F4 FPU registers and main() gotcha
« Reply #12 on: Today at 06:09:57 pm »
Quote
Other people experienced hard faults with STM32 FPU while using FreeRTOS.

That is not necessarily consequence of *late* enabling the FPU (i.e. accessing FP registers or executing FP instructions before enabling FPU), as discussed in this thread.

For example, if FPU is enabled, upon interrupt/exception, the processor stacks (or reserves stack for, if lazy stacking is enabled, which is the default) half of the FPU registers, plus one status word (plus alignment if set so). That's extra 17-20 words, or up to extra 80 bytes, and that may be the difference between stack overflow or not.

JW
 

Offline dietert1

  • Super Contributor
  • ***
  • Posts: 2247
  • Country: br
    • CADT Homepage
Re: STM 32F4 FPU registers and main() gotcha
« Reply #13 on: Today at 06:59:32 pm »
No, the person had enough stack space and fixed the problem by "manually" setting the FPU control register.
 

Offline wek

  • Frequent Contributor
  • **
  • Posts: 512
  • Country: sk
Re: STM 32F4 FPU registers and main() gotcha
« Reply #14 on: Today at 07:26:12 pm »
No, the person had enough stack space and fixed the problem by "manually" setting the FPU control register.
Interesting.

Can you please give some links?

Thanks,

JW
 

Offline dietert1

  • Super Contributor
  • ***
  • Posts: 2247
  • Country: br
    • CADT Homepage
Re: STM 32F4 FPU registers and main() gotcha
« Reply #15 on: Today at 08:04:59 pm »
https://forums.freertos.org/t/cortex-m4-hard-fault-when-using-floating-point-unit/10180
Now that i read it once more, it isn't that clear whether increasing stack helped or using the CPAR register or both.
« Last Edit: Today at 08:10:26 pm by dietert1 »
 

Online peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 3788
  • Country: gb
  • Doing electronics since the 1960s...
Re: STM 32F4 FPU registers and main() gotcha
« Reply #16 on: Today at 08:50:56 pm »
Now I see why FreeRTOS (or at least my port of it, which was done by the guy who started off my Cube IDE project) enables the FPU when it starts up...
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf