Author Topic: Is there a way to find STM 32F CPUs which are upwards compatible with 32F417? (Read 6629 times)

nctnico · « **Reply #50 on:** July 11, 2023, 11:27:10 am »

Quote from: peter-h on July 11, 2023, 11:05:56 am

IMHO it would be an unusual case where one needed inline code on a 168MHz CPU. In most cases where a higher speed is needed, a new look at the job is better.

It is not so much about speed but just avoiding overhead and spending CPU cycles when it can be avoided. In some cases having a small function with some shared code is just nicer than copying the same code over & over while the piece of code is just not suitable to put into a macro.

Siwastaja · « **Reply #51 on:** July 11, 2023, 12:19:24 pm »

Quote from: nctnico on July 11, 2023, 09:52:28 am

that puts each function in a seperate section so the linker removes any unused functions / section. This gets in the way of using inline functions (speed optimisation) and other functions / symbols that need to be kept. It is a hot mess...

Weird. I have never witnessed any issue with -ffunction-sections and -gc-sections. Definitely does not mess up inlined functions, compiler makes the inlining decision separately from outputting the code into a specific section. And what do you mean by "symbols that need to be kept", gc-sections by definition knows exactly what needs to be kept; of course with self-made linker files you need the KEEP keyword(s) in the right place(s).

Or is this once again one of those "I once saw a problem 15 years ago and thus instruct everyone not to use these standard features", like when we discuss illegal unaligned access causing a busfault in every other ARM Cortex-M CPU except the one broken one you come across once, and thus no one can ever trust to see a busfault as per documentation?

Siwastaja · « **Reply #52 on:** July 11, 2023, 12:24:18 pm »

Quote from: nctnico on July 11, 2023, 11:27:10 am

It is not so much about speed but just avoiding overhead and spending CPU cycles when it can be avoided. In some cases having a small function with some shared code is just nicer than copying the same code over & over while the piece of code is just not suitable to put into a macro.

Also remember __attribute__((always_inline)) which is useful when you really want to control inlining manually. The static inline keywords only inline functions by coincidence if the compiler happens to feel this way. Some people are against manual inlining but I agree with you that it's sensible thing to do in the microcontroller world specifically.

nctnico · « **Reply #53 on:** July 11, 2023, 01:25:35 pm »

Quote from: Siwastaja on July 11, 2023, 12:19:24 pm

Quote from: nctnico on July 11, 2023, 09:52:28 am
that puts each function in a seperate section so the linker removes any unused functions / section. This gets in the way of using inline functions (speed optimisation) and other functions / symbols that need to be kept. It is a hot mess...

Weird. I have never witnessed any issue with -ffunction-sections and -gc-sections. Definitely does not mess up inlined functions, compiler makes the inlining decision separately from outputting the code into a specific section. And what do you mean by "symbols that need to be kept", gc-sections by definition knows exactly what needs to be kept; of course with self-made linker files you need the KEEP keyword(s) in the right place(s).

The problem is that the inline functions are lost outside the scope of the file they are defined in. Some of these functions are useful in several files where I like to leave it to the compiler to choose whether it is better to inline or not. So in my case using inline doesn't work together with -ffunction-sections. Same for symbols that are defined in the linker file, those need the KEEP keyword. The way I see it, using -ffunction-sections hampers the linker from doing a good job by itself. There ain't no free lunch and I'd rather not use -ffunction-sections .

Siwastaja · « **Reply #54 on:** July 11, 2023, 01:44:26 pm »

Did I understand correctly:
* You want the functions with external linkage, i.e., not static
* You want the compiler to be able to use optimization which makes a compilation-unit local copy of the code, inlining it, instead of emitting the call to the external function, when it sees fit (would most likely happen at -O3, as this obviously increases code size)
* -ffunction-sections prevents this optimization, you have compared exactly the same code with and without -ffunction-sections and with it, a call to the function is emitted instead of inlining

It seems to me these two features (optimization taking copies; emitted functions being placed at their own sections .text.something) are completely orthogonal and there should be no coupling. If true, you might want to submit a bug report to GCC project. But before that, you should verify if what you think happens is indeed happening.

Quote

The way I see it, using -ffunction-sections hampers the linker from doing a good job by itself. There ain't no free lunch and I'd rather not use -ffunction-sections .

Oh great, more random weasel words which mean nothing. Why should anyone listen to your "advice"?

nctnico · « **Reply #55 on:** July 11, 2023, 01:49:45 pm »

AFAIK the linker from GCC can inline functions just fine.

Siwastaja · « **Reply #56 on:** July 11, 2023, 02:17:10 pm »

Quote from: nctnico on July 11, 2023, 01:49:45 pm

AFAIK the linker from GCC can inline functions just fine.

That would be link-time optimization (LTO), and if you think gc-sections is too heavy stuff for you, I don't think you will like LTO. Here is some comparison:
https://interrupt.memfault.com/blog/best-and-worst-gcc-clang-compiler-flags#-ffunction-sections--fdata-sections----gc-sections
https://interrupt.memfault.com/blog/best-and-worst-gcc-clang-compiler-flags#-flto

tszaboo · « **Reply #57 on:** July 11, 2023, 07:33:05 pm »

Quote from: nctnico on July 11, 2023, 09:52:28 am

Quote from: tszaboo on July 11, 2023, 09:06:42 am
Quote from: nctnico on July 10, 2023, 10:13:10 pm
Until you start reading the documentation... So far I have found the ADC and internal flash documentation to be incorrect. The UART has issues as well when receiving a stream of data back-to-back from a lightly distorded source. As if a bit offset error is accumulating which shouldn't happen.
I wouldn't know, I just changed the target MCU and the pin definitions, recompiled it and the code was working. If you seen errors, report it back and when they confirm it then it probably makes it into the errata, and maybe it even gets fixed the next version. It doesn't feel like an Infineon MCU or a PIC32MZ riddled with dozens of design breaking errata to me (famous PIC32MZ errata for their 18MSPS quick ADC: The ADC doesn't work at all).
Then you are using the HAL (which also has compatibility issues BTW). But I want to get rid of the HAL because the HAL requires you to compile with the compiler option that puts each function in a seperate section so the linker removes any unused functions / section. This gets in the way of using inline functions (speed optimisation) and other functions / symbols that need to be kept. It is a hot mess...

Yes, I'm using that. For the removed inline, I can think of two things from the top of my head.
One is to use it as a #define with parameters instead of a function, also called variadic macro.
The other is telling the linker specifically not to remove it with #pragma.
It's entirely possible that you thought to both of these, they don't work, and I don't understand your problem completely though.

Quote from: AndyC_772 on July 11, 2023, 10:40:16 am

@peter-h: you need ISB / DSB instructions in code running on M7 which requires memory accesses to happen in a particular order. Without them you can get into situations where the expected flow of operations (ie. do this, and then do that) doesn't match the order in which things actually do happen. It's important when doing direct hardware access, eg. enable a DMA controller, then read its 'busy' bit to check if it's still running.

@nctnico: I completely agree about the HAL, it's an obfuscation layer, not an abstraction layer. You still need to RTFM, you still need to understand the underlying hardware in detail, and code isn't directly portable between CPUs with differing peripherals.

IMHO the HAL is not really written for you, it's written for another set of libraries on top of it to use it, which you ultimately call in your code.

peter-h · « **Reply #58 on:** July 12, 2023, 12:55:04 am »

Quote

IMHO the HAL is not really written for you, it's written for another set of libraries on top of it to use it, which you ultimately call in your code.

I don't understand that. The HAL_* functions are generally called by user code.

The way it is supposed to work is that you configure the CPU type in some .h file. Maybe the CUBE config for CPU type does this; I've got it documented in the project doc but have never tried it. My Cube config remains fixed at 32F417.

The HAL_* code is full of #defines for the CPU type. But equally notable are the parts which don't have a #define and which presumably indicate which bits of hardware are the same software-wise.

Then any CPU-dependent HAL_* functions do different stuff according to CPU type. Overall, this works. The code is sometimes highly convoluted e.g. a function to configure say PA8 GPIO to be an output, pullup, high speed pullup, etc is incredibly convoluted. But it works. Most devs would just dive straight into the AF config registers, with practically impenetrable code.

tszaboo · « **Reply #59 on:** July 12, 2023, 07:46:24 am »

Quote from: peter-h on July 12, 2023, 12:55:04 am

Quote
IMHO the HAL is not really written for you, it's written for another set of libraries on top of it to use it, which you ultimately call in your code.

I don't understand that. The HAL_* functions are generally called by user code.

The way it is supposed to work is that you configure the CPU type in some .h file. Maybe the CUBE config for CPU type does this; I've got it documented in the project doc but have never tried it. My Cube config remains fixed at 32F417.

The HAL_* code is full of #defines for the CPU type. But equally notable are the parts which don't have a #define and which presumably indicate which bits of hardware are the same software-wise.

Then any CPU-dependent HAL_* functions do different stuff according to CPU type. Overall, this works. The code is sometimes highly convoluted e.g. a function to configure say PA8 GPIO to be an output, pullup, high speed pullup, etc is incredibly convoluted. But it works. Most devs would just dive straight into the AF config registers, with practically impenetrable code.

There are a large number of examples where you use the HAL but not directly in your code.
HAL is used for generated code by the STMCube thing, whatever it's called now.
The higher level libraries for example implement UART by having it as a printf() where you don't have to call HAL functions.
There are a number of RTOSes, that use the HAL and you are only supposed to call the RTOS modules, and not the HAL libraries. STMDuino implements the Arduino equivalent functions. Micropyhton is using the HAL libraries.

peter-h · « **Reply #60 on:** July 12, 2023, 08:47:04 am »

You mean Cube MX - the "push button code generator".

The rest I don't recognise.

If you select FreeRTOS, Cube imports FR into your project. Same with LWIP, IIRC (I didn't do those parts).

Printf() to a UART? I never saw that from ST. You implement putc() or some variation which printf() then calls, to go to a UART. But in an embedded system this is rare. I actually do have a printf literally but it is patched to come out via the SWV ITM console output (a data channel on the STLINK V3 debuggers, which is very good for debugs when it works; it is very fast). Normally one uses sprintf() or better still snprintf(), in embedded.

Never touched arduino or python.

No FreeRTOS, LWIP, TLS code I have seen calls any HAL functions, AFAICT.

nctnico · « **Reply #61 on:** July 12, 2023, 09:04:27 am »

Quote from: peter-h on July 12, 2023, 08:47:04 am

You mean Cube MX - the "push button code generator".

The rest I don't recognise.

If you select FreeRTOS, Cube imports FR into your project. Same with LWIP, IIRC (I didn't do those parts).

Printf() to a UART? I never saw that from ST. You implement putc() or some variation which printf() then calls, to go to a UART. But in an embedded system this is rare.

It depends a bit on how you organise your projects. I do things entirely different, all my embedded projects have a command line interface that at least uses a serial port but the commands can be streamed over any type of interface allowing all kinds of interfacing. I printf to everything! The basic purpose is to do debugging and it has been proven super handy to collect data while devices are running. Many of my projects have some kind of interface or control function and being able to see data and/or tweak values on the fly instead of having to recompile code for every try is super usefull. But a command line interface can also be used to transfer firmware updates or do remote settings. I even have a modified version that implements a SCPI with nested command tree structures.

Siwastaja · « **Reply #62 on:** July 12, 2023, 09:35:59 am »

Quote from: nctnico on July 12, 2023, 09:04:27 am

It depends a bit on how you organise your projects. I do things entirely different, all my embedded projects have a command line interface that at least uses a serial port but the commands can be streamed over any type of interface allowing all kinds of interfacing. The basic purpose is to do debugging and it has been proven super handy to collect data while devices are running. Many of my projects have some kind of interface or control function and being able to see data and/or tweak values on the fly instead of having to recompile code for every try is super usefull. But a command line interface can also be used to transfer firmware updates or do remote settings. I even have a modified version that implements a SCPI with nested command tree structures.

Even if this is some extra work upfront, it saves back the time spent manyfolds. This is generally very good advice, even though we probably would disagree with implementation details.

peter-h · « **Reply #63 on:** July 12, 2023, 10:24:04 am »

Quote

But a command line interface can also be used to transfer firmware updates or do remote settings.

Sure; I have that, but instead of a UART I use the USB VCP (CDC). It is very fast, and internally maps onto serial port 0 (1,2,3,4 are physical UARTs). USB is totally universal, and with Teraterm on a PC the job is done.

The SWV ITM thing is good if you are running a debugger; it needs no interrupts enabled and works immediately from startup, no CPU config needed, and runs at > 1MB/sec.

nctnico · « **Reply #64 on:** July 12, 2023, 10:47:17 am »

But you can't use a debugger in the field that easely. Most of my customers are using the command line interface for monitoring and other system verification purposes. Even if the device doesn't have a USB-CDC interface, a simple UART to USB converter and a terminal emulator is all they need.

AndyC_772 · « **Reply #65 on:** July 12, 2023, 10:52:29 am »

Quote from: peter-h on July 12, 2023, 10:24:04 am

instead of a UART I use the USB VCP (CDC). It is very fast, and internally maps onto serial port 0 (1,2,3,4 are physical UARTs). USB is totally universal, and with Teraterm on a PC the job is done.

That's really interesting, I'd like to be able to do that. Can you give me an idea of what's involved on the MCU and PC ends please?

peter-h · « **Reply #66 on:** July 12, 2023, 10:58:07 am »

You need to have a CPU with USB, then implement the CDC USB profile, then create a read and a write function to it, then install the ST VCP driver on the PC (this is auto downloaded on win10 and above) and install a terminal app on the PC e.g. Teraterm. Getting USB to do anything is quite a lot of work and someone else set it up for me. I believe he used Cube MX to generate the code. It didn't work properly (e.g. there was no flow control so data would sometimes be lost) so I did extra work on it. If you dig around old posts of mine on the topic of USB CDC etc you will find some stuff. I would not want to do it all on my own, ever. There was a lot of debugging to see where USB packets were going, etc.

BTW, wek, re those DISCO boards and LCD, I have it here. STM32F407G-DISC1, underneath it is a DM-STF4BB 120601 V1.1 (with ETH and other stuff), and a 40-way ribbon cable from "CON3" on that to an LCD board DM-LCD35RT 120700 V1.0. So there was an LCD interfaceable to the 407-DISC. I found a file from 2019

Code: [Select]

/**
  ******************************************************************************
  * @file    stm324xg_discovery_lcd.c
  * @author  MCD Application Team
  * @version V1.0.0
  * @date    30-September-2011
  * @brief   This file includes the LCD driver for AM-240320L8TNQW00H (LCD_ILI9320)
  *          and AM240320D5TOQW01H (LCD_ILI9325) Liquid Crystal Display Modules
  *          of STM324xG-EVAL evaluation board(MB786) RevB.
  ******************************************************************************

The early project also used PolarSSL, later changed to MbedTLS. I don't know why.

Back to "HAL_*" ST like structures and the more the better. Structures of structures and typedefs of structures of structures. Structs everywhere.

wek · « **Reply #67 on:** July 12, 2023, 04:20:03 pm »

Quote from: peter-h on July 12, 2023, 10:58:07 am

BTW, wek, re those DISCO boards and LCD, I have it here. STM32F407G-DISC1, underneath it is a DM-STF4BB 120601 V1.1 (with ETH and other stuff), and a 40-way ribbon cable from "CON3" on that to an LCD board DM-LCD35RT 120700 V1.0.

There was much enthusiasm around the 'F407 Disco when it came out, with numerous third parties providing such boards, which then even made it into ST's examples (in much the same way as the Adafruit LCD add-on board is part of the demo of most if not all Nucleo-64 boards).

Quote

So there was an LCD interfaceable to the 407-DISC. I found a file from 2019

You mean 2011. That's around the time when the 'F407 - and thus the 'F407 Disco - appeared.

But for what you wrote that you intend to use SPI-connected LCD, I meant you can test the performance - very roughly - using the 'F429 Disco. The controller of the LCD there is capable of 3 modes - raw RGB (where the controller is transparent), parallel (similar to what you have, but here it's unusable as the LCD is not connected to the FMC pins), and SPI. The official examples use the raw RGB mode through the LTDC module on the 'F429 - the primary purpose of that board is to showcase LTDC and also the SDRAM interface in FMC, that's where the official keeps the framebuffer (in the parallel and SPI modes the framebuffer is in the internal RAM of the controller on the LCD module).

However, my example uses the SPI interface so you can use it as a starting point, if you want.

JW

peter-h · « **Reply #68 on:** July 12, 2023, 05:18:07 pm »

OK; right.

However I would think that - for a known SPI data rate - a far more relevant parameter would be how clever your graphics routines are.

Take a simple example: draw an analog clock. You can redraw the whole clock every second, or even every 200ms for a smooth seconds hand. With a 256x256 LCD, 21mbps SPI, that is 200kbytes (24 bit colour), which takes 100ms to shift out. That would work. One RTOS task constantly shifting out the 200kbytes of data, and another RTOS task constantly generating the hand positions. A bit of smart sync to avoid artefacts... But a decent library would never do that. It would redraw just the seconds hand mostly, plus the "damaged" parts of the other hands and the background. Especially as the 200kbytes would eat all the RAM you had

Actually this is a good example for a CPU with more RAM: a 32F437 over a 32F417.

Yes, sorry, 2011. The file was dated 2019 presumably because somebody was editing it.

wek · « **Reply #69 on:** July 12, 2023, 05:40:29 pm »

> However I would think that - for a known SPI data rate - a far more relevant parameter would be how clever your graphics routines are.

That heavily depends on the target requirements. In some cases you can't avoid the need for raw power.

JW

peter-h · « **Reply #70 on:** July 12, 2023, 08:11:18 pm »

OK, sure, but then you won't get that with a 437 either.

In so many scenarios, RAM = power, so you could leverage your existing software + experience by a redesign of your board which has some RAM chips on it, a few MB. Beyond that, yeah, you need a faster CPU and then you have to redesign most of your board, and much software. RAM is a central issue in many microcontroller projects, and one tends to have far too little of it on-chip. And if you have it off-chip you lose much of your GPIO.

OTOH one can get really clever with graphics. I've done some impressive graphics with a Z80 at 8MHz, DMA, dual video RAMs, a graphics controller (9367, later UPD7220)... Lots of potential for tricks. But again you may need RAM.

peter-h · « **Reply #71 on:** July 28, 2023, 11:09:25 am »

How about this

Pinout is practically the same.

The one on the right will need water cooling

AndyC_772 · « **Reply #72 on:** July 28, 2023, 12:43:40 pm »

There's a note on migrating the 100 pin LQFP from STM32F4xx to STM32F765 in the data sheet for the 765. Check the pins along the bottom row of that drawing, they're all offset by one.

peter-h · « **Reply #73 on:** July 28, 2023, 01:14:48 pm »

Right; this similarity is obviously not accidental. But you have a 3x higher power dissipation
https://www.eevblog.com/forum/microcontrollers/stm-32f4-reading-cpu-temperature/25/

So I wonder what people do... apart from one of these

With an RTOS you could run what needs to run and then do a WFI. But for any sustained full speed operation you will be looking at a chip temp approaching 100C.

Doctorandus_P · « **Reply #74 on:** July 28, 2023, 01:22:52 pm »

Quote from: DavidAlfa on July 07, 2023, 05:50:37 pm

I suggest you talk about this with ST support, nobody will assist you better.

Huh? that is something I disagree with.
ST has made far to many type numbers, and apparently whole swaths of them are made from the same wafers / dies.
As a result, countless of engineers spend wasting many accumulative hours in figuering out whether there are compatibility issues or not.
And it's all the fault of the marketing folks. Having a lot of "different" IC's in their portfolio looks nice for ST, while they claim not to be responsible for all those wasted hours. It's a consequence of the ugly world we live in.


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

Author Topic: Is there a way to find STM 32F CPUs which are upwards compatible with 32F417? (Read 6629 times)

Share me