Author Topic: How are interrupts handlers implemented? (Read 9648 times)

IanB · « **Reply #25 on:** May 08, 2022, 08:33:31 pm »

Quote from: nctnico on May 08, 2022, 07:53:50 pm

Quote from: Simon on May 07, 2022, 09:22:31 pm
I'm curious about what code is written in the back end (and beyond looking at how headers files go down a wormhole of bit's being defined in yet another file) for interrupts. So I know that when an interrupt occurs the processor saves the current state and runs off to a specified memory location. To me the user this translates into the automatic calling of a function that the chip manufacturer has predefined. But what code has the manufacturer written in order to have that function be placed in a certain physical location in memory?
The only right answer to this question is: it depends. It depends on which microcontroller / processor is used; there is no universal way. So please specify the microcontroller / processor you are interested in.

Furthermore, the technical answer to this question with any particular microcontroller lies in the datasheet. Processors do not execute the C language, they execute machine code. The datasheet will specify exactly how the hardware handles interrupts and what machine code you have to write to process those interrupts.

C compilers present a high level abstraction that eventually is translated to machine code. Any given C compiler and toolchain generating code for specific hardware will have a particular way to abstract the interrupt handling and make it available to your program.

It is sometimes suggested that to get a good understanding of microcontrollers you should write some simple programs in machine language/assembly language. When dealing with interrupts, I/O and peripheral interfacing, this is an especially good idea. Once you follow the datasheet and see what the hardware is doing, you can get a much better idea of what your development environment is doing behind the scenes.

cv007 · « **Reply #26 on:** May 08, 2022, 08:52:41 pm »

Here is the smallest generated code that will function for a cortex-m, which it seems you are focused on-

https://godbolt.org/z/K6Pfqq5oW

Built with no default libraries, no other startup code- just a single source file and a linker script. There are no includes involved, there is no manufacturer code, the compiler is doing as its told but is adding nothing. This will only run an infinite loop, but it will run. This just points out for a cortex-m0/similar, its not a difficult job to get the vectors setup and you can do it yourself if you are inclined to do so. Not necessarily an easy thing initially, but once the idea hits home its certainly doable.

Using manufacturers code means you get a linker script and startup files, so now are only left with the task of creating an interrupt function with the same name as one you want to use, which then 'overrides' the weak function the startup code created with the same name. Certainly easier to use until you want to do something unusual.

Here is a stm32/cortex-m0plus startup/linker example (which is in C++, makes no difference)-
https://github.com/cv007/NUCLEO32_G031K8_B/blob/main/startup.cpp
in this case the ram is used for the vector table, and the example starts to look more complicated than the simple example above, but not difficult once you understand the vector table is just a collection of function addresses, plus the initial stack top value.

Getting a general answer to the question without a specific mcu in mind does not work.

T3sl4co1l · « **Reply #27 on:** May 08, 2022, 09:12:50 pm »

Does that actually work, inline linker script?! Or is that just an example (godbolt still uses whatever it uses stock)?

Tim

westfw · « **Reply #28 on:** May 08, 2022, 09:24:44 pm »

Code: [Select]

#if 0 //linker script
   :
#endif

oooh... That's sort-of cute! Does anyone put their linker scripts in their C source and extract it with the C preprocessor for doing builds?

cv007 · « **Reply #29 on:** May 08, 2022, 11:45:17 pm »

Quote

Or is that just an example

You cannot put a linker script in the online compiler, so its just a listing of the linker script as it will be in a linker script file. If one actually uses that linker script code (in a linker script) and compiles the startup code (in a source file), you get what is in the comments at the end of the online example (objdump).

westfw · « **Reply #30 on:** May 09, 2022, 02:18:25 am »

Heh. It's one of those contradictions of modern programming:

"To really understand how this works, you should write some code in bare assembly language with no vendor-provided code."
"This is a modern CPU; there is no reason for you to ever try to program it in assembly language!"

----
Creating and/or copying a vector table from Flash to RAM is pretty common (when possible.) You need to do that, or something like that, to get maximum performance out of run-time changeable ISRs. Or other reasons.

The CPUs that treat RESET as a type of exception somewhat complicate matters. When that was less common, a CPU on reset would start at some location, and the vectors might be somewhere else (someone mentioned x86 with "start" in high memory and vectors in low memory...) This means that "initial" vectors need to be in ROM/Flash somehow, which is ... so-so.

cv007 · « **Reply #31 on:** May 09, 2022, 03:56:06 am »

Quote

"This is a modern CPU; there is no reason for you to ever try to program it in assembly language!"

So how does that idea fail when using a cortex-m? Except for some things like mrs/dsb/nop instructions, what else requires one to get into assembly if they would rather not?

Quote

This means that "initial" vectors need to be in ROM/Flash somehow, which is ... so-so.

Not all of them. In the link for the stm32 startup, only the stack, reset/nmi/hardfault addresses are in flash. Could probably get by with just the first two, but if a hardfault takes place in the code before the vectors are setup, then you get to a known location so can probably figure out what you did wrong instead of ending up who knows where. Once working, could probably eliminate the latter two, but makes little difference so they stay in place.

HwAoRrDk · « **Reply #32 on:** May 09, 2022, 03:59:59 am »

Quote from: T3sl4co1l on May 08, 2022, 07:14:14 pm

Also to be clear, AVR literally jumps to the interrupt address -- you could write the whole ISR right there in the IVT, if you guarantee nothing ever uses the intervening vectors and jumps into the middle of that ISR! Neat, but not very useful.

I remember reading some blog post where the author did just this - put the whole ISR in the IVT. Don't remember what the overall purpose of the code was, but the author wanted super-minimal latency on the ISR, and it was small enough to put in the IVT. I think also it was the only interrupt to be handled (apart from reset vector, obviously), so the entire remaining table could be used.

peter-h · « **Reply #33 on:** May 09, 2022, 08:09:02 am »

Quote

ARM Cortex CPUs have been designed so that interrupt handlers can be completely normal functions. The CPU internally (in hardware, not software) saves the state of whatever was running, by pushing registers in stack and popping them back after the function returns. Also because the vector table is just a list of function addresses (and usually relocatable in RAM), this can't get any easier for the programmer.

Coming from an assembler background, and Z80 etc where you have to save everything yourself, it took me a while to realise this

However, the ISR still has to clear the interrupt source (the IP - interrupt pending - or whatever bit). And that in turn enables lower priority interrupts to get serviced, so you can choose the point at which you clear that bit. I often wrote ISRs where I cleared the IP right away, which I suspect few people do. It was often necessary because the old CPUs were relatively slow.

Siwastaja · « **Reply #34 on:** May 09, 2022, 08:43:34 am »

Quote from: peter-h on May 09, 2022, 08:09:02 am

Quote
ARM Cortex CPUs have been designed so that interrupt handlers can be completely normal functions
However, the ISR still has to clear the interrupt source (the IP - interrupt pending - or whatever bit). And that in turn enables lower priority interrupts to get serviced, so you can choose the point at which you clear that bit.

This is incorrect, you don't need to clear anything. Lower priority interrupts gets served as soon as the higher priority ISR function returns.

However, some peripherals may need clearing their interrupt status bit, in the peripheral register, but this is completely manufacturer specific and not related to the ARM core. Often no such clear is needed, for example a data register read access often also clears the peripheral interrupt signal.

peter-h · « **Reply #35 on:** May 09, 2022, 09:15:16 am »

Quote

. Lower priority interrupts gets served as soon as the higher priority ISR function returns.

The CPU must then contain an up/down counter which counts calls and returns of nested function calls within the ISR, and enables lower priority interrupts when the counter returns to zero. Or maybe they save the SP and look for when it matches again. I looked through the ST HAL code ISRs and it is extremely convoluted but they seem to be clearing the IP bits when appropriate, but do nothing else regarding interrupts. Their ISRs are huge...

On the Z80 etc families, you have an IRET/RETI instruction which re-enabled the lower priority ints.

I did a google to try to find out how the ARM32 "RETI" (which doesn't exist as such) is implemented but found nothing. And obviously an ISR can call functions...

DiTBho · « **Reply #36 on:** May 09, 2022, 09:54:30 am »

Quote from: westfw on May 08, 2022, 09:24:44 pm

Does anyone put their linker scripts in their C source and extract it with the C preprocessor for doing builds?

No, because it's evil and more prone to fail

ejeffrey · « **Reply #37 on:** May 09, 2022, 03:27:47 pm »

Quote from: peter-h on May 09, 2022, 09:15:16 am

Quote
. Lower priority interrupts gets served as soon as the higher priority ISR function returns.

The CPU must then contain an up/down counter which counts calls and returns of nested function calls within the ISR, and enables lower priority interrupts when the counter returns to zero.

They use a magic value in the return address register that tells it how to restore the state. Attempting to load that value to the PC by e.g. a conventional return triggers the interrupt return behavior.

ejeffrey · « **Reply #38 on:** May 09, 2022, 04:23:18 pm »

Quote from: DiTBho on May 09, 2022, 09:54:30 am

Quote from: westfw on May 08, 2022, 09:24:44 pm
Does anyone put their linker scripts in their C source and extract it with the C preprocessor for doing builds?

No, because it's evil and more prone to fail

Then I'm convinced someone has not only done it but also mandated it as the only correct style within their tiny fiefdom :>

DiTBho · « **Reply #39 on:** May 09, 2022, 06:55:21 pm »

Quote from: ejeffrey on May 09, 2022, 04:23:18 pm

Then I'm convinced someone has not only done it but also mandated it as the only correct style within their tiny fiefdom :>

yup, like what Infineon did with their RAD

nctnico · « **Reply #40 on:** May 09, 2022, 09:48:47 pm »

Quote from: IanB on May 08, 2022, 08:33:31 pm

Quote from: nctnico on May 08, 2022, 07:53:50 pm
Quote from: Simon on May 07, 2022, 09:22:31 pm
I'm curious about what code is written in the back end (and beyond looking at how headers files go down a wormhole of bit's being defined in yet another file) for interrupts. So I know that when an interrupt occurs the processor saves the current state and runs off to a specified memory location. To me the user this translates into the automatic calling of a function that the chip manufacturer has predefined. But what code has the manufacturer written in order to have that function be placed in a certain physical location in memory?
The only right answer to this question is: it depends. It depends on which microcontroller / processor is used; there is no universal way. So please specify the microcontroller / processor you are interested in.

Furthermore, the technical answer to this question with any particular microcontroller lies in the datasheet. Processors do not execute the C language, they execute machine code. The datasheet will specify exactly how the hardware handles interrupts and what machine code you have to write to process those interrupts.

C compilers present a high level abstraction that eventually is translated to machine code. Any given C compiler and toolchain generating code for specific hardware will have a particular way to abstract the interrupt handling and make it available to your program.

And that isn't even true for all cases. On ARM Cortex-M microcontrollers you do not need assembly at all to get the microcontroller going. The CPU core is designed to call C functions from an interrupt vector directly (including main() ). Like I wrote: how interrupts are handled depends entirely on how the CPU core and interrupt handling is implemented. On some controllers the interrupts are handled by a seperate peripheral!

SiliconWizard · « **Reply #41 on:** May 10, 2022, 02:17:45 am »

If you're specifically considering ARM Cortex-M targets, and have questions about priorities, the following may help: https://community.arm.com/arm-community-blogs/b/embedded-blog/posts/cutting-through-the-confusion-with-arm-cortex-m-interrupt-priorities

brucehoult · « **Reply #42 on:** May 10, 2022, 02:35:51 am »

Quote from: ejeffrey on May 09, 2022, 03:27:47 pm

Quote from: peter-h on May 09, 2022, 09:15:16 am
Quote
. Lower priority interrupts gets served as soon as the higher priority ISR function returns.

The CPU must then contain an up/down counter which counts calls and returns of nested function calls within the ISR, and enables lower priority interrupts when the counter returns to zero.

They use a magic value in the return address register that tells it how to restore the state. Attempting to load that value to the PC by e.g. a conventional return triggers the interrupt return behavior.

What does this magic value look like?

peter-h · « **Reply #43 on:** May 10, 2022, 04:36:37 am »

I started an arm32 specific thread here
https://www.eevblog.com/forum/microcontrollers/how-does-st-32f4-know-when-an-isr-has-finished/msg4165615/#msg4165615

westfw · « **Reply #44 on:** May 10, 2022, 04:51:05 am »

Quote

What does this magic value look like?

From ARMv6m (CM0, CM0+) Architecture reference manual (section B.1.5.6 "Exception Entry Behavior")

Code: [Select]

If CONTROL.SPSEL == '0' then
    LR = 0xFFFFFFF9;
else
    LR = 0xFFFFFFFD;

SPSEL says which stack pointer is used (there are two.)

CM3/CM4 (ARMv7m) is slightly more complex, with 6 different magic values (From FFFFFFE1 to FFFFFFFD) depending on mode (Thread/Handler), Stack (Main/Process), and whether it's saving floating point context or not.

brucehoult · « **Reply #45 on:** May 10, 2022, 05:36:43 am »

Quote from: westfw on May 10, 2022, 04:51:05 am

Quote
What does this magic value look like?
From ARMv6m (CM0, CM0+) Architecture reference manual (section B.1.5.6 "Exception Entry Behavior")
Code: [Select]
If CONTROL.SPSEL == '0' then LR = 0xFFFFFFF9; else LR = 0xFFFFFFFD;SPSEL says which stack pointer is used (there are two.)

CM3/CM4 (ARMv7m) is slightly more complex, with 6 different magic values (From FFFFFFE1 to FFFFFFFD) depending on mode (Thread/Handler), Stack (Main/Process), and whether it's saving floating point context or not.

That's very interesting. That sounds to me like some ROM with 4 bytes of Thumb code at each entry point.

I don't think I have any boards with any of the above cores (my ARM stuff is all Cortex A). I have a Teensy with a CM7. Might be interesting to poke around.

DiTBho · « **Reply #46 on:** May 10, 2022, 07:28:39 am »

I don't like it. ARM was simpler years ago.

brucehoult · « **Reply #47 on:** May 10, 2022, 08:48:27 am »

I think I'll continue this here rather than in the more specific thread about how the magic return value works.

I'm looking at an NXP document: https://www.nxp.com/docs/en/application-note/AN12078.pdf

It lists interrupt latency for various cores as:

CPU core	Cycles
Cortex-M0	16
Cortex-M0+	15
Cortex-M3/M4	12
Cortex-M7	10~12

This document shows toggling a GPIO pin on and off after a timer interrupt (which also sends a signal to an output pin) using the following code on an i.MX RT1050 (Cortex-M7) with zero wait state memory:

Code: [Select]

LDR.N R0, [PC, #0x78] ; GPIO2_DR
MOV.W R1, #8388608 ; 0x800000
STR R1, [R0]
MOVS R1, #0
STR R1, [R0]
BX LR ; not shown but I assume

With an oscilloscope they get figures of 10 cycles to enter the interrupt handler, 34 cycles to toggle the pin on, 32 cycles to toggle the pin off. (STR to IO space is much slower than the core speed)

Cortex-M is easy to use, and that's cool, but very "one size fits all". There WILL have been 8 words of stuff stacked by the time you get to the first instruction in your own handler code.

RISC-V instead puts you in the handler with only a pipeline flush of delay (typically 2-3 cycles), but nothing at all has been saved. But it does give you flexibility.

There are some examples in:

https://github.com/riscv/riscv-fast-interrupt/blob/master/clic.adoc#interrupt-handling-software

Here's a simple non-preemptable interrupt handler that just increments a counter in RAM.

Code: [Select]

      addi sp, sp, -8                # Create a frame on stack.
      sw a0, 0(sp)                   # Save working register.

      sw a1, 4(sp)                   # Save working register.
      lui a0, %hi(INTERRUPT_FLAG)

      sw x0, %lo(INTERRUPT_FLAG)(a0) # Clear interrupt flag.
      lui a1, %hi(COUNTER)

      addi a1, a1, %lo(COUNTER)      # Get counter address.
      li a0, 1

      amoadd.w x0, (a1), a0          # Increment counter in memory.

      lw a1, 4(sp)                   # Restore registers.
      lw a0, 0(sp)

      addi sp, sp, 8                 # Free stack frame.
      mret                           # Return from handler using saved mepc.

I've rearranged that slightly from the code at the link, expanding two pseudo-instructions, assigning concrete frame size, and scheduling and grouping instructions for a hypothetical simple in-order dual-issue core that can do two stores (into a store buffer) or two ALU ops in the same clock cycle, and the 2nd ALU op can depend on the first one (skewed pipes). If I understand the materials I found properly, this is right for the Cortex-M7, so I'm assuming similar µarch for a RISC-V.

What we see is that we're already into the first instruction of the actual useful interrupt code with two working registers available on the 6th clock cycle (3rd for dual-issue), or probably 9 and 6 cycles respectively once you add the pipeline refill.

This same example needs only the amoadd modified to instead set or clear a GPIO pin. Something like reading a character from a UART buffer and writing it into a software buffer could be done with the same two working registers and a handful more instructions.

There is example code at ...

https://github.com/riscv/riscv-fast-interrupt/blob/master/clic.adoc#c-abi-trampoline-code

... for enabling interrupt handlers to be written as standard ABI C functions, with support for interrupt chaining and late-arrival of high priority interrupts. There is extensive commentary there of which parts are run with interrupts disabled and which with interrupts enabled, and also how it all works in general.

The code there is for the standard RISC-V ABI, which requires 16 registers to be saved, vs 8 (including PSW) on Cortex-M.

There are proposals to define an "embedded ABI" with fewer argument registers (perhaps 4 like ARM, vs 8 normally) and fewer temporary registers (perhaps 2 instead of 7) so that only maybe 7 registers need to be saved. While this would certainly make interrupt latency for C handlers much lower, experiments with modifying the compiler for this ABI show slow down and code expansion of normal mainline (background code) of up to 30% because of all the extra register spills required.

So unless the interrupt rate is extremely high or the background processing undemanding it's probably better to stick with the standard ABI! And if there is some particular interrupt that needs very low latency, it can always be written in assembly language. Or in C using __attribute__((interrupt)), which saves only the registers the function actually uses -- calling a normal ABI function from the interrupt function results in a full register save.

brucehoult · « **Reply #48 on:** May 10, 2022, 08:53:44 am »

Quote from: DiTBho on May 10, 2022, 07:28:39 am

I don't like it. ARM was simpler years ago.

Simpler internally, or simpler to use?

DiTBho · « **Reply #49 on:** May 10, 2022, 10:04:14 am »

Quote from: brucehoult on May 10, 2022, 08:53:44 am

Simpler internally, or simpler to use?

Internally. I am a RISC-purist, MIPS-addicted.


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

Author Topic: How are interrupts handlers implemented? (Read 9648 times)

Share me