Author Topic: Do i really need to use an RTOS? (Alternatives to finite state machines?) (Read 18464 times)

tellurium · « **Reply #50 on:** June 12, 2022, 11:46:13 am »

Quote from: Siwastaja on June 12, 2022, 11:28:56 am

I still don't follow. This is not what RTOS does. Normally either the CPU core (for example, ARM Cortex), or the C compiler does this register saving and task switching. You of course have to configure the timer interrupt but this is not difficult at all.

I don't think so. C compiler certainly does not do any task switching, and it certainly does not do register saving for that. That's an OS's job.

What in your opinion an RTOS does, if task switching / register saving is done by hardware or a compiler?

Siwastaja · « **Reply #51 on:** June 12, 2022, 11:55:45 am »

Quote from: tellurium on June 12, 2022, 11:46:13 am

Quote from: Siwastaja on June 12, 2022, 11:28:56 am

I still don't follow. This is not what RTOS does. Normally either the CPU core (for example, ARM Cortex), or the C compiler does this register saving and task switching. You of course have to configure the timer interrupt but this is not difficult at all.

I don't think so. C compiler certainly does not do any task switching, and it certainly does not do register saving for that. That's an OS's job.

You need to be more specific what you practically mean with task switching, and which "registers" you are talking about, clearly not about CPU registers. I have no idea what you are talking about!

CPU (+ potentially compiler) does exactly what you mentioned, CPU context + general purpose register stacking, jump to the handler, context retrieval and jump back into what was interrupted, so that the interrupted part does not know anything about what happened.

OS offers some additional features like separate per-task stack spaces (so that SP needs to be modified), which is great for memory protection. But in many MCU projects completely unnecessary, since sharing the stack Just Works.

abquke · « **Reply #52 on:** June 12, 2022, 11:57:54 am »

RTOS' are there to provide a framework so the programmer doesn't have to use semaphores directly. From what I've seen anyway.

Siwastaja · « **Reply #53 on:** June 12, 2022, 12:02:56 pm »

Quote from: abquke on June 12, 2022, 11:57:54 am

RTOS' are there to provide a framework so the programmer doesn't have to use semaphores directly. From what I've seen anyway.

WTF. RTOS is exactly what brings you the semaphores, which you do use "directly". Because it's an OS concept and requirement. Do you have any idea what you are talking about?

I hate this kind of namedropping.

In every freaking discussion about RTOS.

"Oh, you need an RTOS because RTOS is the only way to do flux capacitor recirculation. Without RTOS, you need to write your own space vector route planner which is at least 100 000 lines of code, so better use RTOS"

I have witnessed this behavioral pattern so many times (and I'm not only referring to this forum) that if someone suggests using an RTOS for a project, I'm really careful and evaluate their actual skills before committing to the idea, because it's possible they are completely retarded and unable to do a simple project, because they think RTOS is a silver bullet which replaces understanding and thinking, while in reality RTOS brings many abstracted concepts you need to fundamentally understand well. Parallel programming, mutexes, semaphors etc. are not completely trivial. For really really complex and large projects, with an experienced team who knows an RTOS (and the tools and instrumentation it provides) well, I'm open for the idea, though. There it will probably help.

abquke · « **Reply #54 on:** June 12, 2022, 12:07:41 pm »

Quote from: Siwastaja on June 12, 2022, 12:02:56 pm

Quote from: abquke on June 12, 2022, 11:57:54 am
RTOS' are there to provide a framework so the programmer doesn't have to use semaphores directly. From what I've seen anyway.

WTF. RTOS is exactly what brings you the semaphores, which you do use "directly". Because it's an OS concept and requirement. Without OS, no semaphores. Do you have any idea what you are talking about?

I hate this kind of namedropping.

In every freaking discussion about RTOS.

"Oh, you need an RTOS because RTOS is the only way to do flux capacitor recirculation. Without RTOS, you need to write your own space vector route planner which is at least 100 000 lines of code, so better use RTOS"

If you set a bit in an ISR and then check if the bit is set in main or elsewhere, it's a semaphore. Just like data structures can be "objects" if the programmer wants to imagine them that way.

Siwastaja · « **Reply #55 on:** June 12, 2022, 12:14:51 pm »

Quote from: abquke on June 12, 2022, 12:07:41 pm

Quote from: Siwastaja on June 12, 2022, 12:02:56 pm
Quote from: abquke on June 12, 2022, 11:57:54 am
RTOS' are there to provide a framework so the programmer doesn't have to use semaphores directly. From what I've seen anyway.

WTF. RTOS is exactly what brings you the semaphores, which you do use "directly". Because it's an OS concept and requirement. Without OS, no semaphores. Do you have any idea what you are talking about?

I hate this kind of namedropping.

In every freaking discussion about RTOS.

"Oh, you need an RTOS because RTOS is the only way to do flux capacitor recirculation. Without RTOS, you need to write your own space vector route planner which is at least 100 000 lines of code, so better use RTOS"

If you set a bit in an ISR and then check if the bit is set in main or elsewhere, it's a semaphore. Just like data structures can be "objects" if the programmer wants to imagine them that way.

OK, that clarifies what you meant, thanks and sorry.

But RTOS does not magically solve resource sharing at all. You have the exact same responsibility to manage the resources, setting flags or using mutex/semaphore mechanisms. And parallel programming with shared resources is notoriously difficult, bugs with rarely appearing race conditions do happen.

And large linear functions which set up mutexes, and another functions waiting for mutex releases, are not easy to analyze and prove, at all.

abquke · « **Reply #56 on:** June 12, 2022, 12:20:43 pm »

Quote from: Siwastaja on June 12, 2022, 12:14:51 pm

OK, that clarifies what you meant, thanks and sorry.

But RTOS does not magically solve resource sharing at all. You have the exact same responsibility to manage the resources, setting flags or using mutex/semaphore mechanisms. And parallel programming with shared resources is notoriously difficult, bugs with rarely appearing race conditions do happen.

And large linear functions which set up mutexes, and another functions waiting for mutex releases, are not easy to analyze and prove, at all.

Nope. But it does let one put a shiny sticker on it that says "RTOS" and allow inexperienced (read "cheap") programmers access to real time MCU programming.

Siwastaja · « **Reply #57 on:** June 12, 2022, 12:30:56 pm »

Quote from: abquke on June 12, 2022, 12:20:43 pm

Quote from: Siwastaja on June 12, 2022, 12:14:51 pm

OK, that clarifies what you meant, thanks and sorry.

But RTOS does not magically solve resource sharing at all. You have the exact same responsibility to manage the resources, setting flags or using mutex/semaphore mechanisms. And parallel programming with shared resources is notoriously difficult, bugs with rarely appearing race conditions do happen.

And large linear functions which set up mutexes, and another functions waiting for mutex releases, are not easy to analyze and prove, at all.

Nope. But it does let one put a shiny sticker on it that says "RTOS" and allow inexperienced (read "cheap") programmers access to real time MCU programming.

This is exactly what I have seen and it's a pretty dangerous pattern. Now of course if it's some trivial ILB (Industrial LED Blinker), then it does not matter. An experienced developer does it bare-metal in a week, well verified and tested. Alternatively, you can choose to use RTOS, hire ten inexperienced monkeys do it in a year, and it will finally work. And everybody's proud because NASA used RTOS in Mars, too.

But what I am worried about is seeing fairly complex projects turn into chaos because no one has idea how to approach: simple bare metal approaches are close to their limits, and it's hard to find a good team. RTOS seemingly offers more, but if most RTOS developers you see are those who developed the aforementioned ILB, they won't succeed in actually challenging project. So you need either experienced bare metal team, or experienced RTOS team.

And, this is unsurprising, because all the true challenges are fundamental in software design and implementation, and the RTOS can not magically solve these. Personally, I have found the more I have control over things, the better the chances of success. Thus I avoid excess layers of abstraction, and very much prefer to write nested interrupt handlers instead of threads with yielding or locking calls, but this is matter of taste; both have pretty much the same challenges, and require similar understanding.

tellurium · « **Reply #58 on:** June 12, 2022, 12:32:05 pm »

Quote from: Siwastaja on June 12, 2022, 11:55:45 am

You need to be more specific what you practically mean with task switching, and which "registers" you are talking about, clearly not about CPU registers. I have no idea what you are talking about!

CPU (+ potentially compiler) does exactly what you mentioned, CPU context + general purpose register stacking, jump to the handler, context retrieval and jump back into what was interrupted, so that the interrupted part does not know anything about what happened.

OS offers some additional features like separate per-task stack spaces (so that SP needs to be modified), which is great for memory protection. But in many MCU projects completely unnecessary, since sharing the stack Just Works.

What you've described is interrupt handling, and yes it works exactly as you've described.

It is possible to organise a program by clever use of interrupts and priorities - there is no doubt about that. In this case, however, an interrupt handler must be written in a special way. For example, it should not use infinite loops. It should return as fast as possible. It cannot keep state on a stack between calls - the state should be kept elsewhere.

When a work "task" is used, usually a user-written function with its own stack is considered. With a behavior very different from an ISR. Tasks often run for a long time, or forever. For example, a task could process UART input to implement a CLI:

Code: [Select]

void *uart_cli_task(void *param) {
   char buf[100]; // On-stack state
   int len = 0;
   for (;;) { uart_read(); ..... }   // Infinite loop!
}

That's what a "task" is. Of course you can implement the same using superloop. But in that case, your "tasks" cannot use infinite loops or a long-executing calls. Of course you can use interrupts to call some other "tasks" - but again, the ISRs would need to be written with care.

If a firmware has many such "tasks", where any "task" can block for a long time, then using a task scheduler - which is what RTOS does, is a viable solution. One cannot directly use those "tasks" as ISRs, the architecture of the firmware should be changed.

Writing a firmware in a "RTOS" way is arguably easier. Writing a firmware without an RTOS, with clever use of interrupts and priorities, require more skills and is harder - but often results in a considerably better performance.

tellurium · « **Reply #59 on:** June 12, 2022, 12:47:44 pm »

Let's have a practical example.

A firmware need to create several TLS requests. Each TLS request requires a lengthy crypto calculation, let's say it executes in 3 seconds. Sometimes it needs 1 simultaneous request, sometimes 3 or 5. All requests are equal in importance:

Code: [Select]

void make_request(destination) {
  struct state state;
  connect(&state); // Can take a very long time. Uses external crypto library , so cannot be rewritten as FSM
  send_request(&state);
  read_response(&state); // Can take a very long time
  process_reponse(&state);
}

That's a good case for using a scheduler (RTOS): each such "task" will get its own time slice and all will progress simultaneously.

Implementing that using interrupts and ISRs would be possible, but IMO more complicated.

Simon · « **Reply #60 on:** June 12, 2022, 12:49:53 pm »

Quote from: Siwastaja on June 12, 2022, 12:30:56 pm

This is exactly what I have seen and it's a pretty dangerous pattern. Now of course if it's some trivial ILB (Industrial LED Blinker), then it does not matter. An experienced developer does it bare-metal in a week, well verified and tested. Alternatively, you can choose to use RTOS, hire ten inexperienced monkeys do it in a year, and it will finally work. And everybody's proud because NASA used RTOS in Mars, too.

But what I am worried about is seeing fairly complex projects turn into chaos because no one has idea how to approach: simple bare metal approaches are close to their limits, and it's hard to find a good team. RTOS seemingly offers more, but if most RTOS developers you see are those who developed the aforementioned ILB, they won't succeed in actually challenging project. So you need either experienced bare metal team, or experienced RTOS team.

And, this is unsurprising, because all the true challenges are fundamental in software design and implementation, and the RTOS can not magically solve these. Personally, I have found the more I have control over things, the better the chances of success. Thus I avoid excess layers of abstraction, and very much prefer to write nested interrupt handlers instead of threads with yielding or locking calls, but this is matter of taste; both have pretty much the same challenges, and require similar understanding.

It's the arduino mentality. I started my programming in Arduino so I can't speak for life before the Arduino but I only ever did one project in Arduino as a proof of concept before converting to bare metal. More recently I became aware of code configurators and RTOSes which seem to have a buzz about them these days a bit like the arduino because people can do things without fully understanding what they are doing or at least that is the perception, everyone wants a shortcut and you will get people who will see any problem as needing the one solution they are a fan of even if it is over the top.

I approach these miracle solutions with caution which is why I have asked a few questions having seen the discussion. I always ask myself, when I have to pick up this project again in 5 years how will I go about refamiliarizing myself with it. If it is too complex in it's layout will I manage it, which is why I thought up my own very small scheduling system. Not because it allows me to get better results or work faster but because it will read easier later.

brucehoult · « **Reply #61 on:** June 12, 2022, 12:52:34 pm »

Quote from: abquke on June 12, 2022, 12:07:41 pm

If you set a bit in an ISR and then check if the bit is set in main or elsewhere, it's a semaphore. Just like data structures can be "objects" if the programmer wants to imagine them that way.

That's not a semaphore, it's a flag. They might be the same thing on a boat, but they're not in programming.

Siwastaja · « **Reply #62 on:** June 12, 2022, 12:52:53 pm »

Quote from: tellurium on June 12, 2022, 12:32:05 pm

It is possible to organise a program by clever use of interrupts and priorities - there is no doubt about that. In this case, however, an interrupt handler must be written in a special way. For example, it should not use infinite loops. It should return as fast as possible. It cannot keep state on a stack between calls - the state should be kept elsewhere.

Of course you can keep state - if the handler is one function, static keyword inside function is enough. If the handler is multiple functions, then make the variables compilation unit (file) scope statics. With RTOS you can change the terminology so that now it's "stack" which grows downwards like stacks do, but I don't see large fundamental difference. What I do see is people overrunning stack with RTOS and then experimentally modify the task configuration to increase stack space per task, wasting a lot of memory in the process, and still not proving there actually is enough memory. I don't like it.

I also don't like the "as fast as possible" rule of thumb. Doing things takes the time it does; if you can't make it in time, it's not about the coding paradigm (ISR vs. concurrent thread); both will fail, and you need to change the actual algorithm to succeed in given time. Interrupt priorities solve this problem: if something more urgent happens, it is served, and the lower-priority task execution continues.

I do remember, of course, working with AVR where nesting interrupts was considered a "trick" (which I did never actually use) and hence had rules of thumbs like "keep ISRs very short, only set flags, which you check in the main loop". I don't like this pattern very much and with ARM Cortex MCUs you don't need to do this unless it feels the right way. Instead, you can just assign IRQ priorities and this seems to align pretty well with actual physical nature of things. Inductor overcurrent comparator? That's highest priority. Collecting datapoint from a sensor. That's lower. Processing all accumulated 128 samples once the previously mentioned handler has collected 'em all? That's even lower.

You can of course do the same in RTOS threads / tasks, running infinite loops and waiting for events / yielding / waiting for semaphores / whatever the term. Is it easier? Not for me, I think it's of equal difficulty, just different.

Quote

That's what a "task" is. Of course you can implement the same using superloop. But in that case, your "tasks" cannot use infinite loops or a long-executing calls. Of course you can use interrupts to call some other "tasks" - but again, the ISRs would need to be written with care.

You don't really need superloops. This is the superloop from a fairly complex inertial measurement + sensor fusion + processing system:

Code: [Select]

	while(1)
	{
		; //handle_uart_send_binary();
	}

You can see I had some weird idea about adding something to the superloop and then changed my mind and commented it out. Because superloops are kinda crappy.

So just keep state and trigger functions. But I do see the point, every trigger results in a certain function executing from the beginning. You see this as a limitation; I see it as a strength, because it forces you to partition the code into smaller processing units, where each function is connected to some actual trigger mechanism. This trigger mechanism can be flexible, it can be IRQ from a peripheral; it can be a software interrupt from higher priority interrupt handler, or it can be just a regular function call. But every function is triggered by something, things just "do not magically happen".

And this is way way better than setting flags and checking them in superloop, although that works, too.

tggzzz · « **Reply #63 on:** June 12, 2022, 01:13:28 pm »

Quote from: Simon on June 12, 2022, 10:36:19 am

Quote from: tggzzz on June 12, 2022, 07:05:06 am

Not true, for preemptive RTOSs where a task can be suspended between any two instructions.

The RTOS contains a scheduler, whose job is to determine which task should be executing at any given instant. The scheduler is not executed unless there is a reason to re-determine which task should be running. Hence until something happens, the current task will continue executing.

Ah yes of course, i was not considering that the RTOS can do the work in a timed interrupt. That makes sense. So If I understand correctly the stack has to be saved so that the current task can be picked up again and when it is anything changed in the task that it gave way to is still unknown to the resumed task. Wow, so these must be quite isolated from each other and have enough to do that doing the swap thing is worth the overhead. I don't think my programs are there yet as if anything they are made in small chunks and talk to each other with global variables.

Nothing should be done in any interrupt, except determine the event that occurred and mutate that into a message put into a queue for the scheduler to observe, and optionally kick the scheduler into life.

The task state has to be saved when a task switching occurs. What's in the task state is processor dependent and is invisible to C compilers. Typically it will include the PC, stack pointer, the condition codes, the register set including FP registers. Note that many of those can be automatically saved by hardware when the interrupt is recognised and processed.

But on some processors there is a vast amount of "hidden" state, e.g. any out-of-order instructions that are halfway to being completed. With the Itanium it was thousands of such registers and instructions being speculatively executed, TLB spills and other cache misses - and that significantly impacted the process switch time.

All that is why process switch time in Unix/Windows/etc machines is so heavyweight, and hence the preference for cooperatively scheduled "green threads" within a single process.

Siwastaja · « **Reply #64 on:** June 12, 2022, 01:15:59 pm »

Quote from: tellurium on June 12, 2022, 12:47:44 pm

Let's have a practical example.

A firmware need to create several TLS requests. Each TLS request requires a lengthy crypto calculation, let's say it executes in 3 seconds. Sometimes it needs 1 simultaneous request, sometimes 3 or 5. All requests are equal in importance:

Code: [Select]
void make_request(destination) { struct state state; connect(&state); // Can take a very long time. Uses external crypto library , so cannot be rewritten as FSM send_request(&state); read_response(&state); // Can take a very long time process_reponse(&state); }

Yeah, it's a typical example. But OTOH, it is an antithesis of what real-time systems are. If you have any real-time requirements, then is it an option to take a black box library which takes "approximately 3 seconds", time-slice it for 3-5 instances which, together, maybe takes 5*3 seconds, on average, with no idea about worst-case timing, and no idea about in which order the function calls finish? Of course not, if it does matter, you just have to dig into the library and write it for your use case, or modify it for real-time needs.

Of course often this does not matter at all, but this is a general purpose computing case, and sounds like a general purpose OS, nothing to do with real-time. Which is the point: there are many reasons why operating systems exist. But RTOS? All the arguments for RTOS's are for the GPOS (general purpose OS), while the real-time capabilities are pretty unimpressive.

Frankly, given your problem, I would consider the following two options:
* Bare metal MCU, just make make_request a low-priority task (either in superloop, or low-priority ISR), let others interrupt it. If you have five, run it five times; using FIFO request queue for example. This compromises the requirement that if the five TLS requests are pushed into the queue roughly at the same time, they won't finish roughly at the same time, but at ~3 second intervals, in the order they were pushed. But yours won't prove the worst case time or order anyway. Whichever is better is arguable. One could say that the first TLS request from 1.5 seconds ago deserves to be completed first before starting any work on the next one.

* Real computer, real OS. Throw a Raspberry or some more serious industrial PC at the problem. Now it can run full Linux or BSD or whatever, with much greater capabilities and extensibility. Let a separate microcontroller handle all the real-time stuff.

But yours sounds like a valid case for a cheap IoT device: can't afford real computer (price, size, power budget) and that's fine; can't afford real-time designers, and don't need real time design, so that's fine, too -> timing doesn't need to be predictable, no need for complicated software engineering, any developer does, pre-existing code does -> throw a small OS that can run on an MCU at the problem -> it just happens such OS is called "RTOS" -> utilize that in your CV, because NASA did also use RTOS on Mars.

tellurium · « **Reply #65 on:** June 12, 2022, 01:16:08 pm »

Quote from: Siwastaja on June 12, 2022, 12:52:53 pm

What I do see is people overrunning stack with RTOS and then experimentally modify the task configuration to increase stack space per task, wasting a lot of memory in the process, and still not proving there actually is enough memory. I don't like it.

Yes! Memory waste and stack overflows are very common.

Quote

Instead, you can just assign IRQ priorities and this seems to align pretty well with actual physical nature of things. Inductor overcurrent comparator? That's highest priority. Collecting datapoint from a sensor. That's lower. Processing all accumulated 128 samples once the previously mentioned handler has collected 'em all? That's even lower.

That is a very nice approach indeed.

Siwastaja · « **Reply #66 on:** June 12, 2022, 01:25:56 pm »

Quote from: tggzzz on June 12, 2022, 01:13:28 pm

Nothing should be done in any interrupt, except determine the event that occurred and mutate that into a message put into a queue for the scheduler to observe, and optionally kick the scheduler into life.

The task state has to be saved when a task switching occurs. What's in the task state is processor dependent and is invisible to C compilers. Typically it will include the PC, stack pointer, the condition codes, the register set including FP registers. Note that many of those can be automatically saved by hardware when the interrupt is recognised and processed.

It is not a coincidence that ARM Cortex MCUs handle saving and restoring this state, fully, so that interrupt handlers can be just plain functions. This makes programming said CPUs easy because you totally can do the stuff in the ISRs directly, instead of pushing the event messages and popping them in another scheduler. The end result is functionally the same, with two key differences:
* Performance is better when you don't need to run your own scheduler / queue handling code - high-priority events can be responded to in dozends of nanoseconds
* When you don't need to write any scheduler code, you won't make mistakes writing it.

This results in event-driven code which is quite easy to write, understandable, easy to analyze (because each function is related to some actual trigger mechanism). Similar patterns have been popular in surprising places like web development. Instead of Javascript onClick() function, you have a C onOvercurrent(). And not some overcurrentGovernor() while(1) if(overcurrent) blah_blah task, which is the RTOS way, but does not look any cleaner but quite the opposite, and only causes performance penalty.

tggzzz · « **Reply #67 on:** June 12, 2022, 01:32:43 pm »

Quote from: Siwastaja on June 12, 2022, 12:14:51 pm

But RTOS does not magically solve resource sharing at all. You have the exact same responsibility to manage the resources, setting flags or using mutex/semaphore mechanisms. And parallel programming with shared resources is notoriously difficult, bugs with rarely appearing race conditions do happen.

And large linear functions which set up mutexes, and another functions waiting for mutex releases, are not easy to analyze and prove, at all.

RTOSs certainly aren't magic, and merely mutate the issues into a different form.

As has been known sinc the 60s, mutexes and semaphores are the fundamental mechanism necessary for RTOSs. But, as you imply, using "naked" mutexes/semaphores is error-prone and leads to spaghetti-like code with subtle failure modes. For most practical RTOS applications, it is better to use one of many slightly higher level abstractions which are based on mutexes/semaphores.

Such abstractions are well known and appear as functions in typical RTOSs, e.g. mailboxes, queue/buffers, fork-join and so on. Curiously one of the better sets of examples was Doug Lea's Concurrency Utilities that he created in Java, 25 years ago. Those were based on well proven real-time design strategies from other domains, and were so successful they are now incorporated into the standard Java library. https://docs.oracle.com/javase/8/docs/technotes/guides/concurrency/overview.html https://docs.oracle.com/javase/tutorial/essential/concurrency/

tggzzz · « **Reply #68 on:** June 12, 2022, 01:37:21 pm »

Quote from: Siwastaja on June 12, 2022, 12:30:56 pm

And, this is unsurprising, because all the true challenges are fundamental in software design and implementation, and the RTOS can not magically solve these. Personally, I have found the more I have control over things, the better the chances of success. Thus I avoid excess layers of abstraction, and very much prefer to write nested interrupt handlers instead of threads with yielding or locking calls, but this is matter of taste; both have pretty much the same challenges, and require similar understanding.

Yes and no.

While definitely avoiding excess levels of abstraction, the abstractions in a well-implemented RTOS enable me to ignore (and be ignorant of) many processor specific details. That enables me to concentrate my attention on solving my problem, including avoiding the traps associated with undisciplined multitasking and undisciplined interrupts.

tggzzz · « **Reply #69 on:** June 12, 2022, 01:38:13 pm »

Quote from: brucehoult on June 12, 2022, 12:52:34 pm

Quote from: abquke on June 12, 2022, 12:07:41 pm
If you set a bit in an ISR and then check if the bit is set in main or elsewhere, it's a semaphore. Just like data structures can be "objects" if the programmer wants to imagine them that way.

That's not a semaphore, it's a flag. They might be the same thing on a boat, but they're not in programming.

Neatly put

tggzzz · « **Reply #70 on:** June 12, 2022, 01:48:52 pm »

Quote from: Siwastaja on June 12, 2022, 01:25:56 pm

Quote from: tggzzz on June 12, 2022, 01:13:28 pm
Nothing should be done in any interrupt, except determine the event that occurred and mutate that into a message put into a queue for the scheduler to observe, and optionally kick the scheduler into life.

The task state has to be saved when a task switching occurs. What's in the task state is processor dependent and is invisible to C compilers. Typically it will include the PC, stack pointer, the condition codes, the register set including FP registers. Note that many of those can be automatically saved by hardware when the interrupt is recognised and processed.

It is not a coincidence that ARM Cortex MCUs handle saving and restoring this state, fully, so that interrupt handlers can be just plain functions.

Without direct knowledge, I am suspicious of that statement in the more complex ARMs. Do they save TLBs and similar?

Quote

This makes programming said CPUs easy because you totally can do the stuff in the ISRs directly, instead of pushing the event messages and popping them in another scheduler. The end result is functionally the same, with two key differences:
* Performance is better when you don't need to run your own scheduler / queue handling code - high-priority events can be responded to in dozends of nanoseconds
* When you don't need to write any scheduler code, you won't make mistakes writing it.

Can do and should do are, of course, different things.

Guaranteeing response times with multiple levels of interrupt (or task priorities for that matter) is non-trivial. Consider a lower-priority interrupt being interrupted by a higher-level interrupt. (And s/interrupt/task, of course!). Forgetting about priority inversion is what screwed up the Mars Pathfinder!

Quote

This results in event-driven code which is quite easy to write, understandable, easy to analyze (because each function is related to some actual trigger mechanism). Similar patterns have been popular in surprising places like web development. Instead of Javascript onClick() function, you have a C onOvercurrent(). And not some overcurrentGovernor() while(1) if(overcurrent) blah_blah task, which is the RTOS way, but does not look any cleaner but quite the opposite, and only causes performance penalty.

Event driven design patterns are extremely vaulable at many levels, up to and including soft realtime billing systems.

It has always surprised me that softies have so much difficult thinking in those terms, and have had to introduce/codify them via strange mechanisms often involving magic words like "inversion of control". Yes, I realise IoC is an overloaded (and therefore ambiguous) term!

Siwastaja · « **Reply #71 on:** June 12, 2022, 02:56:29 pm »

Quote from: tggzzz on June 12, 2022, 01:48:52 pm

Quote from: Siwastaja on June 12, 2022, 01:25:56 pm
It is not a coincidence that ARM Cortex MCUs handle saving and restoring this state, fully, so that interrupt handlers can be just plain functions.

Without direct knowledge, I am suspicious of that statement in the more complex ARMs. Do they save TLBs and similar?

Quick Googling* would have revealed that Cortex-M CPUs do not have virtual addressing, and hence no address translation unit, and the MMUs in higher end models (M7) are only a protection unit. Of course the purpose is to keep the core simple enough, suitable for real-time microcontroller applications, so that they don't need to think about complex edge cases like TLBs during CPU design, and won't need to push that complexity to software developer, either.

*) https://www.sciencedirect.com/topics/engineering/arm-cortex
"Unlike the Memory Management Unit (MMU) in application processors (e.g., Cortex-A processors), the MPU does not offer address translation (i.e., it has no virtual memory support). The reason for Cortex-M processors not supporting the MMU feature is to ensure that the processor system can deal with real-time requirements: When an MMU is used for virtual memory support and when there is a Translation Lookup Buffer (TLB) miss (i.e., a logical address needs to be translated to a physical address but the address translation details are not available in the local buffer), the MMU needs to carry out a page table walk. The page table walk operation is needed to obtain the address translation information. However, because during the page table walk operation the processor might not be able to deal with interrupt requests, the use of an MMU is not ideal for real-time systems."

Simon · « **Reply #72 on:** June 12, 2022, 03:26:34 pm »

Quote from: tggzzz on June 12, 2022, 01:13:28 pm

Nothing should be done in any interrupt, except determine the event that occurred and mutate that into a message put into a queue for the scheduler to observe, and optionally kick the scheduler into life.

The task state has to be saved when a task switching occurs. What's in the task state is processor dependent and is invisible to C compilers. Typically it will include the PC, stack pointer, the condition codes, the register set including FP registers. Note that many of those can be automatically saved by hardware when the interrupt is recognised and processed.

There is only one way to execute mare than one thread of code, and that is that the second code comes in an interrupt. Unless the code in the interrupt handler does the swapping of the current stack for at least the stack of something that will then decide who's stack to reload you will just have one thread oaf execution with another interrupting it until it is done. Sounds like a mess but I guess you have to make the CPU appear to be more than one somehow.

tggzzz · « **Reply #73 on:** June 12, 2022, 04:33:23 pm »

Quote from: Siwastaja on June 12, 2022, 02:56:29 pm

Quote from: tggzzz on June 12, 2022, 01:48:52 pm
Quote from: Siwastaja on June 12, 2022, 01:25:56 pm
It is not a coincidence that ARM Cortex MCUs handle saving and restoring this state, fully, so that interrupt handlers can be just plain functions.

Without direct knowledge, I am suspicious of that statement in the more complex ARMs. Do they save TLBs and similar?

Quick Googling* would have revealed that Cortex-M CPUs do not have virtual addressing, and hence no address translation unit, and the MMUs in higher end models (M7) are only a protection unit. Of course the purpose is to keep the core simple enough, suitable for real-time microcontroller applications, so that they don't need to think about complex edge cases like TLBs during CPU design, and won't need to push that complexity to software developer, either.

*) https://www.sciencedirect.com/topics/engineering/arm-cortex
"Unlike the Memory Management Unit (MMU) in application processors (e.g., Cortex-A processors), the MPU does not offer address translation (i.e., it has no virtual memory support). The reason for Cortex-M processors not supporting the MMU feature is to ensure that the processor system can deal with real-time requirements: When an MMU is used for virtual memory support and when there is a Translation Lookup Buffer (TLB) miss (i.e., a logical address needs to be translated to a physical address but the address translation details are not available in the local buffer), the MMU needs to carry out a page table walk. The page table walk operation is needed to obtain the address translation information. However, because during the page table walk operation the processor might not be able to deal with interrupt requests, the use of an MMU is not ideal for real-time systems."

I was thinking of the Cortex A series, which quick googling indicates are superscalar with TLB caches and branch prediction caches, both of which need to be invalidated and refilled. https://www.jblopen.com/arm-cortex-a-interrupt-latency/ That article indicates interrupt latency (i.e. to first instruction in the ISR) of 0.75-3.5µs.

The Cortex A series are definitely used as MCUs, e.g. inside Zync 7000 chips which also contain FPGA fabric. One of the more intriguing possibilities is that one core could run Linux and the other an RTOS, together with processing in the FPGA. Maybe that is done in the new low-end Tek scopes.

There are, of course, other simpler Cortex series; the range is one of the strengths of the ARM ecosystem.

tggzzz · « **Reply #74 on:** June 12, 2022, 04:45:34 pm »

Quote from: Simon on June 12, 2022, 03:26:34 pm

Quote from: tggzzz on June 12, 2022, 01:13:28 pm

Nothing should be done in any interrupt, except determine the event that occurred and mutate that into a message put into a queue for the scheduler to observe, and optionally kick the scheduler into life.

The task state has to be saved when a task switching occurs. What's in the task state is processor dependent and is invisible to C compilers. Typically it will include the PC, stack pointer, the condition codes, the register set including FP registers. Note that many of those can be automatically saved by hardware when the interrupt is recognised and processed.

There is only one way to execute mare than one thread of code, and that is that the second code comes in an interrupt. Unless the code in the interrupt handler does the swapping of the current stack for at least the stack of something that will then decide who's stack to reload you will just have one thread oaf execution with another interrupting it until it is done. Sounds like a mess but I guess you have to make the CPU appear to be more than one somehow.

That is not the standard meaning of "thread".

Arguably a better way of making the "CPU appear to be more than one" is exemplified in the XMOS xCORE processors. To a very useful approximation, the RTOS's functionality is implemented in hardware.

In xCORE devices an 8-"core" "tile", contains 8 sets of registers and one CPU. One instruction is executed from each core in sequence. Thus effectively there are 8 processors all executing at 1/8 the clock speed, and you normally dedicate one core to one i/o peripheral or computational task. A 32-core 4000MIPS embedded chip costs ~£20 one-off at Digikey. Inter-core comms messages use the same hardware and software abstractions as i/o through peripherals. Benefits: no interrupts, no cores, latency guaranteed by the IDE without executing code, crossing your fingers and hoping you've spotted/measured the worst case.


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

Author Topic: Do i really need to use an RTOS? (Alternatives to finite state machines?) (Read 18464 times)

Share me