Author Topic: Why do people not like Microchip? (Read 76118 times)

Howardlong · « **Reply #250 on:** January 08, 2022, 08:15:56 pm »

A bit of a Microchip boi myself, but this seems more than a little out of step.

A 2022 hardware debugger at 1980s prices. $1799.

https://www.microchip.com/en-us/development-tool/DV244140

Admittedly it has plenty of corner case functionality, but for 95%+ of debugging scenarios I just don’t get it.

If you need it, you need it, I guess.

westfw · « **Reply #251 on:** January 08, 2022, 11:25:19 pm »

Quote

I think AVRs were never even remotely popular in commercial or industrial products. The market share just never was there. Thanks to the hobbyist and university appeal [might have caused some sales]

People keep saying stuff like this, but in 2015 (the last year before the Microchip acquisition) Atmel had over $330M of revenue from their microcontrollers.Not to be sneezed at, and quite unlikely if they served "mostly hobbyists."

AaronD · « **Reply #252 on:** January 09, 2022, 03:01:09 am »

Quote from: T3sl4co1l on January 08, 2022, 10:24:18 am

Mega-0 and AVR-DA are absolutely bristling with peripherals. I'm not too familiar with what special things PICs have, but they may be on par now, give it a look.

Just googled the datasheets. Only got one of them. Google really goes off in the weeds with the Mega-0! The AVR-DA is distantly approaching the PIC in terms of hardware peripherals, and it has some interesting features that the PIC doesn't directly, like an event system that gets its own chapter instead of a bunch of direct connections between everything, but probably nothing that I couldn't "Erector-set" a modern PIC to do in hardware too. They're really flexible! Especially if you do treat it like an Erector set, and not strictly like it might have been envisioned. If you manage all of that well, you can almost have it do everything in hardware; and so the famously-awful free compiler running at 1/4-speed anyway, doesn't seem all that bad anymore because it hardly does anything anyway beyond the initial setup.

The PIC's gated timer, for example, doesn't appear at all on an AVR, to my knowledge, and it can be used separately as a free-running timer and a flexible-sourced interrupt trigger. Or you can use them together to measure a pulse width in hardware, and present it to the ISR as the stopped timer value. Or you can use them to add a delay to an event trigger, by connecting the source to the gate and the destination to the timer overflow, and setting the timer to overflow "soon". Etc. And that's just one peripheral!

A general-purpose DMA would be a big game-changer, but the closest that either of them comes to that is dedicated to the USB peripheral alone, and it can only talk to a user-configured block of memory, not to another peripheral.

One of my recent projects deliberately has both a PIC and an AVR in it, talking to each other. The PIC manages a digitally-controlled analog signal path that has some precise timing involved to switch things in and out of circuit, relative to an external clock, so all of that timing is done in hardware with the CPU just "playing housekeeper" around it. The AVR uses its fast instruction rate to semi-bit-bang a fast pulse-width-based serial protocol. It uses the PWM module to create the exact timing in hardware, but the protocol is so fast that about all it can do (with GCC fully optimized for speed at the expense of size and still refactored until it worked) is get the next bit, choose one of two constants, busy-wait a few cycles for the PWM interrupt flag (less than the interrupt latency), set the new duty-cycle, and repeat. Once it's done sending a frame, it can let the main loop catch up. Everything else is set to be slow enough that it can be ignored for an entire frame of this without clobbering a hardware buffer.

I might have been able to do that with the PIC's modulation peripheral instead, which is essentially a 2:1 MUX with some extra flair, but I only thought of it just now. It's meant to take a bit stream as the selection input, and switch between two other signals, like DC and a defined frequency for On/Off Keying, or two different frequencies, or whatever. In this case, I would give it two fixed-duty PWM signals, both clocked in reference to the bit stream so that each bit gets exactly one period.
_{(Maybe both PWM's use the same timer, and one is also fed to the SPI's clock pin in slave mode, then the SPI's data goes to the modulator selection? Then I'd only have to keep up with bytes and not bits! If it also had a DMA, then this entire software driver would be obsolete! Just set it up and let it run, with a static array as its input.)}

That capability in peripherals is what I was looking to see paired with a boatload of RAM and a fast, capable, and open-source-supported 8-bit CPU. Like I said above, they seem to be getting there, but it's definitely not complete yet.

Quote from: Kleinstein on January 08, 2022, 11:59:32 am

The 8 bit µCs are made for the small jobs - like those 90% that get away with 16 kB or less.

I guess "small" depends on where you're coming from. I learned on PIC16's that had just over 300 bytes of RAM and maybe 1k of Flash. Then for the same price (before MCP bought them), I could get an ATmega with 4x the instruction rate for the same external clock (no PLL either way), 4k of RAM, and more Flash than I could ever want! That was in exchange for less variety in peripherals, but it still had the SPI that I needed to run a bunch of shift registers for things like buttons and LED's, and it had *two* UART's so I could dedicate one to a DMX transceiver and still have my terminal-spew debugging. Wow!

Then I filled up the RAM with a giant 2-dimensional array of scenes and channel-structs, and used the hardware multiplier (another wow!) to scale each channel's value to its scene master and then the final output to the grand master. The other data in the channel-struct was to facilitate some "cheater functions" for when you didn't want the standard "highest takes priority" mixing scheme. And the main loop would actually get through all of that, in the time it took to send an interrupt-driven DMX packet! (no DMA) Wow again! Plus mapping the small physical control panel to different parts of the array like the layers/pages of a digital audio mixing desk, etc. I could never have even dreamed of doing that on the PIC's that I knew.

So to me, that AVR was huge! Big chip for a big job! It felt more like a PC than the cramped PIC's that I was used to...but the PIC's still had more capable peripherals, and from what I've seen somewhat recently, they still do.

Quote from: Kleinstein on January 08, 2022, 11:59:32 am

If you want lot's of memory, this aera is already lost the ARM based µCs.

Absolutely yes! But the massive problem there is a quantum leap in complexity, so that a reasonably-skilled 8-bit guy is still completely lost in that world. I gather that the 32-bit world is much more library-driven than the 8-bit world is, but the old habit of rolling your own everything so that you know exactly how it works and why it's failing at the moment, dies hard.

This might be a case of me thinking differently from the rest of the industry, but I have also yet to see a decently-documented library. The alternative is understanding multiple 1,000+ page datasheets, some of which are hard to find because you're supposed to use the library instead...

Kleinstein · « **Reply #253 on:** January 09, 2022, 09:04:39 am »

Going from a 8 bit µC to a 32 bit one is not that bad. I just did that step with not much trouble.

It still takes some time, with a new IDE and new peripherials. I don't think the step from AVR to PIC (or the other way around) would be much easier.
Using libraries for the HW interface can add a bit to this, but one is not forced to use them. It may still be a good idea with USB or similar more complex parts and the initialization.
Things may get more troublesome if you want to program in ASM - the 8 bit µCs are usually still very predictable in the execution time, something that is often lost with caches and buffers with the 32 bit µCs.

The main point I missed was a good simulator to check out the HW details in a simulation instead of debugging the real HW, which can be a bit more tricky. Maybe here I was spoiled from the really good simulator in AVR studio.

Siwastaja · « **Reply #254 on:** January 09, 2022, 11:32:12 am »

Quote from: AaronD on January 09, 2022, 03:01:09 am

One of my recent projects deliberately has both a PIC and an AVR in it, talking to each other. The PIC manages a digitally-controlled analog signal path that has some precise timing involved to switch things in and out of circuit, relative to an external clock, so all of that timing is done in hardware with the CPU just "playing housekeeper" around it. The AVR uses its fast instruction rate to semi-bit-bang a fast pulse-width-based serial protocol. It uses the PWM module to create the exact timing in hardware, but the protocol is so fast that about all it can do (with GCC fully optimized for speed at the expense of size and still refactored until it worked) is get the next bit, choose one of two constants, busy-wait a few cycles for the PWM interrupt flag (less than the interrupt latency), set the new duty-cycle, and repeat. Once it's done sending a frame, it can let the main loop catch up. Everything else is set to be slow enough that it can be ignored for an entire frame of this without clobbering a hardware buffer.

I absolutely see what you did here, but you could consider going to modern microcontrollers. Because many in STM32 series for example come with as many and as advanced peripherals as the PICs, but additionally have enough oomph in the core - that would have been a single-chip solution. And as nctnico would definitely agree with, single-MCU project is easier to manage than dual-MCU project.

Recently I had to implement a stupid legacy 1970's IBM protocol with quite high bitrate, for which there is no special peripheral available on any normal microcontroller on market - some old 8-bitters from early 1990's, which are neither PIC nor AVR is the only thing on the market. So had to bitbang it. And because the stupid protocol has details like bit stuffing to handle, the amount of code to handle that is more than 1 or 2 lines. So what are the options? If I had to do it 10 years ago, I'd have a separate AVR or PIC there, for which I would write the code in assembly bitbanging the clock and data signals exactly, and communicate to another MCU with more relaxed timing. Today? Just use Cortex M7 and run the stupid protocol in interrupts, written in plain "high level" C. 555kHz interrupt rate sounds impossible from the viewpoint of 20 years ago. Today you can just do it. Best thing - everything else works as usual. Data processing, logging, SPIs, CAN communication, etc. It's just one more interrupt source, eating some % of CPU time and increasing worst-case jitter of lower priority operation by some µs.

Similarly, I have built a software defined switch mode converter where I could not get the hardware to do "everything" despite the good set of peripherals, but had to resort to ISR running for every cycle - and do that at high switching frequency (maybe 300kHz). Again, enter Cortex M7 and let the brute force handle it. No need for separate chip; the same chip did handle communication, image sensor access and processing, two three-phase BLDC controllers, accessing six inertial measurement units, doing navigation and whatnot.

So CPU processing power is not only for processing audio or images or whatever like that. It is helpful in low-level IO. It allows you to replace a tight ASM implementation (which is a lot of work to implement; AND totally blocks the CPU from doing anything else, driving into dual-MCU design) with interrupt-based FSM, leaving enough CPU time for everything else. After all, 12 cycles of interrupt latency at 400MHz equals half a clock cycle at 16MHz. (Threading model, using a multitasking operating system, is another alternative to that interrupt-based FSM, but I prefer the bare metal way, unless I need fully implemented communication and filesystem stacks from the OS.)

Simon · « **Reply #255 on:** January 09, 2022, 01:01:51 pm »

The CPU is of little importance, apart from the fact that AVR is 4 times faster at executing code compared to the PIC - the 8 bit PIC that is I could not care less what the CPU is as long as I know how many bits it is and have a compiler for it. This is what I never get about the fuss of one versus the other. AVR executes faster at the same clock rate end of. Then it's just a case of the peripherals. Clearly given that microchip are releasing entire new ranges of AVR based micro's means that they do think it's a worthy CPU and I suspect that if people stop buying the 8 bit PIC's they will happily stop waking them the same as the "legacy" AVR stuff that had the same peripherals but not identical. For example if you look at the tiny and the mega ADC it's all the same, but the bit settings in the register of the same name on both are different for the same voltage reference

these day manufacturers are standardizing these peripherals which is why we now have counter types A, B, C and D. It means very clearly that if you have code that works for one of those counters, it will work for one of those counters, no matter which AVR it is on.

T3sl4co1l · « **Reply #256 on:** January 09, 2022, 01:04:25 pm »

Quote from: AaronD on January 09, 2022, 03:01:09 am

Just googled the datasheets. Only got one of them. Google really goes off in the weeds with the Mega-0!

Huh, go figure...

Example part is a '3208, here's the mfg page,
https://www.microchip.com/en-us/product/ATmega3208
or see the rest of the family from there.

Quote

The PIC's gated timer, for example, doesn't appear at all on an AVR, to my knowledge, and it can be used separately as a free-running timer and a flexible-sourced interrupt trigger. Or you can use them together to measure a pulse width in hardware, and present it to the ISR as the stopped timer value. Or you can use them to add a delay to an event trigger, by connecting the source to the gate and the destination to the timer overflow, and setting the timer to overflow "soon". Etc. And that's just one peripheral!

DA's timer D has some pretty interesting features, including generating and accepting events; I'm not sure about something like masked clocks, but there are duty cycle capture modes (you still have to do the division in software, but the period and width are measured in hardware anyway).

Quote

A general-purpose DMA would be a big game-changer, but the closest that either of them comes to that is dedicated to the USB peripheral alone, and it can only talk to a user-configured block of memory, not to another peripheral.

XMEGA A-series have DMA (and some of the larger ones in lower-letter series?). I haven't used it, but it looks pretty rich, pages of registers. So, you can probably pull tricks like one channel driving the others for rich scripting behavior, or maybe even Turing completeness, I have no idea. Not sure about DA and family; at least not the 64DAxx I was looking at.

There's also the CCL (configurable custom logic), which a lot of PICs have I think; and, integrating with the event system, you can get a lot of logic and sequential functionality that way.

Something kind of unique that I've seen on a few PICs, SMPS control -- you can build a peak current mode control for example, almost trivially; similar functionality I think is constructible with timers and events. Probably at greater expense to hardware resources -- that is, you're tying up a whole timer and a couple event channels to do it, and maybe that limits what else you can do -- but it's also not too common I would guess, that you need to make use of even a fraction of the available resources.

Quote

One of my recent projects deliberately has both a PIC and an AVR in it, talking to each other. The PIC manages a digitally-controlled analog signal path that has some precise timing involved to switch things in and out of circuit, relative to an external clock, so all of that timing is done in hardware with the CPU just "playing housekeeper" around it. The AVR uses its fast instruction rate to semi-bit-bang a fast pulse-width-based serial protocol. It uses the PWM module to create the exact timing in hardware, but the protocol is so fast that about all it can do (with GCC fully optimized for speed at the expense of size and still refactored until it worked) is get the next bit, choose one of two constants, busy-wait a few cycles for the PWM interrupt flag (less than the interrupt latency), set the new duty-cycle, and repeat. Once it's done sending a frame, it can let the main loop catch up. Everything else is set to be slow enough that it can be ignored for an entire frame of this without clobbering a hardware buffer.

Ah... interesting to note that, whereas MPLAB intentionally cripples its output; avr-gcc is just not very good at it. I've clocked it at about half the speed of hand-optimized assembly, at least for DSP (YMMV for other activities). GCC is good at optimizing, but it's done entirely on the internal representation (GIMPLE), and the target architecture is simply translated from that with little postprocessing (AFAIK?). It does very well for ARM and x86 -- of course, these are the most highly developed branches, or at least, I would guess -- but AVR and such, differ pretty significantly from the IR, and so there's a lot of wasted busywork, like, shifting around register allocations, extending sign into registers that aren't ultimately read, etc. And it only inlines 8x8 MULs; anything larger is sign-extended and called out to library (e.g. __mulhisi3). So any kind of arithmetic can be a challenge to optimize, short of assembling it yourself.

On the upside, the ISA is pretty reasonable, the main inorthogonality being some instructions limited to upper sets of registers (i.e. r16-r31, etc.). As a load-store architecture, it's rather verbose, so maybe not all that pleasant to purely assemble in.

I think the same is kind of true of PIC as well, just in different ways; everything has to pull through the W register, but you have fast page access and that. Dunno how adaptable it is, putting ASM with PIC-C. (What even is the compiler's ABI, how does it deal with hardware stack and page access? I haven't read any on it. Not that I care to; just that, this is critical information to be able to do that.) Instructions are "slower" too (~4 clocks/ins), but the clock is typically much higher so they're comparable in that respect. (Heck, same is true of competing ARMs -- those lacking pipelining and cache!)

Quote

I might have been able to do that with the PIC's modulation peripheral instead, which is essentially a 2:1 MUX with some extra flair, but I only thought of it just now. It's meant to take a bit stream as the selection input, and switch between two other signals, like DC and a defined frequency for On/Off Keying, or two different frequencies, or whatever. In this case, I would give it two fixed-duty PWM signals, both clocked in reference to the bit stream so that each bit gets exactly one period.
_{(Maybe both PWM's use the same timer, and one is also fed to the SPI's clock pin in slave mode, then the SPI's data goes to the modulator selection? Then I'd only have to keep up with bytes and not bits! If it also had a DMA, then this entire software driver would be obsolete! Just set it up and let it run, with a static array as its input.)}

Sounds like something that should perhaps be expanded out into SPI frames or something instead? -- but obviously this isn't enough info to tell. Especially tricky if it needs perfect timing; these devices rarely(?) have buffered SPI so there will inevitably be dead time between frames, while the loop/interrupt latency cranks through. DMA can help, but may still suffer from latency due to bus contention. (In which case, a bus matrix may pay off -- which most of these have, I think, but you're still limited by priority access to SRAM or the like, as both DMA and CPU will be needing that from time to time.)

Siwastaja's example of bit-stuffed protocols would be the kind of example that, while you could transmit it via SPI (I mean, given plenty of other assumptions, too), the amount of cranking required to prepare the next frame is nontrivial. And might not be easily tamed with lookup tables -- do mind, you can easily pack in a 64kB table into these devices, as AVR's PGM space is 16-bit address and width, so supports 128kB without fumbling with extended addresses* (up to 384k are available).

*Except you'll still be fumbling for that extra bit because LPM (load from program memory) fetches a byte at a time. But if you place the table entirely in "high" memory, you probably don't have to touch the extended address register (RAMPZ).

Heck... y'know, they could've easily made that a load-word instruction instead, and loaded adjacent registers (e.g. LPM r16:r17, Z+), or even removed the choice and make it implicit (say r0:r1, like the destination of MUL). Or put in byte and word variants. Shrug, it is what it is. I digress...

Quote

This might be a case of me thinking differently from the rest of the industry, but I have also yet to see a decently-documented library. The alternative is understanding multiple 1,000+ page datasheets, some of which are hard to find because you're supposed to use the library instead...

The problem that occurs here, is that, no one wants to deal with the complexity, of course -- so you have a cornucopia of options to choose from, none of which is especially better than the others. If you stick to the mfg tools (e.g. ST's CubeIDE stuff), they'll handle all this for you with a few switches allocating the hardware resources and such, and zoop -- code generation, there you go, add user code and you're off. Oh, and don't mind that 50kB bloat that you've just linked into your "blinky" application. I mean, they give 128kB+ of Flash on these things, but they also seem to do a damn good job using it up for you, as well.

And as I recall, not a lot of that bloat even drops off with -flto, it's not just superfluous crap, it's getting called from somewhere, somehow.

So, if you want to pare down the bloat, or speed up init, or just do things slightly differently from the official ways -- you're on your own.

One would hope for a sort of resource compiler, that doesn't just generate code snippets, but actually writes just the functions and initializers you need. But that would be a tremendous amount more work, to construct and test and debug and support, on top of the hardware selector, on top of the IDE, on top of whatever libraries and compiler support you need to get your new chips into mainline projects (GCC, libc, CMSIS, whatever). And it's only to support... people like us? Just so we can save a few bucks avoiding the upscale MCU that's more profitable for the manufacturer anyway? It feels intentional, but it's really just the coincidence that we're on the exact opposite end of mainstream development.

Speaking of profits -- these fancier AVRs tend to be rather expensive besides. Like, ATXMEGA64D3 is a bit over $4. ATMEGA3208, $1.25. AVR64DA64, $2. Ye olde ATMEGA328P is $1.50, and ATTINY402 from $0.50. They're making some improvements in the newer lines, and, I don't know offhand if XMEGAs were ever cheaper, if they're being phased out by raising prices or something; but the older classics clearly have staying power, like the MEGA and TINY. Clearly, XMEGA's been priced well above competing ARMs (e.g., ATSAME, STM32F0, etc.), and that's one of the smaller parts in the family even.

And, however that compares with PICs, you'll know better than me offhand, heh.

Tim

AaronD · « **Reply #257 on:** January 09, 2022, 07:38:57 pm »

Wow! Thanks Siwastaja and T3sl4co1l! Just goes to show how much I need to get into the 32-bit world. Now if I can just wrap my head around the config...

Siwastaja · « **Reply #258 on:** January 09, 2022, 07:54:19 pm »

Quote from: AaronD on January 09, 2022, 07:38:57 pm

Wow! Thanks Siwastaja and T3sl4co1l! Just goes to show how much I need to get into the 32-bit world. Now if I can just wrap my head around the config...

I went STM32 directly from AVR and didn't change my workflow or mindset in any way. No libraries, no IDE, no code generation - also no problems, after the initial shock and lack of proper tutorials at the time.

It's still the same. Look up registers in the reference manual, think, write code. It's almost a decade now, and the "cool kid" way of the internet forums has changed at least twice during that time. The fact I'm doing it "wrong" doesn't change, but how to supposedly do it "right" is a moving target. Make your conclusion and choose between what you already are familiar with, or a new strategy / toolset you need to relearn every 5 years.

Sure, higher end controllers offer more possibilities, and with more possibilities, there is more complexity, but that doesn't fundamentally change the workflow in any way. Ignore people who say that full paradigm change or using libraries is required. That is an acceptable way to work, of course as evidenced by many doing that, but it isn't the only one.

Documentation of a modern M7 device might be 5000 pages where an AVR is 500 pages, but that is only because it has many peripherals you never use anyway. Given the project complexity isn't increased, the added mental weight from some STM32H7 device, compared to an AVR, is maybe 3-4x, not many orders of magnitude.

This extra "mental weight" consists of small things like having to write a correct value into FLASH waitstate control register, set up PLL clock dividers and multipliers and voltage scaling values, configuring memory regions in linker script however you wish to use the various separate memories available to you... and so on. But it's still within a screenful or two of code, it doesn't take forever to learn. Fundamental process itself is the same. It's like if you learned to drive in a town with population of 10000, going to a bigger city of 100000 might be a bit demanding, but you can make it, one intersection at a time.

Actually I could say the biggest surprise to me was to learn what "linker script" is, despite the concept being half a century old. It's just that avrgcc had always generated (or used the right one) for me. With more advanced controllers, there are reasons why "one size fits all" linker script is not supplied automatically by the compiler. When I started with STM32, I simply didn't find a proper example linker script, and instead found a broken one which "kind of" worked, causing me a massive headache. Now I think you can pretty easily find example linker scripts from the examples by the MCU vendors, no need to trust random hobbyist websites.

ataradov (https://github.com/ataradov/) has a good set of simple no-bullshit MCU starter projects. I haven't used them but at a quick glance, I can confirm his approach is sane; a good reference resource.

NorthGuy · « **Reply #259 on:** January 09, 2022, 08:55:41 pm »

Quote from: AaronD on January 09, 2022, 07:38:57 pm

Just goes to show how much I need to get into the 32-bit world. Now if I can just wrap my head around the config...

This is a myth than different worlds exist. All MCUs are roughly the same. If you can work with MCU A, you can work with MCU B using the same principles. All is very similar - how to arrange things in time, how to divide work between CPU and peripherals, how to think and plan before writing code. If something looks complex to you, 99% of the time this is because you don't know how it works. Once you figure this out, it will not look so complex any more.

Choosing an MCU is not a matter of getting into a particular "world", but rather reconciling your requirements against a particular MCU. Often, important characteristics have nothing to do with the MCU internals at all, like you may want a very small MCU, an MCU which is easy to route, an MCU which consumes very low power, or whatever. Most of the time, you envision how you would do things then you select the MCU which can do what you have envisioned.

AaronD · « **Reply #260 on:** January 09, 2022, 09:34:24 pm »

Quote from: Siwastaja on January 09, 2022, 07:54:19 pm

The fact I'm doing it "wrong" doesn't change...

Haha! I do things "wrong" all the time. I've heard it called the "hacker mentality", where most people look at a device and think, "What does this do?", but a hacker looks at it and thinks, "What can I make this do?" Of course, it's "all wrong" compared to the original intent, but I got something cool out of it and it actually works pretty well!

Quote from: Siwastaja on January 09, 2022, 07:54:19 pm

Actually I could say the biggest surprise to me was to learn what "linker script" is, despite the concept being half a century old. It's just that avrgcc had always generated (or used the right one) for me. With more advanced controllers, there are reasons why "one size fits all" linker script is not supplied automatically by the compiler. When I started with STM32, I simply didn't find a proper example linker script, and instead found a broken one which "kind of" worked, causing me a massive headache. Now I think you can pretty easily find example linker scripts from the examples by the MCU vendors, no need to trust random hobbyist websites.

Interestingly, I spent several weeks beating my head against the wall about an 8-bit linker script. My company was making a clone of its own gadget, not really a port because both the chip and the circuit were so different, from an 8051 with USB to a PIC16F1454. Using the pro compiler for the PIC, the final project was still over half of the Flash size, and we wanted to update ALL of it over USB using our existing tool and its protocol. Not that hard to implement the protocol, but how to update ALL of Flash when we were using over half of it, while keeping it "brick proof" through a random power loss? (the user might have to re-load the firmware to make it useful at all again, but at least it should still have enough to do that)

I ended up with a custom bootloader of sorts that contained the USB stack and main loop and never released control. Instead, it had a bunch of application functions - a "main" function that gets called inside the main loop, the single ISR, and the USB comms that the bootloader didn't grab for itself - and it would only call them if it knew it had a valid application section. It could download a new copy of itself, run the checksum, then jump to a protected bit of assembly to copy that to where it's supposed to be and then jump to it, and then the new "quasi-bootloader" could download the new app section and run the checksum again.

Both sections were part of the same project so that the function calls would "just work" as usual, and some explicit directives in the source code and some explicit linking got it all placed in memory like it needed to be. Then one more post-processing step split the hex file into two files, and a slight modification to the PC app gave it a state machine to make it download twice, expecting it to drop off the bus and reconnect in between.

It worked!

Quote from: Siwastaja on January 09, 2022, 07:54:19 pm

ataradov (https://github.com/ataradov/) has a good set of simple no-bullshit MCU starter projects. I haven't used them but at a quick glance, I can confirm his approach is sane; a good reference resource.

Hmm... At the bottom of his supported devices list is the new(ish) Raspberry Pi Pico. Dual-core M0+ with pretty much standard peripherals, and the RPi's famous support. I had forgotten about that. It could be interesting to play with...

Quote from: NorthGuy on January 09, 2022, 08:55:41 pm

If something looks complex to you, 99% of the time this is because you don't know how it works. Once you figure this out, it will not look so complex any more.

Absolutely! But if all you have is a hammer, then everything looks like a nail. It's hard to think differently unless you've already been there. Once you have the right generalization, *then* you can have different but equally-valid tools. The larger, more capable architectures feel to me like power screwdrivers or maybe an air wrench. (No, you don't beat the screws in!!!)

NorthGuy · « **Reply #261 on:** January 09, 2022, 10:47:33 pm »

Quote from: AaronD on January 09, 2022, 09:34:24 pm

But if all you have is a hammer, then everything looks like a nail. It's hard to think differently unless you've already been there. Once you have the right generalization, *then* you can have different but equally-valid tools. The larger, more capable architectures feel to me like power screwdrivers or maybe an air wrench. (No, you don't beat the screws in!!!)

I would rather think of MCUs as toolboxes. You get a big toolbox with screwdrivers, hammers, drills, saws, various drill bits, blades, wrenches, but also some exotic tools you don't know how to use and what they're for. You get a task, such as building a table, and you need to make sure that you select a box which have all the tools that you need. If you miss something you'll have to improvise - such as beat nails with a wrench, or drill pilot holes and push nails in with pliers. Same with MCUs - they must have everything you need for your task.

T3sl4co1l · « **Reply #262 on:** January 10, 2022, 01:06:55 am »

Yeah, they're often well enough documented to do it by raw registers. What you get from libraries -- of varying sorts, from headers (like the workflow with most AVRs) to CMSIS to mfg HAL stuff or others -- is portability: some ease in switching between devices. Which might be between different sizes of a family, to other families (like, between flavors of AVR), to whole different cores (maybe AVR vs. ATSAM?) or even other manufacturers (not sure there's any embedded examples of this, but PCs and phones are a good example of multi-platform support, at least by higher-level means: the drivers may vary, but they all run Android or Linux or whatever, and from there, the same ARM binaries).

The most basic thing you can do is use raw register offsets and settings, just magic numbers everywhere; that's obviously just very poorly documented, besides being utterly brittle in portability; if anything on the underlying hardware changes, it's a complete wash.

Put those numbers into device-specific headers, and you have something like what AVR has. They go with avr-libc, which provides platform-specific implementations of standard (and some nonstandard/extended) C functions. You may not have to change program code between sibling MCUs, but likely will between sub-families or higher.

Some of those differences can be abstracted away with a driver layer, which is what CMSIS is for AFAIK. More of a thin driver layer over the available devices; a standardized interface so you input the options/actions and get compile or runtime errors if something isn't supported.

And HAL stuff might range from device to family to manufacturer level support, of course stuff that compiles on all platforms is also going to be very bloated. At least give or take how it's structured; a fair amount of portability can be afforded by C++ templates and stuff, but that's not available in plain C and pretty much has to be written to do anything.

Along with that, presumably the higher level stuff has been tested, at least to some degree -- you might overlook the errata reading the datasheet, or stumble on a bug that hasn't been reported yet, and pull out much hair in the process. Whereas in a HAL you have some hope that they've actually written it in already. Or, I mean, who knows -- again, the fundamental problem with ever-higher level software is, it's made for whatever purpose, and whether it was at all very much documented, for that purpose or other, or how well it's been tested, period, or on other platforms; it can be all over the place.

And I mean, look at Arduino; it's cheesy and slow, but it's accessible as hell, and supported on diverse embedded platforms.

And it's not like the problem is unique to software, plenty of hardware is full of crappy design (PIC errata, anyone?

). It's a bit harder to change that software so tends to be a bit better tested, but it's all a matter of degree, nothing is perfect.

Tim

Siwastaja · « **Reply #263 on:** January 10, 2022, 08:42:20 am »

Quote from: NorthGuy on January 09, 2022, 10:47:33 pm

I would rather think of MCUs as toolboxes. You get a big toolbox with screwdrivers, hammers, drills, saws, various drill bits, blades, wrenches, but also some exotic tools you don't know how to use and what they're for. You get a task, such as building a table, and you need to make sure that you select a box which have all the tools that you need. If you miss something you'll have to improvise - such as beat nails with a wrench, or drill pilot holes and push nails in with pliers. Same with MCUs - they must have everything you need for your task.

Yeah, and some prefer to wrap all their tools inside of paper printed with pictures of hammers, because on a CS class, it was told that the more wrapping the better, and besides, hammer is always good. Then they try to use the tools through that hammer paper and wonder why this is all so difficult, but get the job finally done by accepting whatever table comes out as a result. Then they go to the interwebz to brag how nice and well detailed the hammer picture is, and wonder how difficult it must be if you remove those wrappings. After all, that means you have to draw all the hammer pictures from scratch every time you do anything, how insane waste of time.

And after 5 years, hammers go out of fashion, and are replaced with drills.

Boy I love tool analogies!

Codemonkey · « **Reply #264 on:** January 10, 2022, 09:03:40 am »

Quote from: Simon on January 07, 2022, 08:21:26 pm

You got the seminar and the board, how was the board not cheaper? seminars cost money you know.

The cost to the company was about the same regardless. You could also buy the board without the seminar (which was actually run by the distributers, not Atmel) but the cost was still about the same.

Kleinstein · « **Reply #265 on:** January 10, 2022, 09:32:13 am »

The linraries may provide better portablity, but they also add more manuals to read and understand and an additional layer of bugs. At least for the STM32 HAL part the protabilty has it's limites - if the HW is different the way to use HAL can also be different different. Including more advanced features, not supported on all chips to the HAL functions can also be very confusing and make things more complicated than actually needed.
I like the HAL part for the initialization, but that is about it.

The erratas are a big thing and some of the PICs were pretty bad in that respect (e.g. the early PIC18 with non working interrupt priorities - the only work around was not using this feature as it could lead to stack corruption). Erratas were also a big thing with the early Xmegas, severely crippling the ADC. It is a shame that the normal datasheets (epsecially the electronic form ) does not have at least hints, which parts may be effected by erratas. Not a severe one, but odd were the internal caps for a 32 kHz crystal in the AVRs. The DS list them, but AFAIK they never worked and still come up in new datasheets - they could have just skipped that from the mega8 on. In this case I think this is broken by design - switchable caps just can't work well, as the caps need to be low ESR.

SiliconWizard · « **Reply #266 on:** January 10, 2022, 05:30:19 pm »

Quote from: Kleinstein on January 10, 2022, 09:32:13 am

The linraries may provide better portablity, but they also add more manuals to read and understand and an additional layer of bugs. At least for the STM32 HAL part the protabilty has it's limites - if the HW is different the way to use HAL can also be different different. Including more advanced features, not supported on all chips to the HAL functions can also be very confusing and make things more complicated than actually needed.
I like the HAL part for the initialization, but that is about it.

I agree with that. It's usually fine for initializing peripherals / MCU features. But in all other cases, it's often pretty inefficient. (I'm thinking, in particular, about all the functions for transmitting data from/to a peripheral using DMA. They reconfigure the DMA channel at every call. That makes it easy to use when you're just testing stuff, but for real work, you usually need to rewrite them anyway.)

Siwastaja · « **Reply #267 on:** January 10, 2022, 05:37:02 pm »

But, all DMA controllers I've worked with are trivially simple. Maybe there is some beast out there which requires manufacturer to chime in with driver code?

I follow the practice of being explicit to the point, and no more. So if I want to tell the DMA controller in STM32 that the transfer length of the next transfer is 123, but everything else is like before, I do this by writing:

DMA1_Channel2->NDTR = 123;

instead of
config_struct.irrelevant_parameter = 0;
config_struct.some_fixed_but_important_parameter = CONSTANT_TO_A_FIXED_VALUE;
(copypaste 10 more unrelated lines, one of them accidentally wrong to add an interesting hidden bug)
DoStupidHALThing(DMA1_handle_thing, &config_struct);

Performance difference is obvious and undisputed (and yes, it's very common requirement to reconfigure DMA in interrupts at high interrupt rate), but I claim that the first one is also easier to understand.

As treez would put it, would you agree...?

Rudolph Riedel · « **Reply #268 on:** January 10, 2022, 08:09:07 pm »

Quote from: Siwastaja on January 07, 2022, 07:14:13 pm

I think AVRs were never even remotely popular in commercial or industrial products.

How many commercial products did you take apart?
I have an AVR in my dryer, discovered this a couple of years ago when the supply chip broke.
Whirpool AWZ 61225, still works after the repair.

I had a dishwasher, I believe it was a Beko, when it broke I found at least three AVR in it, one for example controlling the sump pump.
Having a working dishwasher was too important though to not immedialy buy a new one.

It's not like manufacturers of white goods advertise which controllers they are using.
And who takes apart a dishwasher, a drayer or a washing machine just for kicks?

westfw · « **Reply #269 on:** January 10, 2022, 11:22:11 pm »

Quote

whereas MPLAB [compiler] intentionally cripples its output; avr-gcc is just not very good at it. I've clocked it at about half the speed of hand-optimized assembly, at least for DSP

Is there an 8bit cpu compiler that does a good job on such math? My impression is some things are pretty much crippled by standards rules (and that "mixed precision math" is just one of those things that you don't want to use C for.)

Quote

it only inlines 8x8 MULs; anything larger is sign-extended and called out to library

.
The current compiler inlines 16*16bit multiplies as well, and is even smart enough to inline a 32x32 multiply when the result is only 16bits: https://godbolt.org/z/Yqsaz9PfW

Nominal Animal · « **Reply #270 on:** January 11, 2022, 12:16:05 am »

(For what its worth, I do quite a bit of SIMD stuff, especially on x86-64; and using the intrinsics and GCC/ICC/Clang vector extensions (which work for e.g. ARM NEON too), I can easily get 2× to 4× the speed of compiler-vectorized code; sometimes even more. ICC is the best of them, but even it cannot do it as well as a human, yet; and possibly never in C or C++, exactly because of the rules of the standard abstract machine. The difference to hand-written assembly is so small that I almost never bother anymore, especially since the vector extensions are rather portable across hardware architectures. In other words, it is no surprise to me at all that math is hard for C compilers to optimize.)

SiliconWizard · « **Reply #271 on:** January 11, 2022, 12:34:02 am »

Quote from: Nominal Animal on January 11, 2022, 12:16:05 am

In other words, it is no surprise to me at all that math is hard for C compilers to optimize.)

Sure thing. And one should also consider whether they are using integers or floating point. On top of good optimizations being hard in general, some "optimizations" that may look obvious to the programmer may actually lead to precision problems when using FP, thus compilers avoiding some of them. Even when using the -ffast-math option.

Nominal Animal · « **Reply #272 on:** January 11, 2022, 01:54:10 am »

One of the expressions that I wish C compilers could handle better is simple
((int64_t)x * (int64_t)y) >> 31
with int32_t x, y, on 32-bit architectures.

Even GCC 11.2 on x86-64 with -m32 -Og compiles this to code that uses two separate imul instructions, with the minimal one (generated with -m32 -O2) being simply
mov eax, x
mov edx, y
imul edx
shrd eax, edx, 31
although
mov eax, x
mov edx, y
imul edx
shld edx, 1
is even better/shorter, although then the result is in the edx register (instead of eax). Both versions are best interleaved with other code (not using eax or edx registers).

The reason this is difficult for compilers to optimize, is the nature of the casts. Either the internal representation of variables in the compiler needs to be able to describe a variable as "32-bit signed value as a 64-bit quantity", or it internally needs different multiplication operations depending on the parameter and result widths and signednesses, and do a preliminary optimization pass that combines a cast-and-multiplication to the proper multiplication operation. To compiler writers, this is nasty.
Yet, in practice, it does make a significant difference to fixed-point multiplications and similar.

And, as in the above example, the paired register result (which I do believe occurs on both 32-bit ARM Cortex M4 and RISC-V) also needs special handling. Depending on the hardware architecture, a paired bit shift may be slower or less preferable for pipelining or other reasons, than a single-register bit shift and a register move.

(RISC-V and MIPS do need both mul-hi and mul-lo operations, because the latter provides that one further bit needed, but that cannot be avoided, when the shift is 31 and not 32 bits.)

This kind of multiplication, and the related fused multiply-add in fixed point, is the only one that I tend to implement in GCC/ICC/Clang extended inline assembly. This has the benefit that unlike a separate assembly function, these C compilers can inline the code containing the extended inline assembly, and even choose the registers used (if the instruction set permits; it does not on 32-bit x86) as long as the proper register constraints are used.
I do not believe that there is a way to describe the operation in standard C, except when the compiler implements the abovementioned optimizations itself.

If we look at 8-bit architectures, the current compiler developers just do not see the need to optimize such expressions, because of the idea that "if anyone needs that, they're better off using a more powerful, e.g. a 32-bit microcontroller, anyway".

In todays world, it is damned difficult to argue for spending more human effort, when a little bit more money thrown at the hardware works just as well or even better.
I suspect only hobbyists and similar enthusiasts actually care about this.

To circle back to the original topic, I do find it odd how much the discussion revolves around personal preferences, and especially attempts at explaining away others' differing personal preferences as stupidity, lack of experience, or lack of knowledge.

Me, I like the hardware, but dislike the company because their business practices are contrary to my current preferences/goals, like I already posted in this thread earlier.

I don't see anything odd in others liking the hardware so much that they're willing to put up with the company, or if doing commercial product development with employer-paid tools, even like to deal with the company. It's just different context, different situation, different weights/emphasis on various aspects. If my own situation were to change, my opinions would too.

The interesting bit here are the details why a particular person likes/dislikes the hardware (or the company). It's just a bit dull/discouraging/annoying reading all the "because I'm happy with the hardware, you must be rather stupid and should change your career if you disagree or dislike the company" -type insinuations between the lines.

I'm especially unhappy with the "I feel Atmel did ..." and "I feel Microchip did ..." type posts, because the writers obviously haven't checked the actual history, company revenues, development mailing lists et cetera, and just pull those assumptions out of their feelings, without any kind of basis in reality. Please stop; it just muddles the waters, as it makes it very difficult to those who only read but do not otherwise participate in the thread to uncover the facts and personal reasons, and separate them from unfounded beliefs and feelings, which really, have negative value.
If you don't want to stop, then please just make it clear when it is your personal belief or understanding without any references, and when you have facts you have based that belief or understanding on.

Zucca · « **Reply #273 on:** January 11, 2022, 04:31:16 am »

(I want to do my first steps in the µC planet with Microchip they look like the easiest to learn....)

rsjsouza · « **Reply #274 on:** January 11, 2022, 01:47:56 pm »

Quote from: SiliconWizard on January 11, 2022, 12:34:02 am

Quote from: Nominal Animal on January 11, 2022, 12:16:05 am
In other words, it is no surprise to me at all that math is hard for C compilers to optimize.)

Sure thing. And one should also consider whether they are using integers or floating point. On top of good optimizations being hard in general, some "optimizations" that may look obvious to the programmer may actually lead to precision problems when using FP, thus compilers avoiding some of them. Even when using the -ffast-math option.

The best C compilers and optimizers for mathematic operations that I worked with were for the TI C5000 and C6000 DSPs - although their implementation was highly specialized in digital signal processing, you could still write it in plain C in a specific way and you would make use of its full MAC with sign extensions and special addressing. FFT was also quite interesting, which even used the C5000's HW FFT engine. But all that is history now.


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

Author Topic: Why do people not like Microchip? (Read 76118 times)

Share me