The F303 does appear to have 4x 5Msps ADCs so it could work for a 4-channel device. At 5Msps and 2 bytes/sample time 4 channels, that's 40Mbytes/sec in DMA operations! Depending on the memory architecture, that could have a noticeable impact on performance.
It seems that DMA speed limits total throughput to 16MSPS:
https://www.eevblog.com/forum/microcontrollers/microcontrollers-with-fastest-adcs/msg2089891/#msg2089891
Also, @72MHz there isn't much time for processing the data if you need to search each channel for maxima, minima, averaging (for oversampling), looking for trigger thresholds etc. Even at 16MSPS you only get 4.5 clock cycles per sample!
4x5MSPS would be 20, so the 4-channel device may be in trouble. My 2-channel one should work as it's only 10MSPS. I had no trouble getting 2x1MSPS working on the F103, but 2MSPS is a lot less.
You could offer 4 channels active only at lower capture rates - no ideal, but could still useful.
You do everything capture related in hardware:
Use one timer to time the DMA samples. This timer is gated by an input signal which is held active by a second timer being stopped. The DMA runs continuously, refilling the circular buffer over and over, so that you will have pre-trigger data.
The second timer is clocked by the first timer (sample count), and preloaded with the number of samples you wish to capture post-trigger. It is gated by an input signal from the trigger circuitry.
Triggering is done with an external comparator (The L476 has a comparator as a device and a peripheral interconnect matrix so it may be possible to do some of these functions and connections internally - haven't looked that far in to it yet). You need an independent comparator anyway in order to be able to do external trigger, and also if you want to be able to do any conditioning on the trigger signal.
When the trigger occurs, the gate to the second timer activates and it starts counting down for the number of samples post-trigger. When it hits bottom, it un-gates the first timer and the capture stops.
You can use an interrupt on the gate signal to the second timer to inform the software when the trigger event has occured (capture is started), and one on the gate signal to the first timer to inform the software when the capture has completed.
Measurements are typically done post capture. DSO's don't capture continuously, they capture a buffer, do processing on it, then commence another capture. In "real" DSO's the capture memory is not usually on the CPU, so it gets transferred to the CPU via a high speed bus and another capture can be started quickly.
If you have enough RAM (the F103 doesn't) you could simulate this by toggling between two buffers so that you can commence a new capture almost immediately after one completes, but there will always be a minimum holdoff. For most of what I do a longer holdoff wouldn't be a problem, and I'd rather have capture depth.
I'm designing this strictly for my own use so I have the luxury of giving priority to it working the way I want. No point spending money to productize it because it will either be a) unnoticed - in which case you've wasted your investment, or b) popular - in which case it will be immediately cloned and sold on on ebay for cheap in which case you've wasted your investment.
I did experiment with "no external hardware" triggering, the STM32 ADCs have a "watchdog" feature where you can cause an interrupt after the signal goes above or below a certain threshold which is tested on a sample by sample basis. Very small interrupts handlers then start/stop the timers as described above, however there is still some latency at fast capture rates. You can compensate for this by scanning a portion of the capture buffer just before where the trigger interrupt occurred to identify the actual trigger position. It does mean your trigger position may not be exactly the number of samples from the end of the buffer that you desire. I compensated for this using 4096 byte circular buffers on the F103 but only showing 4000 bytes to the user - this gave me 96 bytes worth of "wiggle room" which was more than enough.
It actually worked reasonably well, but I eventually decided on the all-hardware approach as you need an external comparator for EXT trigger anyway, and you can do conditioning on the trigger signal. It also makes the trigger system the same for either internal or external trigger. On the F103 I used a PWM DAC (good thing there's lots of timers), but the L476 has real DACs built in!
Remains to be seen if 10MSPS DMA running continuously will slow down the memory bus to the point where the UI becomes unbearably slow ...
You're making me want to pick it up again...
Dave