Author Topic: A High-Performance Open Source Oscilloscope: development log & future ideas (Read 81673 times)

nctnico · « **Reply #100 on:** November 19, 2020, 09:28:18 pm »

Quote from: tom66 on November 19, 2020, 09:01:57 pm

The Zynq 7014S has 40k LUTs + ARM processor + hard DDR controller fabric. 1 unit starting from $89 USD.

IMO that wins over the 65k LUTs + no hardcore CPU + no DDR3 controller in the Kintex 70T -- and in my experience the MIG eats up easily 20% of the logic area for just a single channel DDR3 controller, I'm not convinced that would be a worthwhile trade off here. By the time you've implemented everything that the Zynq gives you "for free" (including 256KB of fast on-chip RAM) you're stepping up several grades of 'pure FPGA' to get there.

Well, that was the old MIG and that made me roll my own (way more resource efficient) DDR2 controller a long time ago. The hard IP DDR3 controllers in modern Xilinx FPGA devices however don't eat any logic. Realistically you can't create a DDR3 controller running at hundreds of MHz from generic IOB cells anyway. The timing needs to be trained etc. And since the Kintex 7 series is related to the Zync series it has exactly the same (hard IP) memory controller as the Zync has.

The distributed oscilloscope design I have made for one of my customers based on a Spartan6 LX45T (which has 44k logic cells) contains an LM32 soft-core, timing logic, network interface + hardware firewall (don't ask me why that is in there), various peripherals and ofcourse the oscilloscope part. This design uses about half of the logic without doing much on optimisation. The smallest Kintex part has 65k logic cells. I really don't think running out of logic resources is going to be an issue.

tom66 · « **Reply #101 on:** November 19, 2020, 09:33:17 pm »

The Kintex-7 doesn't have a hard DDR3 controller!

Spartan-6 and such do, but that was migrated over to the fabric in the later series of devices.

I haven't prototyped it on a Kintex-7 but on an Artix-7, the MIG used 20% of the device. Admittedly that would have been a smaller 7A35T, but that is still 10% of logic used on the big Kintex device assuming it maps similarly - I don't know what the effect of moving to a 32-bit controller would be but I imagine it would further increase device utilisation. There is hardware acceleration for it on the FPGA fabric - the SERDES drivers for instance are optimised for DDR controllers - but it's still a 'soft IP' at heart.

The biggest limitation of Zynq (when it comes to RAM) is in the standard speed grade the max DDR3 freq is 533MHz. You have to go up speed grade to get to 667MHz (or write PLL registers and overclock, but that is asking for trouble.)

asmi · « **Reply #102 on:** November 19, 2020, 09:38:18 pm »

Quote from: tom66 on November 19, 2020, 09:01:57 pm

IMO that wins over the 65k LUTs + no hardcore CPU + no DDR3 controller in the Kintex 70T -- and in my experience the MIG eats up easily 20% of the logic area for just a single channel DDR3 controller, I'm not convinced that would be a worthwhile trade off here. By the time you've implemented everything that the Zynq gives you "for free" (including 256KB of fast on-chip RAM) you're stepping up several grades of 'pure FPGA' to get there.

You missed the part that Kintex fabric is significantly faster than Atrix fabric (close to 50% in my experience, one design that barely closes at 180 MHz on Artix fabric, easily closes at 280 MHz on Kintex). And MIG in 160T part in FF package can implement 64bit 933MHz DDR3 interface, while Zynq only does 32bit 533 MHz. So not only you're getting 2 times wider data bus, you also get 400 MHz more frequency too. All it all, it's about 3.5x memory bandwidth. You also get 10G transceivers as opposed to 6G.
That said, you can get both in Zynq-030 - it's got same 2 cores as your device, and you get 125K LUTs of fast Kintex fabric, can implement 64bit DDR3 *in addition* to what Zynq provides, and you get 4 10G transceivers. And it's also included in free version of Vivado.

asmi · « **Reply #103 on:** November 19, 2020, 09:50:44 pm »

Quote from: tom66 on November 19, 2020, 09:33:17 pm

I haven't prototyped it on a Kintex-7 but on an Artix-7, the MIG used 20% of the device.

MIG typically consumes 6 to 8k LUTs for DDR3 (quite a bit less for DDR2), and it obviously doesn't scale with device. Just for the hell of it I just created an absolute monster of controller with dual channel SODIMM controller for S100 device, and it took 24K LUTs. That is freaking 128 bit wide data bus!
I personally think that K160T is the best midrange device - it can be used with free tools, it allows to implement a 64/72bit DDR3 interface via both discrete components and SODIMM module, and it can run it pretty fast - up to 933 MHz for a single-rank SODIMM module. You also get up to 8 10G transceivers either for talking to really high-end ADCs via JESD204B, or to connect to your main processing block, or both.

james_s · « **Reply #104 on:** November 19, 2020, 09:53:45 pm »

Quote from: sb42 on November 19, 2020, 08:33:22 am

Internet router: mass-market consumer product, cheap commodity hardware, Linux does everything.
Oscilloscope: completely the opposite in every way?

The only way I see this happening would be for one of the A-brand manufacturers to come up with an open-platform scope that's explicitly designed to support third-party firmware, kind of like the WRT54GL of oscilloscopes. I suspect that the business case isn't very good, though.

Manufacturers of inexpensive scopes iterate too quickly for reverse engineering to be practical.

Obviously there are large differences, but the benefit is the same. Open source firmware has the ability to be improved over time as deficiencies are corrected and new features added. The market of oscilloscopes is vastly smaller than that of consumer routers, but the market for high end DIY open source designed from scratch oscilloscopes is a tiny fraction of the already small market for oscilloscopes overall.

The main thing that would appeal to me about a commercial scope is the packaging, it's a nice tidy form factor with nice buttons and knobs and everything in a professional molded housing, and all of the front end hardware is there, the place they are usually most lacking is in the software side of things. Ultimately I suppose this whole project is really only interesting to me from an academic standpoint, it's interesting to see the inner workings of a modern DSO and it's a truly impressive achievement, but if I'm going to spend $500+ I'd buy a TDS3000 and hack it to 500 MHz or a Siglent or Rigol and put up with potentially buggy firmware. Seems like this debate was also beat to death not too long ago, very few people are going to pay as much or more for an incomplete open source device than it costs to buy a ready made off the shelf instrument that is ready to use right out of the box just for the sake of it being open source. The main advantage of open source is cost, anyone can duplicate an open source project so competition drives cost down while innovation continues to result in incremental improvements to the design. If the cost is not significantly lower than a commercial product of similar performance then the market is very, very limited to that tiny segment of the population with the knowledge and desire to tinker and improve upon it. Most people don't care, most of the users of open source projects do not tinker with the source themselves even if they like the idea that they technically could if they wanted to.

nctnico · « **Reply #105 on:** November 19, 2020, 10:00:58 pm »

Quote from: tom66 on November 19, 2020, 09:33:17 pm

The Kintex-7 doesn't have a hard DDR3 controller!

Spartan-6 and such do, but that was migrated over to the fabric in the later series of devices.

I haven't prototyped it on a Kintex-7 but on an Artix-7, the MIG used 20% of the device. Admittedly that would have been a smaller 7A35T, but that is still 10% of logic used on the big Kintex device assuming it maps similarly - I don't know what the effect of moving to a 32-bit controller would be but I imagine it would further increase device utilisation. There is hardware acceleration for it on the FPGA fabric - the SERDES drivers for instance are optimised for DDR controllers - but it's still a 'soft IP' at heart.

After reading Xilinx UG586 I think you are right-ish (the implementation seems to be a hybrid) but still I think that a lot of the logic the MIG is creating can be removed especially if data is read/written (mostly) sequentially and not random. I'm not sure what the difference in complexity is between the Wishbone bus (which is simple and I know very well) versus AXI (which I know nothing about).

asmi · « **Reply #106 on:** November 19, 2020, 10:02:40 pm »

Quote from: nctnico on November 19, 2020, 09:28:18 pm

Well, that was the old MIG and that made me roll my own (way more resource efficient) DDR2 controller a long time ago. The hard IP DDR3 controllers in modern Xilinx FPGA devices however don't eat any logic. Realistically you can't create a DDR3 controller running at hundreds of MHz from generic IOB cells anyway. The timing needs to be trained etc. And since the Kintex 7 series is related to the Zync series it has exactly the same (hard IP) memory controller as the Zync has.

1. As was said, there is no hard memory controllers in 7 series (except in Zynqs).
2. Fabric in 7 series *is* fast enough to implement "soft" memory controller with the little help of some HW blocks like phasers to implement write/read leveling.
3. Different 7 series devices have different fabric, and the difference is quite drastic. Spartan-7 and Artix-7 have the same fabric, Kintex-7 has faster one, I don't know about Virtex-7 but suspect it's even faster than K7.
For Zynqs, devices -020 and below have Artix fabric, while -030 and above - Kintex one. So they are not all the same either.

tom66 · « **Reply #107 on:** November 19, 2020, 10:09:47 pm »

Does anyone know what fabric is in the Zynq UltraScale?
It doesn't seem to be the same as the other 7 series devices, being on a 20nm process (Spartan/Artix/Kintex/Virtex-7 are all 28nm)

I'll have to give the DDR3 MIG a second thought. But, I don't think memory bandwidth is the ultimate limit here unless we were looking at sampling rates above 2.5GSa/s and those start requiring esoteric ADC parts with large BOM figures attached to them.

Could build a $3000 oscilloscope but would people really buy that in enough volume to make it worthwhile?

asmi · « **Reply #108 on:** November 19, 2020, 10:13:56 pm »

Quote from: tom66 on November 19, 2020, 10:09:47 pm

Does anyone know what fabric is in the Zynq UltraScale?
It doesn't seem to be the same as the other 7 series devices, being on a 20nm process (Spartan/Artix/Kintex/Virtex-7 are all 28nm)

As far as I know it's the same as it Kintex UltraScale+, so it should be super-fast. Overall Zynq MPSoC'es are great devices, the only two problems with them are price and packages (they by and large are very big, requiring 10 layer PCBs for full breakout). I would love to use them in my projects, but the price...
Which is why I'm seriously looking at Zynq-030 - 2 cores up to 1GHz, Kintex fabric and 10G transceivers is a great combination. And you can find them too in China for reasonable amount of money.

nctnico · « **Reply #109 on:** November 19, 2020, 10:20:48 pm »

Quote from: tom66 on November 19, 2020, 10:09:47 pm

Does anyone know what fabric is in the Zynq UltraScale?
It doesn't seem to be the same as the other 7 series devices, being on a 20nm process (Spartan/Artix/Kintex/Virtex-7 are all 28nm)

I'll have to give the DDR3 MIG a second thought. But, I don't think memory bandwidth is the ultimate limit here unless we were looking at sampling rates above 2.5GSa/s and those start requiring esoteric ADC parts with large BOM figures attached to them.

Could build a $3000 oscilloscope but would people really buy that in enough volume to make it worthwhile?

You also need to think about how long it will take to process the data. In my own USB design data came in at 200Ms/s but it could process acquired data at over 1000Ms/s. Say you have 4 channels with 500Mpts of memory with a maximum samplerate of 250Ms/s. Having a memory bandwidth of 1Gs/s would be enough for acquisition purposes. However in such a case you don't want the memory bandwidth between the processing part (whether inside the FPA or external) to become a bottleneck. Especially if the memory bandwidth needs to be shared between sampling and processing (think about double buffering here). Otherwise things like decoding and full record math will become painfully slow.

tom66 · « **Reply #110 on:** November 19, 2020, 10:21:23 pm »

Indeed, but the shame about the Zynq-030 is it's the only package option there in FBG484.

So, if you go for the cheapest 484 ball part, you are stuck with the -030, no route to upgrade.

You can go for the 676 ball part which also has -035 and -045 series variants. But now the BOM is larger, and you might not need the extra IO. (The 400 ball part with the extra bank of the 7020 is enough for 2 x ADC interfaces + plenty of IO to spare, with an 8 layer board to route it all out.)

I'm still not convinced I need a Kintex part. The current 7014S solution has been tested up to 1.25GSa/s which is the max I could get from HMCAD1511 before it lost PLL lock. At least some of that is down to the PLL not having sufficient amplitude to meet the specification of the HMCAD ADC even at 1GHz. That is also without any line training on the SERDES inputs and with the internal logic running at 180MHz, ADC frontend running at 1/8th sample clock. I do encounter timing issues above 200MHz causing periodic AXI lockup conditions though, I believe that is down to the lack of timing optimisation. (Currently have a -4ns worst negative slack)

Really it depends on the goals for this project, I never considered much more than 2.5GSa/s which could be achieved with a second ADC port and 128-bit internal bus on a Zynq 7020. Moving to a Kintex might allow that to run at 250MHz with a 64-bit port, but it's not as if there is a lack of fabric capacity now for a large bus.

tom66 · « **Reply #111 on:** November 19, 2020, 10:27:52 pm »

Quote from: nctnico on November 19, 2020, 10:20:48 pm

Quote from: tom66 on November 19, 2020, 10:09:47 pm
Does anyone know what fabric is in the Zynq UltraScale?
It doesn't seem to be the same as the other 7 series devices, being on a 20nm process (Spartan/Artix/Kintex/Virtex-7 are all 28nm)

I'll have to give the DDR3 MIG a second thought. But, I don't think memory bandwidth is the ultimate limit here unless we were looking at sampling rates above 2.5GSa/s and those start requiring esoteric ADC parts with large BOM figures attached to them.

Could build a $3000 oscilloscope but would people really buy that in enough volume to make it worthwhile?
You also need to think about how long it will take to process the data. In my own USB design data came in at 200Ms/s but it could process acquired data at over 1000Ms/s. Say you have 4 channels with 500Mpts of memory with a maximum samplerate of 250Ms/s. Having a memory bandwidth of 1Gs/s would be enough for acquisition purposes. However in such a case you don't want the memory bandwidth between the processing part (whether inside the FPA or external) to become a bottleneck. Especially if the memory bandwidth needs to be shared between sampling and processing (think about double buffering here). Otherwise things like decoding and full record math will become painfully slow.

Yes so that is the goal: 32-bit DDR3 interface @ 667MHz, assuming a standard Zynq 7020 in enhanced speed grade, gets us up to 5.3GB/s.

100,000 wfm/s at ~600 points per waveform is only a read back bandwidth of 60MB/s. It's mostly the write bandwidth you need. (You need the write bandwidth for the pre-trigger, assuming you want a (pre-trigger*nwaves) bigger than the blockRAM supports.)

Read bandwidth starts trending higher, strangely enough as you go to a longer timebase and the blind time reduces as a fraction of the active acquisition time. At which point the current limitation is the CSI-2 bus to the Pi, and the Pi itself has memory bandwidth issues.

I have the capability to implement a 4 lane CSI-2 peripheral, which doubles bandwidth to around 3.2Gbit/s (400MB/s). At which point we are nearing the capacity of PCI-e or USB3, although only in one direction.

One reason to go to 32-bit interface is that we then have the performance available to do a write-read-DSP-write-read cycle -- we can start using the DSP blocks to work on the waveform data we just acquired, and then render it for the next frame. That would allow the DSP fabric to be used in a (psuedo-)pipelined manner. I'm working on a concept for the render-engine on the FPGA to see how practical it would be.

nctnico · « **Reply #112 on:** November 19, 2020, 10:55:38 pm »

Quote from: tom66 on November 19, 2020, 10:21:23 pm

Really it depends on the goals for this project, I never considered much more than 2.5GSa/s which could be achieved with a second ADC port and 128-bit internal bus on a Zynq 7020. Moving to a Kintex might allow that to run at 250MHz with a 64-bit port, but it's not as if there is a lack of fabric capacity now for a large bus.

Maybe it is just a cost versus benefit (future growth) question. A Kintex would open the option to go for a design which has 4x 1Gs/s (maybe 1.25Gs/s to break the magic 500MHz barrier) without doing a major re-design of the PCB and internal FPGA logic.

free_electron · « **Reply #113 on:** November 19, 2020, 11:39:38 pm »

cool ! . couple of ideas
-memory as dimm so i is user expandable
-cm4 , which you already mentioned.
-each acquisition channel as a pci card. ( not necessarly form factor. ) the beauty of that is that you can make a machine that has 1,2,3,4,5,6,7,8,9 whatever channels you need. just plug in more cards. there are pci hub chips avaialble.

nctnico · « **Reply #114 on:** November 19, 2020, 11:49:20 pm »

The thought of using PCIexpress to aggregate more channels over seperate boards has crossed my mind too but you'll need seperate FPGAs on each board and ways to make triggers across channels happen. You quickly end up with needing to time-stamp triggers and correlate during post processing because the acquisition isn't fully synchronous.

asmi · « **Reply #115 on:** November 20, 2020, 12:11:59 am »

Quote from: tom66 on November 19, 2020, 10:21:23 pm

Indeed, but the shame about the Zynq-030 is it's the only package option there in FBG484.

Why limit to 484? You can fully breakout FFG676 package on a 6 layer PCB as it's 1 mm pitch package, so you can fit two traces between pads and breakout vias. Take a look at device diagram. 3 rows go out on a top layer, 3 more - on the first internal signal layer, 2 more - on the second internal signal layer, and 2 final rows - on the bottom layer (since two last rows are only partially populated, you will have enough space for decoupling caps). You only need 0.09 mm traces and spacings for the top layer - because the pads are oversized to 0.53 mm as per Xilinx recommendations, but you can get away with 0.5 mm pads, this will allow using 0.1 mm traces/spacings. For other layers it's even easier because you can use 0.2/0.4 mm vias which will leave you 0.6 mm of space for 2 traces. The fab I use for 6 layer board - WellPCB - can do even 0.08 mm traces, so you aren't even at their limit - even though I've done enough boards with them to be confident that they will deliver 0.08 mm traces with no issues.

asmi · « **Reply #116 on:** November 20, 2020, 12:33:04 am »

Quote from: tom66 on November 19, 2020, 10:27:52 pm

One reason to go to 32-bit interface is that we then have the performance available to do a write-read-DSP-write-read cycle -- we can start using the DSP blocks to work on the waveform data we just acquired, and then render it for the next frame. That would allow the DSP fabric to be used in a (psuedo-)pipelined manner. I'm working on a concept for the render-engine on the FPGA to see how practical it would be.

Maybe you should consider adding another DDR3 memory interface on PL side. This way you will have acquisition memory separated from processing memory, data comes from ADC straight into acquisition RAM, and then you create a processing pipeline from acquisition RAM into your main RAM. This is how most commercial oscilloscopes work, if you will look closely on teardowns, you will see those separate memory devices.
This will also allow you to create more sophisticated triggers because they can work with potentially a lot of samples right in acquisition RAM, and in doing so they won't be consuming bandwidth of your PS-side memory, which - as you said - is a fixed quantity which can't really be easily scaled, while PL-side memory bandwidth can (CLG484 package has fully bonded out 33,34 and 35 banks, this will allow you to create 64bit memory interface). Think about stuff like triggering on receiving a certain byte(s) via some peripheral bus like SPI or UART.

rhb · « **Reply #117 on:** November 20, 2020, 04:05:39 am »

Quote from: Someone on November 19, 2020, 12:17:01 am

Again, its all dependent on the specific (as yet undisclosed/unclear) architecture. But (some, not all) scopes are dynamically aligning the channels based on the [digital] trigger, interpolating the trigger position. Which requires first determining the fractional trigger position (not trivial calculation), and then using that fractional position (at some point in the architecture, could be on sample storage or at plotting) to increase the time resolution of the digital trigger and reduce trigger jitter. This is something which is quite significant in the architecture and can't be easily bolted on later.

So far you've both just said there is a fixed filter/delay, which is a worry when you plan to have acquisition rates close to the input bandwidth.

Equally doing sinc interpolation is a significant processing cost (time/area tradeoff), and just one of the waveform rate barriers in the larger plotting system. Although each part is achievable alone its how to balance all those demands within the limited resources, hence suggesting that software driven rendering is probably a good balance for the project as described.

I consider time shifting data by fractional samples very trivial. And sinc(t) is not that expensive if you know how.

Data *must* be acquired at the fastest sample rate and downsampled in the FPGA. If this is not done the data will be aliased, a depressingly common issue at all price tiers.

Certain operations must be done at acquisition sample rate. Display is rather leisurely at 30-120 fps.

I have been over the DSP pipeline so many times I've lost count. The only thing I am sure of is I will think of another improvement the next pass.

Reg

Someone · « **Reply #118 on:** November 20, 2020, 06:31:41 am »

Quote from: rhb on November 20, 2020, 04:05:39 am

Certain operations must be done at acquisition sample rate. Display is rather leisurely at 30-120 fps.

Now you're just linking concepts which have almost nothing to do with each other. Offline (CPU or otherwise) processing can be done at any rate, very few things in a digital oscilloscope have to be done at the full throughput of the ADCs, but triggering is one of them and you're both constantly talking away from that point.

rf-loop · « **Reply #119 on:** November 20, 2020, 07:32:07 am »

Very intersting discussion and project.

Without going details and complex explanation.

Before start anything, imho, there must be clear and designed that whole digital side trigger engine need be just after ADC and always using ADC full native samplerate and capable to do it repeatedly with full designed capturing speed and this is really busy place. It need HW and it need not so complex but fast brute force to do it well.
Example Rohde&Schwarz have done small miracles in this in they prime RTO models.

This "trigger engine" need include also as perfect fine interpolation as possible between ADC raw samples as possible in real time. Fine adjusting position for display it is simple secondary thing. This architecture selection and decision need do and keep before so much more. Later it is extremely difficult or even impossible in practice.
This kind of trigger engine need include many functions of course starting from simplest possible edge trigger going to very complex advanced intelligent "shape" recognize trigger... it can be even self learning for anomalies in signal. Clever intelligent full speed trigger is only road to do The advanced oscilloscope.

Trigger is key for intelligent and powerful glitch hunting.... not so much this over advertised wfm/s speed what was originally launched perhaps by Keysight because they find how to advertise and all is put for rare "glitch hunting". If clever people have clever scope and work is hunting some rare glitch he do not need more than one time per second capturing scope IF scope is just only waiting this glitch occurs when scope know what he is hunting (waiting). Intelligence is imho better than brute force like enormous repeating capturing speed. Put scope to work and leave humans do more important things than watching desperately scope screen and waiting.

If trigger engine find match it need trig but it need still do it fast. Not find these from acquisition memory... it need do realtime from ADC continuous stream because it is way to eliminate blind time trap in glitch hunting. Of course there need be some knowledge first what kind of anomaly we are hunting... just for one example about it difficult and why whole thing need be direct ADC out stream and capable of handling this stream... there is no free lounges if want do good.
Of course it is natural there need do some compromises. More and more compromises and soon it is dropped to elcheapo scopes level.

If trigger full engine is not in this position and extremely well made, all rest is useless playing and walking with road from problems to problems. Whole good oscilloscope main heart is just this trigger engine what is fist priority to design. Things after then, acquisition memory... displaying ans so on... they are important but if all these are nice and then trigger engine is "easy made" whole end is to garbage collection... was nice project what teached lot of.

So or so... imho, trigger engine just after ADC is heart of whole scope. This make good scope and this make poor or bad scope. First time this come on the board tens of years ago between Tek and HP. This do not go so that first do some working nice image scope and after then start thinking oh... it need this trig... oh it need also this trig... No, it is FIRST thing what need design and deeply. First need be well done trigger high performance trigger engine (partially software and partially hardware) what can do all what later need.

But what I have seen now here about this project... amazing one mans work, really amazing.

Fungus · « **Reply #120 on:** November 20, 2020, 08:06:13 am »

Quote from: rf-loop on November 20, 2020, 07:32:07 am

This "trigger engine" need include also as perfect fine interpolation as possible between ADC raw samples as possible in real time.

Not true. All you need to know is that one sample is below the trigger level and the next sample is above it (for simple rising edge trigger).

You can do all the fine interpolation much later when you go to display the trace on screen.

Which approach is better? That's harder to say.

tom66 · « **Reply #121 on:** November 20, 2020, 08:10:21 am »

Quote from: Someone on November 20, 2020, 06:31:41 am

Quote from: rhb on November 20, 2020, 04:05:39 am
Certain operations must be done at acquisition sample rate. Display is rather leisurely at 30-120 fps.
Now you're just linking concepts which have almost nothing to do with each other. Offline (CPU or otherwise) processing can be done at any rate, very few things in a digital oscilloscope have to be done at the full throughput of the ADCs, but triggering is one of them and you're both constantly talking away from that point.

The trigger is running at the full sample rate (1GSa/s) on this prototype. Every sample is capable of generating a trigger.

Realignment is not done for every trigger because as stated that can be done once the waveform is captured - based on the difference between the sample and the ideal trigger point. For instance, if you want to trigger at 8'h7f but you actually got a trigger at 8'h84 then you know it is 5 counts off, so look up in your table for that given timebase for the pixel offset.

This operation only needs to be done at the waveform rate of the scope - e.g. 20k/100k times a second - and is part of the render engine, not the capture engine.

gf · « **Reply #122 on:** November 20, 2020, 08:41:38 am »

Quote from: Fungus on November 20, 2020, 08:06:13 am

All you need to know is that one sample is below the trigger level and the next sample is above it (for simple rising edge trigger).

Not generally granted, e.g. for signals close to Nyqust. If two adjacent samples are e.g. 0.0 and 0.1 then the analog ADC input signal can still raise up to 1.0 and go down again between the samples, while still not violating the sampling theorem. In this case your algorithm would miss an edge trigger at level 0.5 completely although the original signal does cross this level.

Someone · « **Reply #123 on:** November 20, 2020, 08:52:15 am »

Quote from: tom66 on November 20, 2020, 08:10:21 am

Quote from: Someone on November 20, 2020, 06:31:41 am
Quote from: rhb on November 20, 2020, 04:05:39 am
Certain operations must be done at acquisition sample rate. Display is rather leisurely at 30-120 fps.
Now you're just linking concepts which have almost nothing to do with each other. Offline (CPU or otherwise) processing can be done at any rate, very few things in a digital oscilloscope have to be done at the full throughput of the ADCs, but triggering is one of them and you're both constantly talking away from that point.

The trigger is running at the full sample rate (1GSa/s) on this prototype. Every sample is capable of generating a trigger.

Realignment is not done for every trigger because as stated that can be done once the waveform is captured - based on the difference between the sample and the ideal trigger point. For instance, if you want to trigger at 8'h7f but you actually got a trigger at 8'h84 then you know it is 5 counts off, so look up in your table for that given timebase for the pixel offset.

This operation only needs to be done at the waveform rate of the scope - e.g. 20k/100k times a second - and is part of the render engine, not the capture engine.

You keep talking about offset in counts, but trigger interpolation is in time. There isn't a lookup because the slope of the signal isn't known a-priori, interpolating the trigger point on the time axis needs at least 2 points (more is preferable) which is already an impractical size for a LUT. Yes it can be done offline (as mentioned by Fungus above), but it still needs access to the raw ADC samples that created the trigger, which are not always what is stored in the acquisition buffer.

And yes, the shift can be applied at render time, which complicates that processing (and how the acquisition filtering further changes alignment) while needing the offset value forwarded with the matching acquisition. Closely related to channel skew adjustment/trimming (which can destroy many of the naive assumptions of memory access patterns in a multichannel scope).

tom66 · « **Reply #124 on:** November 20, 2020, 09:08:31 am »

Quote from: Someone on November 20, 2020, 08:52:15 am

You keep talking about offset in counts, but trigger interpolation is in time. There isn't a lookup because the slope of the signal isn't known a-priori, interpolating the trigger point on the time axis needs at least 2 points (more is preferable) which is already an impractical size for a LUT. Yes it can be done offline (as mentioned by Fungus above), but it still needs access to the raw ADC samples that created the trigger, which are not always what is stored in the acquisition buffer.

Yes, and every raw sample is piped into RAM and stored; nothing about the trigger data is thrown away. 1GSa/s sampling rate and 1GSa/s memory write rate. The samples that generate the trigger are stored as well as pre- and post- trigger data. No filtering, no downsampling at this point.

The principle is, if the input filter has the correct response (it needs to roll off before Nyquist so that you avoid the described Nyquist headaches) you can calculate slope from the delta from the presumed trigger point (which is at t=0 - centre of the waveform) and the actual trigger point at t=? ... the actual trigger point will be offset from the point that the oscilloscope triggered at by a fraction of the sample rate (0-1ns). It will never be more than that fraction because then the next comparator would have generated the trigger instead. When you are at the described mode where 1ns < 1pixel, you can ignore this data because the waveform points aren't going to be plotted fractionally on the screen. This is only needed for sinx/x modes. You'd need LUTs for different timebases or channel configurations (1-4ch), but this is something that could be generated at 'production time' as part of the calibration process.

A more complex trigger engine could use several samples to inform a slope calculation. Since the Zynq ARM has full visibility of the samples, it's possible to do a calculation like this before each waveform is sent to the interpolator or the rendering engine. I don't think that would be terribly difficult to do either, although it would be more complex than what I've suggested it might be less sensitive to noise on the trigger point.

I would say that once you are operating at an input frequency >> than the rating of the scope, you can't rely on the trigger being reliable, just as you can't rely on the amplitude being reliable. My DS1000Z falls over on Fin > 130MHz, even though the amplitude is stable.


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

Author Topic: A High-Performance Open Source Oscilloscope: development log & future ideas (Read 81673 times)

Share me