Author Topic: Planning/design/review for a 6-layer Xilinx Artix-7 board for DIY computer.  (Read 45008 times)

0 Members and 10 Guests are viewing this topic.

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 8139
  • Country: ca
    • LinkedIn
Re: Planning/design/review for a 6-layer Xilinx Artix-7 board for DIY computer.
« Reply #100 on: January 07, 2023, 02:19:04 am »
I don't understand what you are getting at.

You keep telling us to make 2 ram controllers where I say make 1 wide double speed controller.

Now, if I want to do a 2D convolution filter on an image.  Forget about displaying it.  We aren't there yet as we may want to use the result as a new texture for 3D modeling, or for image analysis.

How long will it take on a 400MHz 2x16 DDR3 VS a 400MHz 4x16 DDR3.

Say I have another 3 processing steps before I'm ready to render the frame buffer.
 

Offline free_electron

  • Super Contributor
  • ***
  • Posts: 8550
  • Country: us
    • SiliconValleyGarage
Re: Planning/design/review for a 6-layer Xilinx Artix-7 board for DIY computer.
« Reply #101 on: January 07, 2023, 03:08:31 am »
BEFORE you do ANYTHING : get the exact layerstack ( core and prepreg thicknesses and material dk's) Get the dk values for the frequency you will be running at.

So many times people start designing this kind of stuff and then find out all the impedance calcualtions are ou the door because they ran off an invalid specified stack.
Professional Electron Wrangler.
Any comments, or points of view expressed, are my own and not endorsed , induced or compensated by my employer(s).
 

Offline asmi

  • Super Contributor
  • ***
  • Posts: 2839
  • Country: ca
Re: Planning/design/review for a 6-layer Xilinx Artix-7 board for DIY computer.
« Reply #102 on: January 07, 2023, 03:18:59 am »
I don't understand what you are getting at.

You keep telling us to make 2 ram controllers where I say make 1 wide double speed controller.

Now, if I want to do a 2D convolution filter on an image.  Forget about displaying it.  We aren't there yet as we may want to use the result as a new texture for 3D modeling, or for image analysis.

How long will it take on a 400MHz 2x16 DDR3 VS a 400MHz 4x16 DDR3.

Say I have another 3 processing steps before I'm ready to render the frame buffer.
I thought I already explained why we aren't going to implement a 64 bit DDR3 interface, unless nokieboy is willing to assume a risk of implementing something which I personally has never done, and in general very few people done (I couldn't find any board which would have this implemented). Two controllers solution is a compromise I'm offering because that's something I'm fairly certain will work.

Offline asmi

  • Super Contributor
  • ***
  • Posts: 2839
  • Country: ca
Re: Planning/design/review for a 6-layer Xilinx Artix-7 board for DIY computer.
« Reply #103 on: January 07, 2023, 03:21:34 am »
BEFORE you do ANYTHING : get the exact layerstack ( core and prepreg thicknesses and material dk's) Get the dk values for the frequency you will be running at.
We are going to user JLCPCB for manufacturing, and they have published stackups they offer with all values we require.

So many times people start designing this kind of stuff and then find out all the impedance calcualtions are ou the door because they ran off an invalid specified stack.
Since we're going to be running DDR3 at relatively slow 400 MHz, exact impedance match is not super-critical. It would be a completely different story if we were to implement a 933 MHz interface.

Offline nockieboyTopic starter

  • Super Contributor
  • ***
  • Posts: 1812
  • Country: england
Re: Planning/design/review for a 6-layer Xilinx Artix-7 board for DIY computer.
« Reply #104 on: January 09, 2023, 10:25:24 pm »
Just as an aside, my FPGAs arrived today. ^-^

As far as the discussion on memory buses goes, I'm staying well out of that one as there's nothing useful I can add to either side.  I will go with whatever the final consensus is, as you both (asmi and BrianHG) know faaaaaar more than I do about the subject.

@asmi - have you had a response about the SW connections on the MPM3683-7 yet?
 

Offline asmi

  • Super Contributor
  • ***
  • Posts: 2839
  • Country: ca
Re: Planning/design/review for a 6-layer Xilinx Artix-7 board for DIY computer.
« Reply #105 on: January 09, 2023, 11:31:07 pm »
Just as an aside, my FPGAs arrived today. ^-^
Great!

As far as the discussion on memory buses goes, I'm staying well out of that one as there's nothing useful I can add to either side.  I will go with whatever the final consensus is, as you both (asmi and BrianHG) know faaaaaar more than I do about the subject.
Well you are the one who will "bend the metal" so to speak, i.e. will actually spend money on a physical board, so decision is yours to make. I'm actually tempted to burn one of my 35Ts to throw together a devboard with SODIMM just to try it out and see if it works. Now if only days would have more than 24 hours so that I would find enough extra time to actually design a board... ::)

@asmi - have you had a response about the SW connections on the MPM3683-7 yet?
They asked me for a company email (I sent them a request from my personal one). Sent it out, so far - nothing. They said they are going to forward this to a local FAE which cover a province where I live in, we will see. But then again, if I am to throw together a SODIMM tester, it will be super-cheap because it's gonna contain an FPGA, PDS parts and an SODIMM connector.

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 8139
  • Country: ca
    • LinkedIn
Re: Planning/design/review for a 6-layer Xilinx Artix-7 board for DIY computer.
« Reply #106 on: January 10, 2023, 03:58:17 am »
Well, if the Artix7 can run the DDR3 at 800Mhz or above (1600mtps), then 2x 16bit DDR3 ram chips will be as good as running a SODIMM module at 400MHz.  Though, if you get the SODIMM working at 800MHz, then expect to be able to play Quake2/3 on your FPGA with some ultra serious HDL engineering and coding.
 

Offline asmi

  • Super Contributor
  • ***
  • Posts: 2839
  • Country: ca
Re: Planning/design/review for a 6-layer Xilinx Artix-7 board for DIY computer.
« Reply #107 on: January 10, 2023, 10:36:35 am »
Well, if the Artix7 can run the DDR3 at 800Mhz or above (1600mtps), then 2x 16bit DDR3 ram chips will be as good as running a SODIMM module at 400MHz. 
No, it can't do that. Speed grade 2, which is the device we've purchased, can only go up to 400 MHz, speed grade 3 can go up to 533 MHz, but that's about it.

Though, if you get the SODIMM working at 800MHz, then expect to be able to play Quake2/3 on your FPGA with some ultra serious HDL engineering and coding.
Quake 2 was working just fine even with a simple SDRAM (not even DDR) at far lower bandwidth than what DDR3 can do.
« Last Edit: January 10, 2023, 10:39:03 am by asmi »
 

Offline miken

  • Regular Contributor
  • *
  • Posts: 102
  • Country: us
Re: Planning/design/review for a 6-layer Xilinx Artix-7 board for DIY computer.
« Reply #108 on: January 10, 2023, 09:20:06 pm »
asmi is correct. The datasheet specs are in Mbps. I made the same mistake myself at first, since the MIG interface is sized at double the memory burst rate.

Xilinx uses these "4:1" and "2:1" terms for the MIG gearing but you kind of have to work through what exactly is meant. For example a 4:1 with 200MHz EDIT:100MHz FPGA-side is 800 Mbps/pin, 400 MHz DDR.
« Last Edit: January 11, 2023, 07:57:05 am by miken »
 

Offline asmi

  • Super Contributor
  • ***
  • Posts: 2839
  • Country: ca
Re: Planning/design/review for a 6-layer Xilinx Artix-7 board for DIY computer.
« Reply #109 on: January 10, 2023, 11:57:27 pm »
asmi is correct. The datasheet specs are in Mbps. I made the same mistake myself at first, since the MIG interface is sized at double the memory burst rate.

Xilinx uses these "4:1" and "2:1" terms for the MIG gearing but you kind of have to work through what exactly is meant. For example a 4:1 with 200MHz FPGA-side is 800 Mbps/pin, 400 MHz DDR.
That's not entirely correct. "4:1" and "2:1" refer to memory frequency to UI frequency gearing ratio, in your example memory runs at 400 MHz, and UI on FPGA side runs at 100 MHz. UI data bus width is 8x the interface width, so a single UI transaction covers entire 8n DDR3 burst (which happens at 800 MT/s in this example), even though UI includes a FIFO for write data allowing some advance buffering (meaning you can write data into a FIFO and later issue a write command. This way it's theoretically possible to fully saturate memory bus with back-to-back memory transactions if commanded accordingly, though of course in reality there will be some breaks due to a need to activate a row, precharge and refresh, etc.

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 8139
  • Country: ca
    • LinkedIn
Re: Planning/design/review for a 6-layer Xilinx Artix-7 board for DIY computer.
« Reply #110 on: January 11, 2023, 01:18:18 am »
asmi is correct. The datasheet specs are in Mbps. I made the same mistake myself at first, since the MIG interface is sized at double the memory burst rate.

Xilinx uses these "4:1" and "2:1" terms for the MIG gearing but you kind of have to work through what exactly is meant. For example a 4:1 with 200MHz FPGA-side is 800 Mbps/pin, 400 MHz DDR.
That's not entirely correct. "4:1" and "2:1" refer to memory frequency to UI frequency gearing ratio, in your example memory runs at 400 MHz, and UI on FPGA side runs at 100 MHz. UI data bus width is 8x the interface width, so a single UI transaction covers entire 8n DDR3 burst (which happens at 800 MT/s in this example), even though UI includes a FIFO for write data allowing some advance buffering (meaning you can write data into a FIFO and later issue a write command. This way it's theoretically possible to fully saturate memory bus with back-to-back memory transactions if commanded accordingly, though of course in reality there will be some breaks due to a need to activate a row, precharge and refresh, etc.
This matches my DDR3 controller.  My Half mode and Quad mode changes the user command clock in relation to the DDR3 _CK clock frequency while the data buss runs at 2x that.  However, as I did discover with Lattice, they have a 500mbps cap on their DDR 2:1 IO primitive and a 800mbps cap when using their 2xDDR, IE 4:1. primitive.  So for lattice, I am stuck in quad mode unless you want to underclock the DDR3, or, overclock the FPGA.  Strangely enough, Altera allows 840mbps on their run of the mill 2:1 DDR IO primitive for their run of the mil FPGAs, though I cannot exclude that they automatically inferred my multiple data shift registers to and from the DDR IO primitive as a larger 4:1 or 8:1 2xDDR primitive as I easily achieved a clean 400MHz DDR3_CK clock in both Half mode (200Mhz user interface clock) and quarter mode (100MHz user clock interface) while Altera's official PHY can only do 300MHz exclusively in half mode.

If Xilinx Spartan7 has a higher maximum DDR IO (or serdes) primitive data rate between their 2:1 primitive and 4:1 primitive, if I were to adapt my DDR3 controller to Xilinx Spartan7, then I can make use of their improved 4:1 serdes primitive's performance to go above the 800mbps cap.  With some work, if their high speed 3gbps serdes port can be an IO, work in 8:1 and use a separate read and write clock phase, then my controller can be adapted to operate a DDR3 ram chip up to 3gtps if the chip can go that fast.  However, this is the type of project for someone with too much free time on their hands.
 

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 8139
  • Country: ca
    • LinkedIn
Re: Planning/design/review for a 6-layer Xilinx Artix-7 board for DIY computer.
« Reply #111 on: January 11, 2023, 01:26:32 am »
Well, if the Artix7 can run the DDR3 at 800Mhz or above (1600mtps), then 2x 16bit DDR3 ram chips will be as good as
Though, if you get the SODIMM working at 800MHz, then expect to be able to play Quake2/3 on your FPGA with some ultra serious HDL engineering and coding.
Quake 2 was working just fine even with a simple SDRAM (not even DDR) at far lower bandwidth than what DDR3 can do.

I know.  640x480 at 60hz, only using downs-ampled 256x256 textures (optional for higher speed or smaller texture size) 8bit palleted, with frame rates between 30fps and 60fps.
Also, my Voodoo2 at the time had 2 ram controller banks of EDO ram, 128bits each, plus, my motherboard had it's own 128bit ram for the CPU geometry and game engine.  This is not what I was promising Nockieboy with the large high speed ram.  I was promising 120fps, 1920x1080, full 32bit textures at their native 512x512 & 1024x1024 scale, plus enough onboard bandwidth to also run the CPU plus geometry and game engine and audio.
« Last Edit: January 11, 2023, 01:28:47 am by BrianHG »
 

Offline miken

  • Regular Contributor
  • *
  • Posts: 102
  • Country: us
Re: Planning/design/review for a 6-layer Xilinx Artix-7 board for DIY computer.
« Reply #112 on: January 11, 2023, 07:53:33 am »
That's not entirely correct. "4:1" and "2:1" refer to memory frequency to UI frequency gearing ratio, in your example memory runs at 400 MHz, and UI on FPGA side runs at 100 MHz. UI data bus width is 8x the interface width, so a single UI transaction covers entire 8n DDR3 burst (which happens at 800 MT/s in this example), even though UI includes a FIFO for write data allowing some advance buffering (meaning you can write data into a FIFO and later issue a write command. This way it's theoretically possible to fully saturate memory bus with back-to-back memory transactions if commanded accordingly, though of course in reality there will be some breaks due to a need to activate a row, precharge and refresh, etc.
You're right, I confused myself with different clocks. That makes more sense... Funny thing is I read in some forum post that the UI interface was twice as wide but clearly that's not the case. Have to refresh my memory with hard facts before spouting bad information.  |O
 

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 8139
  • Country: ca
    • LinkedIn
Re: Planning/design/review for a 6-layer Xilinx Artix-7 board for DIY computer.
« Reply #113 on: January 11, 2023, 12:27:10 pm »
Everything encircled in purple runs at the pixel clock rate. 
Oh wow - it's more complicated that I thought!
I got a question though - why do you have to run so much at video pixel clock? Wouldn't it be better to run it at a faster clock and write the resulting image into DDR, and them have a totally asyncronous process which does run at a video pixel clock which would read that image from the framebuffer (in DDR) and output it via HDMI/VGA/DisplayPort/whatever? ....

Just to make one thing absolutely clear, the geometry unit I engineered with Nockieboy already have a a software ram-to-ram blitter which can convert any multiple source bit depth graphics to and from any ram, window coordinate size and location into memory for onscreen display.  Not only that, but that engine also can up-sample and down-sample the source graphics to the destination graphics, use blitter source windows as a paint-brush when running the geometry drawing commands, with the addition of rotate 90, 45, mirror and flip.  Nockieboy with the GPU in it's current form have enough to render a Doom, or his current Z80 alone can draw the arcade full Outrun car racing game, but in full 32bit quality, HD quality and at a good 30-60fps depending on the Z80s ability to send out draw commands.

I was planning to help him create a display list processor to automate some control program lists in DDR3 so the Z80 wouldn't have to do anything but load a program, point to it's base address and tell it to 'GO', but Nockieboy wanted a new FPGA first.
« Last Edit: January 11, 2023, 12:41:00 pm by BrianHG »
 

Offline nockieboyTopic starter

  • Super Contributor
  • ***
  • Posts: 1812
  • Country: england
Re: Planning/design/review for a 6-layer Xilinx Artix-7 board for DIY computer.
« Reply #114 on: January 11, 2023, 06:10:53 pm »
I feel totally out of my depth with all this DDR conversation; any suggestions for a good place to start as a primer for this subject so I can start reading up on it all and at least sound like I know vaguely what you're talking about? ;)

I was planning to help him create a display list processor to automate some control program lists in DDR3 so the Z80 wouldn't have to do anything but load a program, point to it's base address and tell it to 'GO', but Nockieboy wanted a new FPGA first.

I think having more room to fit that DLP and, perhaps, a soft-core processor and any other peripherals someone might want to squeeze in there is an important step.  No reason why we can't do both at the same time though, unless it would pay to finalise the next FPGA first before we develop the HDL further?

@asmi - further to our previous conversation about crystals/clock sources - would a 100MHz differential clock source be okay?  Should I be considering a second external clock of some description?
 

Offline asmi

  • Super Contributor
  • ***
  • Posts: 2839
  • Country: ca
Re: Planning/design/review for a 6-layer Xilinx Artix-7 board for DIY computer.
« Reply #115 on: January 11, 2023, 08:27:37 pm »
I feel totally out of my depth with all this DDR conversation; any suggestions for a good place to start as a primer for this subject so I can start reading up on it all and at least sound like I know vaguely what you're talking about? ;)
The crux of discussion is how much memory bandwidth will you need. We need a definite answer to that question before we can proceed with design, as DDR3 interface is a major consumer of IO pins, and there are restrictions on which pins can be used for that purpose.
If you want to learn how DDR3 controller works and how to use it, download a UG586 document from Xilinx website and read through chapter 1, which has everything you need to know regarding controller.

@asmi - further to our previous conversation about crystals/clock sources - would a 100MHz differential clock source be okay?  Should I be considering a second external clock of some description?
You don't have to use differential clock at all, all of boards I designed used regular single-ended 100 MHz clock and they worked just fine. As for additional clocks, it's up to you guys, you know better what do you need for your design. MCMMs are quite flexible in 7 series, each of them has a fractional divider (with 0.125 step) on the first output so you can generate many different clocks. I used 100 MHz clock to generate 148.5 and 742.5 MHz clocks required for 1080p@60Hz HDMI output, though the latter frequency exceeds specs and causes timing violations, but still works just fine in practice.

But, as with everything else, the ultimate call on what you need is on you. You come and tell us what you need, and we'll figure out a way to make it possible.

Offline nockieboyTopic starter

  • Super Contributor
  • ***
  • Posts: 1812
  • Country: england
Re: Planning/design/review for a 6-layer Xilinx Artix-7 board for DIY computer.
« Reply #116 on: January 12, 2023, 12:00:33 pm »
You don't have to use differential clock at all, all of boards I designed used regular single-ended 100 MHz clock and they worked just fine. As for additional clocks, it's up to you guys, you know better what do you need for your design. MCMMs are quite flexible in 7 series, each of them has a fractional divider (with 0.125 step) on the first output so you can generate many different clocks. I used 100 MHz clock to generate 148.5 and 742.5 MHz clocks required for 1080p@60Hz HDMI output, though the latter frequency exceeds specs and causes timing violations, but still works just fine in practice.

But, as with everything else, the ultimate call on what you need is on you. You come and tell us what you need, and we'll figure out a way to make it possible.

Okay, that's cool then if I don't need to use a differential clock source then I won't make it more complicated than it needs to be.

In terms of what I need, that's where this is more a collaboration than me specifying requirements - I'd rather pick a clock frequency that minimises the use of MMCMs to cater for both the DDR3 controller and memory, and the GPU itself.  I'm likely to go with a 50MHz oscillator if it's my choice, as I have a few of those already and if I choose 100MHz, it's going to need to be divided or multiplied like a 50MHz clock would be for the various parts of the FPGA.  Plus the lower the clock frequency, the less problems I'll have with routing considerations etc.

@BrianHG - does the GPU or your DDR3 controller make any particular system clock frequency more desirable than another?  If I go with 50MHz, that's purely based on what we used previously for the Cyclone IV board - if there's a better choice for whatever reason, let me know. :)
 

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 8139
  • Country: ca
    • LinkedIn
Re: Planning/design/review for a 6-layer Xilinx Artix-7 board for DIY computer.
« Reply #117 on: January 12, 2023, 12:32:58 pm »
@BrianHG - does the GPU or your DDR3 controller make any particular system clock frequency more desirable than another?  If I go with 50MHz, that's purely based on what we used previously for the Cyclone IV board - if there's a better choice for whatever reason, let me know. :)
It's compatible with any clock input the FPGA can support, however, you will be using Xilins's DDR3 controller unless you deliberately want the fun of adapting mine.

Unless the Xilinx's controller gives you 16 configurable read/write ports, you will probably will be using my controller's multiport front-end interface as it is doing a shit load of heavy lifting for you.  My multiport uses the user interface command clock from Xilinx's DDR3 controller unless you want to manually configure your own PLL.  In the Max10-6, it's upper limit was ~200Mhz, but your current project is running it at 100MHz.  The Spartan7 should achieve 200Mhz with ease.  (Actually I think the only reason we went 100MHz instead of 200Mhz was the ellipse line generator was too complex.)  This will double the speed of all the rest of your GPU modules, IE:Geometry renderer.  The new data buss' double width, going from 128 bit to 256 bit will once again potentially double your maximum throughput speed, though except for my display raster generator, you have yet to code anything else which will make full use of this true 4x top speed.

If I were you, you should already be working on coding and simulating this part.

I do not know about Xilinx in circuit PLL reconfiguration, but I would ask asmi if 2 different PLLs can run from the same 100Mhz clock source, IE use a second one for the HDMI display port.  Fiddling with the DDR3's master pll to change res during operation may just introduce headaches.

Also, make sure that from the 100MHz, you can make (148.5Mhz or 297Mhz & 742.5Mhz) & (108Mhz or 216Mhz & 540Mhz).  Make sure they are exact, no fractions or decimals, otherwise, just put add a 27Mhz, or 54Mhz, or 108Mhz oscillator to the PCB.
« Last Edit: January 12, 2023, 12:35:14 pm by BrianHG »
 

Offline asmi

  • Super Contributor
  • ***
  • Posts: 2839
  • Country: ca
Re: Planning/design/review for a 6-layer Xilinx Artix-7 board for DIY computer.
« Reply #118 on: January 12, 2023, 09:43:42 pm »
In terms of what I need, that's where this is more a collaboration than me specifying requirements - I'd rather pick a clock frequency that minimises the use of MMCMs to cater for both the DDR3 controller and memory, and the GPU itself.  I'm likely to go with a 50MHz oscillator if it's my choice, as I have a few of those already and if I choose 100MHz, it's going to need to be divided or multiplied like a 50MHz clock would be for the various parts of the FPGA.  Plus the lower the clock frequency, the less problems I'll have with routing considerations etc.
I just checked and it looks like it's not possible to generate 148.5 and 742.5 MHz exactly from 50 MHz, so you will need to use 100 MHz.

I do not know about Xilinx in circuit PLL reconfiguration, but I would ask asmi if 2 different PLLs can run from the same 100Mhz clock source, IE use a second one for the HDMI display port.  Fiddling with the DDR3's master pll to change res during operation may just introduce headaches.
Yes you can, albeit with some limitations (MCMMs need to be in the same column and in the same clock region).

Also, make sure that from the 100MHz, you can make (148.5Mhz or 297Mhz & 742.5Mhz) & (108Mhz or 216Mhz & 540Mhz).  Make sure they are exact, no fractions or decimals, otherwise, just put add a 27Mhz, or 54Mhz, or 108Mhz oscillator to the PCB.
I know for sure you can generate 148.5 and 742.5 MHz exactly from 100 MHz, but what do you need 297 MHz for? Also what mode the second set of clocks is for? I just checked, and it looks like it's possible to generate them from 100 MHz as well (though not at the same time as 148.5/742.5 MHz).

You can play around with MCMM/PLL settings by invoking "Clocking wizard" in IP Catalog. Each MCMM has 7 outputs, first of which (output 0) can have fractional divider, and of course multiplier can be fractional as well (with the same 1/8=0.125 step IIRC).

Alternatively you can use 200 MHz LVDS clock generator connected to DDR3's bank pins, controller will output 100 MHz UI clock which you use to drive the interface, and place an additional 27 MHz clock just for the video out - these clock gens are cheap (about $1), so it shouldn't be that big of a deal. Or instead of 27 MHz fixed frequency you can use a programmable clock generator like SI5351A-B-GT which has 3 outputs each of which can be programmed to a wide range of frequencies via I2C interface for ultimate flexibility. That device is about $3 (+ some cents for the 25 or 27 MHz crystal), so quite a reasonable price.

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 8139
  • Country: ca
    • LinkedIn
Re: Planning/design/review for a 6-layer Xilinx Artix-7 board for DIY computer.
« Reply #119 on: January 13, 2023, 04:18:59 am »
Also, make sure that from the 100MHz, you can make (148.5Mhz or 297Mhz & 742.5Mhz) & (108Mhz or 216Mhz & 540Mhz).  Make sure they are exact, no fractions or decimals, otherwise, just put add a 27Mhz, or 54Mhz, or 108Mhz oscillator to the PCB.
I know for sure you can generate 148.5 and 742.5 MHz exactly from 100 MHz, but what do you need 297 MHz for? Also what mode the second set of clocks is for? I just checked, and it looks like it's possible to generate them from 100 MHz as well (though not at the same time as 148.5/742.5 MHz).

You can play around with MCMM/PLL settings by invoking "Clocking wizard" in IP Catalog. Each MCMM has 7 outputs, first of which (output 0) can have fractional divider, and of course multiplier can be fractional as well (with the same 1/8=0.125 step IIRC).

The 297 is optional, but offers additional options.
It depends on Spartan7's PLL core speed capabilities.
For example, of the core can do the 1.485 GHz, then for the sub-divisional outputs - /2=742.5 for the LVDS HDMI, /5=297, /10=148.5, all integer divisions.  If all the core PLL can do is the 742.5, then we will skip the 297 and just divide that by 5 to give us the 148.5 pixel clock.

With a 54MHz source, /2 = 27Mhz, * 55 = 1.485 Ghz.  All integer, no sub-fractional tricks.

On the Deca board, I did a 50Mhz /25, *27 = 54Mhz,
Then made the 148.5Mhz from the 54MHz.

This took 2 PLLs since the Max10 had no fractional dividers, so this is how I made purest possible reference without any jitter using integer only PLLs.

If we used Cyclone V, then we could have had access to a fractional N divider PLL for the first primary frequency output offering a direct conversion of 100Mhz to 1.485Ghz.  All other sub-divided pixel clock outputs could be made from that.  (Yes, the Cyclone V PLL can operate at 1.485Ghz)

1 PLL should do it on the Spartan7 if it has a fractional N divider PLL.  If it can do 1.485 Ghz, then we can output 4K at 30Hz, assuming the LVDS transmitter can do 3gb.

« Last Edit: January 13, 2023, 04:29:41 am by BrianHG »
 

Offline asmi

  • Super Contributor
  • ***
  • Posts: 2839
  • Country: ca
Re: Planning/design/review for a 6-layer Xilinx Artix-7 board for DIY computer.
« Reply #120 on: January 13, 2023, 06:03:07 am »
The 297 is optional, but offers additional options.
Like what? HDMI is output by using a SERDES (actually a pair of cascaded SERDES, but that is irrelevant here) in a 10:1 DDR mode, so you will need a pixel (symbol) clock of 148.5 MHz and a half of bit clock (742.5 MHz). No other clock are required. You will need to perform 8b/10b encoding yourself and feed SERDES encoded 10 bit symbols.

1 PLL should do it on the Spartan7 if it has a fractional N divider PLL.  If it can do 1.485 Ghz, then we can output 4K at 30Hz, assuming the LVDS transmitter can do 3gb.
Where are you getting all these numbers? Maybe you should read DS181. Official limit is 1.25 gb per pin, unofficially it can do 1.485 gb, but that's about it. Where this 3gb is coming from?
If you want to go beyond what regular IO pins can do, you will need to use transceivers. We're currently planning to wire 4 GTPs to a DisplayPort connector. DP version 1.2 specification is publicly available and so it shouldn't be too hard to implement up to HBR2 (5.4 Gbps per lane), which is enough to drive up to 4k@60. But since even modern GPUs sometimes struggle to maintain reasonable framerate at such resolution, utility of actually doing so kind of escapes me, not to mention that it's going to require a MASSIVE memory bandwidth even only for double/triple buffering alone. Even 64 bit DDR3 at 400 MHz will just about be enough for double-buffering. And frankly, I think Artix-7 is not powerful enough for that resolution, we will need something like Artix Ultrascale+ with it's 1.2 GHz DDR4 support.

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 8139
  • Country: ca
    • LinkedIn
Re: Planning/design/review for a 6-layer Xilinx Artix-7 board for DIY computer.
« Reply #121 on: January 13, 2023, 06:11:59 am »
Ok, no 3840x2160 support.
 

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 8139
  • Country: ca
    • LinkedIn
Re: Planning/design/review for a 6-layer Xilinx Artix-7 board for DIY computer.
« Reply #122 on: January 13, 2023, 06:17:01 am »
Even 64 bit DDR3 at 400 MHz will just about be enough for double-buffering. And frankly, I think Artix-7 is not powerful enough for that resolution, we will need something like Artix Ultrascale+ with it's 1.2 GHz DDR4 support.
Yes it is.. Especially at 30hz progressive.  Though, for 60hz, I would prefer a 500MHz controller.
Now do not go putting Spartan7 below a cheap crummy CycloneV-GX.
« Last Edit: January 13, 2023, 06:19:47 am by BrianHG »
 

Offline asmi

  • Super Contributor
  • ***
  • Posts: 2839
  • Country: ca
Re: Planning/design/review for a 6-layer Xilinx Artix-7 board for DIY computer.
« Reply #123 on: January 13, 2023, 03:21:05 pm »
Yes it is.. Especially at 30hz progressive.  Though, for 60hz, I would prefer a 500MHz controller.
Now do not go putting Spartan7 below a cheap crummy CycloneV-GX.
They both are inadequate. The fact that they can technically output a stream in 4k@60 doesn't change the reality of them being too slow and too small to do much in a way of actually generating that image. A single stream at 4k@60 requires ~1.85 GBytes/s of bandwidth, a DDR3 64bit@400 MHz theoretical max is a bit below 6 GBytes/s, so a simple double-buffering is going to eat almost half of DDR3's available bandwidth, not leaving much for actual rendering engine, which typically requires an order of magnitude more bandwidth than what's required for display, because those resources (textures, primitive streams) also need to be of higher resolution, and generating that many pixels (a single 4k frame contains over 8 million pixels!) requires a lot of hardware, and fast one too. To maintain 60 Hz refresh rate, renderer needs to generate almost half a billion pixels per second, which means at 100 MHz it needs to output 5 pixels on each clock cycle. In my opinion that is waaay beyond what low end 7 series devices can do, Cyclone-5 is even worse than that. Even Artix Ultrascale+'s 64bit DDR4 interface running at 1.2 GHz and providing ~17.9 GBytes/s of bandwidth, while being much better than 400 MHz DDR3, I still suspect might not be enough even for a relatively simple 3D renderer generating 4k@60. For a bit of a perspective, NVidia's 3080 Ti GPU has over 912 GBytes/s of memory bandwidth, so close to two orders of magnitude more than what even Artix US+ can provide.
So, unless you want to simply upscale 1080p to 4K for the ouput (so that you can see big and beautiful pixels in chisp details!), I'd say we'd better abandon all this talk about 4k and focus on something that we can realistically expect to achieve.
« Last Edit: January 13, 2023, 10:13:58 pm by asmi »
 

Online BrianHG

  • Super Contributor
  • ***
  • Posts: 8139
  • Country: ca
    • LinkedIn
Re: Planning/design/review for a 6-layer Xilinx Artix-7 board for DIY computer.
« Reply #124 on: January 14, 2023, 12:10:44 am »
Oh, so you don't believe we could operate the frame buffer and render in 422 YUV mode.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf