Just a few thoughts / ideas.
The resistor ladder dacs are popular and mostly work fine. However I have run into their limitations in my own work. I've been doing VGA with 256 color, but a palette system out of 24-bit color. I often see obvious color differences due to the limited number of bits in the resistor ladder. If you can live with this, they work great and are simple. It sounds like you are doing 256 bit color without a palette so you should be fine, but if you extend it latter this is a limitation you will run into.
A resistor ladder is the simplest way to get a colour output I guess, but it seems a little... inexact... and dependant on resistor tolerances and chosen values, so when I saw an example using a video DAC I thought I'd likely want to give that a try.
I'm not sure I understand colour encoding/generation properly, so would appreciate correction if I'm wrong, but whilst thinking about the frame buffer design the other day I gave some thought to how the data would be stored in the frame buffer. Storing discrete values for R, G and B channels using a bit (or bits) seems a little wasteful. If I want 64 colours for example, I could use 2 bits per channel for 6 bits of RGB per pixel, or one byte per pixel (with a couple of bits spare) in the frame buffer. Thing is, if I want to go to 3 bits per channel or more, I'm looking at two bytes per pixel and doubling the memory requirement for a single frame.
So I hit on the idea of using a look-up table for the colours. Is that the 'palette system' you're referring to? Basically, I'd have one byte per pixel in the frame buffer. Each byte would be a value between 0 and 255, so when the 'pixel' is read from the frame buffer, the value would be used to look up the RGB value in the LUT, which could be anything up to 24-bit values that would get passed out to the DAC...
... would that work? The other thing is, I don't really know anything about FPGAs - I get the impression they're fast, so using a LUT for the RGB values shouldn't slow the FPGA down so much that it couldn't keep up with the clock?
The FPGA I have currently isn't up to what I want from it, really - it's an Altera Cyclone II EP2C5T144, so doesn't have the RAM I need for a frame buffer for anything more than straight text display. I'm wondering if using an external SRAM chip would be the way to go for the frame buffer? Would something with an access time <15ns for example be fast enough? Alternatively, I'm waiting on a Xilinx Spartan 6 which I'll likely develop on instead. That has a 32MB SDRAM on its board, I might experiment using that for the frame buffer if the RAM in the Spartan isn't enough (though I think I can make the internal RAM dual-port, which is highly desirable for a frame buffer?)
For interfacing the two, do you need framebuffer-level access? For example, the original gameduino outputs VGA but takes in different drawing commands (over SPI). This way you aren't having read and transmit entire framebuffers over a slow bus.
I'm intending to interface the VGA controller with my Z80-based computer running at 8 MHz. I'm not hung-up on direct frame-buffer access for the Z80 at all - in fact, I'd rather not mess with the Z80's memory space at all if another interface method is fast enough. Unless anyone here tells me otherwise, I'm going the route of having the Z80 send commands to the FPGA, which it will interpret and modify the frame buffer's contents accordingly. How the Z80 does that, I haven't decided yet, but I could use a serial connection from the Z80's SIO, I could use my Z80's hardware I2C port, or a more direct connection using OUT commands and the data bus, which I'm erring towards as I feel it will be the quickest of the three. An SPI interface isn't currently an option, but may be later if I ever finish a hardware SPI interface for my system.