Although:
// generate sync signals (active low for 640x480)
assign hsync = ~((h_count >= HS_STA) & (h_count < HS_END));
assign vsync = ~((v_count >= VS_STA) & (v_count < VS_END));
Is absolutely correct, however, the >= comparison and < do eat up gates when compiling and your output driving a pin will be a combination unlatched logic equation directly based on your output counter bits. To get a clean output and better F-Max allowing Quartus's Fitter to better route the logic in your FPGA to meet timing constraints, I would:
A) Make Hsync and Vsync a register.
B) Inside your if (pix_enable), make the logic look like this
if (h_count == HS_STA) begin
hsync <= 1;
end else if (h_count == HS_END) hsync <= 0;
if (V_count == VS_STA) begin
vsync <= 1;
end else if (h_count == VS_END) vsync <= 0;
Remember, the == only uses 10 xor gates + a 10 input and gate since you are comparing 2 10 bit numbers. With your GREATER than EQUALS and LESS Than, because of the fixed constants you chose, it may compile to a small number of gates with high efficiency maximum clock frequency (FMAX), but, once you change all your 'localparam's to registers which your CPU can address and change, including default 'reset' parameters, the >= and < will eat up gates and slow down your maximum possible FMAX. (You may only be thinking about 25Mhz now, however, if you want your core to run at 200Mhz, with you pix_enable running once every 8 clocks, still a 25Mhz pixel, such an optimization will probably be needed further down the road, while just using the resources of your 8bit CPU to do the math when writing these new registers)
*Being registers, this now means that those outputs are delayed by 1 pixel clock, but, those outputs will be cleanly timed to the rising edge of your fpga PLL internal clock as those DFF register outputs don't have any logic gates to go through, those DFF register outputs will feed the IO pin directly. (Though, I usually add another DFF stage/pipe to all my DAC and sync outputs allowing the compiler/fitter to bury the logic anywhere deep in the FPGA, the use each pins macrocell flipflops registers to drive the pin itself generating the cleanest possible outputs.)
As for your Assign X&Y, again, doing all that realtime arithmetic is not a problem:
// keep x and y bound within the display area
assign x = (h_count < DA_STA) ? 0 : (h_count - DA_STA); // x is zero if current pixel is before display area
assign y = (v_count >= VA_END) ? (VA_END - 1) : (v_count); // y is VA_END-1 if v_count is outside display area
However, like the HSync and Vsync, personally, I make these outputs a single bit which turns on or off using the simple equality == trick I used to generate the Hsync and Vsync. I rely on my video graphics generator to have an address generator which may increment count every single, every second, or every third/fourth line.
Also, not just that one master image X&Y output enable which is REQUIRED by all DVI transmitters and which may also be used as an image crop mute to protect the video output borders for analog VGA dacs (outside the active video area, the RGB level must be always black), I usually would have something like another 16 programmable of these X&Y on and off output flags which may be used to position sprites or superimposed hardware video windows. Among these 16, which may be positioned after the active video region output, allotting a time slot which may be used to signal memory access cycles to fill your audio DAC buffer, or signal load a new set of X&Y video memory pointers from memory which change the beginning video read address of the next line of video allowing you to dynamically have a new video address and video mode on each new video line. (EG like the Atari 800 display list which allows for a few lines of text mode, then a few with graphics, then more text at a different resolution, or Amiga pulldown screens with a vertically scroll-able window on top of another, each with a different graphics mode pointing to somewhere different in memory.)
Also, in my designs, I call the 'pix_enable' pclk_ena, or pena, even though I expect the reset to activate even if the pena is held low.
If you still directly want to use you h_count and v_count as reference pixel counters and get rid of the ensuing subtraction math to improve FMAX, I would move your hsync and vsync to the end of your counters so that 0 through 639 and 0 through 479 are passed through without any math while the blank front & back porches with syncs come at then end after pixel 640 and line 480. I personally still prefer the 1 bit enables which can actually tell you ensuing logic when to start and stop reading, or when there is free time to read memory for other purposes. With your current code, your logic will continue reading pixel 0x0 whenever there is no video being outputted.