Ok, in verilog, you do not need to bother with the 'LPM_SHIFT_REG'.
At least not in system verilog, but lI gave it a try.
Here is what your have when feeding the (I'm guessing since you haven posted your latest code...)
clk_out_h <= sync_inl_reg ? counter_dl[4] : counter[4];
clk_out_l <= counter[4];
Here is what I did:
input wire load_phase, // make high when loading the new phase
input wire [3:0] phase_in, // 0 through 15 phase positions
output reg sync_out // buffered H-sync output
...........
reg [3:0] phase_reg;
reg [31:0] clk_shift_h, clk_shift_l, sync_shift;
..........
if (load_phase) phase_reg <= phase_in;
//clk_out_h <= sync_inl_reg ? counter_dl[4] : counter[4]; // directly send output clock
//clk_out_l <= counter[4];
clk_shift_h[0] <= sync_inl_reg ? counter_dl[4] : counter[4]; // load output clock into beginning of shift register
clk_shift_l[0] <= counter[4];
clk_shift_h[31:1] <= clk_shift_h[30:0]; // Shift the shift register
clk_shift_l[31:1] <= clk_shift_l[30:0];
clk_out_h <= clk_shift_h[phase_reg[3:0]*2]; // use the loaded phase_reg to select every second 0 through 30 taps on the shift register
clk_out_l <= clk_shift_l[phase_reg[3:0]*2];
sync_out <= sync_inh_reg; // pass the sync input to the output.
I tried this code and Quartus reported the FMAX at 256MHz, but you need 336. So, do not bother adding the phase to the soft-pll code. Only add the :
output reg sync_out
....
sync_out <= sync_inh_reg;
For your video sampling section, do not tie it's H-sync reference to the h_sync input pin, tie it to this module's 'sync_out' reg.
I would recommend running your sampling code at 84MHz or 168MHz. Basically configuring your second PLL to give you that frequency from the 10.5MHz. And when you grab a pixel, grab every 8th or every 16th clock. Your sample phase will be 3 or 4 bits, selecting which on of the every 8th or 16th clock you take in a pixel.
It should be no more complicated than this (168MHz example):
input wire [3:0] phase_selection,
input wire set_phase_selection,
output reg test_sample_point,
.....
reg [3:0] current_phase, phase_selection_reg;
always @ (posedge clk168m) begin
last_sync_in <= h_sync_in;
if (set_phase_selection) phase_selection_reg <= phase_selection;
if ( ~h_sync_in && last_sync_in ) begin // H-Sync pulse low has been recieved
current_phase <= 0; // clear the phase
cord_x <= 0; // clear the video raster X coordinate
if ( ~v_sync_in ) cord_y <= 0; // clear the video raster Y coordinate
else cord_y <= cord_y + 1; // increment the raster Y coordinate
end else begin // not an hsync pulse, a video line
current_phase <= current_phase + 1;
if (current_phase == phase_selection_reg) begin // time to sample a pixel
//write input pixel to dual port ram at cord_x //
cord_x <= cord_x + 1;
test_sample_point <= 1; // pulse the test sample point
end else test_sample_point <= 0; // clear the pulsed test sample point
end // end of video line
end // always @ posedge
(84MHz is the same except the current_phase & phase_selection will now be 3 bits instead of 4 bits. The width of the test output pulse will also now be 42MHz instead of 84MHz.)
Making this code now without the ram, just the test_sample_point output feeding an output would reveal a tiny 84Mhz pulse once per pixel which you can move left and right, 16 positions, depending on the 'phase_selection'.
If you need to test and do not currently have an effective way of driving the 'phase_selection' with a number, and your dev board has an rs232 port, you may use my 'SYNC_RS232_UART.v' here:
https://www.eevblog.com/forum/fpga/verilog-rs232-uart-and-rs232-debugger-source-code-and-educational-tutorial/You may also use the full RS232_debugger if you want to real-time read 4 byte ports, write 4 byte ports, and address read/write memory.
(warning, some FTDI USB-RS232 converters interfere with Quartus USB Blaster. You just need to close the COM port before using the programmer & then you can re-open the com port.)
Note, if the test pulse jitters left & right, it's because of the way we are reading and converting the hsync coming in and how the clock is being generated. The fix will be to make a 10.5MHz output on the second PLL as well as the 168MHz, and in your code, then first latch the sync into a register with an:
always @ (posedge clk10_5) begin
h_sync_clk10_latch <= hsync_in;
end
Then in the main code section, replace the h_sync_in with h_sync_clk10_latch. This should extra clean up the h_sync. If not, you may need to change the (posedge clk10_5) to a (negedge clk10_5).