Well, using what I have listed above, say in channel B we generate a 8+8+8+8 bit ramp from 0-255.
In channel A, we have a line of a bit map 1024 pixels wide x 1024 pixels tall.
We can loop B to a limit of 1024, and increment A normally with a total 1048576 iterations, multiply 8x8 mode, result right shifted by 8. With this, we would have created a gradient of our source bitmap, left dark to right bright.
Changing the source (B) inc step size integer to 0 and source (B) inc step size fractional to 1/1024, the same operation would vertically generate a gradient from the top of the image dark to the bottom bright.
Or, do not use function B and change the source (A) inc step size integer to 0 and source (A) inc step size fractional of 1/3, and change the total iterations to 1048576*3 and we would stretch the image by 3. Since this is linear, you could also change the bit depths to 16 and say we resampled an audio sample to 3x long. Changing the source (A) inc step size integer to 3 and we can say we shrink the sample by 1/3.
Performing such a task 3 times with a different starting offset, using A and being the just resampled data, A beginning with an offset of 1 and B being the previous computed data, summing together and left shift the results by 3 would interpolate the results.
You can mix and multiply a table period of sine waves, loop processing the results with floats to perform complex filters on large data of sums as input B can be circular tables or even circular matrices for convolution filtering. Since we can mix floating and integer sources and destinations, you can do processing like FFTs on source data, even 2D ones with multiple passes.
I know modern DSP can do much of this in 1 pass as they have lots of cache for 2D matrices each clock cycle, but we only have so much room left in this FPGA, so multipass with 1D, 2 point matrices will be what we are stuck with.