The biggest issue I see with digital filters is do we do them in real time or on post-captured data?
The DSP48E1 blocks in the Artix-7 fabric are useful up to around 300MHz in the standard speed grade. Assume maybe 250MHz to be safe. Each block does a MAC operation with a 48-bit result, so with a symmetric filter that gives us one tap per DSP block. We'd only need 12 or 14-bit result data, IIRC the blocks have configurable rounding logic so that will be fine, we can probably truncate results with little harm.
To process 1GSa/s raw data you would need to run four parallel streams of DSP blocks in ~40 tap filter chains, which would create a lot of logic complexity and I don't exactly know how that would work when you switch to multichannel modes, there's a data dependency headache to be resolved.
Whereas if it was done post-processed, assuming 1232 points (100ns/div setting) at 50k waves/sec, that's only 61.6 MSa/s to process. Plenty comfortable to do this after capture by reading back from the RAM and writing into another buffer before rendering (once the 32-bit interface is in place so we have sufficient bandwidth to do this; or using an FPGA-side MIG with a small external RAM, the exact architecture needs to be worked out.)
In raw bandwidth terms there's enough DSP to do ~200 tap filters in a 7020 at around 250kwaves/sec, or e.g. 400 taps at 125kwaves/sec or even 2000 taps at 25kwaves/sec..., although memory bandwidth might start being an issue if samples are stored in an unfavourable order so that needs to be considered. How much correction could you do with a 200 tap filter on post-captured data? How much benefit would there be in going up to something like a 7030 with 400 DSP slices and a faster Kintex fabric; or an UltraScale?