Using a band bass filter is a good idea. I can additionally recommend adding a TVS protection to the antenna. These components usually don't exist on existing modules on the market.
To lower the cost and prevent the out-of-band emissions, you can use a narrowband PCB or chip antenna (not narrower than required) and impedance matching components arranged as a low pass filter. You can also arrange them as a high pass filter if you need to suppress lower frequencies. This depends on your antenna design. Of course using a chip filter is a guarantee.
ESD test pulses have a rise time below nanoseconds. The antenna is a good entry point for such a signal. After all, its job is about receiving GHz range signals. Although the modules on the market have many certifications, they don't have an explicit protection against this. I have doubts about using a low-cost Chinese module on promising applications so making your own design is not a bad thing in terms of reliability.
Flash memory was the first reason for me as well, while making my project. There are many TCP/IP libraries in the ESP-IDF tree, and you see the opportunity of using many client-server protocols, remote firmware updates, data storage etc. But you can only use basic things with a 4MB flash.