Then you arrive at the same problem as those engineers trying to make a LP filter for noise suppression: Leakage, dielectric absorption and temperature dependence of the capacitor(s).
I can only second that. I have been there and tried this and that until I finally gave up, did the maths and realized, that one either relies on the stability of some resistor (or capacitor) ratios or the stability of a filtering capacitor, which is a fallacy, but let me explain.
While the white noise component of a reference can be filtered fairly well, once you get into the 1/f territory you will hit a brick wall. To understand this, one needs to look at the spectral properties of 1/f noise. i have attached a little simulation, that shows the 1/f noise in the time domain, its PSD and the Allan deviation (I will get to that latter).
The time domain plot and the PSD are likely the plots people are most familiar with. The PSD already shows what it happening. As you go to lower frequencies, the noise power increases linearly with f (no sh** Sherlock). At the same time (pun intended) the settling time of any filter goes with 1/f as well. So you will always have the same noise power at your output. Simply speaking, the filter will never settle. This brings me to the adev plot. I beautifully sums it all up. No matter how long I wait, the deviation will be the same.
Filtering at frequencies below the corner frequency makes matters worse. You will have have the same noise power and on top that, there is no filter apart from maths, that has the theoretical transfer function, i.e. is flat from DC to the corner frequency and then rolls off. Every filter introduces its own 1/f noise a low frequencies. So the harder you try, the worse you will make it. This is simple physics or maths.
So is there a way to cheat fate? Yes and no. 1/f noise has an Achilles' heel. It is correlated. So in other words, it does give you some information about its future. This can be used by a predictive filter like a Kalman filter (
https://en.wikipedia.org/wiki/Kalman_filter) to suppress is. The catch? - Its all digital.
The only analogue thing you can do, is to sum several references. This will get the noise floor down by 1/sqrt(N), if the noise is uncorrelated. This is important to keep in mind, because as you start summing references you will invariably introduce correlation (like thermal EMF on the board, or supply ripple, etc.), so you will hit a dead end there as well. Life is a bitch...
Basically, the only way to get rid of 1/f noise is to not introduce it in the first place. If you want low noise pick the lowest noise Zeners (by hand) and sum like 4 of them. Then call it a day. It's all mother nature will give you. Ever!
Addendum: The stuff one adds usually has some temperature dependence, like caps, which results in a random-walk behaviour. A random walk is 1/f². So things will get out of hand pretty quickly.