I wonder what that source is. It seems to have linearity error on about the same level as the 3458A as the two are probably uncorrelated.
There tests showed quite some difference in the INL when changing the reference. This is not that surprising as those capacitive ADCs react to changes in the reference and input drivers. So while they have impressive INL specs for the chips, it still needs a test on the complete setup with drivers and the layout.
It is interesting that the AZ amplifiers seem to be good enough to drive the ADCs - it is not obvious from the data-sheets and good to know that there are relatively simple (1 OP-amp and not the 2 OP-amp solution in the CERN HPM7177) solutions that work.
Notice that they did the INL tests at 62.5 kSPS rather than 1 MSPS. I would guess the AZ amps can't recover from the charge kickback quickly enough to get decent linearity at the higher sample rate. It would have been cool to see how they could do with a composite amp. Honestly, the thing that surprises me most in this article is that they were able to manage 200 ppb linearity error with 5k/1k25 attenuators. Presumably with some optimization and the use of a composite amp for the ADC driver, one could bring the INL down even more.
As others have said, being able to digitize from a low-impedance source at 24+ bits is not the same as having a proper voltmeter. Even without current and resistance measurements, no commercial op amp has everything people ask of a 7.5 digit meter input amp, and the meter needs noise immunity, isolation, input protection, etc. Then again, not everything needs to be a voltmeter.