@med. The reason why I'm not falling into that particular rabbit hole is because who is to say that reference A, B or C and so on is correct. As you can see from your own reference modules, they do vary a few mV but in the grand scheme of things they pretty much agree with each other and so do your meters. I have also got myself 2 sources of references and 1 of those also does current. When it comes to voltages 1 is like yours 2.5, 5, 7.5 and 10V, the other has 2 ranges 1 to 100mV and the other 1mV to 10V and on current from 0.001mA to 24mA and I'm able to increment them in 1mV or 1mA steps.
Now if I calibrate the volts against 1 reference device and then check using the other they do not agree precisely, they are a few mV out and to my mind that is A/ to be expected because engineering tolerances, and B/ they came from 2 different factories and so, were set up against 2 different reference standards. Now that to my mind, explains the differences between your Fluke 8800A's and the Siglent.
That is why I've said that in most cases you will see on service charts, voltages generally to 1dp only and also as mentioned before, the odd 10mV or so is not a deal breaker on the type of work we do anyway. Where I see the extra resolution being a distinct advantage is not being completely accurate down to the 4th, 5thdp whatever, but in trend spotting, i.e. is a battery charging or discharging where those extra digits will enable a trend to be spotted that much earlier.
One has to be, in my mind, sensible about all of this volt nuttery business because if you gave me one of your meters to check against my voltage reference, I would obtain a complete different set of figures to you. The only way of getting every meter to agree would be to set them up using the same reference device, and even then there would be still be some differences as the specs for each meter reads +/-.00x% + x digits, so there you already have meters coming of the same production line that could be slightly different on the least significant digits. I have just done a quick test on my collection of bench meters, and they all agree from cold within 1mV and if I wanted to measure uV then I'd be sourcing a dedicated meter for that.
You have to take a pragmatic view otherwise you could be spending fortunes on gear and still not get all meters agreeing and rolling over of digits in harmony and for what? What actual benefit does it give you that you can translate into a real positive end result on your work? I bet even that poor old Fluke that you decided to make a spares donor was right up to 3dp with maybe 1 or 2 digits out