This was dealt with by navigation centuries ago in the days of mechanical chronometers. These were at sea for 3 years at a stretch and had to be consistent (and were, as proved by successful trade and naval maneuvers).
FWIW, these were so successful the USN did not retire mech chrons until GPS was perfected in 1989 even though they had first deployed quartz during WWII. The quartz did not hold up well at sea. Now of course, the USN would like some back.
All capital ships carried three chronometers. Use one and it went rogue, SOL. Use two and one went rogue, you were guessing. Use three and one went rogue, you knew which one was bad.
These were calibrated and adjusted to Greenwich at the service depot every three years. In use they were only wound, never adjusted for time.
Each carried a certificate that provided the calibration error determined over 30 days of observation. The instruments used an escapement that is remarkably stable over 3 to 5 years and did not use oil (especially important in the days when whale oil thickened over time).
The navigator did not care about the "accuracy" of the dial time. He used the calibration error to calculate the adjusted dial time so he could determine the actual time back in Greenwich.
Application to meters:
The absolute reading is not as important as knowing the deviation from "reality" on a consistent basis. As in chronometers, consistency is far more important.
For this and statistical reasons, I agree with Dave that if three random DMMs consistently provide readings that agree to whatever precision is "good enough", then it can be assumed they are reliable. And an aged instrument has likely reached its highest level of stability (chronometers were run for a year to allow the balance spring of the oscillator to age, and the consistency tended to approve over the 50 to 100 years of service).
Like a navigator, someone working in a high liability position needs annual calibration. If concerned about changes within that period, they can spot check readings with another calibrated instrument. Again though, the rule of three would seem to apply.
For the amateur, consistency would seem to be the goal and she/he can assume agreement among 3 DMMs means the readings are likely valid. This is one approach to assessing convergent validity.
The amateur can also check their DMM by going back to basics, using Ohm's law to determine the expected reading.
I never paid more than $125 shipped for my DMMs, most of the time far less. All agree within a couple counts on the last 2 digits. My work range is 120 vac and below; so all but my Agilent 1252 are at least 15 years old with the electronics well aged and stable.
For the amateur buying a used instrument, Ohm's law at delivery is your friend. Of course, a vetted second instrument is also your friend. And who does not need at least 2 or 3 DMMs?