How are over sampled bits not real bits? Almost every ADC now uses some form of oversampling, either 'real' (in the form of delta-sigma) or something like SAR.
Oversampling requires that the signal has uniformely distributed noise which is at least 1LSB and the ADC is more linear than necessary for the number of bits it has. Think of it like this: you have 4 comparators to detect a voltage level from 0V to 5V. Each step is 1.25V. Now apply 2V to this setup. How are you going to determine that the applied voltage is 2V? You can only see the input is greater than 1.25V and smaller than 2.5V.
If you look at a delta-sigma ADC you'll see it has extra circuitry to increase the resolution of each step. SAR ADCs OTOH have fixed steps.
The extra circuitry is used for noise shaping. Noise shaping helps get the most out of oversampling since it pushes the quantization noise to higher frequencies, where you will filter it out later when you decimate the sample rate. But it is not required. Any ADC can over sample. With correct calibration, even linearity is not the biggest issue (in fact, that is a weakness of noise-shaping ADCs, since they use a feedback loop, so calibrating for quantizer non-linearity is much more challenging than when you don't apply noise shaping).
Your example of a 2V applied to an oversampling ADC is not a good one since that is a DC signal, and oversampling cannot help you there without applying some form of dithering to push the error you have to higher frequencies. But say it is an AC sinewave with an amplitude of 2V, then you can use the over-sampled bits together with the knowledge that your signal is band-limited to a lower frequency to figure out the amplitude.
Assume I have a tone with frequency \$f_{in}\$ with a voltage of y. I use an adc with a given number of bits, and so I will have a certain amount of quantization noise, equal to at most 1/2 LSB. For a second, lets assume we have a perfect anti-aliasing filter at \$f_a \geq f_{in} \$ and that I sample at exactly nyquist, of \$f_{sample} = 2 \cdot f_a\$, the SNR will be \$2Y/LSB \$.
If I now sample at twice the anti-aliasing frequency, the spectral density of my input signal is still exactly, the same, since the tone and its amplitude remain unchanged. The quantization noise is also the same at 1/2 LSB. However, this is now spread over twice the bandwidth. I can now apply a filter to decimate my samplerate to half, and this will get rid of half of my quantization noise, since it is at a higher frequency. As a result, I end up with 1/4 LSB quantization noise, but my signal power is the same. The SNR has been reduced to that of an ADC with one additional bit of quantization.
Any non-linearities in the quantizer will result in spurs occuring since they quantization becomes non-linear.