-- To protect the scopes from high voltages (maybe?)
Probes which include shunt resistors can do that, however most x10 and x1000 probes do not and rely on the oscilloscope's 1 megohm input termination, so usually the oscilloscope needs to be able to handle full voltage at the probe tip whether the probe is attenuating or not.
-- But mostly as a consequence of the passive probe cable being a deliberately highly lossy transmission line, because a lossless transmission line can't be built with a 1 megaohm characteristic impedance, a 50 ohm lossless transmission line would hopeless load down the circuit under test, etc, etc.
The transmission line is made lossy to suppress reflections. You can make a high impedance passive probe with 50 ohm cable, however it will be limited to accurately reproducing lower frequencies because of reflections in the cable.
The main reason attenuating probes are used is to lower the capacitance at the probe tip.
Meanwhile, oscilloscope manufacturers are pushing the limits of how low they can push the noise floor of their instruments, advertising ever lower voltages/division. But we're throwing away a factor of 10 most of the time!
Manufacturers like to advertise lower noise, however semiconductor physics has not changed. If they are using JFET or MOSFET input stages, then their input noise is no better than oscilloscopes from the 1970s that used JFETs. Usually noise is worse because they are using either a MOSFET input stage, or a higher frequency transistor so that the oscilloscope can be "upgraded" without changing the hardware.
Modern (since the 1980s) oscilloscopes tend to use dual path input buffers with a second low frequency input path which itself has attenuation at its input to isolate its input capacitance and increase its input range, but this also increases input noise at low frequencies. Oscilloscopes are not designed to be low noise or they would not use this circuit topology.
My favorite oscilloscope inputs are high voltage (1) differential which inherently have like 5 times the input noise because of differential operation and the bootstrapping circuits needed for a wide input range. They have visibly more noise unless bandwidth is limited to 5 MHz, but it is hardly ever a concern and the noise rejection of having a differential input often makes up for it anyway.
My question is this: is it really true that 10x is what 6-fingered aliens would conclude is the optimal compromise? Or are 10x probes a holdover from when scopes were analog and the attenuation had to be taken care of via mental arithmetic (for which dividing/multiplying by 10 is obviously easiest for us 10-fingered humans)?
10x is not far from optimal. (2) 100x probes do not have significantly lower input capacitance. 5x probes used to be available but they were intended for trigger inputs on oscilloscopes and frequency counters where signal fidelity is less important than increased sensitivity, and even with a 5x probe on an oscilloscope, that would only double the sensitivity. If you want lower noise measurements, then there are other ways to achieve it.
(1) High voltage differential in this case means +/-10 volts, which is provided by bootstrapping the input stage. Old oscilloscope inputs typically operate with input signal range of +/-250 millivolts. Modern cost reduced oscilloscopes operate with an input range 10 times higher without bootstrapping which has the disadvantages of requiring higher slew rates and creating more distortion, but it saves by having one less input attenuator.
(2) If we used base-8 arithmetic, then our probes would be 8x for convenience. If we used base-12 arithmetic, then our probes would be 12x for convenience. There is no application where the difference in noise would matter between 8x, 10x, and 12x probes.