If you speed up all those frequencies in proportion so that the frame rate is exactly 30 FPS, wouldn't they still have the same relation to each other?
Yes it would but you would have to move the sound carrier up by the same amount to avoid the original problem: If there were any non-linearities in the video system then an annoying stationary dot pattern would be caused by the sound carrier at 4.5 MHz. In that case, all NTSC receivers would have to be returned as well, and that wasn't about to happen.
All other components of the signal were slowed down to maintain their respective relationships, of course, and this lowered the frame rate from 30 Hz to 29.97002997… Hz, or approximately 0.1%. The exact ratio of new frequency to old is 1000/1001. This seemed innocuous enough for the system designers since their first priority was to make sure that the system stayed compatible with all of the installed receivers in America; and a shift of about 0.1% was sufficiently small that those TV sets could easily track the change in scanning rate.
However, when this was done the distinction between video time and real time came into sharp focus. With the widespread usage of time code 1 hours worth of video, or 108,000 frames of videotape would last too long! It now became 60 x 60 x 29.97… = 107,892 frames, or 108 frames too short! It was imperative for the producers (and sponsors) of an hour’s worth of programming to see that the length of a show was exactly one hour. This discrepancy was fixed in the application of the code by losing or “dropping” 108 frame counts spread over the course of the hour. So, during the progress of the frame count through the hour, two frames (the counts only NOT the actual frames of video), numbers 00 and 01 were skipped or “dropped” at the start of each minute with the exception of minutes which were a multiple of ten. Sounds complicated? It gets worse.
Now every manufacturer of equipment that generated, read, computed, or otherwise used “drop frame” time code had to take this stuttering time reference into account when dealing with programming length. Every producer, director, editor who wishes to create accurately timed commercials or programs had to be aware of this disjoint “real-time” reference.
As it happens, the above discrepancy of time is not exactly 108 frames per hour. I rounded the frame rate to 29.97 f/s but it is actually a repeating decimal, 29.97002997002997… This means that a fraction of a frame is left over each hour. In fact a real hour is 107,892.1079… frames long. If the time code generator were to run for an entire day it would count 2,589,410.589... frames. It would NOT be an even number of frames due to the repeating decimal. The “drop frame” code would reach the terminal count of 24:00:00:00 with an error of [2,589,410.589 - (30 x 60 x 60 x 24) - (108 x 24)] or 2.5894 frames or 86.4 milliseconds. Each and every day a time code generator would be this amount out of time every day.
It drove many people nuts to have to deal with this crap until the software in the editing systems became sophisticated enough to remove much of the burden from the users. It also gave the Europeans much cause for amusement to look at the mess the TV engineers across the pond had created for themselves.