As an aside: How can a GPU take 18 in the video but a tube based navy bomb take 20 min according to the posters in this thread? Thats like the DSKY running an iphone app.
There was NEVER a "tube based" bombe. Colossus was used only to crack "tunny" or Lorenz SZ42 ciphers, electromechanical bombes were used for Enigma, which was a vastly simpler code.
OK, there are several issues with Enigma. **IF** you know the wiring of the rotors that interchange the letters, it is not a very robust code. If you **DO NOT** know the wiring, it is a VERY difficult code to crack. As I understand it, Enigma was NEVER cracked without capture of a set of rotors.
The Germans changed out the set of rotors sometime in the middle of the war, and it took six months to capture a set of rotors, meanwhile all efforts at breaking the new rotor wiring with code-cracking methods were fruitless.
Since there were only seven rotors to choose from, the combinatorics limited the search space of the number of possible setups of the machines. That only had to be done once per day. Then, you only had to find the starting position of the rotors for each message. The Bletchley Park bombes could test about 26 positions a second. When they matched an adjustable number of characters of the crib, they stopped and the position was noted. The NCR bombes tested something like 780 positions per second, and when a match was detected had to slow down and back up so that the position and number of matched characters could be printed on an adding machine tape. then, the machine resumed the search automatically.
I'm kind of guessing that the GPU program is either starting out not knowing the order of rotors, or maybe without even knowing the rotor wiring, although I think that search space is so huge you would not be able to find the correct hits among all the random ones that looked probable.
But, if the crib was the entire cleartext, then it could be done. I don't know how long that would take.
Jon