It's worth pointing out one more thing ... how much extra memory each algorithm uses.
I could very easily "sort" 1m numbers by making multiple double linked lists (ex element.previous - element.value - element.next ) and basically inserting each value as I loop through the million entries into one such list or by other tricks that use a lot of memory.
For example, let's say the sort key is a 32 bit number ... I could have as many 256 element "buckets" as needed, by using the top 24 bits of the sort key as an id for the bucket. In each bucket then, you can have up to keys and the positions in the array where these keys show up
When you're done parsing the whole set of numbers, you can just sort the buckets by their id and merge everything in one output array.