Regarding BCD specifically, I don't know what conditions trigger the compiler to use features when available. That would have to be one of those "specific patterns" mentioned. (Anyway, for AVR in particular, you can do division by a constant very easily, by multiplying by a shifted constant.)
Seems to me, avr-gcc doesn't generate postincrement instructions very often, though I still need to try more access patterns and semantics on a recent case, to see if something's holding that up...
Compiler interaction for some internals may be special-cased or supported by libraries; for example util/atomic.h for AVR creates macros I believe, which simply resolve to cli/sei, or buffering SFR_REG. Concievably, some compilers might implement that kind of functionality at a higher or lower level, that there isn't a clear threshold as to where a thing should be implemented. (That said, the clearest motivation I can think of, would be: implement everything in libraries that can be. Special instructions like cli/sei aren't subject to optimization, so hard-coded asm is perfectly adequate. Whereas things like pointer arithmetic, and memory access, will be subject to redundancy, differencing, interleaving and such, and so the compiler will need to be aware of them.)
Regarding frequency, it's quite natural that some instructions will be used more than others. Almost everything you're doing, is either moving around data (MOV, LD, ST..), checking data, doing basic arithmetic (ADD/SUB, CMP, TST, conditional bit-expansion or manipulation (set/clear, sign extend, shift..), and doing basic state machine stuff (conditional jumps, loops, calls..). The few (well, 5 to 20 say) percent left includes everything else -- more in-depth math (MUL and DIV, floating point, SIMD..), fancier bit operations (move, copy, shuffle..), IO (sometimes memory mapped, sometimes special instructions -- which for the AVR, even though it doesn't have a separate IO space like Z80 or x86 does, it does have IN/OUT instructions for quick access to low addresses), API calls (INT?) and OS/kernel functions (privileged, when applicable).
Incidentally, GCC definitely doesn't implement AVR's FMUL instructions at all, providing them as builtins only. Which you might be better off writing out with MUL and shifts anyway, as it doesn't seem to perform any optimization around those instructions (again, based off very limited experience at present..). So even on the humble AVR, we have a few examples of that situation.
Tim