Between architectural limitations, and questionable compiler optimizations, I've been pretty unimpressed with the CM0 recently. Some of the code gets really ugly, doesn't actually seem faster or more compact than an 8bit cpu EXCEPT for 32bit math, and you pay for that 32bit address space by needing to store a bunch of 32bit constants. ("lds r1,globalvariable" on AVR: 2 cycles, 3 bytes. equivalent on M0:8 bytes, somewhat indeterminate cycles. Which would probably be fine if it was one of those 72MHz chips with 128k of flash, but vendors are selling 25MHz chips with 16k of flash...))(Though I guess the 16k/20MHz chips are supposed to replace the 4k/10MHz 8bit chips, and might, price-wise. But you need to be careful.)
Fortunately, vendors seem to be jumping on CM4, and it's nicer...
And yeah, these aren't good excuses for limiting yourself to 48kb, architecturally...
(Hmm. Does the 4809 match the pinout of the smaller Xmega chips? (Nope, not even close - 48 vs 44 pins.) How about the smaller SAM chips? (Nope again.). Sigh.