Pound for pound, the AVR core handily outperforms historic 6502. Still true even if there are "single cycle" variants, because AVR has more registers, and probably more instructions. If nothing else, the computation is vastly improved with the 8x8 bit multiply instruction.
As for peripherals, if you can mate all the RAM / video / SID / etc. stuff to an AVR bus (older chips had a legacy microprocessor bus available, but that's all but gone now I think), you should be able to do pretty well.
Now, comparing a stock C64 to an ATMega with whatever selection of e.g. Arduino shields you want to throw at it, you might have a challenge. The C64 was literally made for its purpose; the ATMega, you have to slap everything onto it, and that's painful (using serial channels or clunky GPIO bit/byte-banging). I'm pretty sure most everything has been done, at least individually; it should be possible to bring everything together yet (i.e., keyboard, audio, video, etc.) to reproduce the same functionality.
One notable thing that you simply cannot get: RAM. ATMegas do NOT offer much RAM. I don't know offhand if the XMegas offer more. You could hack it by using onboard RAM as live workspace and offload it, page by page, into an external device (again, via GPIO or serial -- somewhere between slow and molasses), or you can try abusing program Flash as RAM (slow to write, awkward to read, and causes wear -- not to mention the possibility of overwriting the program ROM itself, something a C64 BIOS ROM can't worry about).
RAM is the major defining difference between GP and embedded computing -- single-purpose machines rarely need much state, the rest can be stored in ROM (data tables, or the program -- same every time). PCs need a lot of memory: an OS, the drivers to interact with diverse hardware, and the overhead to run one or multiple programs on top of that. The I/O methods likewise tend to be richer in data, if slower in update: 640 x 480 VGA for example, though on original hardware, you wouldn't even be able to touch every pixel before the video controller takes control of that memory again (these days, such modes are hardware buffered, so the hardware is reading and displaying it at the same time the processor is doing whatever with it. Such buffering would've been even more expensive back then!). Conversely, embedded systems tend to include real-time features, like serial channels, GPIOs, UARTs, etc.
More recently, demand has pushed LCD controllers onboard as well, which you might take as a sign that history is repeating itself -- there's the huge, lumbering PC; the small, nimble embedded graphics system echoing generations past; and the ant-like, real-time embedded worker, that might not have any direct communication with a human at all (or only through the vaguest of LEDs and button presses), but is used so frequently that it is not going anywhere.
Tim