Wow!
Sorry only now I see this, it looks amazing!
Indeed the other AES took 72 cycles/block, now I see it's down to 20 cycles/block !
I had already finished my project for this so I could not test it. And regarding the relatively long encryption/decryption time, I hid it -:) We also have a software SHA256 which takes quite alot from the processor, so I let these processes work in parallel (PL-AES / SW-SHA).
Thanks dolbeau, I'll keep this link as I see there are many other interesting implementations there
Best Regards and keep safe
Guy