It might be possible to pick up an old Archimedes today for a song and use it to learn the ARM instruction set. Although the hardware has changed tremendously the core of the instruction set is essentially the same.
I actually had an Acorn A3000, and later upgraded to the much faster ARM3 based A5000, as my desktop computer while I was doing my degree. It had a version of BBC BASIC built in, which included an ARM assembler, so I used to know the instruction set very well indeed. I'm sure it's evolved since then, but coming from the 6502 it was wonderful to have so many registers to play with, and every instruction was (and presumably still is) able to be executed conditionally.
A lot has changed since then, of course. I sadly had to sell the A5000 to buy a (conventional) PC, which for all its relative disadvantages did offer much better support and price/performance than the ageing Acorn. I think that was about the time I wrote my last computer program too, right up until I picked up the PIC stuff and started learning C a couple of years ago.
It's C that I plan on using now, of course. Life is too short, and fast microcontrollers too cheap and readily available, to invest too much time and effort in learning the machine language of any one processor without a very good reason, IMHO.
That said, I'm well aware, for example, that the memory architecture of a PIC18 is a dog's dinner. I know its performance is limited too, especially with the free compiler, but given the choice of spending my time learning to program a PIC in assembler vs learning to program a much faster - and cheaper! - processor in C, I choose the latter.
Where I expect to spend the time is on learning the peripheral set of a particular MCU - how to use its timers, serial ports, clock structure, interrupts and so on, plus the associated compiler and IDE. I'm hoping not to need to spend too long on the ARM instruction set itself!
I find it always helps to have an application in mind when learning something new, and I do have a PIC18 board which I'm thinking of re-doing with something like an STM32. It has about 32k of code in it, most of which is actually graphics for the LCD which are compiled in as pre-initialised structures. (This is one reason why I'm not keen on a code size limited compiler; I may find that simply porting across the code I already have will use it all up. The other reason is I'd really prefer to build my own board, in my own form factor, with my own peripherals and which will actually work as a usable product when it's done, than use an off-the-shelf dev board).