I'm not sure if there's a question in all of that but there's no substitute for having real hardware for testing...
You can do a lot of algorithmic development, and coding, without the real h/w. Ultimately, yes, you need the real h/w.
I have no experience in IoT. A few examples from telecoms:
1. Two processors on a board communicated via a FIFO chip. Every now and again, the FIFO would misbehave, under high-load
conditions. Logic analyzer logs showed nothing wrong. Software review showed nothing wrong. The chip vendor's technical
support had no suggestions. With deadlines approaching, we decided to implement a light-overhead error detecting/correcting
protocol over the FIFO channel. It worked fine in extended full-load tests, so we went into production with it. We never solved
the FIFO problem, we just coped with it.
2. Code developed for a processor board worked fine. We used a slower "pin-compatible" processor on a
cost-reduced version of the board. On the cost-reduced version, we couldn't do remote firmware download and upgrade
reliably, a crucial feature. This feature had problems because the slower processor's on-chip Flash memory had much slower
write timing - something missed in the s/w reviews. Easily fixed, once discovered.
3. A product had to be powered by a telephone line. It had to use less than 75 uA. It had to fit in a small left-over space in
another vendor's housing. With minimal code running, the board was using twice that current limit, placing the
project in jeopardy. It turned out that unused I/O pins on the processor were floating, and that was causing excessive
current drain. The solution was to initialize all unused I/O to outputs. Easily fixed, once discovered.
The h/w doesn't always work the way you think. If you find out early, it's fixable. How late in development
can you afford to find out?