I've been drawn to this topic thread on account of a gremlin (most definitely not a bug!) that I've been struggling to resolve over the past few weeks while I was tuning the temperature control algorithm of a rubidium oscillator's base plate to cope with sub 10 degree ambient temperatures.
I had finally figured out how to get it to cope with ambient temperatures to within a degree of its theoretical 33 deg maximum (a three degree delta between the base plate's 36.05 deg set point and maximum ambient) and now wanted to verify that it would still cope with 4 deg as it had seemingly done during the winter months test runs in my outside workshop.
Unfortunately this is the wrong time of the year to be using the outside workshop as a low temperature test chamber so I had the bright idea to use my fridge as an 'all seasons' low temperature test chamber. The initial test runs exceeded my wildest dreams of milli - Kelvin stability right down to 10 degrees. Unfortunately, my dreams were shattered just as the temperature had dropped to 8 degrees when it suddenly took a deep dive into endless cycles of under/over shooting, only managing to return to a stable set point once the temperature had risen back to 10 degrees.
At first I assumed I hadn't scaled the corrective responses at low fan intake temperatures enough to avoid this instability when dealing with a thermal time constant in the region of some 35 to 40 seconds or so. I had been under the impression that the Atmel M328p used on Nano and its slightly larger cousin used on the UNO could do no wrong and therefore it was down to me for not properly considering the increased instability with fan cooling as the sole means of temperature control at these low temperatures.
Either that, or I had simply reached the limits of just what was possible and was now "Asking for The Moon", despite my observing overshoot excursions with the fan just barely ticking over, rather than being brought to a total standstill as in my initial versions of my program, suggesting otherwise. Indeed it was my curiosity over just how far I could raise this minimum tick-over drive level to before it would prevent overshoot and start reducing the temperature ever so slightly under such low ambient temperature conditions.
To this end, I simply over-rode the fan control with a set of static fanDrive test values, trying to home in on the 'magic number' and it was when I had reached a value of 255 that I spotted a rapid temperature drop rather than the expected small variation in temperature. Sure enough, trying out adjacent values (254 and 256) did give the expected resulting slight change of temperature drift rate and going back to 255 to retest this peculiar behavior did confirm that with pins D9 and D10 configured to output 10 bit PWM, that one specific 'gremlin value' would mimic the effect of generating a value of 1023.
Later testing with my bread boarded "Flight Spare" followed by writing a small test program to exercise the UNO's 10 bit PWM performance confirmed this to be a shortcoming in these micro controllers (or possibly some cockup in the Arduino's compiler or a built in library function).
With this 'gremlin' only appearing at a rather suspicious 100% for the 8 bit case, I did begin to wonder whether this gremlin repeated itself over the next two multiples of 256 (511 and 767). In this case I resorted to the age old remedy of "Shoot first, ask questions later" approach, using the simple work around of testing for fanDrive for values of 255 and substituting this gremlin value with 254 on every such occurrence.
As later testing proved, 255 was the only gremlin value over the full 10 bit range so proved an almost perfect solution until I closed a backdoor I'd forgotten I'd left open in the display update code for the gremlin to have its wicked way. More extreme low temperature testing had hinted at a shadow of its presence elsewhere in my code. It would have been a more obvious presence but for the fact that I'd added a test to override whatever fanDrive value was going to be used with a fanDrive level of the 215 tickover value whenever the base plate temperature undershot the set point by 25mK or more.
The whole point of the above description was to highlight the fact that not all bugs can be analysed using debugging tools (or even logic analysers for that matter). More accurately put, debugging tools cant handle gremlins. In such cases, as a few in this thread have already pointed out, careful analysis of the code and the behavior of the hardware under its control may be the only way to resolve such gremlins.
If anyone is interested in getting to the bottom of this particular gremlin, I've attached the gremlin test code I'd run on the UNO (should work just the same on a nano). I'd be interested in knowing whether it's just clones that suffer this 10 bit PWM gremlin or whether it also afflicts genuine Arduino boards as well.
visit <https://nerdytechy.com/how-to-change-the-pwm-frequency-of-arduino/#Pins_D9_and_D10_Timer_1_10_bit> for the original reference I'd used to set the timer 1 /D9 /D10 PWM to 10 bit mode.