tired and worn out. Hunted down a nasty issue where a docker container would hang on an HP server but would work on a Cisco.
Obviously this was a tensorflow model, this was a production system, and I had 800 of those containers.
We spent several hours trying to figure out the issue, when a colleague finally had the idea to look for the vector extension support in the CPU.
And, alas, the HP only supported AVX whereas the Cisco being of relatively new breed supported AVX2. Further digging found out that my first cisco server had a Haswell type CPU (2680v3 or so) whereas the HPs had 4650s (4 a piece, plus 1 tb main memory each).
This hurts, the HPs have become obsolete overnite and will most likely be scrapped.
got new funding for 2 new ones though, and as the system is scaling much better than in those old days, I will get 2x96 cores with 512 GB mem to replace them.