Author Topic: Pulling my hair out. Circuit boards stop working once shipped to client and more  (Read 12332 times)

0 Members and 3 Guests are viewing this topic.

Offline JacksterTopic starter

  • Frequent Contributor
  • **
  • Posts: 476
  • Country: gb
    • PCBA.UK
Looking at the clues posted in this thread, I'm almost 99.9% positive there's more to your problems than just the PCB mishap. Although it's possible such "almost shorted" pads could give intermittent operation, it's unlikely it happens in multiple units. You have so many unexplained incidents of it failing, working again, then failing again.

If you want to become a professional design engineer, do yourself a favor and as soon as you have a moment of silence, don't go on to design more features, or a more advanced product, but instead, try to do a proper root cause analysis. Instead of just building products, try to build a process/a "factory" where you can robustly build these products without wasting a lot of time.

You seem to have many issues, some are likely correlated, some are not.

In a stressful situation, we tend to fall back into trying to just get things to work by whatever means. Like can't get the MCU flashed? It's not a total showstopper, swap the board and go on. But in the long run, solving the problem once and for all would pay back in time used, and, it could turn out it's connected to your other issues, so they would be solved as well.

When I was looking this comment of yours:

"They just develop a fault where the software no longer cycles. This can happen on new boards too. It will go through the code 3-6 times and then hang. "

I thought, you are very lucky. You have a lot of specimen that do fail, on your hands. And you have consistent failures. Like you don't need to operate a well-performing product for weeks to see a failure. If I understand correctly, you have at least one (1) unit in your hands which you can demonstrate a failure with, within minutes or hours. That's great.

It doesn't matter what the fault is and what do you think it might be caused by. Given this particular failure you can demonstrate, go for full-blown root cause analysis and see what you find.

You just need to make your steps smaller, and lower level. Whenever you hit a wall of not knowing how to do it, Google it, learn it.

I don't personally use debuggers a lot, but this could be a case where you'd get a starting point. Failing to have one, just make your code turn an LED on/off at different points of code, after a few iterations of moving around where you turn the LED on/off you have found the exact place in code where it hangs.

If your MCU isn't flashing, look at the communication signals with an oscilloscope, decode the contents. It may take several hours, but then you know exactly where it hangs. Chances are, you find some analog signaling issue (stuck logic level, bad rise/fall time)... in two seconds after looking at the scope screen.

Get yourself the basic tools, a 50MHz 2-channel digital storage oscilloscope being a bare minimum to debug such a design. A $400 4-channel Rigol or similar is more than enough, but I'm sure you can get an older generation thing used for maybe $100.

Thanks for the info.

We worked out that the major flashing issue was down to the bad boards and the occasional issues to be the header pins I was using not being repeatable over many units.

I hopefully have fixed this with the latest revision that uses a pogo pin style cable to do the programming rather than just some 2.54mm headers stuck into the PCB.


As for the software stopping. My friend worked out that there was a overflow due to the tight timings. Slight changes between the boards could have causedthat most were fine but some failed after a while.


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf