Author Topic: Pulling my hair out. Circuit boards stop working once shipped to client and more  (Read 12337 times)

0 Members and 2 Guests are viewing this topic.

Offline JacksterTopic starter

  • Frequent Contributor
  • **
  • Posts: 476
  • Country: gb
    • PCBA.UK
So I designed a small product a few years ago and it has been selling well.
But around 5-10% that get shipped end up not working once received by the end user.

The product is pretty simple and only requires power to work.
But a small % just don't work once they are in the hands on my clients.

All they have to do is plug in power, which is either via a battery (a standard used in the industry) or via USB.

I did the whole PCB design, case CAD and offloaded the software to a freelancer.
In house (literally my house) this all works fine and passes all our QA testing.



Another problem I have just had is that I needed more PCBs the other month.

I wanted to improve 1 thing which was when installing the PCB this corner was getting in the way a tad.
So I chopped off this corner in PCB design software and re-poured the ground plane.

And once the PCBs arrived, none of them would allow me to burn the boot loader onto the ATMEGA via the programming header.
Had to pull all the ATMEGAs off and flash them in a socket.
Put them back on and 90% of the boards I made just don't work as expected.
This is after P&P and hand soldering over £300 worth of components >.<


I just don't know what to do any more.

Would love to just hand off the PCB design to someone else but I don't have enough money to fund that right now.
I don't understand why one minute the thing works but then dies as soon as it arrives at the end user.

Any advise on what I should do?



[edit]
For clarification as I know I have not added a lot of details here.



Any boards that have failed have been replaced with working boards. Full warranty was provided.

The two issues above are not the same issue/related.
 I have done 3 batches of boards.
 Batch 1 out of a total of 50 boards, 2-3 failed and were replaced.
 Batch 2 we have had 3-4 fail and were replaced. One has failed again, we are investigating.
 Batch 3 with a slight PCB change, all but a handful have failed to work as expected. None have shipped.

Boards that arrived back here were examined but I don't have the tools or knowledge to go deep into scoping pins or anything like that.

The flashing of the boot loader is only done once, which is why we can flash them on a socket and then P&P.
The USB interface is used to flash firmware and updates.
I am aware that there might be underline issues with batch 3, which is why we are not shipping any of these boards.

My clients are aware that this is a project being done out of my garage and that I am not a "pro" at this.




Offline Psi

  • Super Contributor
  • ***
  • Posts: 10151
  • Country: nz
Note: I am adding to this list as i think of things. So you may want to re-read it in case i have added more since you read it last.

Questions
- Have you got any units back that a client says don't work? If so do they work for you or are they dead.
- Is your QC test automated? If so are you sure it is not faulty and letting dead units out the door? Maybe a manual test is needed so you know for sure all units leaving you are 100% working.
- Is there anything unusual about your location or places you ship to.  Some places irradiated all their mail with high energy x-rays and this can destroy electronics.
- Does the MCU/software system interface/talk-to something that is different in different parts of the world.  Here's examples of what i mean. Maybe bluetooth to a phone  or maybe comms to a desktop PC app. Maybe some people have their phone/desktop set to a different timezone/country/language and may your app is incompatible with this.
- Do you have the sourcecode to your ATMega or does the freelancer hold that?

Few possibilities i can think of.
- Are users connecting the battery around the wrong way and damaging the device. (9V around wrong way for a split sec, that sort of thing)
- Do you have any floating inputs pins on the MCU? Maybe the code only works if an input is read as either low or high but it keeps changing with ambient noise. When built it might stay in one state but in noise environments maybe it floats to high and stops the code running. (Floating inputs should have MCU pullups enabled in software but maybe they are not set in your code?)
- Have you tried powering the device from 4.5V and with lets say 50mA current limit.  Not all USB ports are created equal. Maybe your product is quite critical on power and not all USB ports can power it.
- Where are you getting your parts from, maybe you are getting lots of fake ICs
- Could be a PCB track routing issue where tracks run too close to a hole or board edge and sometimes get cut by the drill/router.  etc Some PCBs work, some don't, some intermittent.
- Are you sure you have the ATmega Fuse Bits set correctly, maybe the startup delay, brownout detector or crystal settings are wrong and this is making it run intermittently.
- Does the product have protection from ESD or PSU spikes, like a TVS?  Does the product get used in a location where it might need this.  etc  automotive/industrial
- How are you programming the MCUs?  I one had a crappy USBASP programmer that would brick 2 our of 5 AVRs it flashed. Not sure why, maybe clock was out of spec and kept erasing fuse bits.
- Does your MCU programming system include a verify check?
- There is one AVR MCU, cant remember which, that comes with fuse bit set to put it into a compatibility mode where it pretends to be a different AVR chip. Some of the IO/peripherals don't work until you get it out of that mode.   :palm:  (my guess is they have a supply agreement to sell a compatibly chip for 25 years for MIL/MED/AERO) EDIT: All ATMega128 pretend to be a ATMega103 until you change the M103C fuse bit

And once the PCBs arrived, none of them would allow me to burn the boot loader onto the ATMEGA via the programming header.
Had to pull all the ATMEGAs off and flash them in a socket.
Put them back on and 90% of the boards I made just don't work as expected.

This makes me lean towards a PCB/SCH issue.
Are you using MISO MOSI SCK pins for anything else other than programming?
You can use them for other things too, but you need to make sure you don't load the lines so much that programming is effected.
It can become intermittent if you load them or have caps on the line to gnd.

Also, grab one of those boards that doesn't program and use DMM to check the tracks between the programming header pins GND VCC MOSI MISO SCK RESET and the ATmega pads for those pins. Also check none are shorted together.



Help
- Are your PCB files in Altium? if so i'm happy to take a look at your SCH/PCB/CODE and see if i can spot any potential problems.


« Last Edit: June 01, 2019, 12:16:23 pm by Psi »
Greek letter 'Psi' (not Pounds per Square Inch)
 
The following users thanked this post: mcinque, Joarek

Offline vk6zgo

  • Super Contributor
  • ***
  • Posts: 7671
  • Country: au
I remember many years ago, a separate State branch of the organisation I worked for were tasked with making boards that would automatically ring various phone numbers  when required.

We duly received our portion of those devices, but unfortunately they didn't work.
When we complained to the other State, they protested:
 "But we tested them & they all rang up who they were supposed to!"

Yup! They dutifully programmed the whole number needed to call those sites from their State into the PROMs.
Those additional numbers, of course, weren't needed in the State they were intended for, & would "freak the exchange out".
 

Offline iMo

  • Super Contributor
  • ***
  • Posts: 4989
  • Country: cv
Hard to help without details. Anyhow, the fact the 5-10% that get shipped end up not working is an indication there is something wrong with the product or processes around and the manufacturer should have stopped shipping such a product.
« Last Edit: June 01, 2019, 12:13:37 pm by imo »
 

Offline dmills

  • Super Contributor
  • ***
  • Posts: 2093
  • Country: gb
What does your power input stage look like? Ceramic cap and LDO by any chance?

If so, try plugging in the power with the supply already switched on, and with reasonably long power leads, you might be blowing the regulator due to ringing in the LC circuit formed by the power cable and the input cap (Cure is a jellybean electrolytic about 10 times the value of the input ceramic in parallel with it, the ESR damps the ringing).

Are ALL of your external IO lines fitted with some form of ESD protection?

No floating inputs on the micro?
Is everything run well within datasheet ratings?

I had an issue with a production board once where we suddenly started getting a very high failure rate, turned out the spi bus was being driven on the wrong clock edge and the first batch of the peripheral chips just happened to work with zero hold time, the next ones not so much.

You need to get a few duds back and investigate.

Regards, Dan.
 

Offline Psi

  • Super Contributor
  • ***
  • Posts: 10151
  • Country: nz
turned out the spi bus was being driven on the wrong clock edge and the first batch of the peripheral chips just happened to work with zero hold time, the next ones not so much.

hehe
Greek letter 'Psi' (not Pounds per Square Inch)
 

Offline AndyC_772

  • Super Contributor
  • ***
  • Posts: 4263
  • Country: gb
  • Professional design engineer
    • Cawte Engineering | Reliable Electronics
It sounds like you've got two problems here.

1) Difficulty programming the MCU, for reasons unknown

2) Customers reporting that products don't work, when they passed your own in-house testing

It's possible these are related, but not necessarily. For now, I'd treat them as separate issues, and if fixing one leads to a solution for the other, that's a bonus. I'd also concentrate on exactly one example of each problem; don't get bogged down with the fact that there are many boards involved, even if their symptoms aren't identical, they may all have the same root cause.

Pick one board that doesn't program, check carefully all the signals required to program it, and find the definitive root cause for why that specific board doesn't work. There really aren't many things it can be; power supplies, clocks and timing, logic thresholds are about it.

Also, make sure you get back from customers at least some of the boards which are reported as faulty. Test them the exact same way you test boards prior to shipment, and see if they now fail. This tells you whether the difference is that the boards have stopped working, or if they have an inherent flaw which your testing has failed to pick up.

This sort of forensic fault-finding and testing is one of the things I do for a living. Feel free to PM me if you need some more detailed, specific advice.

Online wraper

  • Supporter
  • ****
  • Posts: 17420
  • Country: lv
And once the PCBs arrived, none of them would allow me to burn the boot loader onto the ATMEGA via the programming header.
Had to pull all the ATMEGAs off and flash them in a socket.
Put them back on and 90% of the boards I made just don't work as expected.
This is after P&P and hand soldering over £300 worth of components >.<
As a first thing, you just brute-forced a workaround instead of solving the actual problem. Instead you should find the root cause of the problem, and then take measures to fix it. The same goes with DOA delivered boards. You should get some of them back from the customers and find what exactly causes them not working.
Quote
I just don't know what to do any more.

You apparently did not even try to find the problem but already don't know what to do next.  |O
 
The following users thanked this post: Siwastaja, ogden, Ysjoelfir

Online mariush

  • Super Contributor
  • ***
  • Posts: 5119
  • Country: ro
  • .
Maybe the flux you're using needs to be cleaned off the boards otherwise is causing some resistance or short circuits?
Maybe you're accidentally joining two pins of your microcontrollers during soldering?

Do you have screw holes near traces? maybe you're shorting traces with screws or breaking them with friction over traces?

If usb powered... are you assuming you're getting clean 5v? Maybe the guys at the other end have too long usb leads causing voltage drop, or maybe they have stupid unregulated phone charger style usb things pumping 5.5-6v in your boards?
Inductance on the long usb cable causing voltage spikes? Not enough capacitance on input and output of regulators that could damage the regulators or cause them to reset/glitch?  Bad output capacitors on regulators?

Where do they install these products? are there powerful magnets or some induction things or something that could be picked up by your circuit and affect it


maybe share at the very least a picture of the assembled board ... if it's too much of a secret to show a schematic or something more complex
 

Offline typematrix

  • Contributor
  • Posts: 24
  • Country: ie
    • Github
Hi

Have you gotten boards back from customer?
i.e. customer returns.

If you get a customer return board back on your bench does it work?
« Last Edit: June 01, 2019, 04:28:08 pm by typematrix »
 
The following users thanked this post: ebastler

Offline JacksterTopic starter

  • Frequent Contributor
  • **
  • Posts: 476
  • Country: gb
    • PCBA.UK
Note: I am adding to this list as i think of things. So you may want to re-read it in case i have added more since you read it last.

Questions
- Have you got any units back that a client says don't work? If so do they work for you or are they dead.
- Is your QC test automated? If so are you sure it is not faulty and letting dead units out the door? Maybe a manual test is needed so you know for sure all units leaving you are 100% working.
- Is there anything unusual about your location or places you ship to.  Some places irradiated all their mail with high energy x-rays and this can destroy electronics.
- Does the MCU/software system interface/talk-to something that is different in different parts of the world.  Here's examples of what i mean. Maybe bluetooth to a phone  or maybe comms to a desktop PC app. Maybe some people have their phone/desktop set to a different timezone/country/language and may your app is incompatible with this.
- Do you have the sourcecode to your ATMega or does the freelancer hold that?

Few possibilities i can think of.
- Are users connecting the battery around the wrong way and damaging the device. (9V around wrong way for a split sec, that sort of thing)
- Do you have any floating inputs pins on the MCU? Maybe the code only works if an input is read as either low or high but it keeps changing with ambient noise. When built it might stay in one state but in noise environments maybe it floats to high and stops the code running. (Floating inputs should have MCU pullups enabled in software but maybe they are not set in your code?)
- Have you tried powering the device from 4.5V and with lets say 50mA current limit.  Not all USB ports are created equal. Maybe your product is quite critical on power and not all USB ports can power it.
- Where are you getting your parts from, maybe you are getting lots of fake ICs
- Could be a PCB track routing issue where tracks run too close to a hole or board edge and sometimes get cut by the drill/router.  etc Some PCBs work, some don't, some intermittent.
- Are you sure you have the ATmega Fuse Bits set correctly, maybe the startup delay, brownout detector or crystal settings are wrong and this is making it run intermittently.
- Does the product have protection from ESD or PSU spikes, like a TVS?  Does the product get used in a location where it might need this.  etc  automotive/industrial
- How are you programming the MCUs?  I one had a crappy USBASP programmer that would brick 2 our of 5 AVRs it flashed. Not sure why, maybe clock was out of spec and kept erasing fuse bits.
- Does your MCU programming system include a verify check?
- There is one AVR MCU, cant remember which, that comes with fuse bit set to put it into a compatibility mode where it pretends to be a different AVR chip. Some of the IO/peripherals don't work until you get it out of that mode.   :palm:  (my guess is they have a supply agreement to sell a compatibly chip for 25 years for MIL/MED/AERO) EDIT: All ATMega128 pretend to be a ATMega103 until you change the M103C fuse bit

And once the PCBs arrived, none of them would allow me to burn the boot loader onto the ATMEGA via the programming header.
Had to pull all the ATMEGAs off and flash them in a socket.
Put them back on and 90% of the boards I made just don't work as expected.

This makes me lean towards a PCB/SCH issue.
Are you using MISO MOSI SCK pins for anything else other than programming?
You can use them for other things too, but you need to make sure you don't load the lines so much that programming is effected.
It can become intermittent if you load them or have caps on the line to gnd.

Also, grab one of those boards that doesn't program and use DMM to check the tracks between the programming header pins GND VCC MOSI MISO SCK RESET and the ATmega pads for those pins. Also check none are shorted together.



Help
- Are your PCB files in Altium? if so i'm happy to take a look at your SCH/PCB/CODE and see if i can spot any potential problems.


Thanks for the long reply and offer to check boards. Ill export them and DM you the files.

I have the boards that have failed from when clients used them. More than half worked as expected. Some did fail testing.
QC is done by hand. I run the boards on a test program for 24 hours. We also test after this that the PWM input, switches and firmware all work.
We ship from the UK to all over. Ill need to dig into locations that we have shipped to and have had units fail. But France and Sweden have had boards fail. The boards are encased in an aluminium block with >1.5mm wall around with a few cutouts for buttons and display.
The device has a PWM input from a sensor and displays it on some 7 segment displays. It has a NRF24L01 that talks to another one of the same device that I make.
I have source code.


I have a diode protecting reverse polarity though I don't know if I did it the correct way. But the power is only able to be plugged in one way using a LEMO socket and cable, which we provide.
We have 1 pin that is pulled to ground if it is a receiver and not a transmitter.
USB is only for updates, though the device works using it. The actual power source is 16v batteries and I use an off the shelf power regulator board to take 13-20v down to 12v, which then gets taken down to 5v with a normal REG.

This might be a good point, all but a few of the new boards I have made up from batch 3, have been with parts from LCSC. I will check this tonight.

I have used the default Circuit Maker design rules. But the outside edge I made quite far in. At least 1-2mm.
Not sure.
This product gets used all over, but most boards that have failed, arrived to the user bad. My hunch was xray but I have nothing to back that up with.
I use an Arduino to load the boot loader onto the ATMEGA and then a FTDI for flashing firmware.
AVRdude checks and we make sure it confirms all is good.
Not sure about the ATMega328p doing this or not.



I do use those programming pins for the NRF24L01.
Ill send you Altium files later but here is top and bottom.
Board on left is batch 1 and 2. Board on right is batch 3.





Offline vk6zgo

  • Super Contributor
  • ***
  • Posts: 7671
  • Country: au
What does your power input stage look like? Ceramic cap and LDO by any chance?

If so, try plugging in the power with the supply already switched on, and with reasonably long power leads, you might be blowing the regulator due to ringing in the LC circuit formed by the power cable and the input cap (Cure is a jellybean electrolytic about 10 times the value of the input ceramic in parallel with it, the ESR damps the ringing).

Are ALL of your external IO lines fitted with some form of ESD protection?

No floating inputs on the micro?
Is everything run well within datasheet ratings?

I had an issue with a production board once where we suddenly started getting a very high failure rate, turned out the spi bus was being driven on the wrong clock edge and the first batch of the peripheral chips just happened to work with zero hold time, the next ones not so much.

You need to get a few duds back and investigate.

Regards, Dan.

Reminds me of the Transmitter remote control system we had in to modify.
While doing so, I mislaid one of a number of monostables which were used in it.
No worry,-- plenty in stock!
Fitted a new one-- damn thing wouldn't work!

It turns  out that with that particular IC, people had been having problems getting stable operation at short "on" times.
The manufacturer's answer was to split the line into two devices, one with the original part number, which they redesigned to be optimised for short "on" times, sacrificing its performance (which had been perfectly satisfactory) for long "on" times.

For long "on" times, they produced a new device & type number

Not knowing this, we got caught out when the thing didn't work.
In the end, we had to get a stock of the new devices in, replace them in both remote control systems in use, & change the documention.

I wonder how many "young (& old) players" got caught by that, & "tore their hair out" over the years?
 

Offline JacksterTopic starter

  • Frequent Contributor
  • **
  • Posts: 476
  • Country: gb
    • PCBA.UK
Maybe the flux you're using needs to be cleaned off the boards otherwise is causing some resistance or short circuits?
Maybe you're accidentally joining two pins of your microcontrollers during soldering?

Do you have screw holes near traces? maybe you're shorting traces with screws or breaking them with friction over traces?

If usb powered... are you assuming you're getting clean 5v? Maybe the guys at the other end have too long usb leads causing voltage drop, or maybe they have stupid unregulated phone charger style usb things pumping 5.5-6v in your boards?
Inductance on the long usb cable causing voltage spikes? Not enough capacitance on input and output of regulators that could damage the regulators or cause them to reset/glitch?  Bad output capacitors on regulators?

Where do they install these products? are there powerful magnets or some induction things or something that could be picked up by your circuit and affect it


maybe share at the very least a picture of the assembled board ... if it's too much of a secret to show a schematic or something more complex

I can check the flux idea later.

No screws near traces. Only ground planes.

Not USB powered but can be powered via USB. USB only really for firmware updates.

The product is not installed near anything like that. Just near other electronics. But issues have been happening before being used.









Hard to help without details. Anyhow, the fact the 5-10% that get shipped end up not working is an indication there is something wrong with the product or processes around and the manufacturer should have stopped shipping such a product.

Looking at it, it is less than 5%. Total of around 100 boards, only 5-6 have failed. One just needed a repair after user error.




What does your power input stage look like? Ceramic cap and LDO by any chance?

If so, try plugging in the power with the supply already switched on, and with reasonably long power leads, you might be blowing the regulator due to ringing in the LC circuit formed by the power cable and the input cap (Cure is a jellybean electrolytic about 10 times the value of the input ceramic in parallel with it, the ESR damps the ringing).

Are ALL of your external IO lines fitted with some form of ESD protection?

No floating inputs on the micro?
Is everything run well within datasheet ratings?

I had an issue with a production board once where we suddenly started getting a very high failure rate, turned out the spi bus was being driven on the wrong clock edge and the first batch of the peripheral chips just happened to work with zero hold time, the next ones not so much.

You need to get a few duds back and investigate.

Regards, Dan.

Power regulation is a bit of a hack tbh
I am using an drone regulator board to take 12-20v input down to 12v, I then have a 5v REG for the IC.
I did this for a few reasons, the main one being heat, I found that the LDO and other REG SMD packages got too hot with upwards of 20v.
The other being height. I was able to get these drone power regulators on a PCB less than 4mm which was important.

Offline OwO

  • Super Contributor
  • ***
  • Posts: 1250
  • Country: cn
  • RF Engineer.
First thing you really have to do before considering anything else is to triage the failures and get to the bottom of it; exactly which part failed and in what way? otherwise all we can do is speculate and brainstorm 1000s of unrelated and probably irrelevant possibilities.
Email: OwOwOwOwO123@outlook.com
 

Offline OwO

  • Super Contributor
  • ***
  • Posts: 1250
  • Country: cn
  • RF Engineer.
You did mention half of the reported failed boards then subsequently passed testing on your end; in these cases what did the customer observe? Wireless comms not working? device not responding to any input and appearing "dead"? Is each customer different or is there a pattern of a type of failure? We still need far far more info before we can be helpful.

I have the boards that have failed from when clients used them. More than half worked as expected. Some did fail testing.
Action items right now: debug the boards that do fail testing, try to find the exact root cause, record all observations. Dig out all failure reports with devices that subsequently DID pass tests, record them somewhere and look for patterns. Ask customers for more info if necessary.
« Last Edit: June 01, 2019, 05:16:03 pm by OwO »
Email: OwOwOwOwO123@outlook.com
 
The following users thanked this post: Jackster, mcinque

Offline mcinque

  • Supporter
  • ****
  • Posts: 1129
  • Country: it
  • I know that I know nothing
First thing you really have to do before considering anything else is to triage the failures and get to the bottom of it

Quote
Action items right now: debug the boards that do fail testing, try to find the exact root cause, record all observations.
Exactly. Investigation on dead boards is essential. There are too many causes and considerations to find the problem without proper debugging. Absolutely analyze where those boards failed and report your discoverings, possibly together with a schematic.
 

Offline Psi

  • Super Contributor
  • ***
  • Posts: 10151
  • Country: nz
I have the boards that have failed from when clients used them. More than half worked as expected. Some did fail testing.
QC is done by hand. I run the boards on a test program for 24 hours. We also test after this that the PWM input, switches and firmware all work.
...
The boards are encased in an aluminium block with >1.5mm wall around with a few cutouts for buttons and display.

Do you by any chance put screws into this aluminum case? If so are they pre-threaded?
Thread forming screws (or just tight screws) into aluminium creates lots of metal filings!
The metal flakes may not cause a problem initially but after being shaken around in transport maybe they get all over the place and short IC pins.
Test: Put down a clean sheet of copy paper on a desk. Grab a finished unit and carefully open it up on the paper. Tap out the board & case and see if any metal flakes come out. The paper will give a good contrast to make them easy to see.

This product gets used all over, but most boards that have failed, arrived to the user bad. My hunch was xray but I have nothing to back that up with.
Normal airport x-ray is totally fine, that wont do anything. Only the high power X-ray's used to sterilize mail are a concern. Those are usually found at government buildings.

I use an Arduino to load the boot loader onto the ATMEGA and then a FTDI for flashing firmware.
AVRdude checks and we make sure it confirms all is good.
Not sure about the ATMega328p doing this or not.
FYI - There's a new chip out, the ATMega328PB which is not the same as a ATMega328P.
It's easy to think 'oh that's just the lead version' but no, it's a different chip with some different pinouts.

I do use those programming pins for the NRF24L01.
Right, so the ATMega328 SPI pins is used for flash programming of the MCU and also for talking to the NRF24L01 chip over SPI?
How are you handling the reset line on the ATMega328? Is it pulled high externally? Is it connected to an external button or something?
I just wonder if it's possible for the ATmega to go into reset state for some reason while comms to NRF24L01 are active and somehow get garbage send to the ATmega while it's in reset low state (program mode).
I'm not sure this is actually possible, because there should be no SPI clock once MCU goes into reset.
I'm just thinking out loud. Maybe someone else will have a through reading this.
Greek letter 'Psi' (not Pounds per Square Inch)
 

Offline Psi

  • Super Contributor
  • ***
  • Posts: 10151
  • Country: nz
Have a look at these 2 areas on dead PCBs.
There maybe issues where the track has broken or shorted etc..

« Last Edit: June 02, 2019, 01:33:15 am by Psi »
Greek letter 'Psi' (not Pounds per Square Inch)
 

Offline Psi

  • Super Contributor
  • ***
  • Posts: 10151
  • Country: nz
How are you handling merging of the USB Vbus power onto the 5V from the voltage regulator output?
Normally you would diode OR the two sources, but from the pcb layout it looks more like connecting 5V usb to reg output?

Voltage regulators do not like a higher voltage on their output than their input. They tend to die.
That can happen if you connect 5V from USB onto the output of a 5V reg and then remote the battery that's powering the input!

I could see you doing all QC test with a battery always connected but a user connecting USB first because they have a shinny new toy and can't wait to plug it in before they can source a battery.
« Last Edit: June 02, 2019, 01:34:08 am by Psi »
Greek letter 'Psi' (not Pounds per Square Inch)
 

Offline JacksterTopic starter

  • Frequent Contributor
  • **
  • Posts: 476
  • Country: gb
    • PCBA.UK
I have the boards that have failed from when clients used them. More than half worked as expected. Some did fail testing.
QC is done by hand. I run the boards on a test program for 24 hours. We also test after this that the PWM input, switches and firmware all work.
...
The boards are encased in an aluminium block with >1.5mm wall around with a few cutouts for buttons and display.

Do you by any chance put screws into this aluminum case? If so are they pre-threaded?
Thread forming screws (or just tight screws) into aluminium creates lots of metal filings!
The metal flakes may not cause a problem initially but after being shaken around in transport maybe they get all over the place and short IC pins.
Test: Put down a clean sheet of copy paper on a desk. Grab a finished unit and carefully open it up on the paper. Tap out the board & case and see if any metal flakes come out. The paper will give a good contrast to make them easy to see.

This product gets used all over, but most boards that have failed, arrived to the user bad. My hunch was xray but I have nothing to back that up with.
Normal airport x-ray is totally fine, that wont do anything. Only the high power X-ray's used to sterilize mail are a concern. Those are usually found at government buildings.

I use an Arduino to load the boot loader onto the ATMEGA and then a FTDI for flashing firmware.
AVRdude checks and we make sure it confirms all is good.
Not sure about the ATMega328p doing this or not.
FYI - There's a new chip out, the ATMega328PB which is not the same as a ATMega328P.
It's easy to think 'oh that's just the lead version' but no, it's a different chip with some different pinouts.

I do use those programming pins for the NRF24L01.
Right, so the ATMega328 SPI pins is used for flash programming of the MCU and also for talking to the NRF24L01 chip over SPI?
How are you handling the reset line on the ATMega328? Is it pulled high externally? Is it connected to an external button or something?
I just wonder if it's possible for the ATmega to go into reset state for some reason while comms to NRF24L01 are active and somehow get garbage send to the ATmega while it's in reset low state (program mode).
I'm not sure this is actually possible, because there should be no SPI clock once MCU goes into reset.
I'm just thinking out loud. Maybe someone else will have a through reading this.



I have the screw holes pre-threaded on the cnc.
The whole case is then cleaned, bead blasted then anodised. They are super clean.


I am aware of the ATMega328PB. I make sure not to order or use them.


I burn the bootloader before installing the WiFi board. But burning while it is on, not had any problems with that either.
The reset pin on the ATMega328p is shared between ICSP header and the FTDI chip.
There is a 0.1uF cap to ground. This is all to Arduino spec I believe.




How are you handling merging of the USB Vbus power onto the 5V from the voltage regulator output?
Normally you would diode OR the two sources, but from the pcb layout it looks more like connecting 5V usb to reg output?

Voltage regulators do not like a higher voltage on their output than their input. They tend to die.
That can happen if you connect 5V from USB onto the output of a 5V reg and then remote the battery that's powering the input!

I could see you doing all QC test with a battery always connected but a user connecting USB first because they have a shinny new toy and can't wait to plug it in before they can source a battery.

So the power is always delivered via the 16v input. The only time users use USB is to test the device and update firmware.
Both are not used at the same time and we make that very clear in the documentation.

The power is as per Arduino spec for the Arduino Nano.


Have a look at these 2 areas on dead PCBs.
There maybe issues where the track has broken or shorted etc..



Ill check, thanks for spotting.

Offline JacksterTopic starter

  • Frequent Contributor
  • **
  • Posts: 476
  • Country: gb
    • PCBA.UK
Ignore how dirty it is but the board on the left has an ATMega328p from LCSC and the one on the right is from RS.
The text on the LCSC one is barley visible, like this is better than what I can see with my eyes.

The RS one burned the boot loader just fine.
Will add all the other components to see if it runs my firmware without issue.


Online wraper

  • Supporter
  • ****
  • Posts: 17420
  • Country: lv
The text on the LCSC one is barley visible, like this is better than what I can see with my eyes.
Text is not visible because IC is dirty.
 

Offline viperidae

  • Frequent Contributor
  • **
  • Posts: 306
  • Country: nz
LCSC part looks counterfeit.
 

Offline bd139

  • Super Contributor
  • ***
  • Posts: 23059
  • Country: gb
Yeah. Package is different as well. Only slightly.

There’s a company out there, the name I forget (green something) which sells “compatible” mega328p clones. Wonder if some of them got rebranded and chucked in the supply chain.
 

Offline mikerj

  • Super Contributor
  • ***
  • Posts: 3301
  • Country: gb
LCSC part looks counterfeit.

Counterfeit 328P devices certainly exist, but it's a little premature to say this one looks counterfeit when the text can't even be seen through the crud on the board.
 
The following users thanked this post: MadScientist, wraper


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf