Hi smart people of EEVBlog, I've reached the point where I'm stumped by what to do next here. I accidentally purchased a dead GPU off eBay (I mixed up the listings) and figured I could at least use it as a learning opportunity and attempt to repair it. I'm not an electrical engineer by any means and as a mechanical guy, only have a basic knowledge of electrical engineering principles.
So a quick rundown: When slotted into a motherboard the card does power on and the fan spins, but there is no video output. If left running for a little while (~1-2 min), the card will get rather hot as the fan spins up and the heat exhausted out the back is pretty substantial. This is a Dell OEM version of an Nvidia GTX 1080 Ti, manufactured by MSI (very similar to the MSI 1080 Ti aero, but without branding). So it's close to the Nvidia reference / FE design, but probably with cheaper components hence the failure.
Here is a hi-res link of the front of the Founder's Edition PCB:
https://cdn.wccftech.com/wp-content/uploads/2017/03/NVIDIA-GeForce-GTX-1080-Ti-Founders-Edition-PCB_Front.jpgHere's a link to the back side:
https://cdn.wccftech.com/wp-content/uploads/2017/03/NVIDIA-GeForce-GTX-1080-Ti-Founders-Edition-PCB_Back.jpgI've probed around a little bit and found a few things, and I'll start with the "simpler" one on the back side. Here's an album of photos that I can eventually add to, as requested:
https://imgur.com/a/MfsNS86Q563 appears to have a short between Base and Collector (1st pic). I had it de-soldered at work and found unfortunately that even without it installed, those 2 pads are shorted together so not sure what's going on there or if it's intentional. I of course don't have a schematic for the GPU, so not much I can do there. I've since had a replacement soldered back on. Datasheet here:
https://my.centralsemi.com/datasheets/CMPT2222A.PDFNot sure if there are any surrounding components worth investigating there, or components on the opposite side of the board that may be part of the same circuit.
According to this video:
the side opposite Q563 appears to include the VRMs responsible for the GPU PLL management, and 1.8V supply to the vBIOS.
The 2nd (and probably bigger issue) is an apparent heat (probable short) issue with one of the GPU power delivery phases. I used a thermal camera and found that upon powering up, the L6 inductor, Q26 and Q30 mosfets will heat up significantly faster/hotter than surrounding components and the other 6 phases. Additionally, capacitors C190 and C191 run hotter than similar capacitors nearby (2nd pic). All these capacitors however measure OL then switch to ever-increasing resistance values as they charge up, which I understand is normal behavior.
The offending MOSFETs are printed as ON DJ27BZ (pic #3) but those appear to not exist anywhere, while the Founder's edition cards appear to use Fairchild FDCP8016S FETs. Turns out ON makes that PN and the datasheet is identical to Fairchild's, provided here:
https://datasheetspdf.com/pdf-file/1090214/FairchildSemiconductor/FDPC8016S/1 and here for ON:
https://www.onsemi.com/pub/Collateral/FDPC8016S-D.PDFThe TI 53603A VRM controller (position U11) does not appear to be an offender, or at least it doesn't appear to heat up abnormally on startup. I've not been able to identify the inductors labeled as R22, but they all measure ~0.2Ohm with my DMM which probably doesn't go down to whatever value they actually measure.
My assumption is that this Q26 is the offending part since it runs the hottest, but I'm at a loss on how to actually test it and/or replace it. What's a possible common failure for these chips?
On all these DJ27BZ FETs on the board, the GR(2)/SW(5-7) have continuity with each other and PCB ground. Looking at the schematic this seem normal? But this is where my knowledge ends so I'm hoping you fine people can shed some light on this and recommend further troubleshooting.
I've attached some general photos of my board, for your reference.
Thanks so much in advance for the help! Test equipment I readily have access to at home are a Fluke 114 and a Fluke 279FC, but if needed I can use current/voltage limited lab power supplies and other measurement devices at work.