I have a Lenovo ThinkCentre M900 Tiny (10FL) which has been working fine. Recently, I had an HDD failure, and replaced it with an SSD. This time, I decided to go with ZFS (single drive, kinda pointless, but I get scrubbing). Every once in a while, I’m getting errors while scrubbing. It’s always 1-2 read or write errors. And it never reappears if I clear the error and run another scrub.
The data isn’t important, and it’s backed up, so I’m not too worried about it. But, the symptoms make me think that it’s an issue of the SATA port inside. Is it possible to replace the SATA port inside this device? I wasn’t able to find anything like part number etc. online and it looks like I need to replace the whole board. Any help is appreciated.
The physical connector? Like this thing:
If it’s an issue with that you’d likely be able to straight up see it, bent or missing pin. Or check with a meter for shorts. And clean the pins I guess just in case they’re oxidized.
But this is wayyyyy down the list. Is the same LBA bad every scrub? Then it’s the drive, probably. If not did you check the cable (most likely culprit), the sata controller would be next most likely, then port, then solder joints (which you’d have to redo to replace that port anyway so moot point I guess). Udma crc errors?
If you do replace it you’d need to solder. I don’t know how to source that sata connector, that would be the hardest part. There is likely a part number on the actual sata port somewhere though it may not be visible until it’s removed (and even then it may not be available, may have been a custom part or it may be legacy at this point).
The soldering is actually not that tough though if you can find one. Add flux, add leaded solder, hot air, remove, wick to clean pads, fresh solder to pads, more flux, hot air new one on (second tricky part - patience with temp low enough to not melt plastic but enough to melt solder, can be a pain for connectors like that)
If you do replace it you’d need to solder. I don’t know how to source that sata connector, that would be the hardest part.
Imho, if that’s the culprit (I would still suspect the drive(s) at this point) it could be much simpler to replace then entire motherboard as one will probably need to source one in order to get that spare connector.
My own experience fixing laptops and consoles is generally this; unless it’s something crazy popular like a nintendo switch or m1 macbook youre best off getting a donor board to both figure out values of discrete components (good luck getting a schematic) and more importantly scavenge parts like this, custom ICs, asics, and connectors, etc.
Even if it is a macbook or switch a donor board means you have guaranteed quality parts (assuming they’re not dead on the donor board ha) and you don’t have to wait 12 weeks for shipping from Shenzen
Then it’s the drive, probably. If not did you check the cable (most likely culprit),
I’m a pretty experienced pc guy, but I got that wrong this week.
I’d re-used a SSD drive gave classic disk failures for writing cctv images to in a server. It’s one I’d had similar issues with years before, so didn’t spend much time diagnosing before switching it out for another. I then physically cut up the SSD before disposing.
Next day - replacement drive gave identical errors.
Yep - of course, it was the Sata cable. Swapped that and everything’s fine again.
I’m miffed that I binned a (most likely) perfectly good SSD.
Is the same LBA bad every scrub?
Not sure. I’ll keep it in mind and keep a record.
No UDMA CRC errors. Here’s the whole smart output in case anything pops out.
I don’t think there’s a cable, it seems to be attached directly. And I’m no good with soldering. So I guess I’m out of options.
That log is fine and I bet your sata connector is too. And duh of course no sata cable (my build is a rack mount so I jump to that because it’s always the first thing to check).
I would guess you have data corruption from lack of redundancy and i/o error. zpool status -v to show impacted files, restore from backup, then scrub again
Shouldn’t that show up as checksum errors though, instead of read/write errors? I never see checksum errors.
Ahhh right right
R/w would suggest media (especially if it’s the same sectors), but not necessarily that the drive is toast
zpool status -v | grep “errors” may output files associated with bad blocks
zpool status -v might just give you filenames though, worth it to try that first. It might just give object IDs tho and I forget what to do at that point. You have to map them to file paths but I don’t remember how to do this.
If the errors are in different spots though totally different story. Back to sata I/o issues. Controller issues, kernel/driver issue, etc. technically could be the sata port but I seriously doubt it