Manufacturers have been playing this game with DWPD/TBW numbers too --- by reducing the retention spec, they can advertise a drive as having a higher endurance with the exact same flash. But if you compare the numbers over the years, it's clear that NAND flash has gotten significantly worse; the only thing that has gone up, multiplicatively, is capacity, while endurance and rentention have both gone down by a few orders of magnitude.
For a long time, 10 years after 100K cycles was the gold standard of SLC flash.
Now we are down to several months after less than 1K cycles for QLC.
Like does a SSD do some sort of refresh on power-on, or every N hours, or you have to access the specific block, or...? What if you interrupt the process, eg, having a NVMe in an external case that you just plug once a month for a few minutes to just use it as a huge flash drive, is that a problem?
What about the unused space, is a 4 TB drive used to transport 1 GB of stuff going to suffer anything from the unused space decaying?
It's all very unclear about what all of this means in practice and how's an user supposed to manage it.
The more interesting thing to note from those standards is that the required retention period differs between "Client" and "Enterprise" category.
Enterprise category only has power-off retention requirement of 3 months.
Client category has power-off retention requirement of 1 year.
Of course there are two sides to every story...
Enterprise category standard has a power-on active use of 24 hours/day, but Client category only intended for 8 hours/day.
As with many things in tech.... its up to the user to pick which side they compromise on.
[1]https://files.futurememorystorage.com/proceedings/2011/20110...
We treated NVMe drives like digital stone tablets. A year later, we tried to restore a critical snapshot and checksums failed everywhere. We now have a policy to power-cycle our cold storage drives every 6 months just to refresh the charge traps.
It's terrifying how ephemeral "permanent" storage actually is. Tape is annoying to manage, but at least it doesn't leak electrons just sitting on a shelf.
The theory is that operating system files, which rarely change, are written and almost never re-written. So the charges begin to decay over time and while they might not be unreadable, reads for these blocks require additional error correction, which reduces performance.
There have been a significant number of (anecdotal) reports that a full rewrite of the drive, which does put wear on the cells, greatly increases the overall performance. I haven't personally experienced this yet, but I do think a "every other year" refresh of data on SSDs makes sense.
My desktop computer is generally powered except when there is a power failure, but among the million+ files on its SSD there are certainly some that I do not read or write for years.
Does the SSD controller automatically look for used blocks that need to have their charge refreshed and do so, or do I need to periodically do something like "find / -type f -print0 | xargs -0 cat > /dev/null" to make sure every file gets read occasionally?
This article just seems to link to a series of other xda articles with no primary source. I wouldn't ever trust any single piece of hardware to store my data forever but this feels like clickbait- At one point they even state "...but you shouldn't really worry about it..."
do I just plug it in and let the computer on for a few minutes? does it needs to stay on for hours?
do I need to run a special command or TRIM it?
Should I pop them in an old server? Is there an appliance that just supplies power? Is there a self-hosted thing I can monitor disks which I have 0 access usage for and don't want connected to anything but want to keep "live"
There was a guy on reddit that took about 20 cheap USB flash drives and checked 1 every 6 months. I think after 3 years nothing was bad yet.
I've copied OS ISO images to USB flash drives and I know they sat for at least 2 years unused. Then I used it to install the OS and it worked perfectly fine with no errors reported.
I still have 3 copies of all data and 1 of those copies is offsite but this scare about SSDs losing data is something that I've never actually seen.
If not, that feels like a substantial hole in the market. Non-flash durable storage tend to be annoying or impractical for day to day use. I want to be able to find a 25 year old SD card hiding in some crevice and unearth an unintentional time capsule, much like how one can pick up 20+ year old MiniDiscs and be able to play the last thing their former owners recorded to them perfectly.
The difference between slc and mlc is just that mlc has four different program voltages instead of two, so reading back the data you have to distinguish between charge levels that are closer together. Same basic cell design. Honestly I can’t quite believe mlc works at all, let alone qlc. I do wonder why there’s no way to operate qlc as if it were mlc, other than the manufacturer not wanting to allow it.
Um. Backups seem like exactly why I might have data on an unpowered SSD.
I use HDDs right now because they're cheaper, but that might not be true some day. Also, I would expect someone less technically inclined than I am to just use whatever they have lying around, which may well be an SSD.
It will trigger reads in random areas in flash, and try ti correct any errors found.
Without it, the same issue as in the original article will happen (even if the device is powered on): areas in the NAND were not read for long time will have more and more errors, causing them to be non recoverable.
How many people have a device that they may only power up ever few years, like on vacation. In fact, I have a device that I've only used on rare occasions these days (an arcade machine) that now I suspect I'll have to reinstall since It's been 2 or 3 years since I've last used it.
This is a pretty big deal that they don't put on the box.
Flash storage is apparently cheaper (especially for smaller production runs) and/or higher density these days, so these cartridges just use that and make it appear ROM-like via a controller.
zfs in these filesystem-specific parity-raid implementations also auto-repairs corrupted data whenever read, and the scrub utility provides an additional tool for recognizing and correcting such issues proactively.
This applies to both HDDs and SSDs. So, a good option for just about any archival use case.
so it's as if the data... rusts, a little bit at a time
dd if=$1 of=/dev/null iflag=direct bs=16M status="progress"
smartctl -a $1
If someone wants to properly study SSD data-retention they could encrypt the drive using plain dm-crypt and fill the encrypted volume with zeroes and check at some time point afterwards to see if there are any non-zero blocks. This is an accessible way (no programming involved) to let you write random data to the SSD and save it without actually saving the entire thing - just the key. It will also ensure maximum variance in charge levels of all the cells. Will also prevent the SSD from potentially playing tricks such as compression.Really? I could have sworn that primary storage was the one place they weren't going to replace HDDs. Aren't they more of a thing for cache?
I've aged and busied myself beyond keeping track of this stuff anymore. I'm going to buy a packable NAS in the next couple months and be done with it. Hopefully ZFS since apparently that's the bee's knees and I won't have to think about RAIDs anymore.
I rotate religiously my offline SSDs and HDDs (I store backups on both SSDs and HDDs): something like four at home (offline onsite) and two (one SSD, one HDD) in a safe at the bank (offline offsite).
Every week or so I rsync (a bit more advanced than rsync in that I wrap rsync in a script that detects potential bitrot using a combination of an rsync "dry-run" and known good cryptographic checksums before doing the actual rsync [1]) to the offline disks at home and then every month or so I rotate by swapping the SSD and HDD at the bank with those at home.
Maybe I should add to the process, for SSDs, once every six months:
... $ dd if=/dev/sda | xxhsum
I could easily automate that in my backup'ing script by adding a file lastknowddtoxxhash.txt containing the date of the last full dd to xxhsum, verifying that, and then asking, if a SSD is detected (I take it on a HDD it doesn't matter), if a full read to hash should be done.Note that I'm already using random sampling on files containing checksums in their name, so I'm already verifying x% of the files anyway. So I'd probably be detecting a fading SSD quite easily.
Additionally I've also got a server with ZFS in mirroring so this, too, helps keep a good copy of the data.
FWIW I still have most of the personal files from my MS-DOS days so I must be doing something correctly when it comes to backing up data.
But yeah: adding a "dd to xxhsum" of the entire disks once every six months in my backup'ing script seems like a nice little addition. Heck, I may go hack that feature now.
[1] otherwise rsync shall happily trash good files with bitrotten ones
This is somewhat confused writing. Consumer SSDs usually do not have a data retention spec, even in this very detailed Micron datasheet you won't find it: https://advdownload.advantech.com/productfile/PIS/96FD25-S2T... Meanwhile the data retention spec for enterprise SSDs is at the end of their rated life, which is usually a DPWD/TBW intensity you won't reach in actual use anyway - that's where numbers like "3 months @ 50 °C" or whatever come from.
In practice, SSDs don't tend to loose data over realistic time frames. Don't hope for a "guaranteed by design" spec on that though, some pieces of silicon are more equal than others.