I have news: 220.127.116.11, aka CloudFlare's DNS, resolves archive.is to 127.0.0.5.
Live proof: https://digwebinterface.com/?hostnames=archive.is&useresolve...
Archived proof: http://archive.is/utJfW
Of course the moment I changed 18.104.22.168 to 22.214.171.124 in resolv.conf I was able to access the site again.
See here: https://news.ycombinator.com/item?id=19828317
I don't remember details, I'm afraid, but it is an unfortunate situation.
Would be cool if archive.is implemented this comment https://news.ycombinator.com/item?id=19832572 (from previous HN discussion on this)
> The way the file system handles this is incompatible with the workings of NAND flash.
That's true of most conventional filesystems, but log-structured filesystems are much more flash-friendly. That's why there has been a resurgence of interest in them, and also why a typical flash translation layer bears a striking resemblance to a log-structured FS. There are also flash-specific filesystems.
> to an HDD all sectors are the same.
This is not true because of bad blocks. Every disk has a reserve of blocks that can be remapped in place of a detected bad block, transparently, much like flash blocks are remapped. Beyond that, it's also useful for a disk to know which blocks are not in use so it can return all zeroes without actually hitting the media. There are special flags to force media access and commands to physically zero a block, for the cases where those are needed, but often they're not. Trim/discard actually gets pretty complicated, especially when things like RAID and virtual block devices are involved.
Also I believe some humans, and (filesystems?) intentionally stored certain data towards the inside/outside of the HDD because the simple cylinder geometry allowed faster reads in those regions. However, I'm not seeing conclusive proof that modern HDDs show performance variation with respect to radius.
They definitely still do; it's fundamental to drives that run at fixed RPM but maintain high areal density across the entire platter. One of the 1TB drives I have lying around does about 183MB/s at the beginning of the disk, 148MB/s in the middle, and 97MB/s at the end.
At smaller scales, the layout of individual tracks can vary quite a bit, but that doesn't have as much impact on overall sequential transfer speed. See http://blog.stuffedcow.net/2019/09/hard-disk-geometry-microb...
What I find a bit bizarre though that some defragmenters have an option to move files to the end of the drive. Wouldn't you want to move files to the beginning in that case?
1) start of drive partition
3) middle of drive partition
5) end of drive partition
Make 1,3, and 5 the same size, and then run raw disk benchmarks against them. The usual pattern is, like a record, the data begins at the edge of the platter and you'll get higher transfer speeds, and the end of the disk is in the middle, so slower there.
Also worth pointing out that sticking to the front of a spinning disk doesn't just help throughput, it also helps latency as the head doesn't have to move as far.
You'll end up with a graph like:
Unless of could you put a consumer disk in a server with many high RPM fans causing substantial vibration, then you get:
A server/enterprise/RAID edition disk handles the vibration MUCH better, around 3x the bandwidth:
You can benchmark disks, including seek time and speed at different points, using the tools "bogodisk" and "bogoseek": https://djwong.org/programs/bogodisk/
Windows has had similar tricks built in (at various levels of cleverness, depending on the Windows variant you run) for a while now: https://en.wikipedia.org/wiki/Prefetcher
No doubt there are similar options for Linux and other OSs.
A similar trick I've seen is using a small SSD as a manual cache: copy in the files that need to be fast and mount it with the larger filesystem (on slower drives) as a unioned filesystem. Though just using block device based automatic cacheing may be easier and safer than rolling your own mess. There are a few options for Linux though some are not currently maintained (see https://serverfault.com/questions/969302/linux-ssd-as-hdd-ca... amongst other places), and some IO controllers support it directly (even some motherboards have this built in) without needing to bother your OS with the details at all.
For optical media the layout of assets can be critical to the user experience. You probably want your startup assets located on the outer, faster tracks.
You could get some decent gains then if you made your scheduler take rotation into account. A long seek that arrived just before the target sector came under the head could be faster than a short seek that would arrive just after the sector passed the head.
On the other hand, taking rotation into account could make the scheduler quite a bit more complex. You needed a model that could predict seek time well, and you needed to know the angular position of each sector in its cylinder.
I don't think that there were any drives that would tell you this. SCSI drives wouldn't even tell you the geometry. IDE drives would tell you a geometry, but it didn't necessarily have anything to do with the actual geometry of the drive.
At the time I worked at a company that was working on disk performance enhancement software (e.g., drivers with better scheduling, utilities that would log disk accesses and then rearrange data on the disk so that the I/O patterns in the logs would be faster , and that sort of thing).
We had a program that could get the real disk geometry. It did so by doing a lot of random I/O and looking at the timing of when the results came back. If there were no disk cache, this would be fairly easy. (Well, it didn't necessarily get the real geometry, but rather a purported geometry and seek and rotational characteristics that could predict I/O time well).
For instance, read some random sector T, then read another random sector, then read T again. Look at the time difference between when you started getting data back on the two reads of T. This should be a multiple of the rotation time.
If the disk has caching that can still work but you need to read a lot of random sectors between the two reads of T to try to get the first read out of the cache.
Anyway, we had to give up on that approach because the program to analyze the disk took a few days of constant I/O to finish. Management decided that most consumers would not put up with such a long setup procedure.
 Yes, that could mean that it would purposefully make files more fragmented. A fairly common pattern was for a program to open a bunch of data files and read a header from each. E.g., some big GUI programs would do that for a large number of font files. Arranging that program and those data files on disk so that you have the program code that gets loaded before the header reads, then the headers of all the files, and then the rest of the file data, could give you a nice speed boost.
The flaw in this method is that, to use the above example, if another big GUI program also uses those same font files, the layout that makes the first program go fast might suck for the second program. If you've got a computer that you mostly only use for one task, though, it can be a viable approach.
> I don't think that there were any drives that would tell you this
Old MFM/RLL drives would not only tell you this, they'd let you alter it during the low-level format procedure.
There was a parameter called "sector interleave" that would let you deliberately stagger the sector spacing, so it would be like 1,14,2,15,3,16,4,17,5,18,6,19,7,20,8,21,9,22,10,23,11,24,12,25,13,26 or something.
This was because controllers didn't do caching yet, and PIO mode and CPUs of the era were so slow they couldn't necessarily keep up with data coming off at the full rotation rate. If you missed the start of the next sector, you had to wait a whole rev for it to come around again, a nearly-26x slowdown. Whereas a 2:1 interleave would virtually guarantee that you'd be ready in time for the next sector, for only a 2x slowdown. (Really crap machines could even need a 3:1 interleave, the horrors!)
I thought the point of native command queuing was precisely to enable the drive itself to make these lower-level scheduling decisions, while the OS scheduler would mostly deal with higher-level, coarser heuristics such as "nearby LBA's should be queued together."
BTW, discovering hard drive physical geometry via benchmarking was extensively discussed in an article that's linked in the sibling subthread. I've linked the HN discussion of that as well.
For example, even something as trivial and linear as a database log or filesystem log can benefit from placement optimisation.
Each time there's a transaction to commit, instead of writing the next commit record to the next LBA number in the log, increment the LBA number by an amount that gives a sector that is about to arrive under the disk head at the time the commit was requested. That will leave gaps, but those can be filled by later commits.
That reduces the latency of durable commits to HDD by removing rotational delay.
Command queueing doesn't help with that, although it does help with keeping a sustained throughput of them by pipelining.
Isn't it 31 or 32 commands in the queue? That's a worst-case of around a quarter second for a 7200rpm drive, which sounds like an awfully long time horizon to me.
But for ideal scheduling, you need something to deal with the short timings as well.
For example, if you have 1024 x 512-byte single-sector randomly arriving reads, of which 512 sectors happen to be in contiguous zone A and 512 sectors happen to be in contiguous zone B, all of those reads together will take about 2 seek times and 2 rotation times.
Assuming the generators of those requests are some intensively parallel workload (so there can be perfect scheduling), which is heavily clustered in the two zones (e.g. two database-like files), my back-of-the-envelope math comes to <30ms for 1024 random access reads in that artificial example, on 7200rpm HDD.
Generally that's what the kernel I/O scheduler is for.
The main purpose of NCQ (and SCSI command queuing which came way before it) was to allow higher levels of parallelism at the drive interface. This does allow the drive to do some smart scheduling, but still only within that fairly small queue depth. Scheduling across larger numbers of requests, with more complicated constraints on ordering, deadlines, etc., remains the OS's job. And once it's doing that, the incremental benefit of those on-disk scheduling smarts becomes pretty small.
I could swear I read such in 2010 -- did that actually happen in the last ten years?
I didn't find anything but bunnie's blog entry on hacking/re-flashing the firmware on the card's controller.
Due to how important it is to have a good FTL, and the pretty much native suitability of log-structured data formats, like those based on RocksDB or InfluxDB (both not unlikely for data handled and stored on RPi-like SBCs), it'd be much better for reliability and performance reasons to let these LSM engines deal with NAND flash's block erase/sector write/sector read behavior.
Since the firmware is more complicated than hard drives, they are way more likely to brick themselves completely instead of a graceful degradation. Manufacturers can also have nasty firmware bugs like https://www.techpowerup.com/261560/hp-enterprise-ssd-firmwar... . I'd recommend using a mix of SSDs at different lifetimes, and/or different manufacturers, in a RAID configuration.
How different manufacturers deal with running SMART tests under load drastically varies. Samsung tests always take the same amount of time. The length of Intel tests vary depending on load. Micron SMART tests get stuck if they are under constant load. Seagate SMART tests appear to report being at 90% done or complete, but the tests do actually run.
Different SSDs also are more or less tolerant to power changes. Micron SSDs are prone to resetting when a hard disk is inserted in the same backplane power domain, and we have to isolate them accordingly.
Manual overprovisioning is helpful when you aren't able to use TRIM.
What a drive does with secure-erase-enhanced can be different too. Some drives only change the encryption key and then return garbage on read. Some additionally wipe the block mappings so that reads return 0.
Oof, that's extremely obvious yet it never crossed my mind. Nice tip!
000webhost lives up to its name!
In the meantime:
even if I do
nslookup archive.is 126.96.36.199
edit; here you go: https://jarv.is/notes/cloudflare-dns-archive-is-blocked/
Hopefully one of them will work for you.
Now I'm no nuclear scientist so please be forgiving with that description, but that's how I understand them to work :)
I can't even begin to explain how an SSD works, but I know there are no moving parts besides electrons.
edit: moved the "(ideally)"
* At the "lowest" level, there's a little cell that it's very much an EEPROM (but better, because newer tech). This little cell can hold 1, 2, 3 or 4 bits, depending on gen/tech.
* You group a bunch on those cells together and they form a page. Usually it's 1024 cells a page.
* You group a bunch of pages together and they form a block (don't confuse with "block" as in "block oriented device"). Blocks are usually made of 128 pages.
* You group a bunch (1024 usually) of blocks together and you get a plane.
* You get your massive storage by grouping a lot of planes together. Think of it as small (16-64 MB) storage devices that you connect in a RAID-like manner.
* Operations are restricted because of technology. On an individual level, cells can only be "programmed", that is, a 1 can bit flipped into a 0, but a 0 cannot be made a 1.
* If you need to turn a 0 into a 1, then you must do it on a block level (yep, 128 pages at a time).
* That's where the Flash Translation Layer kicks in: it's a mapping between the (logical) sectors (512b or 4096b) and the underlying mess. The FTL tells you how you form the sectors (which would be the blocks of a "block oriented device", but I'm trying to avoid that word).
* You also have "overprovisioning" at work - that is, if your SSD is 120gb, it's actually 128GB inside, but there's 8GB you don't get access (not even at the OS level), that the device uses to move things around.
* Wear Leveling/Garbage Collection mechanisms work to prevent individual cells from being used too much. Garbage Collection makes sure (or tries) that there are always enough "ready to program" cells around.
* The firmware makes everything work transparently to the world above it.
That would be a very (very very) simple explanation of how Flash storage works. Things like memory cards and thumb drives usually don't get overprovisioning nor wear leveling.
Having only 2 or 4 planes per die with per-die capacities of 32GB or more is a big part of why current SSDs need to be at least 512GB or 1TB in order to make full use of the performance offered by their controllers. 265GB SSDs are now all significantly slower than larger models from the same product line.
But as soon as I dive into the details, I get lost. How exactly can you control the nuclear decay? How exactly does do the gears in the transmission move around and combine with eachother to create a specific gear ratio? These concepts probably are probably pretty simple for a lot of people, but they just make my head spin.
That's what the control rods are for. The uranium in one fuel rod in isolation decays at whatever natural rate, which would warm water but not boil it, and placing the rods near each other allows for the decay products (high energy particles) to interact with other fuel rods and induce more rapid decay.
The control rods slot in between the fuel rods, and absorb the decay products without inducing further nuclear decay. Usually these are graphite rods.
> How exactly does do the gears in the transmission move around and combine with eachother to create a specific gear ratio?
It really depends on the specific transmission, a manual transmission is using the shift lever movement to move the gears into place. An automatic transmission most likely uses solenoids to move things (a solenoid is basically a coil of wires around a tube with a moveable metal rod inside, when you put current through the wire, the metal rod is pulled into the tube, you attach the larger thing you want to move to the end of the rod (sometimes with a pivot or what not), and use a spring, another solenoid, or gravity, etc to make the reverse movement. A solenoid by itself gives you linear movement, if you need rotational movement, one way to do that is have a pivot on the end of the solenoid rod, then a rod from there to one end of a clamp on a shaft, then when the solenoid pulls in its rod, the shaft will rotate (this is the basic mechanism for pinball flippers).
AFAIU graphite rods increase fission by slowing (not capturing) neutrons which in turn have a better chance of propagating further fission, because .. physics.
Quite nifty actually - without the moderator, the fuel wont burn.
The primary substance of the control rods is (usually) a neutron absorber, and most reactors with control rods have a passive safety system, so gravity and springs will force the control rods in to significantly slow the reaction unless actively opposed by the control system.
The Chernobyl rods had graphite ends so that when fully retracted, the reactor output was higher than if there was simply no neutron absorber present; unfortunately, this also meant that going from fully retracted to fully inserted would increase the reactivity in the bottom of the reactor before it reduced it, and in the disaster, this process overheated the bottom of the reactor, damaging the structure and the control rods got stuck, and then really bad things happened.
Long story short, most control rods don't have graphite. ;)
Scott Manley explains things so well. Highly recommended channel.
In a nucler reactor, things are a bit different. You start with uranium that is only slightly rsdioactive and does not produce any usable quantities of heat. You place in ib the corrrct geometry and start a controlled nuclear chain reaction. You jave neurons split uranium, which produces heat and more neutrons.
Controlling this reaction so that it actually runs runs, but not so much as to melt your reactor is what, as far as I understand it, makes nuclear reactor design hard.
Also politics, corruption etc.
Web is like: look ma, new shiny stack, gotta use it. The amount of tooling involved in creation of even simple things is often staggering without any real need for it. And often if you do not approach webdev in "politically correct" way you can be laughed out of the door.
Indeed there's this pile of people who don't know how computers actually work and only have experience as 'web developers' so they pile on level and level of abstraction without concern for performance or even common sense. Don't get me wrong some of the abstractions are very good, but most are founded in ignorance of what technology has gone before and so they eventually collapse due to the problems inherent in their architecture, mostly things that were discovered in the 80's or earlier.
Even though it may be off-topic, but could you elaborate on this a little bit more?
Although modern automatics are probably a bit easier to understand than old ones, especially CVTs? As long as you're ok with "the computer just triggers this solenoid.." rather than understanding a big hydrualic computer.
The Howstuffworks article is great. Have fun.
I think you're referring to dual-clutch transmissions: odd gears on one clutch, even gears on the other. So the transmission can switch from eg. gear 2 to 4 with the even-numbered clutch disengaged while transferring power through the odd-numbered clutch in gear 3. When it's time to move up to gear 4, one clutch is disengaged as the other is engaged, instead of having to leave a single clutch disengaged while the gear change happens.
The whole show isn't like that, but it does try to show that so much of the issue was caused by a desire to be seen as infallible (of course the reactor design wasn't flawed, because the people's greatest minds worked on it, etc., etc.) and that was something that had to be dealt with atop the actual disaster.
Or 30, if you believe the govt. report.
They did finally fix the cause of the explosion in the other reactors of that design, remarkably many years later.
Nobody here will defend coal, but graphite-moderated reactors are not the tech you want to be defending.
That seems a tad unlikely.
Exactly. And the reason we're stuck with 60-year-old reactor technology is...
I remember this page but I don't know of a modern update: http://lkcl.net/reports/ssd_analysis.html
These days, if I want an SSD for my desktop and want to minimise the chance I have a disk problem and have to restore from backup, would I be better off with one "data centre" drive (eg Intel D3-S4510), or two mirrored "consumer" drives (perhaps from two different manufacturers)?
The prices look similar either way.
They do make NVME raid solutions now -- with the advantage being that NVME can be faster than SATA. And there are various price points for the NVME drives depending upon speed.
This one from 2018 (not sure if it has full raid or uses VROC)(EDIT: it requires software raid)
This one is cheaper but relies upon Intel VROC (which has been hard to get working on some mobo's apparently)
In either case you're looking at max throughput of 11 gigabytes per second, which is roughly 20 times faster than SATA 3's 6 gigabits per second.
You’re saying the performance gains stop at two drives in raid striping. RAID10 in two strip two mirror would still bottleneck at 8 total lanes?
I also need to see about the PERC being limited to 8 lanes - no offense - but do you have a source for that?
Edit: never mind on source, I think you are exactly right  Host bus type 8-lane, PCI Express 3.1 compliant
To be fair; they have 8GB NV RAM, so it’s not exactly super clear cut how obvious a bottleneck would be.
No idea if 7.68TB RAID1 over two drive with software RAID is much worse than a theoretical RAID10 over 4 3.92TB drives.... apparently all the RAID controllers have a tough time with this many IOPs.
Cards with PLX switches are are way to fix this if you cant upgrade your whole hardware, but the price point is a multiple of simple bifurcation cards, since you have to integrate a whole PCI switch on-card.
The architecture diagrams here are quite helpful:
1. Basically no software is prepared to be "graceful" about their storage suddenly going read-only, especially when the OS is trying to run off that drive.
2. Intel drives at end of life go read-only until the next power cycle, whereupon they turn into bricks.
3. The threshold at which Intel drives go read-only is when the warrantied write endurance runs out, not when the actual error rate becomes problematic. This makes sense if the drive is trying to ensure the flash is still in good enough condition to have long data retention, so that you can reliably and easily recover data from the drive. But (2) already rules that out.
That's inconvenient, given that the first instinct of many people at the first sign of read trouble is to power-cycle the drive. Maybe not in enterprise scenarios, but certainly in consumer ones.
This guy seems to say the opposite, in that the files are "simply not there anymore", contrary to everything I've read: who's right here?
Not really “easily”. At the very least you’ll need a modded flash controller that can bypass the flash translation layer. Also, on SSDs with TRIM support you’re also racing against the garbage collector which will erase any unused (ie. deleted) blocks.
(which is one great promise of zfs on linux/openzfs - cross-platform encrypted removable storage).
and by using AES they can probably claim to satisfy some security standards that will make their marketing people happier.
That's why the secure erase takes seconds and not drive size/bandwidth seconds.
From my limited second-hand experience, it's the complete opposite. Data recovery services can't reassemble data from memory blocks of dead SSDs.
Maybe it works for author's very specific system or use-case, but on my personal MBP laptop with very occasional usage pattern--some days I do not use it at all during the week--I end up with 10 GB per day of writes on average. That way it will be 11.4 years already, not so many. And I do not do something very disk-expensive on it like torrents downloading or database testing, only some general development task, watching online videos, web surfing, docs, etc.
I guess what I’m saying is that for modern SSDs I don’t think write endurance is a binding constraint in most cases.
I check Data written in the Activity monitor — Disk.
Currently my troubles with SSDs in PC/Server/NAS environment are a somewhat more practical, more about compatibility NVMe/SATA, M.2 key types, PCIe port bifurcation support vs PLX switches, none of them are even mentioned. Advice for this is notoriously hard to find, resorting to trusting rare and random forum posts is my state of knowledge progress there.
OK, he’s talking about SSDs, but I want to mention that I’ve easily recovered many large deleted files from SD or micro-SD cards (formatted as FAT32 or exFAT) using Norton Unerase or an equivalent utility.
Are the controllers for SSDs that different from the controllers for SD cards? Has anyone tried Norton Unerase or an equivalent program on an SSD? I’d like to hear a first hand account to help confirm (or deny) what the author claims.
In other words, when you "delete" a FAT file, you are not even erasing a whole directory entry, you just merely flip one bit in that entry.
The data blocks get actually "erased" when they are reused by the FS software.
Recovery of basic deletion is therefore pretty much guaranteed as long as you didn't write something else on the disk.
What the author is talking about is more "serious" deletion, sometimes called "shredding" . To actually erase the data from the disk, you use software that overwrite your file with random data before deleting it. This is supposed to work if the filesystem is as clever as FAT - that is somewhat dumb (but so simple that SoCs such as ARM Cortex Ax can boot from them directly).
SDs and SSDs add another challenge because they constantly lie to the filesystem software; they have their own inner controller mainly to manage bad sectors and do wear-leveling, so when the FS requests to fill a sector with zeros, they might say "ok" but actually just remap the sector internally to another empty sector. So the data is still somewhere on the chip, and an evil scientist with lots of pointy probes can in theory read them back.
If it doesn't, the deleted files should retain their data until the filesystem reuses the sector, as normal. If trim is used, the SSD doesn't have to retain the data, but it doesn't necessarily make it unreadable immediately, there are many implementation strategies for trim.
It was once explained to me that writing to a HDD is like painting. Where new data can just be painted over the top of the old discarded data. While writing to an SSD is like writing on a chalk board where the old data has to be removed before you can write over it.
The non recognition issue goes away after I power off and power on.
It's been this way for about 8 months. Happens every 1/15 times I power on. I've heard it may have something to do with "sleep mode" or something like that. I always shutdown via software though.
It wasn't a myth, that was the idea all along.
> SSD Defragmentation [...]
An important factor left out is wear leveling, it doesn't make as much sense to arrange data in "file-system-order" when the bits on the drive move around.
I find myself copying entire partitions between SSDs from time to time, is there a utility to clear the destination SSD before copy?
Is it possible to do the same for an SD card, so that writing a new Raspi OS to it doesn't do unnecessary garbage collection?
If you want to trim an entire device, see blkdiscard (take care!).
The way TRIM works, you don't need to trim blocks right before overwriting them. TRIM is for blocks that you aren't going to care about for quite some time, it essentially returns them to the SSD management layer for use as overprovisioning.
In my experience "blkdiscard /dev/DESTINATION_DISK" does improve the speed of dd'ing a disk to another one quite a bit though.
And that does make some sense IMHO:
If the SSD's internal datastructure which keeps track of pages which hold user-data is empty then each write will consume less time for looking up if the specific sector is contained in the datastructure.
For most SSDs, this lookup is a single DRAM fetch per 4kB of user data, and therefore much faster than even a single NAND flash read, let alone a NAND flash program or erase operation.
The reason that imaging a SSD is faster after it's been trimmed or secure erased is that every erase block is empty and ready to accept new data. When you overwrite data on a used disk, the drive has to free up erase blocks and that generally involves moving old data elsewhere to preserve it—because the drive doesn't know that the commands to overwrite that are also coming soon.
So I finally kicked it into Reader View, only to find a lot of questionable spelling and grammar issues.
These kinds of basic things go a long way to making an article valuable.
Edit: Upon further inspection I see that this page was designed to be hard to read. Very curious.
 in old fashioned tree-based paper, unfortunately
Worse, some of them I'd sent to people dozens of times to read and there were these horrible typos cringe.
Is it me or someone else noticed that the font is too tough to read?
It is apparent from the first sentence and the website itself that this article is, or arises from, tha musings of shall we say an amateur. It's not professional because I'm not a professional. Whilst some of the comments in this thread are helpful, some are baffling. To the person who couldn't get past the header bacause Comic Sans offended him, the following 8058 words were in Century Gothic, which is perhaps not so offensive. I have however accepted the comments on readability and changed the header and all the rest of the text to Calibri, made the font larger, and changed the line spacing to web standards. I hope it is now more soothing and readable. As for the claim that the text is 'spoiled by having dozens of typos and spelling mistakes' I'm not sure what he is reading (or smoking). Apart from the removal of one stray colon, the current text is entirely unchanged - it is the same file. I don't claim to be perfect and I'm sure that in such a large essay there are some typos and other errors, but to say that there are dozens is patently untrue. Just let me know where they are and I will happily correct them.
Apart from all that, I am surprised that there are so few comments about the veracity or otherwise of the contents. I had great difficulty in finding material that was up to date and actuially delved into the technicalities of SSDs, without baffling me - as much did.
I shall end by thanking those who found the article interesting and I hope that some of it at least has been of some help.