Hacker News new | past | comments | ask | show | jobs | submit login

I've been HDD-free for a few months now; I can't imagine going back. It's not just latency that's an issue for me, it is also reliability (I've had 6 HDD failures in the last two years in various devices around the house.) I also worry less about damaging something if I drop my laptop.

For my latest work project (I do scientific computing), I realized it's easier to do it on my laptop instead of the workstation. Since my laptop has an SSD, I can just use the filesystem as my database. This means that I can have have millions of files (literally, millions) lying around and process them using the good old Unix shell. It greatly reduces the development time compared to using a database. Just for giggles I tried doing this on a machine with a hard drive, and it was more than one hundred times slower.




I just had an Intel SSD die on me that was less than 2 months old. Without any warning it became unreadable. Hooking it up to an external device I used to retrieve data from faulty spinning disks didn't work to get any data off.

Right now I am back to spinning disk and time machine for hourly backups. I'm highly considering selling the replacement Intel SSD when I get it back. Some things like loading programs, starting up and shutting down are much faster. When it comes to installing stuff, extracting files, or writing data to disk, this 7200 rpm 500gb drive is faster than the SSD.

Also worth taking note is the $300+ for the 80 GB Intel SSD could buy 4 500GB laptop drives.


Have you tried it on a box with a HDD and an adequate amount of ram? I would think that on modern file systems with ordered metadata writes (or journaling) like ext3 or ffs with softupdates, so long as you had enough ram, disk speed wouldn't matter all that much so long as you can keep everything in cache.

SSD is great, the problem is that the good SSD costs something like $15 per gigabyte, and good registered ecc ddr2 ram costs just over $20 per gigabyte. Sure, in applications where consistency across power-loss events is a huge deal, ssd is the right answer, but for most applications, buying a whole lot of ram is often faster and not that much more expensive.


RAM was not the issue. The reason for the slowness, as I understand it, is rather that different files are spread out in different areas of disk, even if they are in the same directory. This is considered a feature, and I guess it makes sense under normal access patterns. So accessing a million files (even to load them into memory for the first time) would require the same order of disk seeks, and takes forever. I might be simplifying a little bit, but this is my understanding.

Could I have re-written the code by messing around with inodes and other low-level details so that it accessed the files in physical order? Probably. Was it worth my time, rather than using an SSD? Hell no.

I agree that SSDs are still a tad expensive for the average Joe. For most hackers, considering that we spend most of our work hours in front of a computer, I feel that the added productivity from an SSD is easily worth the investment.


http://en.wikipedia.org/wiki/Page_cache

the idea is that if you have enough ram, you only need to read the files from disk once. after that, the files are in ram cache. Once the files are in cache, at least for reads, it doesn't matter how spread out on disk they were.

And yeah, you do need to read the files from disk once, and that is slow; thus you often see a 'warm up' effect on servers. hitting a new page is often slower than it is for the second person who hits that same page.


Ram might not hold all the information, and reading the files isn't the problem, finding them is. SSD's have virtually no seek time, it is RAM. Tossing in a 64gig SSD essentially puts the entire file system in RAM, at least that's what it feels like, they don't need warming up. It feels like everything's in the page cache all the time.


if the file is in cache, you can 'find' it in cache, without hitting disk. seek time, (which I assume is what you mean by 'finding it') is only a problem if the file isn't cached in ram.

Yes, running on a SSD takes the entire filesystem much closer to ram speeds. However, you are doing so at almost ram prices. (I'm speaking of good SSDs, like the X-25E; which comes to something like $15 per gigabyte; the not so good SSDs have problems of their own. I have a SSD in my laptop right now that is branded by one of the gamer ram companies, i forget which one. It was pretty cheap, under $2 per gigabyte. It's pretty nice for reads, for writes, sometimes it is good, but often writes are worse than spinning disk.) The advantage of just buying the ram is that a good virtual memory management system can automatically optimize to keep the data you access most often in ram.

Like I said, i use a SSD in my laptop, a cheap brand and it's small, my laptop doesn't need a lot of storage, so the cost is reasonable, and I use a journaling file system, so writes are cached and the slow sub-cell size write speeds of the cheap SSD aren't a huge problem. I'm just explaining why in my servers, I prefer to go with a whole lot of ram, and then slow, cheap, and large SATA, rather than less ram and expensive SSD.


I don't disagree about stuffing ram in servers, that's obviously the best approach, but it isn't always an option. I'm talking about the desktop experience. Many of us are limited by time and circumstance and are still using 32bit OS's on the desktop. I can't stuff it with RAM, but my Intel X-25M SSD makes my desktop smoke like no other hardware upgrade ever has.

My of us are also stuck with mission critical legacy 32 bit servers that we can't just take down and upgrade so easily and licenses for enterprise versions of some db's that can handle assloads of ram don't come cheap. SSD's are much cheaper and a no brainer upgrade for that aging db server that just needs to be faster. The X-25E smokes here letting you get that speed without needing that expensive license.


What're you storing in files, and why is it in files versus in something more structured?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: