Hacker News new | past | comments | ask | show | jobs | submit login
Unusual disk latency (sun.com)
106 points by there on Jan 2, 2009 | hide | past | favorite | 22 comments



I used to write the firmware that runs the tracking systems for disk drives (specifically, some of my stuff was in the Quantum Atlas III, IV, 10K, and 10K II). This does not surprise me at all. Cabinet vibrations were always a HUGE component of track mis-registration (i.e., the heads going off the tracks). It used to really annoy us when the cabinet and workstation manufacturers would stick our drives in flimsy, crappy components and wonder why the performance would suffer.


Enterprise drive makers like Seagate and WD are now claiming to have features which limit the damage caused by cabinet vibrations. I was intrigued to note that these differences are done only with firmware.

Any idea whats happening behind the scenes? I assume Seagate measured the common vibration frequencies and conteract them with head movements?


While shouting like that is unlikely, I wonder if some data centres are effected from traffic rumble or other sources of vibration. (Having your server farm performance dip when you switch to diesel generation might be a problem).

I guess that might manifest as "slow racks" or "slow data centres". It also suggests that rubberised disk mounts (as are used for 'silent' desktop PCs) might sense on server systems too (or are they standard already?)


What's coolest is that the analytics tools were available to discover it. No other storage solution I've seen has that level of instrumentation.


I was at a conference with a Sun evangelist banging on about this stuff. I was pretty skeptical. One of our Hadoop grid sysadmins used to work at Sun. He was raving about things like D-Trace and a bunch of features of Solaris that other *nix don't have.

I guess I'm not 100% sold on Open Solaris but they do have some pretty badass stuff in there.


Its nice that they release the fruit of their labours in OpenSolaris.

I've rolled my own hybrid storage and played with it in ZFS using a RAM disk and flash storage and it really does work - it can make a RAID of slow SATA disks look like a bigger RAID of fast 15krpm disks, provided all your workload fits in the various ramdisk/flash caches.


Sun went out of their way to make an amazing storage product. Interestingly, they started with perl, then rewrote everything in JavaScript/C instead[1].

You can even download a VMware image to test their UI and provisioning: http://www.sun.com/storage/disk_systems/unified_storage/reso...

NetApp should go out of business already.

[1]: http://news.ycombinator.com/item?id=390000 (yes, item 390,000)


One of those completely random discoveries that almost never happened in this instance of the universe. Sorta like vulcanized rubber. Or waffle cones.


How much overhead does it take to record disk latency data in realtime? Does it slow down the disks? I read on the guy's blog that ZFS's write caching meant that end users didn't experience increased latency when the guy yelled at the disks, but surely there must be some performance cost to all this instrumentation.


It's possible thanks to the magic of dtrace.

In a storage appliance, you wouldn't expect the load to be overwhelmingly CPU bound[1], so performing real time instrumentation isn't a noticeable performance hit: http://www.solarisinternals.com/wiki/index.php/DTrace_Topics...

[1]: assuming you aren't using ZFS gzip compression: http://blogs.smugmug.com/don/2008/10/13/zfs-mysqlinnodb-comp...


highly wild idea: If specific vibration causes specific disk latency, given a high precision intstrument, what are the chances of being able to eavesdrop sysadmins on a co-loc site? ;)


Substantially lower than if you put a mic inside one of the servers, preferably next to the concealed webcam ;)

Also, to directly answer your question: The latency demonstrated is in response to high amplitude noise. For talking you'd need it to be sensitive to the frequency as well.

Perhaps you could tell if things weren't going well (lots of yelling)?


You know, I have been suffering from unusual disk latency recently. I'm going to send someone into the DC to check for this. Not howling lunatics mind, other potential sources.


Does anybody make a vibration sensor that could be used to correlate cabinet vibration with latency?


Dunno. I plan to use a use a network engineer with a mobile phone and have him walk around the datacentre putting his hands on things while I kick off batch jobs. Tho' now that you mention it, a metal ruler held in his teeth would probably be better.


I wonder if the kind of "anti-noise" tech used in high end Japanese cars around the wheel arches would be useful in a data center.


I want to see the latency graphs during an earthquake.



That would make a seismograph that actually records the data on itself.


For those of us unable to watch the video - what is the explanation?


He screamed at the disks to recreate the graph. Disk vibrations led to latency.


Don't hard drives use voice coils to move the arm back and forth?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: