
AWS random system failures - taf2
Anyone else noticing random system failures on AWS?  Starting two days ago we&#x27;ve been noticing servers randomly die.    At first it was simple matter of forcing the host to stop and then start.  Now we&#x27;re getting disk corruption as well... e.g.<p>```
tting hostname localhost.localdomain:  [  OK  ]<p>Setting up Logical Volume Management:   &#x2F;var&#x2F;run&#x2F;lvm&#x2F;lvmetad.socket: connect failed: Connection refused
  WARNING: Failed to connect to lvmetad. Falling back to device scanning.
[  OK  ]<p>Checking filesystems
Checking all file systems.
[&#x2F;sbin&#x2F;fsck.ext4 (1) -- &#x2F;] fsck.ext4 -a &#x2F;dev&#x2F;xvda1 
&#x2F; contains a file system with errors, check forced.
&#x2F;: Inodes that were part of a corrupted orphan linked list found.<p>&#x2F;: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
	(i.e., without -a or -p options)
[FAILED]<p><i></i>* An error occurred during the file system check.
<i></i>* Dropping you to a shell; the system will reboot
<i></i>* wh&#x2F;dev&#x2F;fd&#x2F;9: line 2: plymouth: command not found
Give root password for maintenance
(or type Control-D to continue): [   18.428094] random: crng init done
```<p>That&#x27;s from the system log... lots of fun to recover these...
======
taf2
It looks or sounds like one of the us-east1 regions is having issues with the
new m5 series hosts... they die and can't be rebooted.

