I think the only "secure" way to erase the contents of a hard drive is to repeatedly overwrite the disk surface with a mix of random/patterened data (like Darik's Boot & Nuke does).
Also, for those wondering about the blocking of /dev/random, it will restrict the number of bits you can copy using dd, but this won't be apparent unless you attempt to copy more bits from the entropy pools than there are available for random number generation. For more information, see this question on Super User:
http://superuser.com/questions/520601/why-does-dd-only-copy-...
The ATA secure erase command is faster, and should be better than overwriting. Overwriting has potential for missing sectors marked as bad but Secure erase will get those.
Multiple over writes is pointless. There's the Gutmann stuff, but that's ancient and the 35 passes was for multiple drive controllers, if you didn't know what drive controller was being used.
But then sometimes you don't have to do what works, but what other people tell you. Thus, if you're working to a standard it doesn't matter if DOD specifications are actually more secure than a single secure erase, you do what the spec calls for. And if you have to persuade other people that the data is provably gone it's easiest to just grind the drives.
Secure erase depends on the technology of the drive and the prowess of your attacker.
If you fear the NSA seizing your disks, consider the tradeoffs of explosive disposal.
If you fear a technically-savvy reporter going through your trash bin, overwriting your disk three or four times with patterns will be fine. But it's probably faster to take a drill and make a couple of holes. Make sure you hit the platters.
If you are selling your old hardware and just don't want your unencrypted stuff to be recovered by a sixteen year old with no budget but lots of time, overwrite the disk once.
If you're trashing an SSD, make sure any patterns you use for overwriting are not compressed out of existence by the controller. Or pull off the controller and crunch it.
The whole "overwrite xx times" is a myth. Overwriting it once with random data makes it completely unreadable. Overwriting it more than that is a complete waste of time.
"e" is an alias for emacsclient ; "veille" is a script which toggles between "xset s 5" and "xset s default" ; "x" is an alias for "xinit" ; "todo" is a script which manage a text file which I use as a todo list.
"ls", "cd", and "git" are far more used than any other commands : 14808, 13256 and 10078 times respectively, against 3919 times for "ssh" which is just behind.
I obtained these data from my .bash_history. Here are the place of the commands that are listed in the article :
"tr" is 76th
"sort" is 66th
"uniq" is 120th
"split" is there only one time like many other command so its rank is not relevant
Substitutions operations are what most "for" do so it is in my top 42. However see (1) below about the article.
Files size are a mix of "ls" for individual file and "du" for multiple files, "du" is 65th
"df" is 63rd
"dd" is 473rd
"zip" is 123rd (and funnily "gzip" is 122nd)
I didn't use "hexdump"
(1) About the following line
for i in *.mp4; do ffmpeg -i "$i" "${i/.mp4}.mp3"; done
I have two remarks. First, using the "-vn" of ffmpeg would accelerate the conversion by making ffmpeg ignore the video entirely. Second, substituting with '/' in the Bash expansion is not the right way to do that. "${i/.mp4}" would be "$i" without the first occurence of ".mp4". It is '%' that you want to use here.
I second this, I think it is good practice to rotate the history file manually when it reaches several MBs. (Using zsh, the main symptoms of a large history file is sluggishness when using ^R and a lag of a few tenths of a second when closing a terminal.)
For now I don't feel any annoyance, the shell startup is instantaneous and it closes instantaneously as well. If I'm starting to feel a slow down I might rotate the log, but I guess that's not happening soon.
turns out my .bash_history is clipped at it's default size-limit (500 lines). So I'll change that to gather more data for next year. My results started with:
147 clang++
77 ls
54 cd
15 gs
14 rake
13 vim
...
where gs is short for "git status"
...and thanks for the hint about the substitution! I updated it on my page.
This would print the size of all files in the current folder and subfolders.
So it works like awk, but with full Perl, no need to learn awk syntax. You can do conditionals, loops and whatnot. I write 30% of my one time throw away scripts directly in the command line.
This list is definitely more interesting to me. I discovered a few of these already this year and have been using them a lot (mtr, pv, curl for inspecting headers) and several others that I know I'm going to start messing with immediately (siege, multitail).
Another VERY useful tool I didn't see on this list is iperf. From the Debian package description:
Iperf is a modern alternative for measuring TCP and UDP bandwidth performance, allowing the tuning of various parameters and characteristics.
Features:
* Measure bandwidth, packet loss, delay jitter
* Report MSS/MTU size and observed read sizes.
* Support for TCP window size via socket buffers.
* Multi-threaded. Client and server can have multiple simultaneous connections.
* Client can create UDP streams of specified bandwidth.
* Multicast and IPv6 capable.
* Options can be specified with K (kilo-) and M (mega-) suffices.
* Can run for specified time, rather than a set amount of data to transfer.
* Picks the best units for the size of data being reported.
* Server handles multiple connections.
* Print periodic, intermediate bandwidth, jitter, and loss reports at specified intervals.
* Server can be run as a daemon.
* Use representative streams to test out how link layer compression affects your achievable bandwidth.
I use iperf initially when I'm troubleshooting poor file server transfer speeds, for example. There's a pretty Java GUI too if you want that.
Not really a bash utility, but delegating to this script from vim has been a huge
timesaver for long running scripts. For instance I can trigger my ruby specs from
Vim without vim getting blocked. I wrote a blog post about it here: http://minhajuddin.com/2012/12/25/run-specs-tests-from-withi...
#!/bin/bash
#~/.scripts/runinbg
#Author: Khaja Minhajuddin
#Script to run a command in background redirecting the
#STDERR and STDOUT to /tmp/runinbg.log in a background task
echo "$(date +%Y-%m-%d:%H:%M:%S): started running $@" >> /tmp/runinbg.log
cmd="$1"
shift
$cmd "$@" 1>> /tmp/runinbg.log 2>&1 &
#comment out the above line and use the line below to get get a notification
#when the test is complete
#($cmd "$@" 1>> /tmp/runinbg.log 2>&1; notify-send --urgency=low -i "$([ $? = 0 ] && echo terminal || echo error)" "$rawcmd")&>/dev/null &
The for loop works fine. Performing word splitting/wildcard expansion/etc on a variable will, well, split it into words/expand wildcards/etc, whether it was introduced by a for loop or not.
I am not sure I understand what you mean. Word-splitting is performed by the "for x in y" construct, using the IFS variable, so by default if you have a file called "foo bar.mp4" the command line from the article:
for i in *.mp4; do ffmpeg -i "$i" "${i%.mp4}.mp3"; done
will result in executing:
ffmpeg -i foo foo.mp3
ffmpeg -i bar.mp4 bar.mp3
Which is obviously not what was meant. So it's a good habit to learn to loop over files in a directory in a different way.
Except for those cases where you want counts, only unique/only duplicate, skipping fields, doing case-insensitive duplicate detection, or only comparing a chunk of the line - all but the last two I've used and now that I've found those in the man page (GNU uniq) I can replace some silly shell hacks that I've used over the years. It's like saying that tail is pointless unless you're doing tail -f... you're missing the actual useful functionality due to presuming the tool only does what you've done with it in the past.
And, by the way, the two commands are something that could be done in O(n), rather than O(n*log(n)) - but this little procedure is so damn easy to write, that on relatively thin inputs that are less than 20m lines long, I usually just do this.
Does anybody know of great guides to master the majority of the useful CLI tools available on Unix and Linux? For me the challenge is that I don't even know I have some of these fantastic solutions readily available. I'd really benefit of knowing they're there in the first place instead of hacking a homebrew solution every time.
Learn Linux the Hard Way talks about them quite a bit, is that the go to guide in 2012 or can I do better?
I also learned a lot from "The Linux Cookbook" (Second Edition) by Michael Stutz (this might be the first edition online: http://dsl.org/cookbook/cookbook_toc.html).
From what I recall xxd is actually part of the vim distribution, so it's common, but not everywhere. An approximate standard (coreutils) alternative is `od -x' (although it doesn't include the ASCII readable char column at the right, which can be annoying)
That whole "find -ls | awk" is wicked slow anyway; try wc and xargs...
$ time find -ls | awk '{s += $7} END {print s}'
15970582120
real 0m27.721s
user 0m1.256s
sys 0m1.780s
$ time find | xargs wc -c 2> /dev/null | tail -1
604260969 total
real 0m0.332s
user 0m0.068s
sys 0m0.204s
I was wondering about that, but is it really wrong?
From what I found in the "BSD General Commands Manual":
Any arguments specified on the command line are given to utility upon each invocation, followed by some number of the arguments read from the standard input of xargs. The utility is repeatedly executed until standard input is exhausted.
and
-P maxprocs
Parallel mode: run at most maxprocs invocations of utility at once.
The way I interpret that is that you could run xargs in parallel mode, but by default the "utility is repeatedly executed" in the same process.
Even in sequential mode, for a sufficiently long input xargs will invoke the command multiple times to comply with kernel limits on command line length.
Hey calm down, I didn't mean to be agressive sorry if you felt my comment that way.
You can count scripts as commands (I did it in my other comment elsewhere on this page) but the way you do it you will miss a lot. For instance you won't count any "uniq", "sort", … that are almost exclusively used as filter and not in first command, you will also miss a lot of "less" and "grep" for instance.
Some issues:
- don't forget that /dev/random blocks
- It's easier to use dd_rescue to track progress than to signal dd
- Using dd to zero out a hard drive repeatedly doesn't increase security[1]. Using ATA secure erase does[2]
- an alternative for summing file sizes is
[1] http://en.wikipedia.org/wiki/Data_erasure#Number_of_overwrit...[2] https://ata.wiki.kernel.org/index.php/ATA_Secure_Erase