
Show HN: Available .com's in /usr/share/dict/words - gamegoblin
http://pastebin.com/raw.php?i=fY8xbZwB
======
jedberg
Histogram of word length. Keep in mind that the median word length in English
is 5:

    
    
                  5     1 #
                  6    14 ##
                  7   104 ###########
                  8   326 #################################
                  9   414 ##########################################
                 10   380 ######################################
                 11   320 ################################
                 12   200 ####################
                 13   133 ##############
                 14    73 ########
                 15    35 ####
                 16    12 ##
                 17     8 #
                 18     1 #
                 19     1 #
                 20     1 #
                 22     1 #
    

edit: Someone points out below I had an off by one error, which I've now
fixed.

~~~
jloughry
Sorting by length is an interesting programming problem; it can't be done by a
regular expression. One way is to use a mask:

    
    
        % sed -e "s/./x/g" < original.txt > mask.txt
        % paste mask.txt original.txt | sort | cut -f 2
        % rm mask.txt

~~~
binarymax
Golf time! in js:

require('fs').readFileSync('list.txt').split('\n').sort(function(a,b){return
b.length-a.length});

~~~
jloughry
Eliminate the temporary file:

    
    
        sed -e "s/./x/g" < list.txt | paste - list.txt | sort | cut -f 2

~~~
jloughry
We can eliminate the paste by using sed's hold buffer, but it costs another
regexp to get rid of the spurious newline:

    
    
        sed -e "h;s/./x/g;G;s/\n/\t/g" list.txt | sort | cut -f2

------
joshu
The vast majority of the good ones are taken, unfortunately. I am guessing
he's not doing whois but instead looking at DNS records.

I have a daily cron job that looks at available domains sorted by word
popularity. The vast majority are in some status that makes them not show up
in DNS but are registered.

That said, I've been thinking of making some of the domain name tools I'm
working on public...

~~~
gamegoblin
I'm using a modification of
[https://gist.github.com/peterc/63893](https://gist.github.com/peterc/63893)

~~~
mappu
If you're interested in doing more of this stuff, I recommend getting in touch
with verisign - they can give you free FTP access to copies of the .com and
.net root zonefiles.

~~~
gamegoblin
Not really at all. This was just a 20 minute morning hack for fun. Thanks for
the info, though.

------
shortformblog
Best one I saw in here was "pledges," which is screaming for a fraternity-run
social network.

~~~
shittyanalogy

        Domain Name: PLEDGES.COM
        Updated Date: 19-feb-2014
        Creation Date: 11-jan-2000
        Expiration Date: 11-jan-2014
    

So it's probably in redemption

------
bigtunacan
These two would be good for specialized dating sites.

homeliest.com - A dating site where ugly people meet other ugly people.

outfoxes.com - Great name for a gay dating site.

~~~
bigtunacan
So I tried registering outfoxes.com. It was available and in my cart... I get
to check out and it is no longer available, but is now "up for auction" by
name.com. WTF?

------
justizin
<same program returns empty list tomorrow>

~~~
gamegoblin
I was planning on running it again tonight just for fun. There were 2024 to
start with. We'll see what happens.

~~~
kiliankoe
I'm curious, how many of them were snatched?

------
aiiane
Some interesting numbers:

    
    
        $ wc -l domains.txt
        2024
        $ grep -Ev "(ing|s|ed|ly|est|ier)$" domains.txt | wc -l
        127
        $ grep -E "^.{1,7}$" domains.txt | wc -l
        119
        $ grep -E "^.{1,7}$" domains.txt | grep -Ev "(ing|s|ed|ly|est|ier)$" | wc -l
        15
    

The 15 words from the last command:

    
    
        almohad
        ayyubid
        boneyer
        chaitin
        copulae
        dourer
        eagerer
        layamon
        livia
        muawiya
        neglig
        shlocky
        unriper
        unsafer
        vilyui

~~~
oakwhiz
chaitin.com: a service that solves the halting problem

------
jloughry
Here is the same list sorted by length:

[http://applied-math.org/words_sorted_by_length.txt](http://applied-
math.org/words_sorted_by_length.txt)

~~~
jere
Nice! Just a reminder of what's not here (and thus is registered):

All but one of the English words in a 100k word list that is less than 6
characters long. Average English word length is about 5 characters. Kind of
mind boggling.

------
hnha
chancel.com - get a 50:50 chance of cancelling your process or killing pid 0.

cuddlier.com - our cushions are

marauded.com - rebuilt your home in this upcoming ios freetoplay.

besieging.com - the long awaited sequel to Marauded. Spent the rest of your
money on useless hats.

gawkiness.com - exposing the shady business practises of the unmentionables

offstages.com - your stars in authentic interviews

------
binarymax
Great hack!

Deeply trying to resist buying shlocky.com

~~~
gamegoblin
Was looking for an excuse to play with Haskell threads, so I read in the dict,
put it into groups of 500 words each, and then distributed each group to one
of 200 threads to whois everything.

------
adamboulanger
A music notation startup's field-day Multiple music terms: Decrescendoes,
diminuendoes.

What's your retention rate at decrescendoes? <add joke here>

~~~
jaredsohn
If you're creating music-related startups or software, you're probably better
off getting a .fm domain, since it is easier to find good domains there and
users are accustomed to it.

Funny enough, I actually considered some of the terms you suggested (my
software fades out music when you play a video) before I started thinking
about .fm. The obvious problem with the terms you mentioned are that they are
long, difficult to spell, and obscure to non-musicians.

~~~
adamboulanger
No .fm necessary for sextettes - the online dating site for music
professionals.

------
bwooce
New words for my vocabulary:

congaed - past tense&participle of conga scrod - young cod/haddock/white fish
split and boned. (also schrod, also available).

------
drakaal
symmetricly isn't a correct spelling as far as I can tell. This was just one
that jumped out at me, looks like others might have similar issues.

Several of these were already registered for a long time, So I'm not sure how
the check was done, but I did pick up UNSeats and UNsays, but I will likely
use them as United Nations Says, not unsays which I don't think is something
you can do.

------
mickeyp
What's interesting is how many of the words are "negative" or have negative
connotations.

~~~
e28eta
I'm struck by how many are pluralizations or modified in some other way (-ier,
-est, ...)

------
shitlord
I wonder who will purchase urethrae.com. I thought fetish porn creators would
have bought it by now.

------
WoodenChair
Why can't I find a clear explanation anywhere of when I'll be able to buy .app
domains?

------
gamegoblin
This is using the ~100K American English wordlist.

------
thom801
These are some pretty shit domain names hehe. Here is everything with fewer
than 8 chars. Lorded.com seems promising though.

with open('domains.txt') as domains: for line in domains: if len(line) < 8:
print line

befogs

cagily

dourer

drolly

gusted

livia

lorded

neglig

scrods

shirrs

soughs

tyroes

unmans

unsays

vilyui

~~~
addandsubtract
You seem to be counting \0 as a char, or meant to find words less than 7
chars. Either way, there are just over 100 names available with 7 chars.

~~~
adamcanady
Probably counting \n as a character since Python is pretty good about never
dealing with \0's.

------
BorisMelnik
can't beleive "liviva" wasn't taken as a mispelling of "Libia"

~~~
gamegoblin
There are several good ones that I could see possible businesses wanting, such
as "surfboarded"

------
saimey
tingliest.com - The 9GAG alternative the world has been so eagerly waiting
for!

tinglier.com - oh shoot!

------
jbyers
livia.com would be the best of the bunch; whois shows it taken.

