
Domain name search with regular expressions and curated sets - ssarah
http://namegrep.com/#(visit%7Chotels?)-(:geo/countries:%7C:geo/world/cities)
======
mikejarema
Who's behind this anyways?

I'm genuinely curious because (1) it doesn't seem to be presented in their FAQ
nor on-site and (2) the WHOIS info for namegrep.com is private.

While this isn't a huge red flag, it is a little suspicious that I could be
handing over my branding strategies (if one could call regexps "strategies) to
some unknown 3rd party.

(FWIW, I built a similar tool, but I don't go to any lengths to hide my
involvement)

EDIT: removed link to my similar tool, wasn't intending to hijack any clicks,
but rather get the discussion going on whether its important to you as a user
of this tool to know who you're dealing with

~~~
mikejarema
Got a few downvotes, is this not a relevant concern?

Perhaps due to listing my similar tool (removed from above comment).

~~~
larrys
It's a reasonable question you asked but I've noticed a tendency to get
downvoted for asking similar questions.

There is domain name frontrunning for sure:

[http://en.wikipedia.org/wiki/Domain_name_front_running](http://en.wikipedia.org/wiki/Domain_name_front_running)

My speculation is that downvotes come from these cases:

a) Somebody that is well known on HN posts something and you are just supposed
to know that they are well known.

b) General trusting and good nature of hackers and lack of cynicism (on the
part of the people who downvote that is not all hackers).

But it is definitely a reasonable thing to ask to be clarified and instead of
downvoting people should really just explain why they feel the site should be
trusted.

------
gametheoretic
Really, really great. Very ambitious. I see no viable solution to the issues
people are griping about without forking over buttloads of money every month.
Maybe someone on HN does, though - we can only hope you'll detect that helpful
comment in this storm of pedestrian shittiness.

Are you on postgres? One thing you could do-- and I only suggest this
cumbersome idea because you might just be crazy enough to try it-- would be to
use the pg_trgm (trigram) extension with the following in mind: a) Theory
being, when someone greps /[a-z]{4,8}/, they're either interested in {anthem,
aardvark, ambition, ...} or {nltk, xkcd, json, zzxx, xxzz, ...}, likely not
both. b) Neither (nor any third set you might come up with) is so inherently
superior that it deserves default status over the other. c) Even with limiting
results, half are bound to be totally uninteresting to the user. So what does
that even accomplish?

So my pg_trgm suggestion is to take that same /[a-z]{4,8}/ result set and
offer the user a relative sliding-scale by which they can push their visible
1,000 closer to/further away from a predefined set of dictionary words.

[http://www.postgresql.org/docs/9.3/static/pgtrgm.html](http://www.postgresql.org/docs/9.3/static/pgtrgm.html)

You may also consider tech acronyms - maybe steal those from StackOverflow
tags. Human names would be too big a hassle, IMO.

Again, I love the ambition of the damn thing. Kicks ass.

~~~
alixaxel
Thanks for the kind feedback. =)

Actually, we started by experimenting with SQLite (which should be faster than
PgSQL I believe since it has no protocol overhead), but it was kinda slow for
bulk queries. We then ended up switching to LMDB and LevelDB with a bitmask to
represent the availability of all TLDs and the performance improved greatly.
As an added benefit, this also made the JSON responses way lighter.

The main problem I see with the pg_trgm approach is that it would only return
domains that exist in the database (or in the zone files) and thus they would
have to be registered, which totally defeats the purpose of the tool. We
couldn't possibly store all the 63 alphanumeric combinations in a database,
that's like a gazillion gazillion possibilities! =P

StackOverflow tags is a neat idea for a set, I don't know how we missed that!
Thanks!

------
ericb
This is great, but the "please keep it under.." issue made it unusable.

Limited to .com, I couldn't use this pattern: [a-z]{4,8}coin

If that's not workable, I'm not sure I can come up with a pattern that works.
Why not just cap the results returned?

~~~
ssarah
It yields the results of each regex element separately, so it's hard to come
up with a functional implementation capped results. Still, I'll look into it.
But even then, the main question is if you really want to see the head of a
list of billions of combinations like you propose?

~~~
corobo
(Not OP)

No, not the billions, but yes the results of other searches when filtering by
".com available" such as (:colors:)((:words/adverbs:)|(:words/verbs:)) and
then maybe if I also limit the length of the domain from there - there wont
then be billions of results

As a side to that, could :colors include more of them? There's definitely more
than 12 usable colour names!

~~~
ssarah
Since the results are stacked on the browser, for that particular search you
typed, you could always search (:colors:)(:words/verbs:) and
(:colors:)(:words/adverbs:) separately and I believe you will get the full
list you want. But I really have to generate the regex and availability
results before being able to apply the filters. As for the colors: yes the
sets need to be improved. (: You got some suggestions?

------
nkozyra
This is actually pretty spectacular, but it would be really nice if it
searched at least the geographic TLDs.

~~~
ssarah
Working on it. Hard to get all those zones (; Thanks for your comment.

------
mnx
Somethings broken, it show zip.com as available, and it was registered in
1997, is valid until 2015.

~~~
eli
zip.com doesn't resolve. I would guess they're just using DNS to check
registration.

~~~
mnx
yeah, looks like it. Oh well, it gave me brief hope of having an awesome
domain.

~~~
eli
I'll save you some time: there are no three character .com domains available,
not for a long time :)

------
maj0rhn
Looks very useful, but would it be possible to choose different colors for the
available/not available tiles? It's very difficult for those with subnormal
color vision to distinguish the red and the green. Thanks.

~~~
ssarah
Hmm, the color scheme is hard to change, but maybe I can implement a parallel
symbolic scheme to help those users out. Do you have some suggestions I could
use?

------
harvestmoon
Looks very nice, cool. As a new programmer/web app maker, I'm quite curious
how this tool might work. It's so fast - and it must be performing a lot of
calculations. Anyway, thanks again for sharing!

~~~
gukov
Probably a local WHOIS database that's periodically updated.

~~~
alixaxel
That's right. We gather our data from the TLD zone files.

We also run Go on the backend. <3

------
lcnmrn
It would have been useful if it had "English words starting with" and "English
words ending with". Or a simple way of filtering sets and using them as
subsets.

~~~
ssarah
That's a nifty idea. Will think about it. Thanks for the feedback.

------
randunel
Does not work properly, many domains listed as available, when they're
actually not. Would have been useful if it worked properly.

~~~
ssarah
Because of performance and privacy concerns, we are not able to resolve a
WHOIS request for all domains. Hence, there will be a few false positives. You
can check our FAQs for a bit more info on this.

------
MasterScrat
The concept is good but there are too many false positives.

otan.com/otan.net is available? Yeah right...

~~~
aseidl
The data might be coming from the .com/.net/.org zonefiles, where a domain
will only show up if it has an NS record configured.

I'm not sure if there's a free and authoritative source for registered
domains, short of grep'ing zonefiles and checking whois.

~~~
ohashi
There's no other way to bulk lookup quickly besides zone files. Maybe if you
are a registrar and even that I am not sure of (depending on how many results
you need).

------
troels
Is "curated sets" something I shuold know what is?

~~~
alixaxel
[http://namegrep.com/faq/#what-are-sets](http://namegrep.com/faq/#what-are-
sets)

~~~
troels
Thanks. Great idea actually.

------
secondhandvape
The question mark in the example regex isn't necessary, in case anyone was
wondering.

~~~
primo44
It's necessary. It makes the "s" in "hotels" optional, so that the regex can
match either "hotel" or "hotels" and then the rest.

