

Ask HN: Transmoggit (aka "HTF is trim in Python?"); What next?  - s3graham

Hey all,<p>After inerte's post a few days ago: http://news.ycombinator.com/item?id=446507, I thought I'd use it as an excuse to fiddle with GAE and jQuery.<p>With props to Bill Watterson [1], I banged up "transmoggit": http://www.transmoggit.com/<p>(try something like "trim" from PHP -&#62; Python, as an example chosen not-at-all-at-random)<p>It does the obvious mapping thing, and lets you add your own. That's all fine, I guess, but it's not super-useful, until the 2.0 bit kicks in and "people" start adding clever translations. Of course, no one will bother to amend translations until there's some reason to visit first.<p>So, my question for the wise crowd: any suggestions on how to automate getting sensible mappings for, say, the easy 50-80%?<p>(any other suggestions are quite welcome too :)<p>thanks,
scott<p>1. http://www.geocities.com/Hollywood/Theater/9876/transmog.html
======
s3graham
And linkage for the lazy:

<http://news.ycombinator.com/item?id=446507>

<http://www.transmoggit.com/>

[http://www.geocities.com/Hollywood/Theater/9876/transmog.htm...](http://www.geocities.com/Hollywood/Theater/9876/transmog.html)

------
inerte
Hey :)

\- A submit button; I typed "trim" and it did nothing :p Had to change
language's dropdown or click a suggestion;

\- SEO! There's no unique url, with proper title, for the entries. None
searching for "trim in python" will find... my blog post is getting 2 or 3
visits a day from people searching "python trim" and/or "trim in python". And
I am not even on the first page. Aactually depending on my browser and
language settings, I am (5th result).

\- Filter duplicates (I entered string.strip is PHP's trim twice)

\- Suggest the relationship the other way around. If you know that trim ==
strip, why not say "hey we think strip == trim"?

How to automate? Do some similar text search thing. For example, you can get
the function definition for trim() on PHP's manual, if you match against
sections of Python's manual, you can offer your visitors suggestions, and let
them one click away to approve the connection. Easy to do?...

Anyway, it's soooo nice to see someone implemented HTFITIP... and I really
liked the clean design! And the name!

~~~
s3graham
Oops, no "submit" wasn't well tested obviously. Dumb! I was always selecting
an item from the autocomplete.

SEO + filter, good ideas, will do.

reversed mapping: it does try to do this a little, can you describe what
didn't work for you? For example, "str.lstrip" as Python->PHP doesn't have the
exact answer, but does look in the other direction and you get a suggestion.

~~~
inerte
So I just entered len() for PHP's sizeof(), and when I searched the other way,
I got a suggestion...

But I am pretty sure that when I searched strip in PHP, nothing came up. I
don't remember if I clicked a suggestion item or I selected a language from
the second dropdown. That might be the problem, since I think you're mapping
string.strip() to trim(), and not strip() to trim(), which is understandable
:) Maybe a LIKE %% search instead of =?

~~~
s3graham
Ah, that's probably it. Thanks, I'll look into it more carefully.

~~~
inerte
Hi scott!

I thought a little bit more about the SEO part and I think the best thing to
do would be to drop the Ajax which loads the similar functions and go with
simple GET requests, which will give you a nice url and possibilities to treat
it as directories.

For example, submitting PHP's trim in Python should load this url:

<http://www.transmoggit.com/php/trim/python/>

Loading <http://www.transmoggit.com/php/trim/>

Would show trim() in every target language.

Loading <http://www.transmoggit.com/php/>

Would show the list of functions, maybe with the languages that you have the
alternative.

Loading <http://www.transmoggit.com/trim/python/>

Would show the equivalent on "from" languages.

I don't know how much serious you're about this project :) I really think it
can make you some money, maybe not enough to make you quit your EA job :) but
certainly some to help pay the bills.

I also don't know how much SEO you know, but basically you need to think what
content your webpage should have to answer typicals queries on your market.
For example:

q: trim python

\- Put "trim in python" on <http://www.transmoggit.com/trim/python/> on the
title, and a H1.

q: python remove whitespace

This is what I've used a couple times, many moons ago, when "python trim"
didn't return good results.

You can answer this query, if on your
<http://www.transmoggit.com/trim/python/> page, you list excerpts of the
manual of the "from" languages. Upon loading /trim/python/, you select the
from languages, its functions where you know the similarity, and show
something like:

<ul> <li> <h2>C#</h2> <p>transmoggit doesn't know how <a
href="/c#/trim/python/">C#'s trim in Python</a></p> </li> <li> <h2>PHP</h2>
<p><a href="/php/trim/python/">PHP's trim in Python</a> is <a
href="python_manual_for_string.strip</a>.</p> <p>Here is PHP's manual
definition for trim:</p> <p>This function returns a string with whitespace
stripped from the beginning and end of str . Without the second
parameter...</p> </li> </ul>

I know that's a lot of stuff! But you seem to have experience with software
development... then you know software it's always 90% complete :)

Good luck!

~~~
s3graham
Hey inerte

Thanks for all your suggestions. I'm not too knowledgeable in all-things-SEO,
but this all sounds reasonable. (The anal-retentive programmer in me doesn't
like the vagueness in the method name.. what if there's a language called
trim? :) but something along those lines makes sense. I remember reading that
/php-trim-python was recommended for urls (vs php_trim_python or something).
Is there a consensus on whether dir/ecto/ries are better or worse than
hyphens?

The only thing I'm not sure about, is that I want to make sure the main user
experience of looking up one function is as fast as possible. I thought it
would be a bit faster to not have the search redirect to another page. On the
other hand, google.com is obviously "fast", so maybe it doesn't matter too
much.

I don't personally think it's a money maker, my general feeling is that
programmers are too "tech-savvy" to click on many ads. But who knows, I guess.
Either way, I'm just doing it because it's fun, and somewhat useful for me. I
love my "real" job anyway, so, not really looking to quit. :)

Thanks again for all the feedback, scott

~~~
inerte
Dashes vs. underscores are different because search engines see dashes as
spaces, so /my-cool-blog is actually /my cool blog while /my_cool_blog is
/my_cool_blog (a single word :)

Here's a page where Matt Cutts, head of Webspam on Google, explain:
<http://www.mattcutts.com/blog/dashes-vs-underscores/> Several other people
working with SEO have come independently with this observation. And I am
pretty sure I've seen a page from an Yahoo guy saying the same thing (but
can't find right now).

dir/ecto/ries are used to divide content into "sections". One good example is:

/news/sports/basketball - News about basketball

/news/sports - Sports news

/news - All news

So you can have /news/politics and /news/economy too.

While if you do:

/news-sports-basketball it doesn't look like the division is there. Think a
few years ago, when url mapping/rewrite wasn't much used. The / character was
a actual physical folder/directory on the machine. Therefore, if you were a
search engine, you would assume that everything under /news/sports/basketball
falls into a "cluster" of related content. You would think that probably
/news/sports/basketball/1 and /news/sports/basketball/2 are related.

Programmers don't click ads, FACT. Maybe programmers looking for travel
information ;) but not researching for (usually) work related stuff. My HTF
post got over 3k hits in 2 days. I really can't tell for sure, but I don't
think there was a _single_ click on my ads. Maybe one click, actually. The
situation is worsened because the hn crowd is not simple programmers, we live
and breath the net. So we're ad blind. The poor guy coding Java on a
consulting gig on a branch of a financial firm, while he is more ad-blind than
soccer moms, it's not something you can base your income (at least not only
Adsense).

Now, I didn't optimize my ad placement, so I could do better. But from a
~0.70% CTR on my main domain to almost _zero_... that's quite a lot.

Of course, while programmers are generally ad-blind, it all depends. For
example, selling .NET books on a .NET oriented website works...

While I agree that looking for a function on transmoggit is fast because of
Ajax, people have to know about transmoggit first :) And the truth is,
depending on your market and your brand, 60% to 100% of your visitors will
come from search engines. Sure, after a while, you'll start to get referrals,
but how did the person who linked to you found the page? Using search engines.

Don't see SEO as a way to trick the search engines ;) I know there's a lot of
bad karma around its practices, but if you think that by applying SEO to your
pages, you're just making easier for people to find your content, then you'll
be happy to do it.

I don't know if you can ask around EA for its website statistics. I am pretty
sure you'll see that >75% of its visitors come from search engines, even on
obvious cases People type the domain name on search engines, for fsck's sake.
Or what could actually just be the domain name if they followed with .com
(searching for Youtube to go to... youtube.com).

There's actually some very good explanations for this behaviour, for example
if I want to see the webpage for the Dead Space game, I don't have to guess if
it is deadspace.com, deadspacegame.com, deadspace.ea.com,
deadspace.games.ea.com, etc... Typing "dead space game" on Google is 100% more
effective.

I am talking too much :) And I don't want to complicate things further to you.
But just make it easier for people to find the webpages :)

~~~
s3graham
Thanks for all the info, much appreciated. On the off chance you see this old
thread, the "category"-type links from a few posts above are implemented now.
I'm not sure if it'll help people find the pages, but I'll find out soon
enough I guess. :)

The best type of ad that came to mind was a programming-job-ad, since that
would be very targeted and possibly useful, but I suppose Joel and 37s have
that market covered pretty well.

Haven't fixed all the UI bugs people noticed or added too much more
data/languages yet... next weekend I guess. :/

~~~
inerte
Hey scott, I click my own comments link here on HN regularly :p

Yeah, just saw "categories" <http://www.transmoggit.com/PHP/trim/Python>

If you go to Google and type "site:www.transmoggit.com" you'll see what pages
it has indexed. I just did and there's no "sub-page" indexed, just your domain
name... but this is normal! In my experince, Google will show the domain index
first and other pages in 1-2 weeks.

And since sometimes the pages Google shows are different between its
datacenters, here's something that I use to see among them all:

[http://www.seochat.com/seo-tools/multiple-datacenter-
google-...](http://www.seochat.com/seo-tools/multiple-datacenter-google-
search/)

About Job ads, I don't remember the company name, but I did see someone who
works with a referral program. You show their ads, and if it works, you get
paid.

But anyway, if your site gets somewhat popular, you can ask for a lower price
than Joel and 37s. If companies have to pay $250/$350 to ad on these sites, a
$10 fee on yours will be an impulse buy :) And you can always raise the price
if it works...

------
mseebach
First impression: Nowhere near enough swearwords :)

Also, if I type "trim" and hit enter, I get imagick_trimimage. If there's an
exact match, I very much probably want that.

~~~
s3graham
Thanks, will do (the second part anyway :). I'm using jQuery's autocomplete
but it's maddeningly broken in those sorts of subtle ways.

------
mattmcknight
It looks like a hard part is going to be method scoping. For things that
aren't in the global namespace, it's sort of hard to give a single answer,
because you are asking two questions (e.g., which object or module and which
method or function?). It might be better to pick the object/module first, and
then the method/function.

~~~
s3graham
I felt like it was probably always method/function name that you would care
about. That was mostly based on the assumption that similarly named methods on
any object do similar things, at least in the language's standard libraries.

Did you have a specific language/module/class/method (or set of them) in mind?

------
tow21
Any chance of adding Javascript as an option? I spend half my life tripping up
over Javascript/Python inconsistencies, this would be immensely useful.

Also, I'm not convinced your google account sign in thing works right; I
clicked on it to login to google, and then google didn't know where to
redirect me to.

~~~
s3graham
Yeah, JS vs Py drives me crazy too. JavaScript is high on my todo list. Just
need to find the "canonical" reference for it.

I can't reproduce the login problems? Non-gmail.com acct? Browser/OS matter?

------
jgfoot
this is awesome.

Automating the first 50-80%? Try some very simple natural language processing,
with a human being acting as a sanity check. If the PHP documentation for a
function is "upper case" and the Python documentation for something also is
"upper case," then the two might be similar...

~~~
s3graham
Hmm, sounds interesting.

In practice, I'm not sure what it looks like. I wonder if something like
"minimum number of edits" would work? Probably not, I guess since that's more
like a "diff" and the docs are not likely to be similar in that way.

Perhaps just "the function who's documentation uses the most same words",
stripping common words? Perhaps that could be improved a bit via synonyms too.

------
SapphireSun
Awesome!

I couldn't figure out how to submit it though :(

I know it's probably a bad idea, but I feel wistful for the original "How the
fuck is ___ in ___?"

------
revorad
This is awesome. What about VB? :-P

------
jonursenbach
This is pretty sick. Nice job.

------
ctice
"Shucks, the transmogrifier doesn't know what PHP's count is in PHP."

hehe

~~~
s3graham
heh, I tried to fix that in the UI (the wrong place, obviously). There's some
browser that doesn't do <option disabled> I guess?

Perhaps a sarcastic answer is the best solution. ;)

------
andr
What about Scala?

~~~
s3graham
Sure, I'll try to add it.

I don't personally use Scala: what reference would you like to see?
<http://scala-tools.org/scaladocs/scala-library/2.7.1/> or is there something
better?

