

Ask HN: Review my project, Factolex - the fact lexicon - akirk
http://www.factolex.com/

======
matt1
Some thoughts:

Maybe write a script that goes through the web, starting with Wikipedia, and
intelligently extracts facts from sentences. From there, pass it along to
Mechanical Turk folks to gauge whether or not it makes sense. You could
cheaply populate your database with all sorts of relevant information this
way.

Maybe focus on a particular field at first, such as tech industry or the stock
market or... whatever. Once you've got a grip on that, start expanding to
others. I'm not suggesting you stop gather other types of facts, but I think
right now its better to focus than spread yourself too thin. I think your
visitors rather have your site tell them a lot about one topic than a little
about a lot.

It's a small detail, but capitalizing the first letter of the descriptions
would make it look sharper. Also, on my browser, IE7, the "More information"
link drops down below the "Welcome" link. I think you could do better with a
different, unique color scheme which people would associate with Factolex. (At
first I thought HN was ugly, but now I wouldn't have it any other way.)

Overall, very well done. I could see a lot of search engine traffic getting
directed to Factolex someday down the road.

------
akirk
Factolex.com is a fact lexicon. We split up knowledge into small sentences:
facts

You can use the checkboxes to remember the facts that you find relevant and by
doing so, we create your personal lexicon (which can be exported using the API
or a widget). Also, the more people select a fact as relevant to them, the
higher it is ranked within the term.

I'd love to get some feedback from the HN community. Thanks for checking it
out.

~~~
glazz
very interesting. Could you tell me how its made?

~~~
akirk
what do you mean? like what it is built with (i.e. programming language,
server etc)?

~~~
glazz
like the idea of project. how was definitions obtained?

~~~
akirk
The idea of the project is a new approach into collaborating on knowledge, by
splitting the knowledge into smaller parts, and then being able to handle them
in an easier way.

Most of the definitions come from Wikipedia. We started off trying to have
people enter facts manually, but then I figured that we should leverage all
the knowledge of Wikipedia that is freely available.

So I wrote a bot that is not particularly clever, but most of the times good
enough for fetching facts from Wikipedia.

By the way, it can also extract facts from the German Wikipedia (there is a
German version of Factolex as well: <http://de.factolex.com/> as my mother
tongue is actually German).

------
avinashv
Very clever--it's a nice way to integrate fact-searching across many different
websites. The interface is effective and doesn't get in the way. I like the
branding you've created for yourself as well. The avatar of the Factobot
incorporates the logo and that awesome chimpy-looking bot really well.

I like how you let me add to my personal lexicon without having to register,
but will that be saved, and be carried over to any account I create if and
when I do?

Few nags:

On the tab bar, if "Home" has a bar on the very left, shouldn't FAQ have one
on the very right?

On the "ongoing votings" page, it's not immediately clear what I am supposed
to do, and the yellow hint box towards the right says the same thing it does
when I search for something--hit a checkbox. There aren't any checkboxes. This
could be confusing for some.

Great job overall, I see myself using this.

~~~
akirk
Thanks for the kind words.

Your personal lexicon survives your registration (of course :) For the
navigation bar: we tried it, but settled for leaving the bar out (as the menu
actually continues on the far right). A matter of taste, I suppose. The
ongoing votings section is still sort of under construction, thanks for
bringing that up. There is quite some room for improvement there ;)

------
zain
What's the difference between this and searching for "define: word" on google?

[http://www.google.com/search?hl=en&safe=off&q=define...](http://www.google.com/search?hl=en&safe=off&q=define%3Avienna&btnG=Search)

~~~
gsmaverick
Personalization!

------
Jakob
Excellent, should be part of Google! (e.g. "fact:Hackernews")

On index pages no inline links are visible. e.g. "Illusion a Polish band
founded in 1992 in Gdańsk, Poland and defunct since 1999"

You can only click "Gdańsk" if you’re on the single fact page.

~~~
akirk
As a matter of fact, there is a Greasemonkey extension for Firefox that
achieves integration into Google search:
<http://userscripts.org/scripts/show/32352>

------
kbrower
Does your grabber get things from urban dictionary?

~~~
akirk
Not yet, but I am not actually sure what to use from there. There are just so
many definitions that have weird up vs. down ratios that I wouldn't dare to
let a program decide what is legitimate or not.

------
rw
Suggestion: include the Cyc database.

~~~
akirk
Thanks for the idea. I am not sure if I can extract useful facts from it, but
I can definitely see this helping for "see also" purposes.

Integrating with the semantic web is a natural further step for Factolex.

------
jgilliam
one simple idea: put a big number in the header or on the homepage with how
many terms you have.

------
gojomo
This is somewhat like an idea I've been considering. So I think it's a great
concept... and have strong opinions on possible directions.

If your primary model is a ranked listing of 'facts' by a major 'term' key, it
will be hard to outperform Wikipedia (or even Mahalo). When those sites'
single-topic articles/pages are well-written, they already lead with the core
facts, and then proceed through the rest in a well-organized fashion. Even the
Google 'snippets' in natural search hits then turn out to be pretty strong for
anwering people's questions/queries. So I think to differentiate you need to
break out of that linear model somehow.

The multiple licenses situation is confusing and may prevent you getting
proper credit for openness.

You may hope voting and reputation will be mechanisms for gradual quality
improvements, but they often backfire. Any benefit from people seeking to
'climb' for the right reasons can be offset by gaming.

It's unavoidable you'd need to use some bulk automated processes (like
scraping Wikipedia or hired contributors) to bootstrap, but that may then
undermine the organic growth of 'community spirit' and norms that will be
essential for the long term. The right balance will be hard to find.

Good luck!

