

Ask HN: Please review my OpenCourseWare Search Engine - pierrefar

Hell all,<p>I've been hacking at this for many weeks and I'd love your feedback and guidance on how to improve it.<p>URL:
http://www.ocwsearch.com/<p>What is it:
A search engine for OpenCourseWare. Right now only MIT courses are indexed, but more universities and textbooks are planned.<p>Edit: It indexes all the lecture notes, the PDF files in the zip files, not just the course descriptions.<p>The blog has a poll for you to vote for what should come next.<p>What I'd like feedback on:
Anything you see broken/improvable/missing. HN readers are the perfect target market so your feedback is going to carry a lot of weight.<p>Please get in touch: I've seen a few HN posts about OCW projects, and I'm very happy to connect with like-minded people. The email is pierre@(the domain name).<p>For the curious: The technology stack warrants its own blog post, so briefly: PHP, MongoDB, Redis, and Sphinx.<p>Thanks!
======
nl
I've done some work in this area, although more aimed at the school sector
than higher ed.

One thing we found was that providing alternate forms of navigation and
filtering is important.

For example, from memory the OCW metadata contains subjects. If the query
result pages displayed those as facets then it would allow users to refine
results by subjects. "School" and "Year" would be other good facets.

Solr does faceting really well, BTW

~~~
pierrefar
Others have asked for an advanced search interface, and I think your idea of
filters fits right in with that. Thanks!

What I'm imagining now is actually at least two new user tools to help them do
their searches: a query builder that does the advanced operator searches for
them (like Google's advanced search form) and filters on the search results.

You got my brain in over-drive :)

~~~
nl
My view is that advanced search interfaces don't add a lot of value, but
filtering does.

People like to be able to explore result sets interactively. Properly
implemented filters give you all the functionality of advanced search, but
with the benefit of seeing exactly what you are getting dynamically.

Have a look at the filters on a search like this:
[http://shopper.cnet.com/1770-5_9-0.html?query=Core+i5&ta...](http://shopper.cnet.com/1770-5_9-0.html?query=Core+i5&tag=srch)

It gives much better feedback than an advanced search does.

------
kees
Very useful product. This is a service I'm actually going to use. By the way,
If you change your name to something more generic, add a couple of other
educational sources to you engine as well, you will be google as academic
earth is youtube.

~~~
pierrefar
Good point, and one I debated without a good resolution. For the launch, I
stuck with what I had just to get something in the hands of users and get
feedback.

Any ideas for different names? A fresh perspective would be appreciated.

------
Cinnamon
Cool project! How did you get Sphinx to index MongoDB?

Link: <http://www.ocwsearch.com/>

~~~
pierrefar
Thanks!

I used the xmlpipe2 data source for Sphinx. A PHP script reads MongoDB and
prints out the XML. Works a treat.

------
justliving
nice! looks like a useful service! The search was pretty snappy! Was
impressed.

regarding the overall design, I think that a new make-up might not hurt ;)

What's your business model (if there is one), or are u doing it just for the
fun of it?

Either way, keep up the good work! cheers

~~~
pierrefar
Thanks!

Yes new make-up is needed. Clearly I am not a designer and I know that. If you
have ideas, I'm keen to hear them. Also, if you have favorite search engines
in terms of design and/or functionality, I'd like to see them for inspiration.

Business model: there are different directions this could go. Of course the
Creative Commons license is the main constraint, and I have what I think are
compatible ideas. For now, it's just for fun because I actually really wished
I had an OCW search engine recently, so I built one.

~~~
justliving
(unfortunately) I am not a designer either! When I opened the page, it design
just felt a bit "old", e.g. not very "web2.0" like, if you know what I mean.

I think, that a bit less of clutter and a better focus on the important parts
of the site (the search bar, and a clear message what value the site provides)
would definitively help.

Further, something like "example searches" or "last searches" might be
beneficial as well, e.g. to showcase the application. Just my 2cts ...

hth, cheers

------
Serene
Indexing lectures instead of courses is more useful.

~~~
pierrefar
It indexes the PDFs in the zip files. It's not just the courses, but the full
text of all the lectures in each course.

~~~
Serene
What I meant is that I would prefer to see where exactly my keyword surfaced
in the course - in what lecture or assignment vs lists of all courses
mentioning the keywords.

~~~
pierrefar
Ah, sorry I misunderstood. Yes I completely agree. The "snippets" shown right
now are just the official course descriptions. Better snippet generation is
something I need to research before I deploy anything. It's a complicated
problem to solve in a way that will work across all searches. Google used the
meta description tags for years before they figured out how to best show
snippets from page contents.

------
bwelford
Even though it only has the MIT OCW data so far this is a most impressive and
user-friendly interface. Congratulations, I'm sure it will appeal to a wide
audience. Sorry but I'm not in your market niche so I do not know what's
available from other institutions.

