Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Impressive! Really fast, full featured code search across a huge corpus.

1. How did you build the index? Did you use a GitHub dump of some sort? How often do you refresh it?

2. Is it Elasticsearch or similar or a completely custom engine?

3. What kind of RAM/CPU are you using to power it?

4. Any plans to open source the code or commercialize the technology?

I could absolutely imagine paying for a private code search engine like this to run against a large internal company codebase spread across many repositories.



Thanks! It's built on top of Solr. It fetches the repos from GitHub - it should pick up any updates to repos within a few days. It's running on a couple servers with 20 cores each, which is not really enough for the traffic it's getting right now.


Have you seen livegrep?

Blazing fast multi-repo regex code search. May be more expensive to run in prod, not sure.


This is so good I imagine you're gonna need more.


Cool!I love it.


I'd be curious how you built the step from regex to ElasticSearch. My guess would be an n-gram (3-gram) index in ElasticSearch and then translating the regexes to that, but just curious if you built that custom or used something off-the-shelf. Love the site!


> I'd be curious how you built the step from regex to ElasticSearch. My guess would be an n-gram (3-gram) index in ElasticSearch and then translating the regexes to that, but just curious if you built that custom or used something off-the-shelf. Love the site!

I'm pretty sure Elasticsearch supports regex search, it's just that it's horrendously slow and can blow up the system.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: