Hacker News new | past | comments | ask | show | jobs | submit login

I'm curious what your queries look like, because these performance numbers are awful.

I'm currently running an index that is 96 million documents(393GB) using a single shard with a response time of 18ms.

If you're comfortable with it, I'd suggest profiling Solr. We found that we were spending more time garbage collecting than expected, and spent some time to speed up an minimize the impact of it. Most of this was related to our IO though.

Second, don't use the default settings. Adjust the cache sizes, rambuffer, and other settings so they are appropriate for you application.

I'd also start instrumenting you web application such that you can start testing removal of query options that may be creating your CPU usage issue. You get a lot of bang for your buck this way, and you may find the options you were using provide no meaningful improvement in search. A metric like mean reciprocal rank can go a long way to improve your performance.

Re GC - were you using standard or parallel GC?

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact