Hacker News new | past | comments | ask | show | jobs | submit login
PostgreSQL: A full text search engine, part 1 (2014) (shisaa.jp)
87 points by kodekracker on Aug 5, 2015 | hide | past | favorite | 13 comments

Anyone interested in this piece, may also be interested in my Demystifying PostgreSQL slides from some time ago. Though most if not all is still relevant today.

Full text search starts on slide 32: http://www.slideshare.net/noloh/demystifying-postgresql

I've occasionally been tempted to struggle with postgres's full-text options... and always regretted it. It's idiosyncratic and quickly hits a ceiling, in terms of ranking-quality or query features, which is way below what's possible with specialized solutions.

Even if you despise Java, SOLR or ElasticSearch aren't that hard to get up as a service. Each hour invested into getting those working is likely to return more value than an hour with PG's full-text search.

I wish someone could fix Postgres's text search to work like SQLite or Solr. I'm teaching myself C in hopes of fixing it, but it ain't easy.

Let me define my own lexer as a UDF. Let me define a pipeline of filters like Solr. Let me use BM-25 text similarity like SQLite.

> Let me define my own lexer as a UDF.

You can. CREATE TEXT SEARCH PARSER does essentially that. (Yes, it's called parser, not lexer, but in this case the difference essentially doesn't exist).

> Let me define a pipeline of filters like Solr.

Hm. Essentially that should be possible using dictionaries.

> Let me use BM-25 text similarity like SQLite.

Hm. It'll not be possible to write bm25 directly like in sqlite (matching to the current row without specifying it IIUC), but generally it's easy to define additional functions and use them for sorting and whatnot.

I think the postgres text search functionality unfortunately has lacked somebody with interest in developing it for a couple years now. It can be useful, but it could be much better. The default text search parser is pretty much useless in my opinion :(

I meant a non-C user defined function. CREATE TEXT SEARCH PARSER requires functions that handle the `internal` type.

Being able to write one in plperl would rock

You can create the text search index over the result of arbitrary function. I have done it for a language that lacks stemming/ispell support. In pl/perl.

Could you give a small example to illustrate how/what you mean? Would you make a function that massage tables/views in "StrangeLanguage", and then spits out "StrangeLanguage" that's been stemmed/normalized etc -- and then feed that to pgs normal full text index system?

Our company had switched from Postgres FTS to Elasticsearch and saw huge performance benefits without loss of relevancy.

Maybe without loss of relevancy, but maybe with data loss. That may be fine (or not). If you haven't done it yet, I'd recommend that you check Jepsen's analysis of ES: https://aphyr.com/posts/323-call-me-maybe-elasticsearch-1-5-...

Is it ok to post old articles? if so, I have many in this field:)

Yeah. Just put the year in parens at the end of the title if you're worried about it.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact