
PostgreSQL: A full text search engine, part 1 (2014) - kodekracker
http://shisaa.jp/postset/postgresql-full-text-search-part-1.html
======
asnyder
Anyone interested in this piece, may also be interested in my Demystifying
PostgreSQL slides from some time ago. Though most if not all is still relevant
today.

Full text search starts on slide 32:
[http://www.slideshare.net/noloh/demystifying-
postgresql](http://www.slideshare.net/noloh/demystifying-postgresql)

------
gojomo
I've occasionally been tempted to struggle with postgres's full-text
options... and always regretted it. It's idiosyncratic and quickly hits a
ceiling, in terms of ranking-quality or query features, which is way below
what's possible with specialized solutions.

Even if you despise Java, SOLR or ElasticSearch aren't that hard to get up as
a service. Each hour invested into getting those working is likely to return
more value than an hour with PG's full-text search.

------
elchief
I wish someone could fix Postgres's text search to work like SQLite or Solr.
I'm teaching myself C in hopes of fixing it, but it ain't easy.

Let me define my own lexer as a UDF. Let me define a pipeline of filters like
Solr. Let me use BM-25 text similarity like SQLite.

~~~
anarazel
> Let me define my own lexer as a UDF.

You can. CREATE TEXT SEARCH PARSER does essentially that. (Yes, it's called
parser, not lexer, but in this case the difference essentially doesn't exist).

> Let me define a pipeline of filters like Solr.

Hm. Essentially that should be possible using dictionaries.

> Let me use BM-25 text similarity like SQLite.

Hm. It'll not be possible to write bm25 directly like in sqlite (matching to
the current row without specifying it IIUC), but generally it's easy to define
additional functions and use them for sorting and whatnot.

I think the postgres text search functionality unfortunately has lacked
somebody with interest in developing it for a couple years now. It can be
useful, but it could be _much_ better. The default text search parser is
pretty much useless in my opinion :(

~~~
elchief
I meant a non-C user defined function. CREATE TEXT SEARCH PARSER requires
functions that handle the `internal` type.

~~~
baudehlo
Being able to write one in plperl would rock

~~~
karavelov
You can create the text search index over the result of arbitrary function. I
have done it for a language that lacks stemming/ispell support. In pl/perl.

~~~
e12e
Could you give a small example to illustrate how/what you mean? Would you make
a function that massage tables/views in "StrangeLanguage", and then spits out
"StrangeLanguage" that's been stemmed/normalized etc -- and then feed that to
pgs normal full text index system?

------
chishaku
Related: [http://blog.lostpropertyhq.com/postgres-full-text-search-
is-...](http://blog.lostpropertyhq.com/postgres-full-text-search-is-good-
enough/)

HN Discussion:
[https://news.ycombinator.com/item?id=8381748](https://news.ycombinator.com/item?id=8381748)

------
AnonNo15
Our company had switched from Postgres FTS to Elasticsearch and saw huge
performance benefits without loss of relevancy.

~~~
ahachete
Maybe without loss of relevancy, but maybe with data loss. That may be fine
(or not). If you haven't done it yet, I'd recommend that you check Jepsen's
analysis of ES: [https://aphyr.com/posts/323-call-me-maybe-
elasticsearch-1-5-...](https://aphyr.com/posts/323-call-me-maybe-
elasticsearch-1-5-0)

------
postila
Is it ok to post old articles? if so, I have many in this field:)

~~~
jdc
Yeah. Just put the year in parens at the end of the title if you're worried
about it.

