
ElastiQuill: A modern blog engine built on top of Elasticsearch - indigodaddy
https://github.com/BigDataBoutique/ElastiQuill
======
stephbu
__hits tiny staple with enormous sledgehammer __

While it is possible to use ES as your primary store, doesn’t mean it is a
great use-case. Pushing ES to durability takes resources (people, time,
compute) and rigor.

For a blog site this feels like overkill. Your priorities are durable low-freq
writes, and more freq-reads. Hands-free day-to-day maintenance, and simple
recovery mode such as restore. Caching (and optionally queuing) can absolve
sins of many topologies to fulfill these requirements.

ES brings a much larger surface area than just store blobs and metadata. Big
tax to gain free-text search - which while interesting most users will almost
certainly never use (thank Google for that)

Focus on a cheaper, lighter, database engine that is durable, checkpoints
regularly, and has simple, testable backup/restore processes. Ship the backups
somewhere e.g. S3

~~~
badrabbit
I disagree, both from a programming and admin perspective, I found ES to be
the easiest option. With ES, I don't have to explicitly create tables
(indexes) ,I find ES queries and results a lot easier to deal with. A single
node ES cluster is easy to maintain, I mean, just give it enough ram and disk
space and you shouldn't even have to think about it again. And if you want to
visualize the data as an admin, you can slap kibana on it. You can even go a
bit crazy here and feed ES your web server logs to have even better
correlation for stats. ES also has a very simple testable backup proceee (as
simple as a curl command!).

To your point, it does feel like "using a cannon to kill a mosquito". But I
still had a better experience with it than even sqlite.

------
hombre_fatal
Surely the data store is the least interesting part of a blog. Or, rather,
what competitive advantage does Elastic give to a blog that warrants having
"Elastic" in the product name?

It reminds me of [https://nodebb.org](https://nodebb.org), the "forum built
with Node.js" (though they've rebranded away from that) which always struck me
as the opposite place to focus when selling software.

For example, none of my problems with Wordpress are related to its choice of
database. Just like porting Wordpress to Elastic Search don't fix any of my
issues with Wordpress. A blog is almost entirely defined by its UX.

On a better note, the platform seems pretty polished.

~~~
oefrha
> A blog is almost entirely defined by its UX.

A blog is almost entirely defined by its content. I read a lot of run-of-the-
mill average to crappy Wordpress or Blogger blogs that neither load fast nor
look flashy (or sometimes even pleasant). I don’t give a shit because I read
them for the content, mostly from an RSS reader.

~~~
NewsAware
Understood the GP to mean blogging-engine/CMS and UX to mean admin&editing
ergonomics

------
vore
I've heard that Elasticsearch has had some durability issues in the past and
the recommendation has been to have a more reliable source of truth (e.g. an
SQL database) and periodically import data into Elasticsearch for indexing. Is
this still a problem in practice, or is storing data directly in Elasticsearch
like this still unsafe?

~~~
jillesvangurp
They re-implemented the clustering algorithm for v7 and fixed numerous bugs in
between v1/v2 and v7. Including essentially all of the ones reported in the
infamous call me maybe article ([https://aphyr.com/posts/323-jepsen-
elasticsearch-1-5-0](https://aphyr.com/posts/323-jepsen-elasticsearch-1-5-0)).

However, it's not a database. I've actually abused it as such and it's fine.
You get optimistic locking but no real transactions. Search is eventually
consistent unless you call _refresh (but you shouldn't), etc. Bearing in mind
it is not a database, it is not intended to be used as a database, and
probably will never be a database, it actually works fine as a database
provided you don't do a lot of updates (write model is append only).

If safety is a big concern for you, obviously use something else. But for a
blog it's completely fine.

------
vikingcaffiene
Sorry but this is a terrible idea. ES is not meant as a durable store. Its
designed as an ephemeral abstraction on top of a more durable storage
mechanism like Postgres or the like. This is just asking for trouble to use it
like this.

~~~
aprdm
How is something meant as a durable store or not? I've worked in companies
that had only ES as a database for years and they didn't had complaints. I've
seen it working mostly well in these orgs and because of my now experience
with it would likely strongly consider it as a database just like mongo and
friends with the benefits of having a very large ecosystem (elk, beats etc.)
Which can make it a one stop shop for all things in an org

~~~
vikingcaffiene
ES is an _eventually consistent_ data store meaning that it makes no
guarantees about data availability when you put stuff in there. That alone
should be a deal breaker for you.

If you use ES for what it's designed to be and make it a _projection_ of your
underlying durable data store, you can do things like rebuild your index or
change your schema without fear of data loss.

Its a document store, not a database. MongoDB is too but they try to hide it
by bolting on all the stuff traditional relational DB's have like ACID and
transactions so you wont realize what a poor choice it is as a database when
compared to say, Postgres.

~~~
aprdm
That's not entirely true, it has its guarantees and ways of work is just not
the same as more traditional SQL dbs and that's ok for tons of applications.

~~~
vikingcaffiene
Are you referring to the real time flag? My understanding is that that is per
shard and if you try to pull from another shard that hasn't updated its index,
you'd get stale data. Maybe you are referring to something else or I am wrong
about that?

It doesn't change the larger point that using ES as your primary database is
not playing to its strengths. You're better served using a transactional data
store and building your indexes from that.

------
whalesalad
This will be less reliable than a one-click WordPress install running on a
$5.99/year GoDaddy shared host.

------
pensatoio
This project is the epitome of, “how not to use Elasticsearch.” Comments in
this thread can be divided perfectly into two groups: people that have
experience with Elasticsearch and people that don’t. Only the latter will
suppose this project could ever be a good solution.

------
sagichmal
Elasticsearch is not an appropriate choice for an authoritative store of data.

~~~
kbumsik
Could you elaborate? I'm just started learning Elasticsearch so that sound
quite interesting to me.

~~~
jaytaylor
Elasticsearch should not be used as the sole source of truth. I've been using
it for years professionally, and it's an amazing tool but not safe with regard
to data integrity / durability.

While dramatic progress and improvements have been made over the past 9 years,
sometimes things still go bad and indices get corrupted. When this happens,
it's necessary to reindex the data. There are also additional situations where
reindexing is required. So the safe advice is: Always have the authoritative
data source elsewhere (in a reliable data store or database of some kind) and
then load the data into Elasticsearch from there.

Postgres or even MySQL will be a safer bet when data integrity is key. Then
it's only a matter of indexing the data from the DB into ES.

Using Elasticsearch for a blog is a fun you idea, but ultimately is likely to
be overkill for a personal blog site.

~~~
alexnewman
durability means something specific in database land. And it ain't how you are
using it.

~~~
sagichmal
He's saying ES isn't D-in-ACID durable. That's correct. What's your objection?

~~~
alexnewman
I'm saying that the d in durable is not the issue in elastic search. That's
just being reliably written to stable storage which modern es does pretty well

------
big_chungus
Why would you do this? PHP is awesome compared to this. Or go with a static
site. This reminds me of the VS opening JSON meme:
[http://devhumor.com/content/uploads/images/September2019/vs_...](http://devhumor.com/content/uploads/images/September2019/vs_json_file_overkill.jpg)

------
inertiatic
I don't see how ES could be an issue for a blog, it's not like Lucene indexes
blow up randomly and keeping backups should be standard practice at scale even
if you used a proper DB.

Meanwhile, if you want a small deployment that also does full text search with
a bunch of "smarter" features and some analytics, it would make sense to only
use a single data store.

All these in theory, as I haven't checked out the actual project.

------
PudgePacket
I like the irony of being built on a search engine but not having search
functionality :)

------
NicoJuicy
A blog engine build on top of a 1,5gb instance is a bit much.

------
idclip
bless you, why not. ever considered applying to logz.io ? they could use a guy
like you.

