Ok so maybe someone can tell me what I (we) did wrong at my job we tried using the ELK stack, and it's probably still running but it is such a resource hog. I do not understand why they built Elasticsearch. I've read in a couple places you need like 32GB of RAM[0] just to run this thing to do queries, and having crashed Kibana / Elasticsearch a dozen times I believe it's designed poorly. I had hoped I could drop in MongoDB instead, but saw no indication of this being a fluid change. How many resources are any of you allocating towards your 'ELK' stack (I say 'ELK' cause now they have other software in the mix)?
Needless to say, I rather build my own solution for logging instead using a database that's not in-house having experienced all of this.
Oh wow now it says 64GB of ram is the sweet spot.... What the heck is this thing doing that couldn't of been accomplished with MongoDB or PostgreSQL? I've got busier data sets that don't need 16GB of RAM, and yes we pound the database with logs of sorts and query in all sorts of ways and still I don't get it... I wouldn't recommend this stack to a friend unless they've got plenty of hardware to spare.
> tell me what I (we) did wrong at my job we tried using the ELK stack, and it's probably still running but it is such a resource hog.
Part of the problem is what it's promoted for. It's a great drop-in, horizontally-scalable, full-text search engine, that's inexplicably become popular for log ingestion and analysis.
To those ends, I hate it, I hate every bit of it, from the atrocious JSON-based query DSL (seriously thought it was a joke at first) to its unpredictable timeouts, shard storms, mapping conflicts and other problems at scale. Elementary SQL concepts aren't possible ('select someone_elses_poorly_named_key as first_name', nope, you gotta reindex). High-cardinality aggregations fail in spectacular ways. High-anything aggregations fail in spectacular ways. The scroll API returns results unordered. There's no way to properly spec your cluster; the docs explicitly take a trial-and-error approach to design.
It's not just you. Elasticsearch does me no favors with the task of log analysis. I'd sooner normalize and grep a pile of gzipped log files than keep dealing with this mess, but this is the second job I've been at that's built their logging infrastructure on top of it.
> I do not understand why they built Elasticsearch.
"You know, for search."
It's great for searching proper text. Documents, comments, blog posts, etc.
> I've read in a couple places you need like 32GB of RAM[0] just to run this thing to do queries, and having crashed Kibana / Elasticsearch a dozen times I believe it's designed poorly.
You don't necessarily need 32GB to do anything. The required heap size scales with your intended workload, but it's not like you can do a back-of-napkin calculation to figure out the relation. I run a 1GB instance for development.
It uses the memory to do a lot of caching so the queries you throw it are lightning fast. Mongo does something similar (but crappier IMO) with its concept of working sets.
I've deployed it a few times. In some cases it randomly spikes to 100% cpu usage until you restart it.
Is there a way to know how much RAM you are going to need for your dataset? I think I was using it for looking up restaurant names from a database of 100,000 and I wanted to factor in misspellings and partial matches.
Not really, depends on the size of the dataset and the complexity of the operations - when did you use it? Sounds like you were on a dodgy build or set up
I think Elastisearches policy on sizing is pay us or a partner to have a look and give you a guesstimate, which is pretty standard
We ran into this same problem trying to build and run a data management platform on it... IDK if you want to try what we built, but it's a hell of a lot faster and less cumbersome. We went from 30 ES servers down to 4 Pilosa servers.
I spent so much time resurrecting ElasticSearch and then developers wouldn't even use it because the query language is based on ngrams and they want grep-like capabilities instead.
Unless there are hundreds of GB/day to index it's much simpler to forward the logs using syslog or journald and then use grep on the collector.
I definitely used it on 8GB or RAM but only about 10M documents or so. It's pretty kick-ass for ad-hoc queries about data that you would typically have to set up a star schema. I tell you probably the best thing you can do is set up a Kafka queue, Apache Spark and ElasticSearch (do the research around these 3) but you'll love the ability to find out things like how many M(ale) patients that are above the age of 30 that have diabetes have died shortly after a surgery. Trying to set all that up with complicated star-schemas etc.. really sucks compared to just building a JSON format that you pipe through Kafka or Spark.
Edit: And yes originally used it for log processing, but really people should definitely try it out for replacing very expensive BA stacks. For log analysis something called Graylog that actually uses Elk internally or just go with Splunk which gives you out of the box primitives for session length calculations. In ELK if you want to do something that sounds as simple as session lenght - you'll ending having to reprocess documents using Kafka or Spark with background jobs that reprocess documents greater than 1 hour old (or something to that effect).
We are so many that have been through this just like you.
My realization was also what others have mentioned, that I'm trying to use a search engine backend for log storage.
Specifically 3 months, which is the law here. That meant that ES had to keep 3 months of logs readily searchable. It's just not feasible when you're generating 25-30G logs each day.
I still use ES for search engines but I've stopped using ELK for logs unless it's for small environments.
This is why I never got into the elk craze. Especially coming from oldschool syslog/syslog-ng/rsyslog/journald/etc... and usually searched from the command line.
For some reason though the past few years companies and people have become obsessed with gui's, and nagios/cacti were falling out of favor, so people started just dumping elk/graylog so randoms could quickly and easily get that gui... without considering the resource requirements and lack of scalability. (graylog2 fixed a lot of the graylog problems)
It seems a lot of managers also started wanting gui-dashboards, which is probably the big reason behind the push. Regardless, I don't think the trend is going away, so the market is ripe for disruption for something that does the same thing but faster and with less resources. The real problem is most of the competitors for some reason decided proprietary was the way to go, and the people using these tools don't want more proprietary bullshit in their stack.
The feature that makes them different from other web-gui-graphing tools though is the search/query customization.
I'm not sure why elasticsearch documentation recommends such memory usage, however for small apps I have elasticsearch running for years (version 1.x) and it's been running on a shared 2GB virtual machine for the past 4 years, I've had to restart it a few times, but seriously you don't need 32GB or 64GB. It depends on your use case.
Originally the most popular (and only) use case for Elasticsearch was for full text search. Then a bunch of people thought it was good for analyzing logs and decided to hijack the brand, and market it as a logging analyzer.
But in its core, Elasticsearch is meant for full text search and analytics. Not logging. Logging is not even an interestinng use case.
You don't actually need 64GB or 32GB or whatever RAM. The docs should be more clear. What they mean to say is if you have a large enough dataset in production, 64GB RAM per node is the ideal maximum size. That is because 32GB is the max Java heap size that uses 32-bit pointers. So 32GB for Java heap and 32GB for everything else. Although more RAM is generally better because the OS cache will still be utilized making queries faster overall.
I think ES's Java heap default is 1 or 2 GB, and that is more than enough for many use cases. Heap-heavy operations like sorting and aggregations may need more RAM depending on the index size. As far as I know, search isn't heap-heavy so you only need more RAM as query volume increases or index size increases.
For crashes, what version? Version 6.x has never crashed on me, but previous versions did have a tendency to crash for me.
Elasticsearch is a search engine built on Apache Lucene and basically makes it much more usable. It's similar to Apache Solr but easier to get up and running.
It's good at text but can be used as a similarity search system, so you can index and find similar images, audio waveforms, binary data, etc.
Over the years, search queries have become useful for log analytics and so the ELK stack has been developed to become a single solution to do both text search as well as ingest logs and run all kinds of queries, aggregations, and even machine learning.
I think that's what fundamentally hurts it as they are very separate use-cases trying to be served by the same system. We still use ES for search but wish it just focused on that with much better performance, reliable clustering and proper transactions instead.
By default elasticsearch stores data in 3 ways: rowstore (original doc/row), inverted-index (for fast queries), doc-values (columnar store for aggregations). So you need to configure it for what you want, for each field.
While you can do all of that with plain pg and having no indexes and doing table-scans, it will be slow when you search (depends on how often you need to search and type of queries).
So es is very good at some stuff compared to every other type of db. For logging, yeah, there are some companies who just do full-scan on each query and are fine.
> dejavu is the missing web UI for Elasticsearch. Existing web UIs leave much to be desired or are built with server side page rendering techniques (I am looking at you, Kibana).
When did server-side rendering become a bad thing? It tends to be much faster. Kibana feels slow to me, and that's because of the JavaScript it does have.
I don't agree that server-side rendering tends to be faster. I think it largely depends on the use case. I am going to use the word "responsive" rather than "faster" since it is more descriptive of how a UI feels. If you don't need to go to the server, performing an action will be more responsive if done client-side in general. If you do have to go to the server, then you are at the mercy of the network. However the less data transferred, the better in terms of responsiveness. In many cases, asking the server for JSON rather than HTML/JS/CSS is more responsive. Render time is also a factor. If client render time is slower than network/server time, then rendering on the server is more responsive in general.
I agree that client-side UI modifications tend to be faster if no server round-trip is necessary.
Rendering HTML will, however, tend to be faster than having a client parse and interpret JSON, and then manipulate the (shadow) DOM accordingly.
And then it always felt wrong to me that I have to force countless clients to do the exact same view transformations, which could have been done once, on the server, and cached.
For a new project, I'm prerendering HTML fragments via AJAX with IntercoolerJS - it's a pleasure to work with, and the developer and user get both the best experience of those two worlds: A dynamic, blazingly fast UI that feels like an SPA, but degrades gracefully even for the Emacs-w3m user and Googlebot.
I also found it's really worth it to move servers or reverse proxies closer to the client - independent of client/server rendering. Users shouldn't wait more than 200ms for an Ajax request to finish.
This right here. My applications are typically used 8-10 hours a day by a small team of 5-10 users. Everything is done with Ember because Rails could not provide the User experience of a desktop app with the constant server rendering. Everyone loves the SPA when it rolled out and it’s still going strong 3 years later.
I recently finished a small project at work for another business unit-a straightforward tableau dashboard-only for them to turn around and say "our laptops aren't powerful enough to run tableau, thanks but we're going back to excel".
They were the ones who insisted I use tableau, I had originally planned and suggested a nice server-side rendered dashboard that they could have used on any grade of laptop, but no they wanted something _right now_ and any kind of proper solution was simply "too hard".
The real irony being that they'll waste more time doing and redoing these excel reports every time they want one of these reports than it would have taken me and my teammate to build a solution for them.
It's pretty obvious that server side rendering has more potential bottlenecks and IO overhead. But with that said, nothing says it must be slower. It just has more potential to be slower.
Yes, we don't focus on admin related features atm. Elasticsearch head or Elasticsearch HQ [1] are good options for doing this.
Fun fact: We actually started out with elasticsearch-head, but realized it was really hard to hack on (this was ~2y ago) and decided to create Dejavu. Imo, the key difference is we have focused a lot on how to get the data indexed (mappings, bring your CSV/JSON files) and have the raw data visualized (via UI based filters or DSL queries).
> dejavu is the missing web UI for Elasticsearch. Existing web UIs leave much to be desired or are built with server side page rendering techniques (I am looking at you, Kibana).
So less so "missing" UI and more "outdated" UI.
I personally like Kibana but I am happy to see some innovation in the space. Competition can only help.
I love that this can be run as a Docker container or as a Chrome extension. With the quick setup I'll probably give it a try. It looks nice.
It's complementary to Kibana, I would say. Dejavu gives the raw data view as well as an ability to add/update data along with mappings. It also comes with an importer view where you can add JSON / CSV files to directly import into ES. Doing this otherwise currently is a major PITA.
Kibana is more for dashboarding and visualization, Dejavu gives you control of the raw data views and indexing operations.
One thing to note for anyone using elasticsearch in production, if you setup your own cluster. It might be wise to put nginx as a reverse proxy. By just doing that we reduced the amount of crashes when the server got hammered. Since nginx will help with request queueing
I will have to relook at this - the comment is from a year ago when they weren't using React.
On a cursory look, it still seems to use a Node.JS based server [1]. One of the key side effects of needing a server is that there is a more involved distribution + installation process. You just can't run it on a static server like github pages or as a browser extension.
I've stumbled on a couple of bugs with Dejavu, and wanted to see if I can submit a bug fix. And, well, submit and fix them I can, but boy oh boy is the code terrible. Only use this on your own risk and on a throwaway ES instance, as the code quality in this case kind of implies there's major risk of screwing things up.
> dejavu uses a websockets based API and subscribes for data changes for the current filtered view. For this to work, the Elasticsearch server needs to support a websockets based publish API
Is websockets a requirement just for the real-time updates, or for the entire UI?
There’s no open source websockets-based ES publish api, you have to use the closed source private service made by the company which created this project.
Sometimes you want to access the application from outside, available via some corporate proxy, and all do not support WebSockets, and even those that do on paper are so complicated to set up that some admins abandon the idea (looking at you MS Forefront TMG). So it's good to know that you'll get the baseline functions working without WS.
Not that I know of, but given the similarities in the underlying APIs - we would be open to supporting this and help guide the implementation if someone is interested in sending a PR. Feel free to file an issue - https://github.com/appbaseio/dejavu/issues/.
Yeah, when I looked at the title, I was confident I've seen that before, but I hadn't.
You might say I had a case of, ah, never mind, bad pun.
On a serious note, I understand there's only a finite number of words, but a limitless number of software project - and yet surely there are still enough names that aren't used for something well-known and widely used. Yes, the DJVU and Dejavu are spelled differently, but it is the same word.
Needless to say, I rather build my own solution for logging instead using a database that's not in-house having experienced all of this.
[0]: https://www.elastic.co/guide/en/elasticsearch/guide/current/...
Oh wow now it says 64GB of ram is the sweet spot.... What the heck is this thing doing that couldn't of been accomplished with MongoDB or PostgreSQL? I've got busier data sets that don't need 16GB of RAM, and yes we pound the database with logs of sorts and query in all sorts of ways and still I don't get it... I wouldn't recommend this stack to a friend unless they've got plenty of hardware to spare.