Most of that is attributable to a drop in sales & marketing expenses from 69% of turnover in 2010 to 60% for 2011 and 2012. If we scale their 2011 and 2012 S&M expenses up to 69% their net margin for the three years becomes {-21%, -14%, -16%} instead of its present {-21%, -6%, -9%}. The rest is largely cutting R&D. (The 2% blip for FYE 2012 is attributable to other expenses, which "consists primarily of the changes in the fair value of our preferred stock warrants, interest expense on our outstanding debt and interest income on our cash balances" - IPO-related.)
Given that they expect "operating expenses to increase over the next several years as we hire additional personnel, particularly in sales and marketing," I don't know if that is sustainable and thus indicative of any strategic brilliance.
It's enterprise software and they are the market leaders for the whole "unstructured data analysis" segment (which is a real segment, and has real demand)
That makes the business essentially a sales organization with a R&D wing. The CEO "is well known for taking Hyperion from $500M in 2001 to revenues of almost $1B in 2007"[1], so he seems to understand that process well.
I also used to have doubt about it because I heard it was categorized as a "log analysis", which is very unsexy from a developer's point of view. Can we be tricked by names! After using it for a while just yesterday, I found it not bad at all.
In a nutshell, Splunk is valuable because it turns your unstructured untyped data into something much easier to analyze and Business Intelligence (BI) ready. IMHO, the strength of Splunk is two folds. First, it provides a central place to view and mine your logs across hundreds or thousands of machines. This is very useful in many organizations with large scale distributed computing because you don't need to ssh into your database server in one terminal and compare the results in another terminal connecting to your web server, if you could ever figure out which of the 100 database servers you should look at. Second, instead of just greping, it treats each log entry as a list of key-value pairs and provides a simple yet powerful query language for it. On top of that, it gives you a library of visualization components. I was able to build a fully-customized Google Analytics style dashboard for my application in 20 minutes. And I would say that is empowerment, and empowerment is the most valuable value.
And if you are very into technologies, they are a Python house, which is always a plus :)
That's mostly because their target customer is big enterprise who feeds on buzzwords they can feed to upper management. It's similar to the websites of SAP, Oracle and Intel, which all have the same qualities:
- No up-front pricing, you need to get a quote
- No screen-shots, you need to see a prepared demo
- Any contact what the company results in assignment of an "account manager", the used-car salesman of the software industry.
- Lots of testimonials from people you've never heard of talking about software you've never heard of using the very same buzzwords found on the website.
That's because they're a log analysis company trying to ride the wave of Big Data hype. It doesn't really fit but if you squint and look at it in the right light you can imagine that it does...
Splunk in a nutshell: it runs a daemon on all your boxes, that tails all your logfiles (you can configure which ones) and pipes them all to a central logging server. On the logging server, another daemon runs that does pattern matching/filtering on the incoming logs, and fires an alert when it gets a match. And umm, that's about it really.
The one useful thing it can do is give (say) devs access to logfiles from prod servers that they aren't allowed for whatever reason to log into themselves. But you could do this yourself with a periodic rsync to an internal webserver...
Not so - Splunk has a very sophisticated index for huge amounts of data, and a rather fancy DSL for analyzing it. I've looked at it in some depth as one of our products (http://zetetic.net/software-combine) does log aggregation (and does a much better job with Windows and SNMP than does Splunk), but Splunk is not at all a trivial piece of software.
Honestly, your competing with inhouse scripts people spend a few weeks / year maintaining. Which is not to say you can't make a lot of money doing so. Ideally, your saving people significant time writing these scripts, but tossing out buzzwords just alienates your users.
Splunk's query language is not like regex - it's somewhat more like SQL or LINQ, with groupings, aggregates, pivots, etc. It is definitely a DSL.
I am not a Splunk developer or serious user of it, but you greatly underestimate the problem space to suggest this class of products address the same problems solved by even a very skilled scripter's scripts.
It's much more than that. It lest you query logs, and queries can be very complex, which then gives you a real time result with charts and all. I've used it in production, and it's was the best damn tool to debug production problems. The licensing costs are insane though.
It really depends on the size of the logs, the number of users, and the deal you can negotiate. We paid around $60K/year, but we had quite a lot of logs.
As a guy who used to resell Splunk, it offers a lot more in comparison to it's open source alternatives. Being data agnostic with decent log compression, with an extremely rich searching syntax, some very good (HTML 5) reporting dashboards, decent reporting and a verbose API make it very attractive for users (companies) who have compliance requirements or require some form of performance monitoring. (I know it's a mega sentence)
I have since left working with Splunk directly but I would still advocate its use because it's one of the better commercial (albeit expensive) log management/SIEM products around.
* the largest data source in any enterprise
* increasingly must be stored for long-term analysis ...
* by corporate policy or ...
* by government regulation/demand
Regular open-source tool stacks can't do this without astronomical cost in storage hardware. Various log analysis players have managed to make this data storable and some have done well on the analysis side.
Splunk will do well. I hope this boosts acquisition interest in my old company, http://www.sensage.com, another log analysis company.
Splunk is one of the best pieces of software ever made. It's critical to almost everything Shopify does. It's pricy but easy to justify because it makes any ops and many dev tasks 10x easier and therefore there are some real savings in required manpower.
The push away from something very well understood and easily explained (syslog and SIEM aggregation) to calling themselves "big data" and "business analytics" is kind of lame, IMO. I assume they were pushed to do this by the investment market, since big data is hot.
I use Splunk fairly heavily at work with a fairly significant amount of data going through it daily.
It's pretty good software, and isn't as trivial as some on this thread seem to think (I've built large Solr implementations too, so I know search reasonably well).
The strong points: good interface, excellent data import, decent search language, decent docs & community, good APIs, a good set of mostly decent drop in applications that run on top of it.
The weaknesses: While indexeing is Map/Reduced based and scales fairly well, querying is single threaded. That limits it to the performance of a single CPU core + IO limitations. This also applies to things like sub-queries: in a database they could be run separatly, but in Splunk they aren't.
It is also fairly expensive at large scale, although the licening model is fair (it is licensed by data volume, so you can install it on as many machines as you like and share the license between them).
Looking through their site it looks like a Splunk search (or a query) can include a large set of non-trivial, CPU intensive operations. Therefore, perhaps, it is a process that does not lending itself well to multithreading: http://splunk-base.splunk.com/answers/12027/singlemulti-thre...
There is no reason why that second search couldn't be executed simultaneously, and that would approximately half the time for the whole query to run (assuming sufficient CPU power etc).
Congratulations to Splunk! Operational Analytics is an exteremely interesting area rich with many complex problems: performance (processing complex queries over very large data sets), prediction (proactive diagnostics), and visualization (packing dense information in a human-friendly manner).
We at Pattern Insight are currently working on the next generation of logging software, called Log Insight. If you are a Splunk customer or thinking of buying them, take a look at at our product page [http://patterninsight.com/products/log-insight/] for information on a more sophisticated and complete solution.
Don't hesitate to contact me if you are on the job market and want to work on interesting data-mining problems (full-time only please).
Go straight to scale with http://www.sensage.com. Used by governments, health records and insurance companies, telcos, etc. When commodity solutions like Splunk no longer work for you take it to the only other level you need.
Is this type of comment considered OK in this community? I'm all for a little competition, but it feels a bit dirty to me to congratulate a company while calling their product a "[less] sophisticated and complete solution."
If it were a startup talking about how they're better for specific types of problems (e.g. "if you're an EC2 fan but need to host in Russia, contact me..."), I think that's totally ok. Big company vs. startup, general vs. niche, non hn company vs. hn company (where "hn company" is "people on hacker news regularly, or YC companies), traditional vs. open source/free, etc. are all points in favor of making it ok.
I'm not sure in this case (I know lots of people at Splunk, Adammark/sensage, and other SIEM companies, and IMO they're all useful in some cases, and too expensive for a lot of cases).
Question for those who have used Splunk and alternatives:
How much of what can be done in Splunk, is not possible with alternatives like Graylog2? http://www.graylog2.org/
I considered splunk at some point until I read the EULA review of theirs at http://blog.hacker.dk/
"
Upon at least ten (10) days prior written notice, Splunk may audit your use” …. ” Any such audit will be conducted during regular business hours at your facilities“ … “You will provide Splunk with access to the relevant records and facilities“
"
Why is that them screwing their customers? They sell licenses. Customers agree to use only what they're paying for. This language gives them the right to check and see how many licenses the customers use just to make sure everything's square. Seems fair to me.
The only clause I see worth objecting to is the bit that forbids publishing benchmarks or reviews. That part's BS, but I imagine that it's BS that they would drop rather than lose a sale.
This is entirely par for the course, and quite reasonable if you understand how it happens in practice and why it exists.
There are Splunk customers who do not allow outbound connections to the Internet and so Splunk can't use automated means of auditing license compliance. So they reserve the right to audit you on site. So if you're the CIA and you are paying for 1 petabyte and you are using 2, they want to know and charge you appropriately.
As a matter of good corporate governance, they are actually doing you a favor and preventing you from being a thief. :-)
Are you serious?
I would never give some random supplier unlimited access to my server and internal network. Such a requirement is an absolute show-stopper.
Is that really screwing their customers? They license by GB/day usage. As expensive as Splunk is, an audit if they think you're really underreporting usage doesn't seem too far fetched.
There's a free version of Splunk that does an internal count (something like 100mb/day) for demo purposes. If one is savvy enough to override that, then I'd also assume one is savvy enough to fake the records when the license police come calling.
I can't be the only one that finds it very disquieting that you have to give a company access to your internal network so they can perform antipiracy checks.
Do they have a viable business or is this just another "inflate and escape" company?