

Logging: Unsexy, Important, and now Usable. - gsteph22
http://www.roadtofailure.com/2010/01/25/logging-unsexy-important-and-now-usable/

======
n8agrin
"We want to be like Geordi in Star Trek, who can see everything on the
Enterprise with a few finger presses. I wish there was such a thing! The
closest we can get is ganglia, which monitors OS-level events (it can be
hacked for apps)."

I disagree. Large software companies already exist in this space: Splunk,
LogLogic, Arcsight, etc. The author Sounds like they reinvented Splunk in
particular. (disclaimer: I work for Splunk and can see everything in your
datacenter with a few finger presses. Call me Geordi LaForge.)

~~~
eru
I hope it's not as painful as Geordi's visor.

~~~
RyanMcGreal
The client app is called BananaClip.

------
jawn
Hi I work in the log importation services business.

From my perspective the main hurdle to log aggregation/correlation is not
scalability. If splunk doesn't cut it for your performance needs, you have
probably hit the price point to where you can afford a loglogic or similar
appliance.

Instead the barrier to entry is in the number of applications supported by a
particular log archival product, and the ability to correlate across the
different applications.

As I'm sure you know at this point, adding support for log types is a
painstaking task. Most vendors punt on this and tell customers to do it
themselves.

If there is a niche available to you as a startup I would think that it would
be in offering a very low turnaround time in supporting new log types. For
example: give us some log sources and we'll support and categorize your logs
with our service.

As for running in the cloud on large datasets, I think you'll find that most
customers are not going to want to double or triple their outgoing bandwidth
-- In addition to concerns from a security compliance standpoint.

That being said, good luck in your venture. Logging is a mess, and could
certainly use some clean up. :)

------
xal
For 99% of all companies a simple unix box that collects log files and moves
them along for permanent storage a few days later (S3/Tape) is enough. At
Shopify we use Clarity which is a web frontend for Grep and Tail that we
released as open source: <http://github.com/tobi/clarity>

------
ajross
FTA: _This means in order to find what you want, you need to explore the data:
you need to search it. The only tool most of us can use is grep_

Um... what? Have the authors gotten stuck in a time vortex and been dropped
off before, y'know, awk? Much less perl, or any of those new-fangled toys.

I mean: writing scripts to do log analysis is a pretty fundamental problem for
server-side development, and lots of very smart people have spend the last
two^H^H^Hthree decades working on tools to address the issue.

I don't even see how this (indexing the entries across a Hadoop cluster) is
all that useful. In general, you don't do log analysis by asking "give me all
the entries that match this pattern", you do it by walking them in order and
extracting one or two fields from each line and building some kind of result
data structure. This thing would be fine if you were asking for all the logs
messages that mentioned "coffee", I guess. But what if you wanted a histogram
of hit counts per page per day-of-week?

~~~
gsteph22
Thanks for providing this valuable perspective. I sort of group grep/Perl
scripts/everything else together as manual processes. What I was getting at is
the whole "roll your own" scripts is a royal pain in the ass.

For analytics, you're right, search is only part of the equation. That's why
we make MapReduce easy to use on a cluster. You can write Pig or Hive scr

We also have templates for common data formats (and ways to roll your own) so
you can turn unstructured log text into structured data, so that a histogram
of hit counts per day-of-week is just a few lines of a script (or maybe even a
search).

~~~
gsteph22
Looks like I got cut off in the middle. *You can write Pig or Hive scripts to
generate interesting analytics.

------
randolph_carter
Out of curiosity, can you be more specific on what scalability, analytics, and
ease-of-use improvements LogSearch offers vs. Splunk in particular? We use
Splunk internally and have not found it to be significantly lacking in these
areas but are always open to alternatives. Additionally, what is the advantage
of relegating log management to a cloud application? Typically log data is
relatively sensitive; do you provide assurances that the data won't be
misused, lost, repurposed, etc.?

~~~
gsteph22
Thanks! Actually our software runs in the cloud _or_ in your datacenter. If
it's in the cloud, it can be encrypted in trasmission (or stored encrypted if
you're willing to take the performance hit).

If it's sensitive data, I'd recommend just spinning up your own cluster and
installing the tool.

~~~
gsteph22
Oh, and as far as scale-wise: we run on the "web scale", which means a dozen
to hundreds or more of computers, each generating tons of logs. It's why we're
built on Hadoop. It's the main complaint our customer research indicated
people have with tools like Splunk.

~~~
n8agrin
Cool. Splunk also does this and it's good to know that people don't perceive
Splunk to be capable of operating this way so we can fix this perception.

~~~
nwatson
Or you can use a tool that has already scaled for ages and has been in
production since 2002. SenSage ... <http://www.sensage.com/customers>.

~~~
randolph_carter
Also not wanting to sound like a mouthpiece, but choice of log management
system is a very interesting topic to me... scalability is only one concern
and may not trump everything. We actually looked at both LogLogic and Sensage
during our initial evaluation both of which scaled better than Splunk did at
the time. My memory of the evaluations is dim (2+ years ago), but I recall
that both products seemed more brittle in our environment (education), where
we have very little control over incoming log formats due to somewhat chaotic
organizational structure and infrastructure. Splunk seemed more able to accept
unstructured or unexpected data without complaint and to deal with the parsing
at search time rather than index time, and was a more natural fit. The other
decision points/tradeoffs we ended up making were a) no additional licensing
cost for Splunk agents, b) prioritization of real-time perusal for
troubleshooting purposes over long-term reporting and canned reports. Sensage
excels at the latter since it amortizes the cost of reporting by caching
periodic results, and provides a large set of prebuilt reports, if I recall
correctly. Splunk's "summary indexing" serves sort of the same purpose as the
cachign feature of Sensage and has been improved in v4.0. And c) no additional
cost for users (we have over 50 people looking at data indexed by Splunk). The
"search" metaphor is also very easy for people to grasp which was an added
benefit of the Splunk GUI.

My sense has been that at least the low end of the market is increasingly
preferring a search-oriented log management architecture versus a database-
backed, query-heavy architecture because organizations are familiar with the
search metaphor, the overhead of managing a search index is less than the cost
of database administration, and the unsophisticated use case (i.e., free-text
search rather than advanced query syntax) is increasingly common. Smaller
organizations also rarely have the maturity to deal with logging
systematically since it requires a pretty systematic approach to
infrastructure. That said, the tradeoffs we made may or may not apply in a
terabytes-per-day environment, I unfortunately can't speak to it directly.

------
gsteph22
The main thing is: we focus on bringing a cost and performance advantage to
the small and medium companies, to make their lives a little better.

------
gsteph22
It's true that LogSearch is similar, but we focus on cloud, analytics, ease-
of-use and scalability -- each of which we'ce heard Splunk lacks.

~~~
n8agrin
I don't mean to come off sounding like a mouthpiece for Splunk, but Splunk
does work with "cloud" data (hurray for buzz words), it is dead simple to use
out of the box and recognizes many special fields like timestamps without any
configuration (and also provides extensive configurability when needed) and
scales like the distributed monster it is, capable of handling 10-15 MB of
data / sec. on a single machines and GBs worth of data per day when scaled out
across multiple machines. You should try it before you judge it on hearsay
(and I'll maintain that you should also try any Splunk competitor before
judging them as well).

~~~
matrix
I have tried Splunk. It's a great concept and I commend the Splunk team for
building such an easy to use, polished product. However we found the query
performance to be... ahem... not well geared towards large data sets. This
combined with the licensing model meant it wasn't an option for us.

I just wish Yahoo would open source Everest (their multi-PB column store DB
based on PostgreSQL) -- this would be ideal for building an open source Splunk
competitor.

~~~
n8agrin
Interesting. Do you know which version of Splunk you were using? Our latest
version has vastly improved our query performance over large datasets.

Re. an open source log indexer: Agree, this is a space that will eventually
become dominated by open source tools, used particularly by startups and small
businesses. I think most people ignore this use of a MapReduce-like framework
because they conceptually understand how it could be used, but 99% of all work
is in the implementation, not the idea. And as of yet, I don't believe there
has been a specific implementation beyond what companies like Shopify are
doing where they add nice GUI tools on top of awk and grep (which admittedly
is probably good enough for most people / business on this forum).

------
spudlyo
See also: <http://developers.facebook.com/scribe/>

------
bravura
I worry, like the other posters, that cloud logging + search is a solution to
a problem that doesn't exist.

~~~
gsteph22
It's not only the cloud, it's also in the datacenter :)

