Hacker News new | past | comments | ask | show | jobs | submit login
Linux Performance (brendangregg.com)
536 points by pablode on Jan 27, 2017 | hide | past | favorite | 64 comments

I recommend Brendan's book on Systems Performance[1], describes a structured approach to solving performance problems and is a good reference for various types of resource performance profiles.

[1] http://www.brendangregg.com/sysperfbook.html

Has anybody else found this book a little too high-level? I've had a hard time getting useful, practical advice out of it but haven't read it cover-to-cover yet. For instance there's a bit about the impacts of using a GC'd language, but there's nothing new, just common knowledge. Does anybody know of a lower-level/more advanced resource like this book?

I wanted to give an overview & awareness of all performance areas, not just the focus of the book (systems). So topics like runtimes (and GC), queueing theory, etc, are summarized in less than a page in a 600+ page book. A big problem with performance is gaps in people's knowledge (unknown unknowns -- which I referenced in the intro), so I wanted to at least create awareness of them (to become known unknowns). But I bet that's frustrating if I've hit an area you want to know a lot more about.

Areas like systems performance methodology I've written a lot of new material for in the book in more depth.

IMHO: I use it more as a reference book. "Oh, I'm having I/O problems, let me look up the sections on Filesystems and Disks"

It is very dry, and hard to read from cover to cover, but it is filled with a mountain of info. I unfortunately don't know of a more introductory book to recommend, but would also be curious

The parent asked for more advanced book, not for more introductory book.

You are correct. <insert joke about me needing more coffee here>

A couple months ago I was looking for some courses that could cover the new tools available on linux to measure performance: in kernel and user space. My main frustration is that there was no sign of any structured course that instruct on all these tools, and as importantly, when to and not to use each of them. Before you tell me again: no, youtube videos and blog posts are not good enough: I need to be able to sing up multiple members of different teams and be sure we all receive a homogeneous knowledge. Can Anyone share any pointers about where I could find this kind of training?

I used to work at a consulting firm called EfficiOS[0], specialized in OS and application efficiency and performance. They maintain the LTTng[1] kernel and userspace tracers and related tools, and offer training. I'm not sure whether that responds to your specific needs, but if you get in touch they'll certainly be able to guide you.

[0] http://www.efficios.com/ [1] http://lttng.org/

say how much you're willing to pay for it, and I'm sure somehow will build the course for you.

>> Before you tell me again: no, youtube videos and blog posts...

I've been hunting for such a thing as well, though I'm guessing your hunt is for much higher quality than mine. I've not been able to find anything BUT based on what I've been watching and reading (yes, youtube and blogs) it wouldn't be much work to assemble a reading/viewing list from youtube/blogs and that might just work for you given that there might not be a perfect solution out there. The market for advanced high quality dedicated training on really advanced things is probably rather small (just my guess).

That "Linux Performance Observability Tools" diagram detailing the utils to inspect every part of the stack is excellent!

This site is a great resource and I hope Brendan keeps posting amazing content.

It is interesting to see a former Solaris developer embrace Linux like this, when others seems to consider it an abomination.

It's a tremendous boon to Linux that it happened, it would have been very easy for him to transition into Illumos while with Joyent or FreeBSD at Netflix. Both already ran DTrace. He stuck with Linux and spent time making their performance tools competitive.

I love Brendan's writeups and scripts, and use them everyday multiple times a day, but he didn't "make Linux performance tools competitive".

He's spreading and making available knowledge about perf, ftrace, eBPF that would otherwise remain relegated in Linux kernel development circles. He's also contributing to bcc (a tool in the eBPF ecosystem), but his main merit in my opinion is marketing, with his books and blog (I'm using the term in positive sense).

The lack of DTrace like tools has long been a criticism of Linux. The entire BPF team deserves a lot of credit for their work, but his decision to highlight this problem and help solve it rather than move to FreeBSD likely got it into 4.9 rather than some kernel in the future.

Well it's the #1 platform, not sure how you can avoid using it. And I don't see why it would be such "abomination". Linux is still very clean imo

You avoid it by using the alternatives, the BSDs and Illumos. The post is probably comparing Gregg to Bryan Cantrill, who is very vocal about what he thinks is wrong with Linux.

Except that most server side apps run on Linux.

You are correct, but I don't see how that's relevant to the point? Am I misunderstanding you?

Windows is much more popular and has many more apps, but I can come up with a handful of reasons why I don't like it or think it is objectively worse.

They often also run on BSD and Illumos. They'd need some specific kernel function to have be difficult to port, and several of the alternatives offer Linux emulation layers for those situations.

This is awesome, I haven't seen all the perf tools so clearly laid out in usecase ever.

You think this is awesome search YouTube for his talks and prepare to be blown away. Incredible engineer and funny guy.

One thing that I learned from him and have never forgotten is that you can reduce a harddisk (the spindle kind) performance just by screaming at it https://www.youtube.com/watch?v=tDacjrSCeq4

This is awesome! Reminds me of how "IBM Tech Uses Hard Drives to Predict Earthquakes" [1]. And then it's just one guy screaming at the HDDs.

[1] http://www.pcmag.com/article2/0,2817,2371337,00.asp

Yup that is what Brendan is famous/infamous for and it is pretty hilarious. His talks at conferences like say Monitorama are always equal parts amusing and deeply technical. I've been lucky enough to learn a ton from his work on pcc / perf to find a DTrace equiv on Linux.

503, seems we torched the place...

I'm wondering why doesn't HN keeps a cached copy of each page that makes it to the front page (or maybe automatically post it to archive.is). I don't think it would be hard to implement and would avoid this quite frequent problem.

Because of copyright issues.

People like traffic to their actual domain, because it often means better search engine ranking in the future. On top of that, some websites serve ads, which means that traffic is proportional to revenue.

Ah, don't you just love it when there's a simple technical problem with a simple and valid technical solution, but there's a legal problem that stops it dead?

Well, legal and whatever domain the "you viewed the ads but they werent tracked so we don't get money" problem falls into. Stupid artificial barriers which are there because they solve an economic problem, that wouldn't exist in a perfect world. Game theory, maybe.

This reminds me of the bird complaining: "if only there was no air slowing me down, I would fly a lot faster".

I have the technical ability to steal lots of stuff. But yes, there are legal (and moral) obstacles that keep me from doing it.

The solution described above is pretty much what Google's AMP does. It turns out that a lot of people don't like AMP because they see it as a monopolistic way of keeping users in Google's walled garden. HN doesn't quite have a monopoly on startup/tech news, but I think the same reasons apply.

Sites opt-in to AMP, so copyright is less of an issue there.

You'd be OK with putting a ton of work into something, and then having it cached/ripoffed somewhere else?

Google and the Internet Archive already caches it. Also, if your work goes down for whatever (perhaps legal) reason, you know that people can still benefit from it.

Caching it for when a site is overloaded is not ripping it off, just like sending copies of works to the British Library for archival purposes is not ripping the original work off.

The trick in this solution is deciding when to flip the switch for the cached version.

Perhaps a simple GET to the URL ever 60s. If you receive 5 non-200 responses in a row, then the HN link points to the archive.is version. Same method in reverse for bringing it back.

That sounds like it would DDoS the site. Even better: monitor the comments for "site is down", or just provide a link to google's cached version as a button (Even better than the 'web' link we have at the moment).

> That sounds like it would DDoS the site

How would one request per minute from a single server DoS the site, let alone DDoS it?

Both of those sites can be instructed not to store copies of your pages for their caches or archives; and they will dutifully follow your instructions.

Also, the library analogy doesn't work in copyright law. Copyright protects the act of copying, not the act of transferring an already-authorized copy of a work to someone else.

Could we pull up the real page from the actual domain in a full-window iframe and then replace it with the cached version if it fails to load?

That will fail for a lot of websites since X-Frame-Options is becoming more common: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/X-...

This post has only 30upvotes right now, I'm surprised that website couldn't hold it. My WP blog hosted for ~3euro/month was a few times here on top with 600UV, on reddit, according to GA with 3600 people online at the same time... and it didn't die to my surprise. I would rather say that hosting is... not performing well.

Upvotes are probably not a good estimate of traffic from HN hitting a site for a multitude of reasons, for example:

- not everyone upvotes

- there's probably a substantial number of users who are content to read articles but have never had a login (I did for two years)

- social media amplification.

I know, that's why I mentioned reddit too.

That 503 looked like apache mod_security throttling.

That hinges on the assumption that most (or even a non-negligible amount of) visitors will upvote the article.

Static pages vs something like WordPress?

try using fullhn.com

This is useful.

Archive.is is not related to web.archive.org, and the short URL is not a 'tiny' version of the latter. (That said, I agree with the general advice and think that tiny-fied URLs break the web.)

Shortened urls are gods way of giving you surprise goatse. Avoid. Always.

As if non-short URL's can be trusted...


Or all the phishing domains...

Apparently not enough performance.

The HN hug of death.

Seems like posting to HN is a good way to trigger a DDOS

HN the new /.

/. was the original, but since then sites like Reddit and HN, never mind facebook or G+, can also trigger a traffic spike.

I seem to recall that the blog of some physicist hit the front of multiple such sites on the same day, effectively taking down the whole hosting service.

Previous discussion on Hacker News: https://news.ycombinator.com/item?id=8205057

His site is a great information source. I used some of his work to diagnose a JVM performance problem (discussed in https://news.ycombinator.com/item?id=12505517).

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact