Hacker News new | comments | show | ask | jobs | submit | nkurz's comments login

This is a nice summary of Linux performance monitoring tools.

What does he mean with his repeated question "Why isn't it 10x?"

If you benchmark a product and it does 13,000 req/sec (or whatever), then answer the question of why it isn't doing 130,000 req/sec. It's a way of encouraging you to debug it further.

Got it, thanks! For me the better question is often simply "Why is it X?" Knowing why the speed is X is often of much greater utility than simply knowing that X is the speed. I like to focus on determining the current single limiting factor before considering the harder question of what would be necessary to achieve 10x.

If you know the limiter, you know why it isn't 10x. It's CPU bound. It's disk bound and we're on a single disk. etc.

How do we actually make it 10x? is a follow on question, but as you say, not one that may be easy to answer!

Please stick around despite this. You write well, and have a perspective that isn't well represented here.

Summarizing, the paper suggests that you can generate high quality pseudorandom numbers by hashing a counter with a cryptographic hash (such as AES). Since AES has hardware support in modern processors, this is fast: less than 2 cycles per byte of randomness. If you do fewer "rounds" of encryption than used for cryptography, you can pass all existing tests for randomness with run time of less than 1 cycle per byte. Using a counter as the state makes it very easy to skip ahead in the sequence, and to distribute "streams" across multiple processors.

I'm sort of surprised this could be a paper published in 2011. Isn't this pretty much the obvious thing to do? Counter hashing is a well known way to generate good random numbers, and it seems almost as obvious how you'd scale to multiple processors.

Yeah, considering what was described is essentially aes-ctr

Forgive my ignorance in cryptography. If a series of pseudorandom numbers can be trivially replicated by figuring out the counter and hash algorithm (seems simple with open source software), how can the psuedorandom numbers be said to be of high quality?

The key(s) used for the cipher would need to be randomly seeded in some way that an attacker could not easily guess (e.g. with a different CSPRNG).

Edit: And for hash functions, you can generate some random bytes that you append the counter to before putting it through the hash algorithm. The same as above, if those bytes are hard to guess then the construction should be safe.

Is there a particular reason you would bet against it, or do you just saying that you wouldn't would bet for any implementation without proven theoretical properties?

(As an aside, it's probably just my lack of familiarity, but considering how simple the algorithm is I find the syntax surprisingly hard to follow. Is this modern C++?)


Although this thing is able to fool the 200 or so tests from TestU01, it would take a short time for an expert to devise a test that would distinguish it from random, and even recover the key (read: seed). On the other hand, AES-CTR (conjecturally) fools _all_ statistical tests that take less than 2^128 effort, and on top of that it is still very very fast (0.63 cycles per byte on Haswell). So what really was gained? Cryptographic primitives have gotten so fast that the additional speed from these ad hoc generators seems like an unworthy risk. I concede that it can be very fun to try to make them, though.

Additionally, a generator that passes those 200 tests is not guaranteed to be high-quality for a particular usage, which might as well be considered nothing but another statistical test. There is the famous case of the r250 generator---an additive Fibonacci generator that passed all statistical tests of the time---which turned out to have characteristics that rendered costly real-world simulations wrong [1].

That is C++, yes, but I don't think there is much of anything modern about it. Trying to save on space probably contributed to its unreadability a bit.

[1] http://journals.aps.org/prl/abstract/10.1103/PhysRevLett.69....


Here's a paper that includes an analysis of a similar approach that uses 5 rounds of AES under the name ARS-5: http://www.thesalmons.org/john/random123/papers/random123sc1...

I know this paper, and like this approach. Reduced-round primitives let you take advantage of existent analysis, and keep some guarantees while improving speed. In the case of AES, we know that the probability of any differential for 4 rounds is at most 2^-113; similar for linear probabilities [1]. I'm curious why ARS-4 fails Crush; I wonder if the custom key schedule is the culprit.

[1] https://eprint.iacr.org/2005/321

about the C; https://software.intel.com/sites/landingpage/IntrinsicsGuide...


Not quite that, but here's someone's approximation of this approach for an extremely fast hash:

  So this was a fun hash that I came up with when I was looking
  for a 128-bit hash for falkervisor's code coverage database.
  This hash passes all of smhasher's tests, faster than any other
  hash that does the same.

  I made this up by randomly testing things until it passed
  all of smhashers tests. It is likely that this hash is heavily
  flawed and smhasher just doesn't stress its flaws.


These are interesting links, but they have to do with treating individual ornamental ash trees that are infested with a particular insect. Fruit can only be grafted onto the same species (or a very closely related one) so this has no particular relevance. Information about standard treatments for ornamental plums or crabapples would be relevant, though.


Emerald ash borer is a noxious pest that is just devastating. It's going to kill something like 15% of all of our urban trees. Just staggering. http://www.courier-journal.com/story/tech/science/environmen...


Great advice even for native speakers!


No, he said that would have been the retail price. He bought them for much less. In other places, he mentioned that his total cost for the project was about $4,000.


There's a map in the linked Nature paper: http://www.nature.com/ncomms/2015/151027/ncomms9705/fig_tab/...

It's pretty direct: these particular eels appear to swim north along the shore to the outlet of the St. Lawrence River, then more or less straight south to the Sargasso Sea.


Critically endangered species. Market value of up to $2600 a pound in Japan since Fukushima. 50% of American eels and more than 80% of European eels wipped in the last years

Seriously... What where they thinking?. And what will be next? The efficient rhino poacher's guide and roadmaps?



Maybe I'm wrong, but I don't think you are understanding the issue. These adult eels spend their lives in the rivers of Atlantic Canada. They come down the river once in their lives to breed thousands of miles away in the North Atlantic. The young eels somehow then make their way back to the rivers.

The high prices are for the "glass eels", the juvenile stage. They are caught with nets in the river as they return. This article is about the path taken by the adult eels from the river, to the ocean. Knowing the rough path they take in the ocean is unlikely to help anyone catch more eels.

If you want to catch the eels, you do it in the small river, rather than in the open ocean. Knowing more about their reproduction helps us understand and protect eels. Do you have reason to believe this is not the case?

Here's a parallel story about the European Eels: http://news.bbc.co.uk/earth/hi/earth_news/newsid_8273000/827...


The real money here is not in the glass eels


And more recently here: https://news.ycombinator.com/item?id=10530149

But it's a great article that deserves some discussion, which it hasn't had yet.


Well might be a great article, but any website that shows me a popup when I land on it gets closed without any further click...



Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact