Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: SHA-256 Animation (github.com/in3rsha)
1053 points by inersha on May 13, 2020 | hide | past | favorite | 102 comments



I wanted to understand how SHA-256 works, so I made a terminal animation that shows the bitwise operations at each step. I wrote a text guide in the README.md to explain what's going on. I think my technical terminology is okay.

I'm new to hash functions though, so I don't yet know why SHA-256 has been designed in the way it has (e.g. why exact numbers were chosen in the bitwise rotations).

As far as I understand, the Sigma functions promote diffusion of bits to help prevent collisions, and Choice/Majority/Addition help to make it a one-way function, but I'm not entirely sure. I'd be interested in learning more about the design if anyone has experience in this field.


There's nothing particularly special about the constants. They needed a set of n constants to work with and the "cube root of first n primes, truncated" scheme is a "nothing up my sleeve"[0] construction. If they'd used magic numbers with no obvious generation scheme, you'd be left wondering if that was done as a way to put a back door in place.

[0] https://en.wikipedia.org/wiki/Nothing-up-my-sleeve_number


That's also the explanation the author gives in chapter 4.

I'm more interested in how they came up with the parameters for the sigma functions. I'm sure it's described somewhere.


> I wanted to understand how SHA-256 works, so I...

... taught everyone else how SHA-256 worked...

This is an awesome intro. And now I also want to know more about the things you mentioned wanted to learn more about.


Without any doubt I do not properly understand anything I can't explain to someone else.

An exercise I go through constantly is figuring out how to explain a thing I think I know to a curious person with no relevant training. Often in the process I discover I need to go do more research or actually test things because I did not understand them as well as I'd maybe thought.

Mostly this is just an exercise. But every so often I actually get to use this in anger. A non-technical friend who works in a Computer Science department asked me on Facebook to explain a joke she'd seen which involved localhost addressing, and I was very pleased to be able to provide a concise explanation using analogies that I know hold up to scrutiny. Obviously a joke isn't very funny if you need it explained, and I can't fix that, but I can avoid the discomfort of her not understanding a joke other people are laughing at in her place of work.


I realize it's not what you're saying, but I don't like the idea that if you can't teach something to others, you don't understand it yourself. Teaching is a skill and it's something that I am aware that I struggle with. I can explain something in great detail to a captive audience and understand it myself personally, but teaching is about getting others to engage with the ideas you're presenting and identifying and elaborating on parts that they don't understand.

Given the content of knowledge sharing sessions that I sit through and the convoluted nature of some of them, I wish that people recognized that presenting information is not all that is required to teach. You can understand something perfectly, but teach it horribly.


If you can't teach it, then you don't understand something perfectly. Perfect is a word par excellence. Teaching is indeed a skill but don't confuse it with presenting. A wise guy once said, "if you can’t explain something in simple terms, you don’t understand it". I'm sure OP stands by it and that itself is very admirable.


Feynman also said: "Hell, if I could explain it to the average person, it wouldn't have been worth the Nobel prize."

Ideas take time to digest and it's not accurate to say that if someone leaves a room not understanding what you've just shown them, you don't understand it yourself.


It's possible that's a distortion/misquote. I found where he said "I would simply say, ‘Listen, buddy, if I could tell you in a minute what I did, it wouldn’t be worth the Nobel Prize", which is about time, not intellectual ability.

[https://www.aip.org/history-programs/niels-bohr-library/oral...]


I guess previous posters "If you can't teach it, then you don't understand something perfectly" can be replaced with "Process of teaching someone will make you notice all the holes in your own understanding"

Or even without replacement. Teaching someone doesn't imply a successful result.


1) I loved this video.

2) I did have to take a sip of tea and think about my life when I realised I was watching a video by a Welshman about mining.

3) Although having said this, 'Welsh Bitcoin Miner' is going to fit seamlessly into my West Country themed cyberpunk adventure 'Cider Punk'.


You could call it CIDR Punk if you want to be a bit more on-the-nose ;)


1) Thank you.

2) It's certainly a combination I never anticipated.

3) I look forward to the book.


I feel like this is some sort of James Watt reference but I can't be sure.


This is a great idea of showing something that is at first very complex. Could be used in Discreet Mathematics to teach students!


Indiscreet mathematics?


Somewhere I had the idea concrete math was a combination of continuous and discrete, but I seem to be mistaken.

"When DEK taught Concrete Mathematics at Stanford for the first time, he explained the somewhat strange title by saying that it was his attempt to teach a math course that was hard instead of soft. He announced that, contrary to the expectations of some of his colleagues, he was not going to teach the Theory of Aggregates, nor Stone's Embedding Theorem, nor even the Stone-Cech compactification. (Several students from the civil engineering department got up and quietly left the room.)"


This is super impressive, thanks for building and sharing it!


Nice work, and that is a very cool approach to learning new things!


This is fantastic - and your YouTube video is also impressive. Thank you for doing this.


No problem at all, thank you.


Like it a lot. Not sure how useful it is but just looking at it is mesmerizing, well done!


Congratulations on the worlds slowest and useful SHA-256 implementation! :)



This is phenomenal. The animations at the top were way too quick for me to understand, but they pulled me in. The step-by-step explanations below (alongside the relevant parts of the animation) were very clear and interesting.

I had the big picture from the top of the page, and each section gave me a little more insight into what was going on. Great use of animation, and also very clearly written!

I also like the testimonials.


Thank you for being so kind, it means a lot.


I once heard about a meeting where someone did a presentation for the best part of an hour to a room full of Japanese people. They all nodded along, smiling as the presentation progressed. It wasn’t until the end that the presenter asked them what they thought. They all just kept smiling, but no answer. Someone at the front turned and talked to the room (in Japanese) then turned to the presented and said “they do not speak any English, but they are sure from this presentation that you are very, very intelligent”.

I feel like the Japanese people. I don’t understand what you’re doing here but I am convinced it is very clever.


Anecdotally, based on experience with shipping software with an annoying bug to Japanese consumers, the Japanese people are way more patient than Americans.

I worked at an American company that made utility software for DOS/Windows. A major Japanese distributor liked our stuff, and made a distribution deal with us. They also helped with localization, and suggested feature changes to better fit what Japanese consumers wanted. Later this expanded to where they would also suggest new products they wanted for Japan and we'd develop them.

One product under development was very rarely causing a system lockup during installation. By "lockup" I mean it appears completely dead. No mouse motion. No keys worked. Hitting "CAPS LOCK" would not even toggle the light on the keyboard. Hard reset seemed to be the only way out.

It was rare enough that we always saw the install attempt after that hard reset work.

We just could not figure out what was causing this. The Japanese distributor decided we should go ahead and release, and their tech support people would deal with any complaints.

So we released...and their tech support got a lot of calls. People were hitting it a lot more than we had seen in testing.

...except they were not calling to complain about their system locking up. No, they were calling to complain about a slow install.

Apparently, when it became completely unresponsive all you had to do was wait 20-25 hours and it would complete the install.

I cannot imagine any American consumer whose PC has become completely unresponsive going 20-25 hours without giving up and resetting the thing. Not only were there such people in Japan, there were a lot of them. We didn't get one complaint about a lockup--all of the many many people who called called about it taking around a day to finish.

(Oh, and the knowledge that it was not a complete lockup was enough of a clue to let us figure it out. It suggested it was some sort of timeout, not a lockup. Our product needed to know what optical drives were available, and it turned out that the way it scanned for them did not get alone well with one particular brand of optical drive controller card, which could trigger about an hour timeout per drive checked for. We switched to a much more cautious drive scan, and the problem went away).


Lore from an old company of mine says that we had the opposite of this happen. We had an opportunity to expand from the US into China. So a dedicated team spent months localizing our English-only software into Chinese.

Our sales people traveled to a key meeting with execs at the Asian division of a multinational corporation (the potential customer). They did a big demo to show the software working in Chinese, hoping to impress the execs with how the software was ready to go.

The execs awkwardly said essentially, "Hey, that is probably great, but even though we handle the Chinese market, we don't speak Chinese; it's only our customers that do. So we couldn't tell what your software does, and we wouldn't be able to use this. Do you have an English version?"


You need to watch his video, very cool, it really helps to understand how this works: https://www.youtube.com/watch?v=f9EbD6iY9zI


There's a few things in there that are factually incorrect -- in particular, the false notion that "every input has a unique output" can be quite dangerous in some cryptographic settings.

That said, the purpose of this talk is about the mechanics of the function, and not its properties or how to use it safely. So don't let that detract from what is, really, an awesome presentation.


I'm sorry, could you please elaborate? I was always under the assumption that hash functions have to be deterministic, and thus, that "every input has a unique output" was a correct statement.

AFAIK the contrary is invalid, so that "not every output is the result of one and only one input".


A function being deterministic means that any input will have a single output. But it is not unique for any hash function, SHA-256 included. The definition of a hash function is any function which takes an arbitrary length input and outputs an n-bit output for some fixed value of n. By virtue of the fact that you have infinite inputs and finite outputs, the outputs cannot be unique.

Generally when people make this claim, what they're actually referring to is what's called Collision Resistance (CR) and/or Weak Collision Resistance (WCR), which instead make claims on difficulty of finding such collisions (of which infinitely many exist).

WCR, necessary for almost any cryptographic use, states that for any given input it should be difficult to find a different input which hashes to the same value. CR, generally desirable for cryptographic hash functions, states that it should be difficult to find two different inputs such that their hashes are equal. CR implies WCR, but WCR does not imply CR -- for example, SHA-256 (currently) exhibits CR but SHA-1 only exhibits WCR.


There are 2^256 potential outputs for SHA-256, while the number of potential inputs is infinite. Therefore, the same output can be generated with different inputs, although finding such "collisions" by chance is extremely unlikely


The claim is not that every output has a unique input, which would not be correct, and seems to be what you are addressing.


at 1:08 in the video, that is exactly what he claims:

"So every piece of data in the world has its own unique hash digest."

This is false for the reasons apeescape describes: every piece of data in the world has its own hash digest, but these hash digests are not unique.


Yes that sentence is technically incorrect, but practically correct. We've never found a collision and though we expect it to be theoretically possible, even common if you consider "all possible inputs" and the pigeonhole principle, for practical purposes hash outputs are unique because nobody considers "all possible inputs" when evaluating probabilities.

I'm saying that for a layman explanation, it's reasonable to say that hash outputs are unique. Because following that with "technically, it's more 'practically' unique, theoretically there are collisions but you won't encounter them with probability > 2^-256" (or whatever it is) just confuses the topic to them more than just summarizing. You have to admit that most people won't go on a 200h adventure to learn about the state space of 256+ bits and how to conceptualize tiny statistical probabilities, so there must be a point where you have to cut the explanation to an approximation of the truth. This is true in every field.


I don't like to leave holes like this in people's comprehension. It's OK if people don't end up with an intuitive feeling for how relatively unlikely different things that don't actually happen are, but I want them to be aware of that category as distinct from things which can't happen because the type of argument needed is different.

The air molecules in the room you're in can't all gather in one corner because that's not possible, it's forbidden by conservation rules.

But they won't gather in two opposite corners only because that's so tremendously unlikely, it would be allowed by conservation but statistically it's ludicrous.

The same is true at the opposite end of the spectrum. Almost all real numbers are normal (in all bases) but the nature of "Almost all" in mathematics is different in an important way from "All" and I want people to grasp this difference when I'm discussing properties of numbers. It definitely is not true that all real numbers are normal, you probably rarely think about any normal numbers at all.


> I don't like to leave holes like this in people's comprehension.

I agree. I think this wording would be better than in my previous comment, what do you think?

    it's reasonable to say that hash outputs are *almost surely* unique


> I'm saying that for a layman explanation, it's reasonable to say that hash outputs are unique. [...] theoretically there are collisions but you won't encounter them

You could have said exactly the same thing about MD5 right up until you couldn't. Then you could have said "oh yeah well MD5 is broken, but it's safe to assume you'll never find one for SHA-1", right up until we did. So if you say "oh yeah well SHA-1 is broken, but it's safe to assume you'll never find one for SHA-256", I disagree.

It would be one thing if collisions in hash functions were found by just repeatedly hashing things until you find a collision. If that were the case, then yes, I'd agree with you on those 1-in-2^256 odds, at least for a while. But by and large, that's not what happens. Over time, weaknesses are found in algorithms which allow you shrink the search space, which significantly changes your odds.


Kind of agree w you, but still feel adding a few words by way of a disclaimer about collisions is much better than presenting as plain truth something that merely approaches it.


On the other hand, if we can count "every piece of data in the world" then we can estimate the probability of having a collision.


I see what you mean, but it sounds like the output is unique, and we probably agree that in this field you need to use sentences that cannot be easily misinterpreted.


That video is really, really awesome! And it won't leave you feeling "Japanese" either. (Which is a great people, btw. I'd really like to go there someday, mostly for the food and language and history. And Anime also, I'm forced to admit.)


Did you see the README.md? It explains each of the steps.


Thanks for pointing that out - I kinda missed it until seeing your post. I went through each step - it's still a bit convoluted, but definitely helps me appreciate hashing functions. It's probably very easy for a poorly coded hash function to accidentally wipe important data or accidentally and a bunch of data with zeroes, so it's pretty cool to see a lot of ROT and XOR usage which does more interesting things with the original data.

Had no idea about the prime roots and multiplication, that's pretty clever too.


> “they do not speak any English, but they are sure from this presentation that you are very, very intelligent”

I kind of hope there's some nuance to this story and it didn't quite happened in the way you described.

If they think it's "polite" to let a person talk for an hour without understanding a word of it (while nodding and smiling), and then have the nerve to call it "intelligent", that's an insult to both the presentor and the concept of intelligence.

Cultural differences notwithstanding, the idea of "completely wasting someone's time while pretending to pay attention" shouldn't be too difficult to empathise with.


It's probably mixed with avoiding confrontation and thinking that the other people understand it. I'd assume that calling it intelligent didn't have a belittling tone, but rather a 'still awesome, sad I didn't understand it' one


i just felt like i was in the matrix for a little bit and that was ok for me :)


You made my day


Very nice!

This also has potential for landing in many movies where a hacker at work is involved.


Indeed, I'm hoping that it ends up in movies so they finally have actually relevant visuals. Someone send this to Hollywood.


True, it has to be used as encryption cracking software! :D


Here's the author doing an in-depth explanation of how SHA-256 works using this code: https://www.youtube.com/watch?v=f9EbD6iY9zI

I'm halfway through, but looks very well done, thanks!


Wow, this was just what I asked for in a comment before seeing this comment. Thanks.


My pleasure, thank you.


This looks great, though it requires quite sometime to go through it (and figure out what could possibly be understood by someone with knowledge of programming and bit wise operators, and what can just be skipped because it’s something only cryptographers can understand).

If someone were to make a slower video explaining it with all the sections from this dissection, that’d be even more awesome.

Edit: Seems like the author created a video for this — https://www.youtube.com/watch?v=f9EbD6iY9zI (thanks to 1_player’s comment at https://news.ycombinator.com/item?id=23165906)

On a tangent, here’s an animation (by someone else) explaining AES encryption — https://m.youtube.com/watch?v=gP4PqVGudtg (thanks to harrigan’s comment at https://news.ycombinator.com/item?id=23165821)


I would love to build something like this for encryption algorithms that twiddle bits (DES and friends.) Even for the math-based algorithms, but I have a had time imagining how to make those interesting.


I have Excel spreadsheets showing every bit of the calculation of AES and DES: https://www.nayuki.io/page/aes-cipher-internals-in-excel ; https://www.nayuki.io/page/des-cipher-internals-in-excel

Also, some people have been sharing a video explanation of AES/Rijndael. It is actually rendered from someone's Shockwave Flash movie. https://web.archive.org/web/20051124061027/http://www.cs.bc....


https://www.youtube.com/watch?v=gP4PqVGudtg is a nice animation of AES. Of course, the inputs are fixed unlike the above.


Love the animations but they go by too fast. It would be nice if you could add a single step capability so the user could follow along at human pace.

Also, I wanted to see how you updated the bit strings in place and was surprised not to find anything special. It looks like you use nothing but puts statements, which, as I understand it, simply add a carriage return at the end of a string but how then does the cursor move back up to the first line? Can some explain the technique being used here?


If you add the third optional argument of "enter" when running the sha256.rb script you can use your keyboard to step through. I haven't set this up for the individual animations though.

ruby sha256.rb sizzzzlerz enter

I didn't do anything special for the terminal animation. I just work out the current state of the hash function at each step, clear the terminal, and print the entire state back to the screen. I know it looks like I'm directly manipulating each individual bit in your terminal, but really I'm just redrawing your screen.

I'd like to do something more practical in future, but for this I just did what I needed to do to get an animation working. Still, every bit you are seeing is correct.


Thanks! I detected a few flashes in some lines that weren't active but I didn't believe you could clear and reprint the entire screen that fast. Very cool!


I was surprised to see addition in there.

Hash functions use bit-wise operators. This is a world where "integers" are simply convenient fixed-sized blobs of bits and all bits are equally significant.

Addition is an arithmetic operation. Bits have significance in this world. The state of the input bits have more "influence" over the left hand bits of the result than the right hand bits.

Look at abcd+efgh=ijkl (Ignoring any overflow'd bits per SHA-256.)

To find the value of bit l, you only need to know bits d and h. You need to know all eight input bits to know the value of bit i.


This is so interesting to watch. Can you do it for other hash or encryption methods?


I have thought about it. Any particularly interesting ones?


Interesting candidates are blake2b, sha-3/keccak, and the much simpler siphash-2-4 that's used in many hashtable implementations.


md5/sha1 is useful because they're still used in UUID (rfc 4122) generation. obv not from a crypto standpoint but more from a unique (non-exploited) approach.


ecdsa would be good


I think this might end up being used in a hacking scene in a movie ... with bright green text of course.


IIRC, hackertyper has been used numerous times.


Shameless plug: https://jrvidal.github.io/aes-demo/

I like this one better though.


Looks nice, but an explanation of each step would be more helpful to understand it (along with a coverage of the overall algorithm). The animation was a little too quick for me to follow along.


Minor typo: "Rinjdael" -> Rijndael


I feel like there are lots of mathematical / programming concepts that could be explained through animation. Anyone got any good examples?


3blue1brown is great, e.g. his Fourier transform animation is super intuitive: https://youtube.com/watch?v=spUNpyF58BY

He has opensourced his animation engine "manim" used in his videos: https://github.com/3b1b/manim


Thank you for showing me that Fourier transform video. I've never understood how it worked because usually they just show the integral and call it good.


There are plenty of animations/videos about algorithms, all the way up to Hungarian folk dancing: https://youtube.com/user/AlgoRythmics/videos


I follow Tamás Görbe on Twitter and he regularly posts cool mathematical animations: https://twitter.com/TamasGorbe/status/1238448040521932801


This was very useful while learning stats: https://m.youtube.com/channel/UCtYLUTtgS3k1Fg4y5tAhLbw


I remember that in my first year of CS, the professor asked us to implement this algorithm. We had just learned our first computer language(C by the way). Result: nobody was able to do it, and everybody got a 10 after the professor realized it.


Hehehe, yea, no, there's a good five to ten steps necessary to grok all that.


Great animation! And I still have no idea what happened there :D


This is amazing! Really fun to watch the example in the readme and just shows how much happens behind the scenes nearly instantly.


Psh! I never trust a program with my hashing; pen & paper is your friend here, much more secure.


There are microprocessor out there that does most of this in single instruction, :O


Make a screensaver out of this!


I get it! Fantastic work.

I hope you get lots of requests to include it in teaching materials.


I forsee this being in a blockbuster movie featuring a "hacker"


Could be a in a movie


That's some brilliant work. (Y)


I subscribed to your YouTube channel.

https://youtu.be/f9EbD6iY9zI


Great work! Any idea where we may find an equivalent in Python?



awesome, thanks




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: