Hacker News new | past | comments | ask | show | jobs | submit | smaudet's comments login

Seems like a reminder to me...

Also, I wonder how the security of cryptosystems relates to a combined approach, i.e. if you cut the message and hash smaller parts with multiple algorithms (SHA-1+SHA-256) how much more infeasible you can make the "attack" here.

So you can have a global hash that is checked 3 times (so not much of a hamper but better than nothing, if you theorize there are 1000x faster than birthday problem attacks on each hash), to serve as a sort of "quick check", and a deeper hash when you split a 10 GB file into say, 10 MB chunks, and so gain back a 1000x order-of complexity (at a minimum) to compute.

For streams where order of bits matters, I speculate this may be even more difficult to attack, since each attacked hash is now constrained by the data that will fill each prior and next chunks, if not the whole set of chunks (so perhaps 1 million or greater difficulty?)


I'm not sure if i understand what you are saying correctly, but generally hashing the same data with multiple different Merkle–Damgård hash functions is not anywhere as secure as you might naively expect.

I am indeed curious about this: consider an hash function built like

H(m) = H0( H1(m) + H2(m) + ... + Hn(m))

where + is string concatenation and H0,...,Hn are mostly good hash function.

I would expect H to be as good as the best of H0,...,Hn on almost every metric; at worst being limited by the block size of H0.

That is you would need to badly break all of them to break H.

Honestly I would also guess that even if H_i = HMAC(md5, i, m) you would get a decent H (except for the small block size.

So maybe something even more nested like

    H(m) = H_00( H_01(m) + H_02(m) + ... + H_0p(m))
         + H_10( H_11(m) + H_12(m) + ... + H_1p(m))
         ...
         + H_q( H_q1(m) + H_q2(m) + ... + H_qp(m))
where H_ij(m) = HMAC(md5,i*(p+1)+j,m).

You are making two assumptions here:

1. More hash functions equals more security (hash encapsulation)

2. The attacker wants to know the content of 'm'

H0(H1(m)) has the security of just H0. Hashes are not made to protect the content of m, but instead made to test the integrity of m. As such, a flaw in H0 will break the security guarantee, no matter how secure H1 is.

Practically, H0(H1(m)) is the same as H0(m), as m is just "some data," and the result of H1 can be seen as "some data".

If your construction is H0(m0) + H1(m1) where m0, m1 are both halves of m, then the overall security is reduced to the weakest hash function. For example, a length extension attack in the weakest hash function breaks the overall integrity of the construction.


> H0(H1(m)) has the security of just H0. Hashes are not made to protect the content of m, but instead made to test the integrity of m. As such, a flaw in H0 will break the security guarantee, no matter how secure H1 is.

But this isn't true for all flaws. For example, even with the collision attacks against SHA-1, I don't think they're even remotely close to enabling a collision for SHA-1(some_other_hash(M)).

Similarly, HMAC-SHA-1 is still considered secure, as it's effectively SHA-1-X(SHA-1-Y(M)), where SHA-1-X and SHA-1-Y are just SHA-1 with different starting states.

So there's some value to be found in nesting hashes.

[1]: https://en.wikipedia.org/wiki/HMAC#Definition


We are saying the same thing. H0 is SHA-1 in your example.

The strength of an HMAC depends on the strength of the hash function; however, since it uses two derived keys, the outer hash protects the inner hash (using the same hash function), which in turn provides protection against length extension attacks.

The case I was making, is that weakhash(stronghash(m)) has the security of weakhash, no matter how strong stronghash is.


> The case I was making, is that weakhash(stronghash(m)) has the security of weakhash, no matter how strong stronghash is.

I'll have to disagree. There are no known collision attacks against SHA-1(SHA-3(M)), so in the applied case, a combination can be more secure for some properties, even if it isn't in the theoretical case.


There is only SHA-1 with a fixed starting state!

Once you change the IV the hash becomes entirely insecure and can be broken in seconds. You just need to overwrite the first IV word with 0, and it's broken. It's a very weak and fragile hash function. They demonstrated it with internal constants K1-K4, but the IV is external, and may be abused as random seed.


The properties I am thinking of are strong and weak collision resistance, there are other relevant properties to hash functions (like every bit being about independent of every other bit, but I care less about those).

> If your construction is H0(m0) + H1(m1)

Here if H0 has a weak collision attack and H1 has a strong collision attack and + is xor or addition the i see how H0(m0) + H1(m1) can be vulnerable.

> H0(H1(m)) has the security of just H0

I believe it has the security of just H1, but my construction was very different; it was H0(H1(m) || H2(m)). (I used + as concatenation, I forgot that it is usually written as ||)

Here you would need strong collision attacks on all three hash functions (including an attack on H0 that is limited to very short messages of a fixed size.


I think I was misunderstood.

I do not mean H0(H1(m0)+H1(m1)) nor H1(m0)+H(m1) but Reinman(x={0,1000})(H0(m(x)))

Where there are 1000 hashes. So H0 must be broken one thousand times, then it does not matter that some attack exists to reduce the security 1000 times because the attack must be performed 1000 times. You could easily nest these so that F(y)=Reinman(x={0,y})(H0(m(y))) and take G(z)=Reinman(y={1,z})(F(y))

So that G(3) e.g. would produce 6 hashes of strength H0. No hash is taking another hash as a function, but the message itself here provides security - you must not just find a duplicate for one hash, but all 6 simultaneously. I wonder if the increased complexity might easily defeat most attacks on H0.


> I don't understand how anyone is shocked by businesses

I think its that normally biz leaders manage to seem reasonable (maybe they aren't), so they seem like level headed "we will explore this option" type of people. Often they don't really understand it because they have MBAs not degrees in fields.

Most people who work there would tell you the primary value they have at the company is that they hold the purse strings. Nothing else.

But now, possibly because they took a prompt from some AI, they exuberantly shout about how they "love AI" and you get that bad feeling when you see someone and you know they were taken in by a snake-oil salesman selling a pyramid scheme...

So the facade has dropped and everyone's just like "oh wow, I guess <level-headed company> is run by idiots"...


I think the bigger issue is not whether it has some efficacy, even if limited (which could be interpreted as a good thing), it's the potential for the bacterium to create a mono-culture, which would be objectively bad (remember, diversity is critical in populations, and for our oral health perhaps even necessary).

So not so much perfect being the enemy of the good, fluoride isn't perfect either, but it is good. This is potentially extremely harmful, and possibly even a source of something like a new AIDS pandemic...


I'm sorry but the last claim seems extremely exaggerated.


The single major flaw with systemd...

Linux was(is?) the system. Shell scripts could configure the system, reasonably well. Systemd meant it was no longer possible to simply fix a linux distro that had it...a massive recompile and intricate understanding of many low level systems was needed.

No, you shouldn't need to know about PDP-11s to fix your buggy wifi, neither should you need to know a massive binary subsystem to configure your wifi either...

So as far as doing its "one job", it did/does it fairly poorly, even today. It is a lot more standard and it does get points from me in that respect, but it fails at being a configurable system completely.


I've fixed a lot of problems on a lot of systems and never needed to recompile systemd. The majority of them I've fixed by editing text files.

What kinds of problems are you hitting where you need to recompile systemd?


I could simply trace shell scripts with some file io, or echo statements. Now there are systems using python, even better I can set breakpoints and modify, no need to recompile.

Configuring the system with textfiles only works if the system anticipated all your needs, and to anticipate all needs is not only complex but error prone.


Systemd unit files can exec shell scripts…

There have been times where I needed to do something that wasn’t available in the unit file syntax, so I just do it in a shell script and ExecStart that.


I suppose, can you arbitrarily replace portions of systemd with scripts, though?

If so, perhaps it is less of a problem than it used to be.

It still doesn't fit every need but it's better if it can get out of the way when told to.


I don't think systemd is meant to be fixed or modified, unless you are actually contributing to the upstream systemd project and have accepted that it will be a time consuming project.

It seems to be made for, and works well for, a modern ephemeral infrastructure workflow.

Build stuff to work with an unmodified distro, wipe it and reinstall if it goes wrong.


> Many people here seem bitter or have an idealistic point of view

It is the opposite of idealism to see the world as it is. Pragmatism is rooted in acknowledging both the good and bad.

Idealism is ignoring the bad in the name of "pragmatism". Maybe you have to ignore it for your Public Relations metrics, but not for your executive or engineering perspective(s).


> But it feels like the “sour grapes” cohort is the fastest growing one, and increasingly is tilting all discussions that direction.

> Like a bunch of grumpy old men, we don’t like new things here, the 90s were the peak of the internet and computing apparently.

I invite you to consider, based on your own wording, that you are doing more feeling than rationalizing. It is some work, and perhaps not completely possible, to do a comprehensive and correct meta analysis aiming to gauge the state of rational vs non-rational commentary on HN.

> bitter about AI, bitter about social media, bitter about the Saas model, bitter about Crypto, bitter about ads, bitter about privacy, bitter about capitalism, bitter about Elon Musk, bitter about every damn thing imaginable

The fact that the world is imperfect is not a reason to ignore that the world is imperfect. One must of course satisfy their Ego and make some peace with the world that is around them that it is in some sense "good", but the act of a rational mind, after it is done indulging the (necessary?) behaviors of the animal in which it resides, is to relentlessly nitpick, criticize, deconstruct the world around it, as far is it is possible, without feeling.

Yes, all those things suck, or have things that suck about them. If one of them is the field in which you work, you may even resent the criticism. And yet, it is only by acknowledging what is wrong that we can build and do what is (more) right.

Perhaps what I will say, is that if HN is supposed to be a place of technical innovation, it is undeniably true that it is no longer possible to easily innovate, anymore. And if that is true, then there should some discussion of all the ways that what has been built now constrains/no longer makes possible the alternatives. That is not something you can change with a "happy go lucky attitude" or renouncing a cynical one. In fact, one can argue that "can do no harm" attitude is what has brought about this venture. Perhaps a slower, more considered approach, would have resulted in a better outcome.


>Yes, all those things suck, or have things that suck about them.

I'm a long time reader, but only recently registered to post. I think this statement is quite illuminating to illustrate the point of the person you're responding to.

I actually didn't know HN existed until a colleague told me about it as a place to find a bit more optimism about technology than has become the norm on places like reddit. The overwhelming vibe on reddit is that capitalism bad, big tech bad, AI bad, etc. And I have definitely noticed this a lot more on HN in the last few years than when I first started reading.

I don't know why, and obviously it is just my anecdotal opinion, but it is how I feel, and I have seen many posters who feel the same.

Obviously we should all be open to different views, but sometimes I just want a little haven where I can read about technology and cool stuff alongside people who are mostly optimistic about that stuff, without having to be swamped by "end state capitalism" sentiment, like everywhere else. That's just what I want, I'm not making any moral judgement on what others want.


And some of us would still disagree: HN has, for a long time, been exactly the union between the overly optimistic technologist (tech founder) and a very critical engineer.

I mean, this is evident in posts by one of "model" founders, Paul Graham. Many of his posts are about how most are doing things wrong, only framed in a positive way (for success, do this instead of the usual things you've been doing).

So perhaps you came in attracted by one side, but stuck around for the arguments, even if unconsciously ;)


> And some of us would still disagree: HN has, for a long time, been exactly the union between the overly optimistic technologist (tech founder) and a very critical engineer.

Other's would know more than me, I'm just an anecdote.

All I can say is that I find many responses to be Pavlovian, not well thought out, overly negative or cynical, and in my humble opinion just part of a low effort zeitgeist against capitalism.


Nah, you have the wrong idea completely, non-shady businesses provide a good or service, that you can't obtain (easily) yourself. Period, non-stop. (Ignoring for a minute monopolies, which can actually be quite fair but usually aren't unless heavily regulated.)

If you have infinite time, resources, patience, sure, you might be able to provide the same thing for yourself, but not everyone has a farm, not everyone has the time or ability to visit Grandma...

Shady businesses try to sell you gum while on your call to Grandma (nothing to do with calling grandma, just a "value-added" thing).

I'd also posit, if you don't know the difference, you're probably being exploited yourself. I'd take a hard look at who you are giving money to, and try re-evaluate if they are providing you value or not (with respect to your circumstances).


Exactly. There's some need, perhaps, to keep these tools "up to date" because someone in a non-free country is going to use them in a horrendous manner and we should maybe know more about them (maybe).

However, there is no good reason in a free society that this stuff should be widely accessible. Really, it should be illegal without a clearance, or need-to-know. We don't let just anyone handle the nukes...


> an anti-union consulting firm

Hmm. Maybe an anti-anti-union consulting firm is a business opportunity?


Or, we can use what worked in the past without involving for-profit enterprises: grassroots movements

Easier to align people when you remove the whole troublesome "money" part. Question is how to motivate Americans to work together if not for money?


Who would be purchasing its services?

The only obvious customer would be a union, and they already provide that service themselves.


Unions sometimes hire on third party organizations to help them organize, I don’t think they are specifically specialized against anti-union consulting firms, but I bet that’s part of it.


> Conflicts are first-class items

I have less experience working in large (heavily tracked active, e.g. 1k+ committers) repos, however we try to keep conflicts to a minimum - of course you can't get rid of them completely, normally merge/rebase I have close to zero conflicts on the regular, or those that I have are automatically (non-ai) merged (correctly).

I suppose I find the idea of conflicts as first class entities as somewhat intriguing, do they function as a sort of super-feature? If e.g. you have 10 features all touching the same area of code, then do you get maybe 3 combined conflict(ing) features?

At any rate, I think while it might be a useful metric, there should be an aim to keep conflicts to an absolute minimum....


One thing that "conflicts as first class items" implies is that commits can exist in a conflicted state. This implies that you don't have to deal with conflicts right away. Let me explain.

jj's rebase works just like git's rebase conceptually: take these commits and move them on top of some different commits. But unlike git, `jj rebase` will aways succeed, and succeed immediately. The resulting rebased commits/changes will just be in a conflicted state if a conflict happens. This decouples the "do the rebase" from "fix the conflicts" in time, meaning I can choose if I want to handle them right away or later.

This ends up being very powerful in a few ways: for one, because it will always succeed, and succeed quickly, we can start automatically rebasing. Let's say you have a branch with three changes on it. You go back and edit the first change, because you forgot to do something. jj will then automatically rebase the two children, and you immediately see if you've introduced a conflict. You can then fix it right now, or you can continue working on change #1, until you're ready to address the conflict. Let's say you decide to do that, and the conflict originates in change #2: when you fix that conflict, #3 gets rebased again, and in this hypothetical story, the conflict goes away.

It's also nice if you have a bunch of outstanding work, and you want to rebase it all on top of your latest trunk/main/master branch: run the command to do so, and it'll all get rebased right away, and you can immediately see what has a conflict and what doesn't, and fix them when you feel like it, rather than doing it right now just so that the command completes.

There are secondary advantages, like the fact that this means rebases can happen in-memory, rather than through materializing files. This means it's super fast, and doesn't interrupt what you're doing. But I think those kinds of things are harder to grok than the workflow improvements.


That sounds like (sounds like at least) a breath of fresh air to be honest. Conflicts are very important to deal with and yet are a second-class concept in Git. And if you primarily use rebase you have to rely on git-rerere. And if you screw up a rebase? I guess you just `git rerere forget` one of those stored files? But what if you forget that it was slightly messed up? Well I guess it might just lie around as an opaque “conflict cache” resolution.

At least I can use `git show --remerge diff` on a merge commit that had a conflict. That gives some after-the-fact insight.


It doesn't work like that. I should probably rewrite that "first class" part in the docs and use a real example (I wrote it originally), but basically it comes down to this:

When a commit is in a conflicted state, the fact it is conflicted is recorded inside the commit object.

Let's say I have a base B which is the main branch, three commits X Y Z that do 3 different things, and my set of changes that aren't committed, called "@".

    B (main) ---> X ---> Y ---> Z ---> @
Now someone pushes to main, so the full graph now actually looks like this with the new main called B'

    B ---> B' (main)
    \
     \---> X ---> Y ---> Z ---> @
Let's say that B' has a change that will cause a conflict in Y. What does that mean? It means that if we were to change the parent of X to B', then the state of Y would be invalid and thus in conflict, because the changes are not compatible with the state of the system.

`jj rebase -d main -s X` will give you a graph just like this, with such a conflict:

    B ---> B' ---> X ---> Y ---> Z ---> @
                          C      C      C
The marker 'C' means "This commit is conflicted." Note that all descendants of a conflicted commit are conflicted too, unless they solve the conflict. (This sentence is phrased very carefully because I'm about to show you a magic trick.)

Okay... So now in my filesystem, I can go see the conflict and I resolve it. Maybe Y renamed a variable that B' got rid in file foo.c, or something.

So now I can solve this conflict. But how do I resolve it? This is too much of a topic to discuss in general but here is the magic trick:

- Solve the conflict in your working copy

- Move the conflict resolution into the first conflicted commit, Y

- The conflict resolution will be propagated to all descendants, just the same way the conflict itself was propagated.

Step 1: solve the conflict in your working copy. Now my history looks like this.

    B ---> B' ---> X ---> Y ---> Z ---> @
                          C      C
Note: @ is no longer conflicted! We solved the conflict there, so it is OK. Now how do we resolve the conflict in Y and Z?

Step 2: `jj squash --from @ --into Y --interactive` will move any changes you select in a diff editor and then move that diff into the other commit.

Now the graph looks like this:

    B ---> B' ---> X ---> Y ---> Z ---> @
I moved the resolution of the conflict into Y. And so the resolution of the conflict is propagated to Z.

Step 3: There is no step 3. You are done.

So the secret is that, Jujutsu tracks conflicts and the relationships between conflicts in the commit graph, just like commits. This is why they are "first class." Git basically doesn't do any of this. A commit with a conflict and a commit without one are indistinguishable in Git, unless you look at the actual diff and see rejected hunk markers. A conflicted hunk in a modified file in Git is no different than any other hunk.

This is already too long but as an addendum what I used to do is describe the conflict support as "git rebase --update-refs, combined with git rerere, on 1000x steroids." Except that's actually a shit way to describe this functionality, because only like 5 people on Planet Earth know about --update-refs or rerere. So you really need to experience it yourself or see a step-by-step, I'm afraid.


That all makes a lot of sense! I didn't know about git rerere - maybe I'll make a point to try jj out if it has optimized the diff resolution...

It also closely matches how I actually work, but the twist that you get to pick what time to perform the resolution. If you frequently integrate, your changes are considered newer than whatever was downstream, and so you don't have to re-perform any of your changes....buuut if you have a stack of commits that can take a long while. Plus, you might upset your IDE if there are frequent file changes, so I avoid integrating too often...

It's a nice concept. My one question - can you track history across branches? I frequently "save old state" just in case I need something pre-integrate or my IDE is dumb and something's locked/gets wiped/AV freaks out and deletes my repo files... It sounds like you could sort of easily build that sort of thing on top...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: