Hacker News new | past | comments | ask | show | jobs | submit | nertzy's comments login

Isn’t it because you can generate the same content two different times and hash it and come to the same ETag value?

Using UUID here wouldn’t help here because you don’t want different identifiers for the same content. Time-based UUID versions would negate the point of ETag, and otherwise if you use UUIDv8 and simply put a hash value in there, all you’re doing is reducing the bit depth of the hash and changing its formatting, for limited benefit.


I would assume that you would only create a new UUID if the content of the tagged file changed serverside.

Benefits are readability and reduced amount of data to be transferee. UUID is reasonably save to be unique for the ETag use case (I think 64 bits actually would be enough).


The point of the content hash is to make it trivial to verify that the content hasn’t changed from when its hash was made. If you just make a uuid that has nothing to do with the file’s contents, you could easily forget to update the UUID when you do change its content, leading to invalid caches (or generate a new UUID even though the content hasn’t changed, leading to wasteful invalidation.)

Having the filename be a simple hash of the content guarantees that you don’t make the mistakes above, and makes it trivial to verify.

For example, if my css files are compiled from a build script, and a caching proxy sits in front of my web server, I can set content-hashed files to infinite lifetime on the caching proxy and not worry about invalidating anything. Even if I clean my build output and rebuild, if the resulting css file is identical, it will get the same hash again, automatically. If I used UUID’s and blew away my output folder and rebuilt, suddenly all files have new UUID’s even though their contents are identical, which is wasteful.


SHA256 has the benefit that you can generate the ETAG deterministically without needing to maintain a database (i.e. content-based hashing). That way you also don’t need to track if the content changes which reduces bugs that might creep in with UUIDs. Also, if typically you only update a subset of all files, then aside from not needing to keep track of assigned UUIDs per file, you can do a partial update. Reasons to do content-based hashing are not invalidated because of a new UUID format.

This is the only correct answer. I interviewed dozens of people this way over more than a decade. Hiring was never difficult to get right.

Bonus: we self-selected for people who don’t like to pair. We paired 100% of the time.


And another to experts-exchange.com of course.


My favorite was when I was in college. I spent a long time trying to figure out how to get my WiFi card for my school-issued laptop working in Linux. Someone else posted a fix (with a full explanation of what to do!) and I followed it and it worked!

Then I look at the username and it’s my classmate from down the hall in the same dorm.

And I’m pretty sure I actually did end up in a beach house with them at some point.


It’s pretty funny that anyone would complain about a Linux distribution being named after a developer’s given name.

I mean, it’s Linux.


Debian also did this.


Irony being they (Ian and Deborah) split up, Ian quit Debian and worked for Sun (arguably a competitor back then), Docker, ..., and unfortunately committed suicide.

The good news? Debian's still going strong.


This sounds like a Halt and Catch Fire spinoff.


I feel like this is a fact that's much less well known :)

(Debian is actually a portmanteau of Deb(ra) + Ian...)


Maybe find a law office willing to take on the case for the chance at a cut of the penalty?


So perform hours and hours of unpaid labor on the hope that years from now you can make some lawyers a good chunk of money.


There's a certain personality type, that doesn't necessarily want to win, but just wants their opponent to lose.

And is often willing to put in a lot of effort to make sure this happens.


[flagged]


We can use your ex-wife for good. We can use her to take down the RIAA.


I read it as that blogspot.in was registered via Mark Monitor and that MM made a big mistake here.


I used to work on a software development team with James Somers and I can attest that he is both a great writer and able to handle criticism, valid or otherwise. I think he would appreciate the debates he is generating.


Yes, this is why developers should use URI-building libraries instead of direct string manipulation to modify URIs.

If I visit an HTML page with a link to “.evil.com/people/123” and click on it, the user agent won’t append “.evil.com” to the hostname. You’d instead get something like “https://api.hotstartup.com/.evil.com/people/123” which would be safe (if not broken).


> "The lingering uncertainty in the social control of risky technology is how to control the institutional forces that generate competition and scarcity and the powerful leaders who, in response, establish goals and allocate resources, using and abusing high-risk technical systems".

Seems to me that the relationship between Boeing and its various 737 MAX customers could be seen in a very similar light. Abusing high-risk technical systems as a tool to extract resources to cover for artificial scarcity in budgets or market expectations.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: