Hacker News new | past | comments | ask | show | jobs | submit | eclectic29's comments login

One man’s extremely horrible mistake and look how the whole industry has been suffering for decades - countless articles debating the existence or absence of comments, new formats, new parsers, new ways of adding comments, post-processing, etc. What a pity!


There's something to be said for the hard limitations of json that made it successful.

Because of that it didn't become XML or YAML or markdown or protobufs or *RPC or microsoft config file format or all the rest. All those formats have "reasons", but they are harder to understand, harder to parse or not as portable or on and on...

(that said, I wouldn't mind comments. JUST comments).


Most reasonable parsers has an allow-comments flag, and even in JS you could it in a simple regexp replace all with:

const obj = JSON.parse(jsonText.replaceAll(/("(?:\\"|\\.|[^"])"|[^\/])|\/\/.|(\/)/g,"$1$2"));

The above should be multiline JSON safe and give you single-line comments to end-of-line, it matches first strings (and checks for escaped double-quotes and escapes to get them correctly) or all other characters except // sequences (outside of strings) and passes that straight through via group 1(anything outside of string not starting with /) or 2(single / even if it's outside of JSON spec), if a // sequence is found it's not passed through and disappears.

Yes, regexps can be abused. This one should be fine though but only use it for config files you control :), use as CC0 and keep an keen eye if translating to another language since escapes will differ if the regexp goes into a string instead of a regexp literal like in JS.


It’s better than having to deal with the ad hoc out-of-band data formats that people would inevitably have stuffed into comments if given the opportunity.

Remember how JavaScript used to reside inside a HTML comment?


> It’s better than having to deal with the ad hoc out-of-band data formats that people would inevitably have stuffed into comments if given the opportunity.

As has been pointed out multiple times in the thread so far, a random string that is ignored by the receiver is semantically equivalent to a comment, and people totally stuff random crap into strings to extend the data format in sometimes-horrifying ways. Banning comments doesn't stop us from abusing strings, it just means we don't have a good way to indicate that a part of the file should be ignored by the parser because it's not actually data.


But the problem isn't with abusing strings, but abusing something that isn't data.

Abusing the string is the correct place to put poorly planned data because it's just data and it has a stable position in the data (and syntax) tree.

But consider this example:

    {
      "a": /* directive */ 42,
      "b": 42, // directive
      // directive
      "d": 42,
      /* directive */ "d": 42
    }
What does the code look like that has to reconcile the directive with the data it modifies?

That's a completely different and worse problem than someone "abusing a string" with something like `{ "a": "42 directive" }`.

I think JSON made the right call. People just confuse that with it being somehow forbidden to use comments in JSON config, but they're wrong. Just look at VSCode's config. It was possible all along despite decades of people whining in blogs.

We didn't need comments in JSON over the wire. We didn't need a new format. We didn't need another blog post about it. Just write your tool to accept comments in its config.


I don't understand what the problem is.

You can just make a parser that ignores comments, while still allowing them in the syntax. You don't need to store the comments in the AST, or to actually parse them. You just need to skip them.


That only works if you control the data that gets parsed.

Imagine you’re a HTML parser library author in 1998. You’ve been happily skipping comments. Now you get complaints that your parser doesn’t see JavaScript. Turns out that everyone has agreed that these new script tags should be embedded inside comments.

Should you keep skipping comments and tell your users you’ll never support this HTML that everybody else now considers valid?


Should we not allow comments in HTML, then?

There are parsers which allow for comments if one wants, so if someone wants to engineer an insane system no one is stopping them, provided they can ensure the parser on the other end of it.

This is a data format meant to be readable by humans. As such, it's natural to want to support things like configuration via end-user editing values in the text as a use case. Data is occasionally going to need comments to explain valid options or add context for people to edit. This is a reasonable thing to have.


Why would a comment be required for this when they can just use a string property? YAML has comments, and they are not used by any DSL abomination that I'm aware of.


Conda package recipes have a preprocessor using "selectors" in comments to conditionally exclude certain lines in a YAML file. Not that this is particularly more ugly than other YAML DSLs, but it has been used.


Definitely. It's important that every ad-hoc out-of-band data format look

   "like this"
never

   // Like this
Bad Things will surely befall us if we break the taboo.


Yes, but you expect much better from a seasoned engineer.


We should’ve kept XML.


I keep repeating this.

We already have XML. Stop trying to make JSON another XML.

I'm now knee-deep in a gig that uses all sorts of w3c standards, which enforce json-ld, json-schema, json-whatnots and so on. It's a terrible mess of unreadable complexity that has LONG been solved in XML. Decades ago!

I keep thinking that if you need one of those "fancy" features, why not just use XML? Why bolt them onto JSON, a format that was deliberately (?) set up to not have all these features?

I think that if you don't need these features: by all means, use JSON. It's simpler, cleaner, easier. But that it's only simpler, cleaner, easier because it's not XML. And that all these attempts at making it XML, end up with a JSON that's harder, more complex and often messier than the XML that we had for decades and that JSON tried to "solve" by being simpler. So it's one step forward, two steps back.


Right? People forget history faster than ever.


We should’ve kept S-expressions!


and also life of LLMs would be so much easier, all the JSON content, so much missing context!


I just installed AdGuard and the ads didn’t go away. What should I be uisng? I have an iPhone.


Brave browser for Iphone has good default adblocking


AdGuard is blocking all ads on the page for me. Maybe your configuration.


Wipr and 1Blocker, for me. Start with the former; it’s cheaper.


If Martin Kleppmann is the author I know this stuff will be worth watching out for.


I mean if it doesn't fit the team's or the project's stage, don't do them daily. Simple! I just solved it for you. Not sure why we need tons of articles on scrum to state the obvious :-). Are the engineering teams so dumb that they're following scrum to the letter? Come on!


> Are the engineering teams so dumb that they're following scrum to the letter?

No, their managers are.


Slightly off topic: Learnt a new word today 'vulgarization' which seems to have a completely different meaning from the obvious. Thanks.


Note that, in the abstract, “vulgar” means “common” (as in “vulgar latin”). Indeed, its negative connotations come from that same sense: “common” people are unrefined.


The association between vulgarity and propriety (and class distinctions) sort of ruins that word, particularly in the english speaking west.

I wonder if that's as big of a problem in the romance languages (which all treat left/right the same way - left = bad, right = good)


Yes, in Spanish vulgar is used as inappropriate. We have "el vulgo" (el pueblo, the people), which kinda teaches you the correct meaning, popular, unrefined. But "vulgo" is seldomly used.


Indeed: are you sinister or dexterous?


As a left-handed contrarian, I’ve always enjoyed that sinister and left handed go hand in hand.


In French the same word for “right” means the same notion in English for

- direction

- straight ahead

- civics

- propriety


This goes pretty deep in English. I'd argue that the semantic intention behind the colloquial usage of "vulgar" is nearly inseparable from the "class distinction" baggage it carries. Consider these common synonyms and their etymologies:

- Rude: "coarse, rough, unfinished, unlearned" (https://www.etymonline.com/word/rude#etymonline_v_16610)

- Mean: "shared by all, common, general" (https://www.etymonline.com/word/mean#etymonline_v_12495)

And even synonyms like obscene, indecent, or disgusting, which don't evoke this distinction directly, still almost always ultimately rely on separating things based on what is "good" and "clean" according to class distinctions.


Hahahaha, what a joke Google has become! Another half baked name change. Looks like no one really cares anymore at Google.


This is excellent. Thanks for sharing. It's always good to go back to the fundamentals. There's another resource that is also quite good: https://jaykmody.com/blog/gpt-from-scratch/


Not true.

Your resource is really bad.

"We'll then load the trained GPT-2 model weights released by OpenAI into our implementation and generate some text."


> Your resource is really bad.

What a bad take. That resource is awesome. Sure, it is about inference, not training, but why is that a bad thing?


This is not “building from the ground up”


Neither the author of the GPT from scratch post, nor eclectic29 who recommended it above did ever promise that the post is about building LLMs from the ground up. That was the original post.

The GPT from scratch post explains, from the ground up, ground being numpy, what calculations take place inside a GPT model.


Inference is nothing without training.


Why is that bad?


> These days, the roles I consider are in leadership so if we lack vision and a clear understanding of our value I’m usually empowered to fix that. If you’re interviewing for a more IC role, your hiring manager and teammates being unable to communicate expectations and success criteria is obviously a bigger concern.

So, the author doesn't seem to consider IC role a leadership role. I see, ok.


I think you're being unfair. It's clear what the authors intention is. They aren't speaking "down" regarding ICs, they're just making their intention clear in terms of the role they're in search of. Yes, ICs can absolutely be leaders in an organization (and _should_ be) but that doesn't change what the role of a leader is or what the author wants.


By definition they aren't. That's what the I stands for. Individual as in you aren't leading anyone else.


that isn’t mutually exclusive with leadership. ICs lead by example and mentorship. the fact that they’re not managing people doesn’t preclude being a leader at a company. the way i’ve heard it put for staff+ engs that i like is that they’re managers without reports.

but to speak to GP’s point, they are being a bit overly sensitive. “leadership” in the context of the article likely means C-suite or director level, which usually IS mutually exclusive with an IC role. (and maybe that’s what you meant, sorry for ‘splaining if so)


A whole article about getting a job in tech and no mention of the abominable leetcode. Surprise! From junior to principal Leetcode is here to stay. Good luck solving 2 problems in 35 mins.


I explicitly said no leetcode when I was an employee and getting hired and reached close to half a million dollar salary. So at least another path is present if devs don't give in.


Meta is less leetcode oriented? I'm shocked to read this. Meta is the poster child for leetcode style interviews. Meta requires you to solve 2 leetcode style questions in 35 mins (out of 45 mins - first 5 mins for initial pleasantries, last 5 mins for asking questions). For each question, you're required to (based on the signals they look for) ask clarifying questions, present a solution to the interviewer, get buy-in, code, verify with test cases - all this in 17.5 mins/question. Go figure! :-)


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: