Hacker Newsnew | past | comments | ask | show | jobs | submit | electroly's commentslogin

UTF-16 is the internal format of the ICU library (International Components for Unicode, the support library from the Unicode standards people) which is a common way to add "full fat" Unicode support to a programming language. This has knock-on effects everywhere. If you're using ICU, you either use UTF-16, too, or you constantly convert back and forth every time you interact with ICU. You're often best off using UTF-16 in memory and only converting to UTF-8 when you write files or transmit over the network.

> Amazon was AWS first customer

It wasn't. The retail business took years to move to AWS. They could not even be described as early adopters of AWS.


Microsoft wrote their own replacement and it's better-integrated into Office than the old editor was. Search for "insert equation" in the Office search box. The new one supports both the old editor's style of input and actual LaTeX inputs. Heed the dates on articles talking about it: TFA is from 2018 before they wrote the new editor.

There was an interesting paper on how the newer equation editor handles math using just Unicode math symbols: https://www.unicode.org/notes/tn28/UTN28-PlainTextMath-v3.1....

30 years ago, in C89 and pre-standard C++, it was the case that `int foo()` in C is a function that accepts any parameters, and in C++ it is a function with no parameters. In C89 you have to write `int foo(void)` if you want no parameters. This counterexample to C++ being a superset of C was well-known even back then.

Another well-known counterexample is implicit conversion from void*. In C89 you can do `int* foo = malloc(100);` but in C++ it requires an explicit cast from void* to int*.

I don't believe there was ever a time, even pre-standardization, when C++ was a strict superset of C; it always had little incompatibilities here and there.


Perhaps in c-with-classes(Cpre)? To the extent that its output could be considered C.

It looks like you're right and the answer to when was C++ a superset of C may well be "never".

From the description, Cfront had always been a full-fledged parser that only happened to output C since the very beginning.


> a full-fledged parser

perhaps more accurately a fully fledged compiler (that emitted C)


> I had forgotten than you can build a perfectly legal enum from an integer out of the bounds of the enum's range. And a switch statement is non-exhaustive

These are solved by the new feature described in the article that we're commenting on right now. They're giving us unions and exhaustive switch. Ctrl+F "canonical way to work with unions" in the article to see an example. One of the best parts about C# is they never stop bringing useful features from other languages back home to us in C#. It makes for a large language with a lot of features, but if we really want something, we'll eventually get it in C#.


And one of the worst is how long it takes them to implement even simple things. There are parts of the language (Expression) that are 20 years behind the rest and they don’t see the problem.

I helped out on this image-to-bits transcription, doing manual verification of the automated work. I did the whole thing by hand: I sliced the ROM images into strips that excluded parts of the image that don't encode bits, used my tablet and stylus to manually place a black dot on every 1 bit, then wrote a trivial program that detected the presence or absence of the black dot in each cell. From my perspective, the ROM is organized like a series of "ladders" where the 1 bits are missing legs of the ladder, and I was placing dots on the missing legs. I compared my results with the ML output and manually re-checked each bit where we disagreed.

http://brianluft.com/images/2026/05/386_microcode_bits.jpg -- my fully annotated result. I was working from a higher-quality PNG; this is highly compressed because it's a big image.


Thank you so much for your work. Thank you!

I wanted to give HN a perspective on working on this stuff: Working on these micrographs is like looking for a penny on 4 football fields: I tried to see how long it would take me to search the physical area for any coins, and it took 4 1/2 hours and I did not find a penny, but I found two dimes.

This is maddening work, and again, thanks.


This OP hasn't done any of those things. They are here discussing the project, and it's clear all of their replies are human-written. The AI use is stated up front in the readme. They posted a 12 minute YouTube video demonstrating that the project works, with narration that indicates English is not their first language. The git commit messages are all classic short human messages. It's a genuinely neat project that obviously has no commercial motivation. Their crime appears to be using AI to clean up their non-native English in the README and then reusing some of that README text in the top-level descriptive comment on their Show HN post. Indeed, they should not have done that for their comment, but the rest of these accusations are just soapboxing about AI. You could have written this comment anywhere; it has nothing to do with this post.


> and it's clear all of their replies are human-written. The AI use is stated up front in the readme. The

Very much not the case with the comment I responded to.

There is a stark contrast between the AI written first comment and some of their other comments.

I know many here don’t like any accusations of AI writing because they aren’t as attuned to picking it up, but the comment I responded to was as blatant as it gets.

I tried to give a more friendly encouragement to share self-written comments.


Yes, I'm obviously aware of that. We're all capable of seeing em dashes and staccato sentences. My reply mentions, explicitly, that their top-level comment was AI written (reusing portions of their AI-written readme) and that their replies are human written. I chose my words carefully; HN itself uses the terminology "comment" for top-level messages and "reply" for sub-level messages, and I used the phrase "top-level" to further disambiguate it. I apologize if that was confusing but what I said was accurate and carefully considered. I further agreed that they should not have done that. That one comment seems to be their only crime here. You then took the opportunity to soapbox about a bunch of things that OP did not do, in the message that I replied to.

I don't have anything to add. It just seems like you misunderstood my message.


> HN itself uses the terminology "comment" for top-level messages and "reply" for sub-level messages

Does it? I can't see any distinction in https://news.ycombinator.com/newsguidelines.html or https://news.ycombinator.com/newsfaq.html

Rereading your comment now with this in mind, I can make out the distinction, but I don't think you made it clearly.


The submit button in the post box says "add comment" vs. "reply" depending on what kind of message you're posting, and the link under comments says "reply" while articles don't have that link. I called it the "top-level descriptive comment on their Show HN post" because I agree just "comment" vs. "reply" alone could be confusing. What's a better way to describe the comment that an author posts with their Show HN to begin the discussion vs. replies that are made in response to specific comments? I genuinely don't know how many more terms I could have loaded onto that phrase to make it clear which post I was talking about. That wasn't supposed to be a confusing part of my post.


I'm not disagreeing, but I will point out that COM reference counting is an atomic integer operation. That's expensive. boost::local_shared_ptr exists because std::shared_ptr does sometimes cause performance problems. std::shared_ptr must be used sparingly. It's unlikely to matter in a UI scenario with long-lived objects because it, indeed, does use reference counting sparingly.


I don't think AWS says so explicitly, but it's commonly known that T2/T3/T4 use live migration to balance workloads across hosts. This is necessary when oversubscribing hardware (which is the explicit purpose of these families) to avoid hot spots. This may be the "runtime interruption behavior" that the author sees. Use an instance family with dedicated CPU capacity if this matters to you. With T2/T3/T4 you are explicitly asking for variable performance to save money.


For sure — the interesting part to me isn’t that burstable infra varies. It’s that the variance barely showed up in the normal benchmark layer at all while the sustained workload behavior started separating underneath.

Testing for diff scenarios and building a dataset to show perf deeply and broadly to help with decision-making. Trying to dial in the true bang/buck.


"Short of..." indeed. You already know the answer, although it doesn't need to be general; it only needs to work on a single codebase.

A recent and highly relevant example is the migration of the TypeScript compiler to Go. They did not use an LLM to translate the code. Instead, they used LLM assistance to write a deterministic TypeScript-to-Go translator and then used that to translate the code. I have far more confidence in this approach than in letting the LLMs rip on the translation itself.


I think TypeScript to Go is far easier to translate than something to Rust though.


Is it? I wouldn't assume that. Go is a smaller and less flexible language than Java/Typescript (I say that as a compliment) so it's not clear to me that all Typescript idioms have an obvious Go equivalent.

Leaving aside ownership, Rust is a big, complex, expressive language. I'm not that familiar with Zig, but I think it tries to be a "better, modern C" so it seems like it should be easily possible to mechanically translate Zig into direct Rust equivalences. You probably won't get "good" idiomatic Rust at the end, but you should get working code that does the same thing.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: