More

alexlur · 2025-01-29T17:20:17 1738171217

Note: Author is the CEO of Anthropic.

alexlur · 2024-11-06T18:37:51 1730918271

Not to mention that temporary seasonal agricultural workers have ever been called “expats” either.

alexlur · 2024-06-09T00:06:12 1717891572

Is it just me or the “Read in other languages” links in the article are just plaintexts that don’t link to anywhere?

layer8 · 2024-06-09T10:38:56 1717929536

Probably your browser. Japanese, Spanish and Russian works, the others are grayed out.

alexlur · 2024-06-02T04:39:59 1717303199

They are not comparable. The Chinese script was tailor-made for Chinese languages, while it was simply adopted by the Koreans, which arguably was a bad fit because it’s 1) agglutinative and 2) not even a Sino-Tibetan language. Even then hanja is still part of the national education curriculum today (look up 한문 교육용 기초 한자).

alexlur · 2024-06-02T01:25:02 1717291502

It’s really bizarre to see someone claim kana has anything to do with “modernization”. The Japanese modernization and industrialization period is famously associated with translating Western concepts and terminologies into Sinitic words that later spread to China, Korea and Vietnam.

rjh29 · 2024-06-02T04:23:48 1717302228

That was true like 100 years ago, but nowadays katakana words are extremely popular and increasingly used over their Sinitic counterparts, so I feel it's a valid argument.

Also it's not uncommon for words like ろ過（濾過）to be written in part kanji especially in news... if that trend continues beyond the 常用 kanji we might end up with a Japanese that is closer to Korean.

alexlur · 2024-06-02T04:57:12 1717304232

The modernization argument only makes sense if your society is economically or militarily inferior to the society you want to emulate. It was the case 100 years ago, but not today.

The Japanese economy has been stagnant for over 30 years with no end in sight. Following the same logic, Japan should perhaps “modernize” their language by following China, which is a ridiculous conclusion as you can tell.

alexlur · 2024-06-01T22:22:12 1717280532

This is an extremely mainland-centric view. Cangjie is the dominant IME in Hong Kong.

asdasdsddd · 2024-06-01T22:24:46 1717280686

That's why I said almost all

alexlur · 2024-06-01T22:34:57 1717281297

It’s only almost all if you only interact with the millennials or younger. Pinyin is an IME for Mandarin. If you aren’t fluent in Mandarin, chances are you use voice input or stroke typing.

causality0 · 2024-06-02T01:08:19 1717290499

Why shouldn't it be mainland-centric? Mainland China is 99.5 percent of the population of China. That's like refuting a claim about Americans by calling it "a very non-Pennsylvanian view".

alexlur · 2024-06-02T01:34:56 1717292096

Because China is not the only place where Chinese languages are spoken. There’s more than 10 million ethnic Chinese in Southeast Asia alone. And it’s not only a mainland-centric view: it’s a mainland–Mandarin speaking centric view.

alexlur · 2024-06-01T22:03:27 1717279407

> how inefficient Chinese characters are in general (but especially evident in computing)

We are not in the 90s anymore. UTF-8 has been around for 32 years now. If you’re working for a system that has no UTF-8 support, you have a much bigger problem to worry about.

> characters have no direct relation to phonetics

Most characters are phono-semantic where one part of the character is a phonetic hint and the other is a semantic hint.

> modernize it similar to Hiragana

Hiragana isn’t and wasn’t intended to replace kanji (unless you are from the fringe Kanamozikai). It serves a different grammatical purpose and is complementary to the other two. Kana is useful for an agglutinating language like Japanese, but not Chinese languages.

numpad0 · 2024-06-02T01:19:29 1717291169

I think one of statements with respect to CJK languages that has to be made more often is that each of the languages has own numerous dialects with dubious mutual intelligibilities, e.g. Tsugaru and Kagoshima dialects against standard Japanese.

The phrase "a language is a dialect with an army" often appears in topic of Asian languages, and causing frictions between CJK non-speakers wondering about compatibilities between the three and speakers showing near vile dissents to those questions. While I understand both sides of these sentiments, the situation is not ideal for both sides.

IMO, it might be weird to refer to these languages as "Beijing Tokyo Seoul" languages, but doing so occasionally(just occasionally) could create more tangible feel as to why these three seem to exist side by side so utterly disconnected against each others.

shiomiru · 2024-06-01T22:16:53 1717280213

> Kana is useful for an agglutinating language like Japanese, but not Chinese languages.

FWIW, the Japanese did develop a kana-based system for Taiwanese during the occupation, but it was an abomination.[1]

[1]: https://en.wikipedia.org/wiki/Taiwanese_kana

alexlur · 2024-06-01T21:44:16 1717278256

Thank God it didn’t happen.

z2 · 2024-06-02T03:42:58 1717299778

Much of the simplification adopted shorthand already in common use, which is why Japanese shinjitai simplification independently arrived at many similar characters and patterns. The second simplification round was an abysmal newspeak-esque failure, and thank goodness _that_ wasn't adopted either.

asdasdsddd · 2024-06-01T22:11:47 1717279907

pinyin is the best thing that happened to the language after simplification.

Not only did it propel literacy rates to basically 100%, but it added a phonetic component to the language

alexlur · 2024-06-01T23:06:35 1717283195

Again, this is a very mainland-centric view. Hong Kong has never simplified their writing system or even developed a proper romanization, and yet has consistently one of the highest literacy rates in the world. Guess what helped literacy? Post-war socioeconomic development like poverty reduction, mass education and industrialization.

> it added a phonetic component to the language

Fanqie has been a thing since the 2nd century. Zhuyin was invented in 1913.

charlieyu1 · 2024-06-01T23:47:54 1717285674

Agreed. I have seen kids from mainland China spending lots of time learning pinyin while kids from Hong Kong at the same age can already write some characters and pronounce the words accurately

numpad0 · 2024-06-02T03:08:43 1717297723

Simplification is just bad. It removes too much that it breaks ability for non-speakers to infer meanings. Complexity of letter shapes is irrelevant to ease of use in computer usage, so it's just a massive loss.

fjdjshsh · 2024-06-02T20:59:04 1717361944

>it breaks ability for non-speakers to infer meanings

Not sure what you mean by this. Do you mean that it's less convenient for people that don't speak / read Chinese? Why would that be a relevant metric?

You may be missing that character standards have changed over time and that different writing styles (草书，行书) are implicitly simplifications. You can think of latin or Russian cursive as a simplification of the printed letters.

In practice, the phonetic component has been mangled / evolved over time, so simplification doesn't make things more or less difficult for students (be it 5 year old native speakers or 50 year old non native speakers).

rurban · 2024-06-02T13:12:40 1717333960

Worked out excellent for Korean (Hangul) though. Also English.

Both massive wins

numpad0 · 2024-06-02T15:50:16 1717343416

I don't think it did for Korean, though I need input from speakers to be sure. From my experience, Korean MT routinely stops halfway through inputs and dumps nonsensical phonetic transcripts, likely from failing to identify words. I suspect they were just being complaisant to American influence in postwar years. Computers failing to even isolate and match words in this day and age is not a sign of an excellent working script.

yorwba · 2024-06-02T18:15:50 1717352150

Translation needs phonetic transcription to handle proper names. If there are words that may or may not be proper names depending on context, machine translation will guess the context wrong at least some of the time and phonetically transcribe what should be translated, or translate names that should be transcribed.

The problem also can also happen when translating from English, if you think about all the surnames that are occupations, or names like "bill" or "lily." Capitalization usually helps disambiguate, but there's title case and all caps and people who never capitalize anything...

numpad0 · 2024-06-03T08:36:48 1717403808

It's not just proper nouns. Korean MT seem to routinely "de-synchronize" into wonbonhangugeotegseuteububun mid-sentence and sometimes comes back in sync, sometime stays out of sync until the end of the sentence. it happens way more often than average with the Korean language.

yorwba · 2024-06-03T15:01:17 1717426877

Do you have an example input where that happens?

DiogenesKynikos · 2024-06-02T19:42:05 1717357325

Simplified Chinese characters are already difficult enough for foreigners to learn. Making them learn traditional characters would just be sadistic.

numpad0 · 2024-06-03T14:33:28 1717425208

Traditional characters is built on common parts for pronunciation and meaning cues. Simplified removed that so IMO it compresses worse and therefore harder. It's visually less dense, but, so what.

DiogenesKynikos · 2024-06-03T14:54:12 1717426452

Those cues are there in exactly the same way in most simplified characters.

The cases where simplification has removed those cues are rare enough that the extra complexity of traditional characters is really not worth it.

I've never heard anyone claim that simplified characters are more difficult to learn, and it just seems false to me.

ogurechny · 2024-06-02T03:57:34 1717300654

“Literacy rate” is just a bureaucratic index. It was increased in most countries with mostly the same measures, no matter which their writing system was. If look closely, “literacy” meant “making mass of workers and soldiers capable of following basic instructions”, and there often was not much for them to read except for parroted propaganda (obviously, I'm not talking about China specifically, as it has been the same everywhere).

tengwar2 · 2024-06-03T22:42:57 1717454577

Phonetics can be counterproductive to comprehension, or converting meaning to text. Take an example much closer to English: Scottish Gaelic, which is written with the Latin alphabet. It's considerably older than English, has more distinct consonants and vowels, and it is really difficult to guess the pronounciation from a written word if you only speak English (unlike Welsh, which has nice orthography and is easier than English in that respect).

Because of these difficulties, there is a long tradition of anglicising names of settlements to meaningless collections of letters which when read by an English speaker approximate vaguely to the original Gaelic name. Unfortunately this is not a reversible process - you can't look at a modern anglicised name and guess what the Gaelic is, in general.

Now while Gaelic has a tiny population of native speakers, there are millions of people who know some "map Gaelic" - that is, we can look at a map with Gaelic place names, and understand the elements. It doesn't work for towns and villages, but generally in the north, no-one bothered to anglicise the names of natural features, just the settlements - and walking is the most popular outdoor recreation in the UK, so we learn this when we read maps.

When the first SNP government of Scotland came in, they introduced bi-lingual road signs, even in areas where Gaelic is no longer spoken. There was and is complaint over this, but I found that things became much clearer. I could look at a placename like Machrihanish, and see that it is Machaire Shanais. I still don't know what Shanais means, but Machaire is a type of landscape that I know, so I immediately know that this is low-lying and grassy, and fairly level. I can do this for thousands of place names without being able to reliably tell how to pronounce the words - similar to the way that the pronounciation of a word indicated by a Chinese character can vary widely with the part of China, so that the pronounciation becomes quite secondary to communication.

vunderba · 2024-06-02T01:18:18 1717291098

Uh... no. Bopomofo which is used in Taiwan is a phonetic script that is used as a popular IME.

And simplification's only "arguable merit" is that it saves a fortune in ink at the expense of losing its historical roots. But guess what? We mostly use computers now. So great job Mao, now we have two competing standards. (Nod to XKCD).

Unrelated but to those of us who started with 繁體字, simplified just looks ugly. (龙 vs 龍)

iforgotpassword · 2024-06-02T19:10:02 1717355402

Sure traditional looks nicer, but holy fuck is writing it (by hand) ever annoying. When I asked friends who grew up with the traditional characters about it they said a lot of people use some form of simplification when taking notes or leaving messages for friends/family. People from mainland seem to only shorten words by omitting characters of longer words, if at all.

And about losing the historical roots, I guess if you're interested in it, the characters will always be there and accessible for you to study. I'd be interested how much the average Joe from Taiwan really remembers about random characters' roots, composition and meaning. I know much more people from the mainland, and among them are people who don't give shit, and those who can also write a lot of traditional characters and give lectures about the origin of meaning of some character and whatnot.

Also, since this is about computers after all, I've seen a study a while ago about from mainland where they tested how many mistakes people make writing less common characters. There was a bar chart that went down between 10 and 20ish, then went up a bit and started to go down again at around 30. It was speculated that people in school still have to write a lot by hand, and during/after college that stops and everything has been digital for a decade now so people just forget again, but folks old enough to have used pen and paper for a couple decades just had enough practice. I wonder if this effect would be more or less pronounced with traditional characters.

rjh29 · 2024-06-02T04:15:17 1717301717

I feel like Japanese strikes the right balance, no ugly oversimplified characters but making common kanji easier to write for children (國→国、櫻→桜）

For example 竜 is a fairly common simplification of 龍 and imo not nearly as ugly

z2 · 2024-06-03T04:23:56 1717388636

There are some strange-looking ones too (圖-図、圓-円), but agree that overall it was lighter touch. I think all simplification projects have an inherent awkwardness in taking handwriting shorthand or cursive and trying to reformalize it back to print. In any case it's a shame that there was no coordination due to obvious geopolitical conflicts, that we're now left with 3 sets. It was easier last time, 2.2k years ago when some dude took over all places that wrote Chinese characters and forced a single way of writing :)

Vt71fcAqt7 · 2024-06-02T21:41:16 1717364476

Yeah except hiragana and especially katakana both look ugly though.

otabdeveloper4 · 2024-06-03T10:59:57 1717412397

> simplified just looks ugly

I prefer simplified for the aesthetics alone. Traditional is cringe and ugly in typed form.

wolfgangbabad · 2024-06-01T21:46:01 1717278361

Vietnamese is relatively OK.

alexlur · 2024-06-01T21:56:22 1717278982

Chữ Nôm is a borrowed writing system and not native to Vietnamese, which isn’t even a Sino-Tibetan language to begin with.

gumby · 2024-06-02T02:37:05 1717295825

Latin is a borrowed writing system not native to English, German, Polish and many others which aren’t even Romance languages to begin with and must resort to di- and trigraphs plus non-Latin characters like J, V, ß, ł or å, among others (not to mention diacritics).

DiogenesKynikos · 2024-06-02T20:00:01 1717358401

Alphabets are much more flexible than the Chinese characters.

An alphabet can be adapted to basically any language. You just have to map the letters to the sounds, and you're pretty much done.

By contrast, the Chinese writing system is adapted very specifically to the properties of Chinese language. Every syllable in Chinese has a meaning (or set of meanings), so every character represents one meaning (or a few). English does not have that structure: words can have very arbitrary syllables that don't have any meaning on their own. Chinese characters encode a meaning plus a sound, which is often reflected in how they're composed (i.e., a character will often be composed of two simpler characters, one of which has the correct meaning and one of which has the correct sound). Chinese words do not change form: there's no conjugation, no plural form, etc. As a consequence, the writing system has no way to deal with things like conjugation.

I have no idea how one would even begin trying to adapt Chinese characters to write English. On the other hand, it's relatively easy to come up with a way to write Chinese in any alphabet.

int_19h · 2024-06-02T20:02:02 1717358522

"Å" is just "O" stacked on top of "A" though. And "V" is in fact the OG Latin form ("U" is the newly introduced one).

But yeah, the whole notion is kinda silly. Most writing systems in the world are developed from very few originals. E.g. for most of Eurasia, the source is either Egyptian hieroglyphs or the Shang Oracle bone script.

acwan93 · 2024-06-01T21:56:19 1717278979

Relatively. The amount of diacritics on Vietnamese surpasses European languages so text rendering becomes a challenge if a naive developer doesn't test with Vietnamese.

numpad0 · 2024-06-02T00:59:56 1717289996

Is bringing back Chu Nom script going to simplify Vietnamese support on computers by a lot? It's unintelligible to CJK users, but as far as text rendering goes, it seems just simple Kanji/Hanzi.

publicola1990 · 2024-06-02T18:53:25 1717354405

The Vietnamese romanized their writing, they seems to be doing fine.

alexlur · 2024-06-03T02:27:54 1717381674

This isn’t factually correct. The French colonial administration romanized their writing and enforced chữ Quốc ngữ.

alexlur · 2024-05-22T02:13:57 1716344037

Old thread: https://news.ycombinator.com/item?id=32264372

> "Powered by LibertOS, our proprietary privacy OS"

> James O’Keefe quote on the front page

> Gettr link in the sidebar

This is not a phone for privacy-conscious people. This is a phone marketed to Republicans.

alexlur · on Nov 24, 2023

Worked for Switzerland.