Hacker News new | comments | show | ask | jobs | submit login
Ithkuil: A Philosophical Design for a Hypothetical Language (ithkuil.net)
84 points by setra 231 days ago | hide | past | web | favorite | 39 comments

I found this New Yorker article helpful in providing context: http://www.newyorker.com/magazine/2012/12/24/utopian-for-beg...

This is a fascinating article for anyone who is considering reading it. Long but absolutely worth it.

Today I learned that George Soros learned Esperanto as his first language.

I was skeptical of that - and still am. His Wikipedia page says his father was an "Esperantist" and that he was taught Esperanto, as opposed to learning it as his first language. I find that more likely.

There is a series of progressive rock songs in Ithkuil written by John Quijada, the language's creator, and sung by David Peterson, language creator for _Game of Thrones_ and several other popular TV shows, available on YouTube. The latest installment is here: https://www.youtube.com/watch?v=AAJlr5C8fPA

Pretty good musically, too.

It was a painfully long wait for the lyrics to start, though. Half of me wanted to listen to the awesome intro, and the other half just wanted to hear some damned Ithkuil. The former half won by small margin.

So, I'm not sure if it was the earlier article about this, or another documentary I saw, but I think it's been established that even as different languages have different "speeds" in terms of syllables uttered, the rate of communicated meaning is roughly the same across modern languages.

I wonder if that would hold up with this language as well - if it is so concise and precise, would it just mean that people take a while to speak it, and take a while to understand it after hearing it, before responding? Or is it possible that it's more Neo-in-the-matrix like where we'd immediately understand some complex idea right when it is uttered? I believe the former would be more likely but have no idea.

I would guess it's more likely that, rather than taking longer to speak or understand it, there'd be a high rate of misunderstanding due to the large phonemic inventory. I suppose the idea is that it's possible to express ideas more rapidly due to a higher entropy signal, but it seems unlikely that it's possible for listeners to hear 13 vowels + 12 diphthongs completely distinctly. I know several non-native but fluent English speakers who find it hard to follow all distinctions of English vowels (between 13-17ish, depending on dialect and including diphthongs), which is significantly fewer than Ithkuil proposes.

That assumes that anyone actually could come sufficiently close to speaking it, of course; it seems similarly unlikely that speakers could express so many distinct phonemes sufficiently clearly when the total complexity is so far beyond current languages.

Natural languages that have a large consonant inventory tend to have a simple set of vowels, and vice-versa. So it seems that humans can only handle phonetic complexity along one axis yes.

I am not sure. In Turkish, for instance:


It's a single word. It means: Are you one of those people who we tried to convert to take care of themselves better but failed to do so?

The language proposed reminds me in some ways to Turkish :)

I've been curious about the extent to which people commonly use some of these long theoretically possible Turkish words. I was curious about that for German, too, before learning it; my impression since then is that German compound nouns with more than three or so base nouns are rare in practice (maybe outside of a few technical contexts, like names of legislation). Germans love to give examples about a Danube steamship company captain's hat or a rhubarb-loving barbarian's bar and so on, but they probably wouldn't actually use such words spontaneously. So, how does word length work out in day-to-day use of Turkish?

Anecdotally as a second language speaker of Turkish, day to day Turkish doesn't​ use words that long. That's mostly just one of several very similar phrases used to point out how long and creative Turkish suffixes can get. However, many words one uses in day to day conversation have two or three suffixes tacked on. I'd say most words in a sentence have at least one suffix. Formal written Turkish you might find in a newspaper article discussing the latest political developments uses words with even more abstract suffixes. Such formal writing also uses long winded sentences which would be considered run-on in English and could occasionally be translated into a whole paragraph. It's a fascinating language!

You're right, such monstrosities are quite rare in regular language. Also, technically, the "mısınız" part at the end should be written separately so even this example is two words.

Counting the syllables I get something like

Turkish: 16 English: 26

Given the inherent loss in efficiency in trying to accurately translate very specific phrases out of context, that doesn't seem like such a big difference. (And I'm sure you could further condense the translation maybe like "Are you one of them we failed to get to take better care of themselves?" at 18.)

I think the OP's assertion is generally assumed to hold for natural languages.

> I wonder if that would hold up with this language as well - if it is so concise and precise, would it just mean that people take a while to speak it, and take a while to understand it after hearing it, before responding?

This sounds to me like formal scientific discourse: extremely concise and precise, but it takes a while to package one's thoughts in that form, and to unpack someone else's. That is to say, I think that your question is great, and that this possibility is probably what would happen (… or happens, I guess?).

I'm surprised no one yet has compared Ithkuil to Babel-17:


In the story, it's described as a language which forces clear thought, while secretly also forcing a change of beliefs.

At first I thought it seemed like the kind of language machines will invent to communicate with each other.

But then I realized it's main features aren't clarity and precision. But instead it's a just another way of mapping human mental models to encoders and decoders. Speech and script on to ears and mouths and eyes.

Machines will have fundentally different mental models of meaning and also be able to create any encoder decoder pair to suit any range of sensors and actuators​.

Machines could have a language in light, 3d printed objects could be scripts. A sculpture could literally be art and treatise at the same time.

'Concise and precise' seems the be the goal of this language. Like assembly language. It doesn't seem useful for mere humans. I would prefer playfulness and ambiguity.

Precision is totally compatible with deliberate ambiguity and playfulness.

As examples: Esperanto tries very hard to be precise, but there are playful neologisms, like "gedormi", which conjugates "dormi", "to sleep", in the mixed-gender plural, to form "to sleep with (heterosexually)", with the same innuendo as in English.

In Lojban, there is an abundance of poetry in which pieces of the phrases are intentionally and visibly omitted, which allows for flexible interpretation of created ambiguity.

Assembly isn't concise, though it is precise. An expression (program) to do anything substantial is actually pretty verbose in assembly, it's the programming equivalent of "show your work" in algebra class.

Especially the way modifiers are used, this reminds me more of the APL/J family of languages. Start off with words (verbs and such in those languages) and allow for modifiers (adverbs) that alter their behavior and meaning (sometimes significantly, while still retaining the core concept of the root verb).

I think precision in language is like scientific knowledge. No matter how far we extend it, the limit will always be there when we want to enjoy it. Not way out there but right here, like you never have to reach far from a set of rational numbers to find a real number outside the set.

I think precision is necessary if you expect to use the language to invoke conclusions or as a basis for inference.

Ambiguity in this context means you can derive some information but not as much as if it were precise.

"The weather is stormy" doesn't give you information about whether you need snowshoes or rainboots.

An interesting problem is that constructed languages that avoid syntactic ambiguity by having a formal grammar with no unambiguous parsing (like Lojban and, I think, Ithkuil) can still have semantic ambiguity. In Lojban there is never a syntactic ambiguity about what the asserted relationship between concepts is, but the language community readily admits that the underlying concepts themselves still contain cultural and other ambiguities, in terms of whether given language users would agree to apply those concepts to particular things, situations, or people.

It's very possible that conlangs like Lojban and Ithkuil still contain concepts about storms that don't require a speaker to indicate what kind of precipitation the storm produced.

I'm by no means a Lojban expert, but some fiddling with http://jbovlaste.lojban.org leads me to "vilti'a" for "storm", a combination of "tcima" (x1 is weather/a meteorological phenomenon [at place x2]) and "vlile" (x1 is violent/in a state of violence). You can then further refine this, e.g. by adding "lindi" (x1 is a lightning/electical arc) to get "lidvilti'a" for a thunderstorm [at place x2].

Lojban focuses on relations, in the logical sense, between objects. Some relations are very concrete and familiar to English speakers and some are not. Here are some examples of Lojban relations:

mlatu = x1 (entity) is a cat of species x2 (taxon) As a noun: lo mlatu — cat.

carvi = x1 (entity) rains or showers to x2 (entity) from x3 (entity) As a noun: lo carvi — rain. lo te carvi — rain cloud.

simxu = x1 (entity group) mutually do x2 (relation between members of x1, contains two places for [ce'u]) As a noun: lo se simxu — done mutually.

cnemu = x1 (entity) rewards x2 (entity) for atypical x3 (property of x2) with x4 (entity, property of x1) As a noun: lo cnemu — rewarder. lo se cnemu — rewardee. lo te cnemu — reason for a reward. lo ve cnemu — reward.

Is assembly language not useful for mere humans? Certainly, every human-friendly programming language abstraction that we use today is built off of something like assembly language.

Does playfulness necessarily exclude preciseness?

To the extent it involves benign misdirection, yes

I think that playful mathematics certainly exists.

I like Toki Pona more.


Toki pona is another constructed language, but it is focused in minimalism and being easy to learn. Despite the minimalism you can express many things with it.

I would not use it for technological things though, since technological concepts require to start being creative with neologisms that can be hard to get.

Very cool idea, and impressively well-thought-out. I have long toyed with the idea of a universal 'concept language' wherein all concepts can expressed as one of a set of "root" concepts that has had a chain of generic "transformations" applied to it.

Off topic: I used to know a guy named Ithkuil. It's not a common name in North America, but it is definitely a personal name. I know how strange it is when you find a random software project. Send after you. There are at least a couple named "Igor" out there.

Are you sure they weren't actually named after the language? I have been using it for gmail for many years (just because I am a fan, not connected to the language other than that).

They were a post doc in 2004 so I doubt it.

I used to be the top hit on google, then someone made Gerrit. Damn them.

That's odd indeed

Interestingly, the creation of a new ultimate expressive conlang is its very deprecation.

These languages begin with the notion that if the perfect language does not exist, we must start over entirely. The very act of starting over prevents any previous works from catching on.

I wonder what study has been done so far about specific linguistic paradigms that would be comparable to study about programming language paradigms like procedural vs functional, etc.

Is anyone here familiar enough with conlangs to compare this to others (like lobjan)?

Which other ones? There're thousands. Even if you just consider "popular" ones, there're a few hundred, for reasonable definitions of "popular".

Very briefly, lojban was conceived as a scientific experiment and tries to re-build the structure of language from the ground up, based on logical foundations with little or no connection to natural languages; the lexicon, however, is derived algorithmically from natural languages, and there is no particular emphasis on concision. Ithkuil, on the other hand, is much more of an artistic project, which makes use of the same basic mechanisms exhibited by natural languages but takes them to extremes in the pursuit of both precision and concision, and has an a-priori vocabulary.

Lojban is generally classified as a loglang and an engelang, while Ithkuil is generally classified as an engelang and an artlang.

The next most similar conlang to Ithkuil that I know of (and purely in my opinion) would be Latejami by Rick Morneau. Latejami is designed as an interlanguage for translation, and thus doesn't care much about concision (partly because it's mostly supposed to be used by machines, not humans) but does aim to be able to accurately represent any semantic structure that exists in any natlang, and goes to great lengths to systematize its semantics in ways similar to Ithkuil.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact