Hacker News new | past | comments | ask | show | jobs | submit login
English, Swedish, German and Finnish Decline “Dog” (2013) (linustechtips.com)
156 points by BerislavLopac 19 days ago | hide | past | web | favorite | 102 comments

The problem is that what constitutes a word is not uniformly defined across all languages - every language has its own set of rules that is partly justified by its grammar, and sometimes by historical reason.

Let me give you an example in English: a dog, with dog, to dog, behind dog, inside dog, ...

I can continue this list forever. But you would tell me that's not interesting, because I'm just using prepositions Well, then we'll have to ask what is a preposition. In English you write preposition as a separate word because you can insert something in between: a red dog, with blue dog, to tall dog, behind quick dog, inside my dog, ...

But what if in finnish you canot squeeze anything between preposition and noun? Oh, I rather meant postposition, but same difference. I don't speak a word of Finnish, but I suspect that's the case, and you are supposed to put your adjectives either before or after that word concatenated with all its postpositions. Maybe that's why the word is written together with its postposition.

Some languages add markers to the words, such as gender markers, tense markers, evidence markers (dog that I see vs. dog that I've heard of) in a similar way. Now just compute a cartesian product of all the possible prepositions times 3 gender markers times 3 tense markers times a few evidence markers and times 2 plurality markers. You get a scary number in the end, especially if you have to learn this language.

And as about Finnish, I have a friend who learnt it as a second language, and he told me that it's not as scary as it seems, because despite the shear number of declension forms, it all follows patterns with relatively few exceptions. That is not to say that Finnish is a simple language, but it's not as complicated as you might think after reading this story.

P.S. I'm an amateur in the field of linguistics with no knowledge of Finnish, so please correct me if I'm wrong. My example about cartesian product wasn't about Finnish in particular, but rather how gigantic numbers of declension forms may arise in other languages.

Native Finnish speaker here. Those are not postpositions but rather declension suffixes to the same stem word (koira, dog). It's still one word, and you discern it as such in speech. Though in speech, if the combination of suffixes you are using is particularly complex or unusual, you might need to make extra effort to speak particularly clearly to ensure intelligibility.

In English, you say "together with my dogs", in Finnish you could say "koirieni kanssa" (with-my-dogs together), or cram it all to "koirineni" (with-my-together-dogs), but you would pronounce the latter word clearly to ensure understanding as the -n- suffix in this context is uncommon in modern spoken Finnish.

Using a similar suffix method as Finnish uses to build declension, we could in English build from stem dog -> dogish (i.e. dog-like) -> dogishness (i.e. quality of being dog-like) -> dogishnessism ("ideology advocating the qualities of being dog-like"). It's clearly one word, but it might be hard to imagine a real everyday context to use it in.

Finnish is hard but not impossible. One great thing is the orthography is practically perfectly regular - something which English gets horribly wrong.

I'll add that Turkish is also like this.

"with my dogs" is köpeklerimle, dog-s-my-with.

You can make large variety of "word"s based on köpek, dogishness would be "köpeksilik" which would win you some strange looks if you used it but Google turns up what appear to be a few thousand legitimate usages.

It's not quite fair to call köpeksilik or köpeklerimle words though, it's not like you have to memorize them to learn the language. Just as you can build the phrase "with my dogs" out of words you already know, you can quickly construct or understand köpeklerimle from the suffixes it's built out of.

And just like the end of this list reaches forms which a couple Finnish-speakers have said aren't very meaningful, Turkish also allows for some extreme "words".

Here's the longest Turkish word: muvaffakiyetsizleştiricileştiriveremeyebileceklerimizdenmişsinizcesine

It means something like "As if you are those whom we may not be able to easily make into people who make others unsuccessful". It usually takes a native speaker a while to actually understand this word.

Yes, this property of a language is called agglutination, and it is not at all uncommon even in completely unrelated languages.

It is possible that declension suffixes once upon time were postpositions. At least I remember this was a theory behind Indo-European declension suffixes. If a postposition is used often enough after a noun, people quickly learn the pattern, and think of the pair as a single word. Although this must have happened before writing was invented, so probably people didn't think of what constitutes a word. Now every now and then it happens that a word dies in a language. A postposition could have died as a standalone word but was preserved as a declension suffix. And now we might only think of it as a suffix.

As far as I know, all languages in the Finnish-Ugric family use a similar system of declension suffixes. Usually, the different declension cases are similar in number and their functions, but the actual suffixes used may vary from language to language. Proto Indo-European is indeed believed to have had a suffix-based complex inflection system, which the modern Germanic prepositions either have replaced or developed from.

English orthography is great for Middle English. Not so hot for the various modern Englishes.

Reminds me of elementary when someone pulled out the word Disestablishmentarianism.

They missed out on antidisestablishmentarianism?

I think your mostly right. Finnish language has prepositions like "below", "over", "above" but lot of prepositions just happen by conjucating the word. Also prepositions appear in few different ways based on the sound of the word, so "(to) dog" is "koira(lle)" but "(to) home" is "koti(in)".

It gets more complicated when you start mixing other attributes like ownership, so "(our) dogs" becomes "koira(mme)", then you can combine both of them saying "to our dogs", so it becomes "koir(illemme)". Then you can add in time references, conditionals, a question and you get lot of those others forms, like "koirammeko" which translates to "our dogs?"

>Also prepositions appear in few different ways based on the sound of the word, so "(to) dog" is "koira(lle)" but "(to) home" is "koti(in)".

There's a subtle difference between those. Perhaps the latter would be better off translated as "(in to) home", or perhaps the former as "(for) dog".

I was going to suggest that English has a lot of postpositions if you count some of the suffixes such as -side and -wise. ie. fireside or lengthwise

After attempting to double check this, it looks like the siffix is not considered a postposition and the suffixed words are simply prepositions themselves. ie fireside chat.

The usual thing that would make you call a thing a declension, is that it actually mutates the root word somehow, instead of just agglomerating onto it.

> Let me give you an example in English: a dog, with dog, to dog, behind dog, inside dog, ...


What’s updog?

Nothing much. Whats up with you?

This is HN not Reddit but it seems okay on this thread, so I’ll allow it, especially since it made me laugh on a Friday evening.

If you allow it, just allow it, don't kill the mood. Otherwise state clearly that you allow it but want it to stop.


Google translate of the Finnish: your dog, dog, dog, dog, dog, dog, dog, dog, dog, dog, dog, dog, dog, dog, dog, dog, dog, dog, dog, dog, dog, dog, dog, dog, dog, dog dog, dog, dog, dog, dog, dog, dog, dog, dog, dog, dog, dog, dog, dog, dog, dog, dog, dog, dog, dog, dog

Your list seems long and repetitive, until you compare it to the set of dogs that currently exist.

Where is the line drawn between a declension and a phrase? Even though they're a single word, a lot of those Finnish variants appear to serve the purpose that a multi-word phrase would in most languages.

Finn here. Most of the technically valid declensions of "dog" would never see any actual use, so it's not quite as bad as it seems, but I think you still need to count them as individual words because they can't form a valid sentence by themselves.

These noun declensions are not where it ends, though. There are many ways to derive adjectives from nouns and nouns from verbs and verbs from adjectives, so you can generate technically valid new words for quite a long time given just one root.

> These noun declensions are not where it ends, though. There are many ways to derive adjectives from nouns and nouns from verbs and verbs from adjectives, so you can generate technically valid new words for quite a long time given just one root.

You have verbs like "dogged" and adverbs "doggedly" in English too.

Generating adverbs ("doggedly") from adjectives ("doggéd", two syllables) is, as far as I know, absolutely universal among inflectional languages. Adverbs and adjectives are very closely related things.

Generating verbs ("dogged", one syllable) from nouns ("dog") is absolutely universal among ALL languages, inflected or no, but is a much less regular process than the adjective-to-adverb derivation, as the relationship between an arbitrary noun and an arbitrary related verb is much less regular than that between an adverb and its matching adjective. English is pretty free in deriving verbs. Latin very obviously has methods of deriving verbs and adjectives from other parts of speech, but those are generally not given any formal grammatical treatment when you study the language -- it's expected that you'll just pick them up by feel.

”English is pretty free in deriving verbs”

Or, as Calvin said: “Verbing weirds language”


Huh, I don't think Turkish lets you verb nouns. They've got plenty of ways to turn verbs into nouns but every time I try to make the other way work it just sounds weird.

Any chance you know the term in linguistics for verbing things? I want to look it up because you're right that this feels like a natural thing to do, it's surprising that Turkish doesn't allow it.

Derivation is the term for generating one word from another word. I don't think there's a term for specifically deriving verbs.

I did find a paper on Google Scholar purporting to list "derivations producing verbs from nouns" in Turkish, "An Outline of Turkish Morphology" by Oflazer, Göçmen, and Bozşahin.

English also has diminuitives. piglet, doggy, etc.

English diminutives are very rudimentary and irregular compared to inflected languages.

In some language you can form several levels of diminutives from any noun. Even stuff that makes no sense, like "cute taxes", "cute death", "cute abstraction", "cute declension".

And you can form the opposite - take a noun and create less cute/uglier/larger version of it.

And these additional forms still obey the other rules, including declension. So there's a combinatorial explosion of versions of each word.

> Even stuff that makes no sense, like "cute taxes", "cute death", "cute abstraction"

A cute abstraction would definitely be my thing. As we would say in Italian, an astrazioncella, or astrazionuccia, or astrazioncelluccia, or astrazioncinuccia, or astrazioncinelluccia. :)

When learning some Italian I was both dismayed and fascinated to discover that it was necessary to avoid every diminutive ("cute") and augmentative I could think of, because they all carry sexual innuendo.

> they all carry sexual innuendo

They don't. Your teacher was a pervert. :)

Abstrakcja, abstrakcyjka, abstrakcyjeczka, abstrakcyjunia, abstrakcyjeczunia, ... :)

The Russian word vodka is a diminutive form of voda (water) — in essence, vodka is “cute water”.

yep, same in Polish.

Woda, wódka, wódeczka, wódzia, wódziunia.

and then if you don't like drinking you can go the other way:

wódka, wóda, wódzicha.


"Finnish is a member of the Finnic language family and is typologically between fusional and agglutinative languages. It modifies and inflects nouns, adjectives, pronouns, numerals and verbs, depending on their roles in the sentence."

"Fusional languages or inflected languages are a type of synthetic language, distinguished from agglutinative languages by their tendency to use a single inflectional morpheme to denote multiple grammatical, syntactic, or semantic features."

"An agglutinative language is a type of synthetic language with morphology that primarily uses agglutination. Words may contain different morphemes to determine their meanings, but all of these morphemes (including stems and affixes) remain, in every aspect, unchanged after their unions."

"A synthetic language uses inflection or agglutination to express syntactic relationships within a sentence. Inflection is the addition of morphemes to a root word that assigns grammatical property to that word, while agglutination is the combination of two or more morphemes into one word."

In a hardcore synthetic language, the difference between a word and a phrase is just about meaningless. The "words" can get full on crazy-pants, too.

How to know where one word ends and the next begins: https://en.wikipedia.org/wiki/Word#Word_boundaries

Examples of how the system works in Finnish and in other languages on the continuum: https://en.wikipedia.org/wiki/Agglutination

I think the distinction you are looking for is the articles.

In english you have dog and dogs, but can add articles and say 'these dogs' 'those dogs' 'the dogs' and so on. In languages like finnish, the articles are included in the word, thus more variations when you take case and word genders into consideration.

No word genders in Finnish, sorry! :-)

Also, pronouns like "these" and "those" can't be included in the word in Finnish -- however, those that indicate ownership can (i.e. my dog/s, or the genetive case* ). There's no direct equivalent of "the dog" in Finnish either, the closest being "these/those dogs" where applicable.

* For language nerds: considering the subject (that emphasizes the many declensions in Finnish), it is somewhat ironic that both "my dog" and "my dogs" is the same single word in formal Finnish. Unlike many other cases, plain genetive doesn't allow distinction between singular and plural.

"No word genders in Finnish, sorry!"

No direct word for "please" either, unless you want to count kiitos, which I understand as thank you.

> but can add articles and say 'these dogs'

FYI those aren't articles, but demonstrative adjectives, which are part of the language class of determiners, to which articles also belong.

That's right, I would say that it is a bit misleading. Anyway 15 cases are already surprising, and if we count the plural form then we have 30 forms for a noun. I tried to learn Finnish when I was living there, I think it is a really enjoyable and logical language. Certainly one of the most special languages that we have in Europe.

Well, yes, it's quite special; it's a Finnish-Ugric language, one of the few languages in Europe that isn't Indo-European.

The line is drawn around having single words for what other languages have phrases for. koirankaan is a single word just as houses is not a phrase (house es) but the plural of house.

Languages use different concepts to express the same ideas. The study of languages isn't built around the ideas but around the structure of the language itself.

Lest we be Russian here...

Собака, Собаки, Собаке, Собаку, Собакой, Собаке.

Собаки, Собак, Собакам, Собак, Собаками, Собаках.

That's, of course, the declension for female dog, which is distinct from male dog (and bitch).

Since we're doing Slavic languages, let's go with Serbian.

- dog = pas (we don't distinguish between "a dog" and "the dog")

- declensions (singular): pas, psa, psu, psa, psu, psom, psu

- declensions (plural): psi, pasa, psima, pse, psi, psima, psima

- but that's not all, the number can change the declension: 1 pas; 2, 3, 4 psa; 5+ pasa

It's my language and I love it, but the grammar is a nightmare. I think it might be one of the easiest languages to learn to read and write, but good luck learning what it all means ;)

Many educators of Eastern European languages actually categorized Serbo-Croatian (and presumably still categorize BCMS today) as one of the easiest Slavic languages for foreigners to learn, because it has reduced the number of cases compared to what the situation was in Proto-Slavic or how it is in Russian or Polish today.

I took several years of Russian in college and am now trying to pick up a little German. Word of mouth had taught me to fear the German case system but... after Russian? Hard to see what all the fuss is about.

Yeah, Russian is far worse. I learned Polish for a while, which I understand has a lot of similarities to Russian, and it was far more difficult than German. I remember having decline words differently depending on which number came before it.

I find german and russian weirdly close to each other. Ok so there are the obvious words like кино, but also other things like the fact that in german, it's "jemande/m/ helfen" and in russian "кому-то помогать/помочь", both taking dative. There's loads of other similarities. Maybe it's the fact they're both indo-european, I'm sure someone more knowledgeable could explain.

Proto-Indo-European was fusional and pretty crazy on declensions, with plenty of cases for nouns and pronouns, verb aspects and moods, grammatical numbers (it had dual) etc. When you see the same in a European language, that's usually where it comes from, so it's not a coincidence. It seems that languages tend to shed that complexity over time, some (e.g. English) faster than others.

Hey, it's simple: nominative, genitive, dative, accusative, instrumental and prepositional. For singular and plural.

For the curious ones, that (roughly) means: A dog, Of a dog, To the dog, a dog, By the dog, About the dog.

And all of the same in plural.

As with all agglutinative languages, you learn a set of morphemes that are applicable to all words, and are crazily composable.

It’s a bit disingenuous to show the resulting words as examples of how hard these languages are. I’d say they are easier (but very different) than English.

I know this from experience, as I speak Turkish with its Çekoslovakyalılaştıramadıklarımızdanmışsınız [1]

[1] https://en.m.wikipedia.org/wiki/Longest_word_in_Turkish

Right, isn't this whole thing a crazy artifact of

"some languages put in a space between a noun and its modifiers, some don't".

If it were a practice to always join modifiers with the noun, you'd see the same for English, right?

Not exactly... I wouldn't say. Unless porkinspectorsofficepenmakerdeclensionsufferer would be a normal composition in English. Just with spaces in between them.

I do see native speakers string together noun phrases like that, with the spakes. e.g. "Fast downvote suppression throttler".

It's shorter than the unpacked equivalent "throttler for the suppression of fast downvotes". And more importantly, it's less ambiguous when you're writing long sentences.

I wasn’t disputing its usefulness. I was saying that it’s just like these agglutinative languages, except it has the convention of writing them with spaces.

The differeyis that the morphemes you add at the end are the same for all words. It’s not just stringing several words together.

But those morpheme suffixes are functionally the same as modifier words (like in English) but with the convention of not writing spaces.

It’s not the same as writing “table tennis table”, for example. It’s having all of prepositions (of, at, in, on...), and part modifiers (such as -ness, -ly, -ing...) nearly ininetly composable by adding them to the end of the word.

For agglutinative languages, would it it like ...

throttler forthesuppression offastdownvotes

... or ...


... or ...



鳴く、鳴きます、鳴いた、鳴きました、鳴いて、鳴ける、鳴かれる、鳴かせる、鳴かせられる、鳴け breathes 鳴かない、鳴きません、鳴かなかった、鳴きませんでした、鳴かなくて、鳴けない、鳴かれない、鳴かせない、鳴かせられない、鳴くな

Your comment would be more constructive if it included an explanation in English of what you wrote, like a few of the other top level comments have done.

Putting this through Google Translate produces the following romanji: "Naku, nakimasu, naita, nakimashita, naite, nakeru, naka reru, naka seru, nakase rareru, nake breathes nakanai, nakimasen, nakanakatta, nakimasendeshita, nakanakute, nakenai, naka renai, nakasenai, nakase rarenai, naku na".

Converting it to English produces the following: "Crying, squealing, singing, crying, crying, singing, sounding, sounding, sounding, barking breathes not singing, not singing, not singing, not singing, not singing, not singing, not singing, not singing No, I will not be singing, I will not sing, I can not cry, do not crow"

... and that's just declining based on the "familiar" social relationship. You'd never use any of these forms with the emperor.

Well, at least there isn't gender or number to worry about

It includes "polite" as well. Of course, this doesn't include the "respectful"/"deferential" styles, and also omits archaic and dialect forms.

However, if you have japanese customers, worry about future year numbering in your program.

Seeing all the different languages represented here, I'm struck by how successful the dog has been at ingratiating itself to humanity. Does anyone know a language that has no word for dog?

That sounds utterly impossible unless there is somewhere on this planet an isolated tribe on some island who would have lost all their dogs eons ago.

I can’t help but share a slide from my talk at EuroClojure 2016 that’s gone viral:


Hope that I am not the only one confused about this: what does "decline" mean in the title?

It's a confusing title. It would be more clear as "The declension of the word "dog" in various languages".

The correct verb is "declinate". The title is wrong. Declination (not declension) is like conjugation bur for nouns.

Important note: I don't care about the fact that none of the English dictionaries agree with me in this respect :)

I'd like to point out that in German, "der Hunden" doesn't exist.

Finnish is a synthetic language that uses heavily agglutination morphology to express syntactic relationships.

All those examples are grammatically correct but most of them are never used in written or spoken language. Instead of having few cases you learn and actually use, there is a set of rules you can use to build new words. Real spoken or written language is not using all what is grammatically available.

Can't talk about crazy Finnish declensions without inviting Estonians!

Koer -- Eesti keele süntesaator Koer // sg n, // koer koer

Koer // pl n, // koerad koerad

Koer // sg g, // koera koera

Koer // pl g, // koerte koerte koerade koerade

Koer // sg p, // koera koera

Koer // pl p, // koeri koeri koerasid koerasid

Koer // sg ill, adt, // koera koera koerasse koerasse

Koer // pl ill, // koertesse koertesse koeradesse koeradesse

Koer // sg in, mas, // koeras koeras

Koer // pl in, // koertes koertes koerades koerades

Koer // sg el, mast, // koerast koerast

Koer // pl el, // koertest koertest koeradest koeradest

Koer // sg all, // koerale koerale

Koer // pl all, // koertele koertele koeradele koeradele

Koer // sg ad, // koeral koeral

Koer // pl ad, // koertel koertel koeradel koeradel

Koer // sg abl, // koeralt koeralt

Koer // pl abl, // koertelt koertelt koeradelt koeradelt

Koer // sg tr, maks, // koeraks koeraks

Koer // pl tr, // koerteks koerteks koeradeks koeradeks

Koer // sg ter, // koerani koerani Koer // pl ter, // koerteni koerteni koeradeni koeradeni

Koer // sg es, // koerana koerana

Koer // pl es, // koertena koertena koeradena koeradena

Koer // sg ab, mata, // koerata koerata

Koer // pl ab, // koerteta koerteta koeradeta koeradeta

Koer // sg kom, // koeraga koeraga Koer // pl kom, // koertega koertega koeradega koeradega

Source: https://www.filosoft.ee/gene_et/

Edit: typo

Let's see some Hungarian ones: kutya, kutyát,kutyának, kutyán, kutyába, kutyában, kutyához, kutyának a (dog's), kutyára, kutyánál, kutyailag, kutyább, legkutyább, legeslegkutyább, kutyai, kutyáé kutyám, kutyámat, kutyámnak, etc. (my dog) kutyáim, kutyáimat, kutyáimnak, etc. (my dogs) kutyájuk, kutyájukat, kutyájuknak, etc. (her/his dog) kutyáik, kutyáikat, kutyáiknak (her/his dogs) kutyák, kutyákat, kutyámnak etc. (plural) etc, etc. My beloved teacher (she was professor of ancient history, India) said that the Hungarian language is almost as difficult as the Sanskrit.

I know more people that tried learning Hungarian and failed than those who tried and succeeded, with a ratio of 10:1 if not more. The best part is that I heard from more than one Hungarian that it's not worth learning their language because it's too complex and there's only ~10 mil people that speak it. It's just not worth the effort :)

And all of these are words that are used depending on the context (some are interchangeable though).


No one did Arabic? ok I'll write it in Latin characters for y'all:

One Dog : Kalb, Two Dogs : Kalban, Three to 100 dogs : Thalathat Kelab, Dogs : Kelab .. 100 and more it reverts to singular so it's <Number> Kalb.

I guess in Hebrew it'd be very similar as well but I don't speak it.

Small correction:

One Dog - Kalb

Two Dogs - Kalban

3 - 10 - Kelab

from 11 - 99 would be: Kalban.

100 - Kalb

101 - 110 - Kelab

111 - 199 - Kalban

200 - Kalb


In a way English is the most interesting of these, because it descended from a language as highly inflected as Finnish, and in addition with fusional morphology, but has become almost completely analytical over time, like Chinese or Vietnamese.

It's amusing, but it's likely a cardinality like issue, they have n cases * n tenses * n genders. So I'm sure they follow a pattern that reduces what you need to learn at least by half and every day usage is a fraction of that.

I think this may be misleading, because if there are clear rules, and all words follow them, then a lot of things become easier. English has of/for/with etc in the place of these.

Here's Armenian (western).

Singular: (շուն: shoun)

շուն շունս շունդ շունը շունի շունիս շունիդ շունին շունԷ շունԷս շունԷդ շունԷն շունով շունովս շունովդ շունովը

Plural: (շուներ: shouner)

շուներ շուներս շուներդ շուները շուներու շուներուս շուներուդ շուներուն շուներԷ շուներԷս շուներԷդ շուներԷն շուներով շուներովս շուներովդ շուներովը

Technically, there are 16 more, but they have the same form as the first eight of each.

I was introduced to this through the polandball comic of this post: https://imgur.com/QFm6SCE

Maybe finnish has more declinations but german has a nearly unlimited amount of words with "dog" too...

Hundstage, Hundehotel, Hundekot, Hundeasyl, Hundepension, Hundeweg, Hundemüde, Hundefell, Hundebox, Hundebett, Hundefloh, Hundebiss, Hundewiese, Hundekeks, Hundebaby, Hundedreck, Hunderasse, Hundekette, Hundehalsband, Hunde... and so on and so on

The Finnish seems related to the idea that code is the most compressed form of information.

I was so sure this was about some naming convention debate in a commit message or column/variable choice somewhere where it would be translated into, in Swedish, to "died" (=dog) without fluffy ears. "DogUsers"

I could have sworn I saw this exchange as a Polandball comic but I can't find it.

This would make a great Polandball

edit: Did not expect downvotes on this. Curious why

Not much to see here, in Polish all the forms (for plural and singular) are just:

pies, psa, psu, psem, psie, psy, psów, psom, psami, psach

All Uralic-Altaic languages have this property. Also there is wovel harmony in those languages. This makes it very hard for computers to understand them.

Okay, now we are ready to start with the dog breeds, from the afghan hound to the yorkshire terrier

It's funny how people in this thread start to debate what is a word..

Maybe we should just laugh and move on :)

They forgot the possessive/genitive of dog: dog's and dogs's [dogs']

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact