The problem is that what constitutes a word is not uniformly defined across all languages - every language has its own set of rules that is partly justified by its grammar, and sometimes by historical reason.
Let me give you an example in English:
a dog, with dog, to dog, behind dog, inside dog, ...
I can continue this list forever. But you would tell me that's not interesting, because I'm just using prepositions Well, then we'll have to ask what is a preposition. In English you write preposition as a separate word because you can insert something in between:
a red dog, with blue dog, to tall dog, behind quick dog, inside my dog, ...
But what if in finnish you canot squeeze anything between preposition and noun? Oh, I rather meant postposition, but same difference. I don't speak a word of Finnish, but I suspect that's the case, and you are supposed to put your adjectives either before or after that word concatenated with all its postpositions. Maybe that's why the word is written together with its postposition.
Some languages add markers to the words, such as gender markers, tense markers, evidence markers (dog that I see vs. dog that I've heard of) in a similar way. Now just compute a cartesian product of all the possible prepositions times 3 gender markers times 3 tense markers times a few evidence markers and times 2 plurality markers. You get a scary number in the end, especially if you have to learn this language.
And as about Finnish, I have a friend who learnt it as a second language, and he told me that it's not as scary as it seems, because despite the shear number of declension forms, it all follows patterns with relatively few exceptions. That is not to say that Finnish is a simple language, but it's not as complicated as you might think after reading this story.
P.S. I'm an amateur in the field of linguistics with no knowledge of Finnish, so please correct me if I'm wrong. My example about cartesian product wasn't about Finnish in particular, but rather how gigantic numbers of declension forms may arise in other languages.
Native Finnish speaker here. Those are not postpositions but rather declension suffixes to the same stem word (koira, dog). It's still one word, and you discern it as such in speech. Though in speech, if the combination of suffixes you are using is particularly complex or unusual, you might need to make extra effort to speak particularly clearly to ensure intelligibility.
In English, you say "together with my dogs", in Finnish you could say "koirieni kanssa" (with-my-dogs together), or cram it all to "koirineni" (with-my-together-dogs), but you would pronounce the latter word clearly to ensure understanding as the -n- suffix in this context is uncommon in modern spoken Finnish.
Using a similar suffix method as Finnish uses to build declension, we could in English build from stem dog -> dogish (i.e. dog-like) -> dogishness (i.e. quality of being dog-like) -> dogishnessism ("ideology advocating the qualities of being dog-like"). It's clearly one word, but it might be hard to imagine a real everyday context to use it in.
Finnish is hard but not impossible. One great thing is the orthography is practically perfectly regular - something which English gets horribly wrong.
You can make large variety of "word"s based on köpek, dogishness would be "köpeksilik" which would win you some strange looks if you used it but Google turns up what appear to be a few thousand legitimate usages.
It's not quite fair to call köpeksilik or köpeklerimle words though, it's not like you have to memorize them to learn the language. Just as you can build the phrase "with my dogs" out of words you already know, you can quickly construct or understand köpeklerimle from the suffixes it's built out of.
And just like the end of this list reaches forms which a couple Finnish-speakers have said aren't very meaningful, Turkish also allows for some extreme "words".
Here's the longest Turkish word: muvaffakiyetsizleştiricileştiriveremeyebileceklerimizdenmişsinizcesine
It means something like "As if you are those whom we may not be able to easily make into people who make others unsuccessful". It usually takes a native speaker a while to actually understand this word.
It is possible that declension suffixes once upon time were postpositions. At least I remember this was a theory behind Indo-European declension suffixes. If a postposition is used often enough after a noun, people quickly learn the pattern, and think of the pair as a single word. Although this must have happened before writing was invented, so probably people didn't think of what constitutes a word. Now every now and then it happens that a word dies in a language. A postposition could have died as a standalone word but was preserved as a declension suffix. And now we might only think of it as a suffix.
As far as I know, all languages in the Finnish-Ugric family use a similar system of declension suffixes. Usually, the different declension cases are similar in number and their functions, but the actual suffixes used may vary from language to language. Proto Indo-European is indeed believed to have had a suffix-based complex inflection system, which the modern Germanic prepositions either have replaced or developed from.
I think your mostly right. Finnish language has prepositions like "below", "over", "above" but lot of prepositions just happen by conjucating the word. Also prepositions appear in few different ways based on the sound of the word, so "(to) dog" is "koira(lle)" but "(to) home" is "koti(in)".
It gets more complicated when you start mixing other attributes like ownership, so "(our) dogs" becomes "koira(mme)", then you can combine both of them saying "to our dogs", so it becomes "koir(illemme)". Then you can add in time references, conditionals, a question and you get lot of those others forms, like "koirammeko" which translates to "our dogs?"
I was going to suggest that English has a lot of postpositions if you count some of the suffixes such as -side and -wise. ie. fireside or lengthwise
After attempting to double check this, it looks like the siffix is not considered a postposition and the suffixed words are simply prepositions themselves. ie fireside chat.
The usual thing that would make you call a thing a declension, is that it actually mutates the root word somehow, instead of just agglomerating onto it.
Where is the line drawn between a declension and a phrase? Even though they're a single word, a lot of those Finnish variants appear to serve the purpose that a multi-word phrase would in most languages.
Finn here. Most of the technically valid declensions of "dog" would never see any actual use, so it's not quite as bad as it seems, but I think you still need to count them as individual words because they can't form a valid sentence by themselves.
These noun declensions are not where it ends, though. There are many ways to derive adjectives from nouns and nouns from verbs and verbs from adjectives, so you can generate technically valid new words for quite a long time given just one root.
> These noun declensions are not where it ends, though. There are many ways to derive adjectives from nouns and nouns from verbs and verbs from adjectives, so you can generate technically valid new words for quite a long time given just one root.
You have verbs like "dogged" and adverbs "doggedly" in English too.
Generating adverbs ("doggedly") from adjectives ("doggéd", two syllables) is, as far as I know, absolutely universal among inflectional languages. Adverbs and adjectives are very closely related things.
Generating verbs ("dogged", one syllable) from nouns ("dog") is absolutely universal among ALL languages, inflected or no, but is a much less regular process than the adjective-to-adverb derivation, as the relationship between an arbitrary noun and an arbitrary related verb is much less regular than that between an adverb and its matching adjective. English is pretty free in deriving verbs. Latin very obviously has methods of deriving verbs and adjectives from other parts of speech, but those are generally not given any formal grammatical treatment when you study the language -- it's expected that you'll just pick them up by feel.
Huh, I don't think Turkish lets you verb nouns. They've got plenty of ways to turn verbs into nouns but every time I try to make the other way work it just sounds weird.
Any chance you know the term in linguistics for verbing things? I want to look it up because you're right that this feels like a natural thing to do, it's surprising that Turkish doesn't allow it.
Derivation is the term for generating one word from another word. I don't think there's a term for specifically deriving verbs.
I did find a paper on Google Scholar purporting to list "derivations producing verbs from nouns" in Turkish, "An Outline of Turkish Morphology" by Oflazer, Göçmen, and Bozşahin.
English diminutives are very rudimentary and irregular compared to inflected languages.
In some language you can form several levels of diminutives from any noun. Even stuff that makes no sense, like "cute taxes", "cute death", "cute abstraction", "cute declension".
And you can form the opposite - take a noun and create less cute/uglier/larger version of it.
And these additional forms still obey the other rules, including declension. So there's a combinatorial explosion of versions of each word.
> Even stuff that makes no sense, like "cute taxes", "cute death", "cute abstraction"
A cute abstraction would definitely be my thing. As we would say in Italian, an astrazioncella, or astrazionuccia, or astrazioncelluccia, or astrazioncinuccia, or astrazioncinelluccia. :)
When learning some Italian I was both dismayed and fascinated to discover that it was necessary to avoid every diminutive ("cute") and augmentative I could think of, because they all carry sexual innuendo.
"Finnish is a member of the Finnic language family and is typologically between fusional and agglutinative languages. It modifies and inflects nouns, adjectives, pronouns, numerals and verbs, depending on their roles in the sentence."
"Fusional languages or inflected languages are a type of synthetic language, distinguished from agglutinative languages by their tendency to use a single inflectional morpheme to denote multiple grammatical, syntactic, or semantic features."
"An agglutinative language is a type of synthetic language with morphology that primarily uses agglutination. Words may contain different morphemes to determine their meanings, but all of these morphemes (including stems and affixes) remain, in every aspect, unchanged after their unions."
"A synthetic language uses inflection or agglutination to express syntactic relationships within a sentence. Inflection is the addition of morphemes to a root word that assigns grammatical property to that word, while agglutination is the combination of two or more morphemes into one word."
In a hardcore synthetic language, the difference between a word and a phrase is just about meaningless. The "words" can get full on crazy-pants, too.
I think the distinction you are looking for is the articles.
In english you have dog and dogs, but can add articles and say 'these dogs' 'those dogs' 'the dogs' and so on. In languages like finnish, the articles are included in the word, thus more variations when you take case and word genders into consideration.
Also, pronouns like "these" and "those" can't be included in the word in Finnish -- however, those that indicate ownership can (i.e. my dog/s, or the genetive case* ). There's no direct equivalent of "the dog" in Finnish either, the closest being "these/those dogs" where applicable.
* For language nerds: considering the subject (that emphasizes the many declensions in Finnish), it is somewhat ironic that both "my dog" and "my dogs" is the same single word in formal Finnish. Unlike many other cases, plain genetive doesn't allow distinction between singular and plural.
That's right, I would say that it is a bit misleading. Anyway 15 cases are already surprising, and if we count the plural form then we have 30 forms for a noun. I tried to learn Finnish when I was living there, I think it is a really enjoyable and logical language. Certainly one of the most special languages that we have in Europe.
The line is drawn around having single words for what other languages have phrases for. koirankaan is a single word just as houses is not a phrase (house es) but the plural of house.
Languages use different concepts to express the same ideas. The study of languages isn't built around the ideas but around the structure of the language itself.
- but that's not all, the number can change the declension: 1 pas; 2, 3, 4 psa; 5+ pasa
It's my language and I love it, but the grammar is a nightmare. I think it might be one of the easiest languages to learn to read and write, but good luck learning what it all means ;)
Many educators of Eastern European languages actually categorized Serbo-Croatian (and presumably still categorize BCMS today) as one of the easiest Slavic languages for foreigners to learn, because it has reduced the number of cases compared to what the situation was in Proto-Slavic or how it is in Russian or Polish today.
I took several years of Russian in college and am now trying to pick up a little German. Word of mouth had taught me to fear the German case system but... after Russian? Hard to see what all the fuss is about.
Yeah, Russian is far worse. I learned Polish for a while, which I understand has a lot of similarities to Russian, and it was far more difficult than German. I remember having decline words differently depending on which number came before it.
I find german and russian weirdly close to each other. Ok so there are the obvious words like кино, but also other things like the fact that in german, it's "jemande/m/ helfen" and in russian "кому-то помогать/помочь", both taking dative. There's loads of other similarities. Maybe it's the fact they're both indo-european, I'm sure someone more knowledgeable could explain.
Proto-Indo-European was fusional and pretty crazy on declensions, with plenty of cases for nouns and pronouns, verb aspects and moods, grammatical numbers (it had dual) etc. When you see the same in a European language, that's usually where it comes from, so it's not a coincidence. It seems that languages tend to shed that complexity over time, some (e.g. English) faster than others.
As with all agglutinative languages, you learn a set of morphemes that are applicable to all words, and are crazily composable.
It’s a bit disingenuous to show the resulting words as examples of how hard these languages are. I’d say they are easier (but very different) than English.
I know this from experience, as I speak Turkish with its Çekoslovakyalılaştıramadıklarımızdanmışsınız [1]
Not exactly... I wouldn't say. Unless porkinspectorsofficepenmakerdeclensionsufferer would be a normal composition in English. Just with spaces in between them.
It's shorter than the unpacked equivalent "throttler for the suppression of fast downvotes". And more importantly, it's less ambiguous when you're writing long sentences.
I wasn’t disputing its usefulness. I was saying that it’s just like these agglutinative languages, except it has the convention of writing them with spaces.
It’s not the same as writing “table tennis table”, for example. It’s having all of prepositions (of, at, in, on...), and part modifiers (such as -ness, -ly, -ing...) nearly ininetly composable by adding them to the end of the word.
Your comment would be more constructive if it included an explanation in English of what you wrote, like a few of the other top level comments have done.
Putting this through Google Translate produces the following romanji: "Naku, nakimasu, naita, nakimashita, naite, nakeru, naka reru, naka seru, nakase rareru, nake breathes nakanai, nakimasen, nakanakatta, nakimasendeshita, nakanakute, nakenai, naka renai, nakasenai, nakase rarenai, naku na".
Converting it to English produces the following: "Crying, squealing, singing, crying, crying, singing, sounding, sounding, sounding, barking breathes not singing, not singing, not singing, not singing, not singing, not singing, not singing, not singing No, I will not be singing, I will not sing, I can not cry, do not crow"
Seeing all the different languages represented here, I'm struck by how successful the dog has been at ingratiating itself to humanity. Does anyone know a language that has no word for dog?
Finnish is a synthetic language that uses heavily agglutination morphology to express syntactic relationships.
All those examples are grammatically correct but most of them are never used in written or spoken language. Instead of having few cases you learn and actually use, there is a set of rules you can use to build new words. Real spoken or written language is not using all what is grammatically available.
Let's see some Hungarian ones:
kutya, kutyát,kutyának, kutyán, kutyába, kutyában, kutyához, kutyának a (dog's), kutyára, kutyánál, kutyailag, kutyább, legkutyább, legeslegkutyább, kutyai, kutyáé
kutyám, kutyámat, kutyámnak, etc. (my dog)
kutyáim, kutyáimat, kutyáimnak, etc. (my dogs)
kutyájuk, kutyájukat, kutyájuknak, etc. (her/his dog)
kutyáik, kutyáikat, kutyáiknak (her/his dogs)
kutyák, kutyákat, kutyámnak etc. (plural)
etc, etc.
My beloved teacher (she was professor of ancient history, India) said that the Hungarian language is almost as difficult as the Sanskrit.
I know more people that tried learning Hungarian and failed than those who tried and succeeded, with a ratio of 10:1 if not more. The best part is that I heard from more than one Hungarian that it's not worth learning their language because it's too complex and there's only ~10 mil people that speak it. It's just not worth the effort :)
In a way English is the most interesting of these, because it descended from a language as highly inflected as Finnish, and in addition with fusional morphology, but has become almost completely analytical over time, like Chinese or Vietnamese.
It's amusing, but it's likely a cardinality like issue, they have n cases * n tenses * n genders. So I'm sure they follow a pattern that reduces what you need to learn at least by half and every day usage is a fraction of that.
I think this may be misleading, because if there are clear rules, and all words follow them, then a lot of things become easier. English has of/for/with etc in the place of these.
Maybe finnish has more declinations but german has a nearly unlimited amount of words with "dog" too...
Hundstage, Hundehotel, Hundekot, Hundeasyl, Hundepension, Hundeweg, Hundemüde, Hundefell, Hundebox, Hundebett, Hundefloh, Hundebiss, Hundewiese, Hundekeks, Hundebaby, Hundedreck, Hunderasse, Hundekette, Hundehalsband, Hunde... and so on and so on
I was so sure this was about some naming convention debate in a commit message or column/variable choice somewhere where it would be translated into, in Swedish, to "died" (=dog) without fluffy ears. "DogUsers"
All Uralic-Altaic languages have this property. Also there is wovel harmony in those languages. This makes it very hard for computers to understand them.
Let me give you an example in English: a dog, with dog, to dog, behind dog, inside dog, ...
I can continue this list forever. But you would tell me that's not interesting, because I'm just using prepositions Well, then we'll have to ask what is a preposition. In English you write preposition as a separate word because you can insert something in between: a red dog, with blue dog, to tall dog, behind quick dog, inside my dog, ...
But what if in finnish you canot squeeze anything between preposition and noun? Oh, I rather meant postposition, but same difference. I don't speak a word of Finnish, but I suspect that's the case, and you are supposed to put your adjectives either before or after that word concatenated with all its postpositions. Maybe that's why the word is written together with its postposition.
Some languages add markers to the words, such as gender markers, tense markers, evidence markers (dog that I see vs. dog that I've heard of) in a similar way. Now just compute a cartesian product of all the possible prepositions times 3 gender markers times 3 tense markers times a few evidence markers and times 2 plurality markers. You get a scary number in the end, especially if you have to learn this language.
And as about Finnish, I have a friend who learnt it as a second language, and he told me that it's not as scary as it seems, because despite the shear number of declension forms, it all follows patterns with relatively few exceptions. That is not to say that Finnish is a simple language, but it's not as complicated as you might think after reading this story.
P.S. I'm an amateur in the field of linguistics with no knowledge of Finnish, so please correct me if I'm wrong. My example about cartesian product wasn't about Finnish in particular, but rather how gigantic numbers of declension forms may arise in other languages.