Here's how I (Japanese native) would do it:
SENTENCE = S 。
S = BUN* verb | BUN* NP be-verb
BUN = NP pp | adv (pp = postposition, or 助詞)
NP = noun | adj NP | S NP
(S (BUN (adv 今日))
(BUN (NP (noun 築地)) (pp で))
(BUN (NP (noun 寿司)) (pp を))
(verb 食った)) 。
(S (BUN (NP (S (BUN (NP (noun 築地)) (pp で))
(NP (noun 寿司))) (pp は))
(NP (adj うまい)
(NP (noun 寿司)))
(be-verb だった)) 。
The be-verb is optional in S. "S = NP" is completely valid grammatically. Also "adj = BUN* verb | B* NP be-verb" if you allow "i-adjectives" to be classed as verb phrases (as they were historically) and "na-adjectives" to be "noun-phrase ni aru". So you would actually get:
SENTENCE = VP | NP
VP = BUN* verb | BUN* NP be-verb
BUN = NP pp | adv
NP = VP? noun
The grandparent point about recursive rule(s) is valid however and their easiness is one of my favorite things in Japanese grammar.
Anyway, the problem seems easy at first but it’s totally possible to vaste hours perfecting a grammar that still doesn’t capture frequently occurring construction (been there done that).
With, of course, some exceptions. I'm confident there are exceptions in Japanese as well, being as exceptions are the rule in natural languages.
This sentence is accepted by the parser shown in the article and is conceptually the same as 「築地 で 食った 寿司 は うまい 寿司 だった 。」 (save for the use of past tense, which, for this purpose, is irrelevant).
(S (JAR (noun-no モンタナ) (LID に))
(JAR (M (verb-u いる)) (noun-no ひと) (LID は))
(SFF (M (verb-i うつくしい)) (noun-no ひと) (da だ)))
The parser just turns every sentence into a flat list. Imagine a parser for a programming language that parses every single line, but doesn't group lines in the same function, loop, conditional etc. together. Unless the language is assembly, you wouldn't be able to get away with that.
It all comes down to the precedence levels of the LID particles (に, は, で etc.), doesn't it. I'm not sure if those are well-defined, but I think it would feel more natural for a native speaker to evaluate に before は in nearly all cases. The order between で and は seems more elusive. I'll definitely bring this up in our conversation group next time, though I'm sure I'll get some funny looks for it.
M = Jar | verb-u | verb-i | noun-na na | noun-no no
Example text from the blog post:
Much like the rest of biological systems, natural language "laws" and formalisms are more or less guaranteed to have a "except when..." clause after them.
I tried to think of a recursive "except when" to the above, but couldn't come up with anything appropriately witty.
The content is good, but unfortunately if I came across this domain name in a search engine result page I would think it was yet more SEO spam --- over two decades of Internet use has made me weary of domains containing words like "best" in them. (For similar examples, think of domains like best-antivirus-2020.online --- sounds like a fake AV scam.) IMHO the proliferation of such TLDs has only made things easier for phishing and such.
I'm not aware of Japanese equivalents (probably due to my lack of Japanese knowledge), but the Gyeongsang dialect of Korean language allows a sentence that only contains a single syllable 가 (ga) and thus can only be distinguished by intonation. "가가 가가가" is a commonly cited example and pronounced (here denoted using X-SAMPA) and analyzed as follows:
가 ka: The kid (short for "그 아", itself short for "그 아이")
가 ga [topic marker]
가 ka The family name "Ga"
가 <R>ga A suffix for family (collective, unlike "씨" for individuals; e.g. "김가" for Kims)
가 <F>ga Coupla, weak question
かか、 mother (お母さん in standard Japanese)
かー this (これ)
かー this way (こう)
かかー write (書く)
か ? (か)
> Ted and Ed edited it.
I'm vaguely aware of two meant for Japanese, though since I know no Japanese, there are almost certainly errors in my report here:
> "Let's cover Eastern Europe", to-o-o-o-o-o-o-o-o-o.
> "Two chickens in the front yard and two in the back yard", niwaniwaniwaniwaniwaniwatori.
If you look at the post history for the spammer's main account, they try to disguise their spamming by posting a few miscellaneous Wikipedia articles and the like between posting their site: https://news.ycombinator.com/submitted?id=sova
You can also see them experimenting with spamming their other sites as well; the poster is always the same individual.
すもももももももものうち or ははははははははのははははははとわらった
Please see MeCab (https://en.wikipedia.org/wiki/MeCab) segmentation output below:
In swedish: Bar barbar-bar-barbar bar bar barbar-bar-barbar
In german: Weichen Weichen weichen Weichen, weichen Weichen weichen Weichen
Their main landing page doesn't inspire much confidence either: https://japanesecomplete.com/purchase
The combination of layout and font gives that sketchy feel. I'm sure the content itself is great, but I think it would benefit from a bit more carefully constructed site along with some sort of trial (á la Wanikani) since almost every resource claims to be the panacea you've been looking for.
So if you say "sakana da", it does mean "It is fish", but so does just "sakana". The copula is implied. The "da" is completely optional and is actually only added for emphasis -- the literal translation is kind of like "That it is fish exists". In literary Japanese you would say "sakana de aru", "de" being the particle that links a verb the the means with which the verb is executed. For example "basu de iku" means "will go by bus" -- bus is the means by which we will go. In "sakana de aru" or "sakana da" we are basically saying that "fish" is the means of its existence.
The "na" modifier is also interesting. It is really "ni aru" where "ni" is essentially the "direction" in which something exists. "Something like a fish" would be "sakana no you". If you want to say "It is a fragrance something like a fish" you could say "sakana no you na kaori". Although I'm not aware of any modern Japanese that would express it like this, this is equivalent to "sakana no you ni aru kaori" -- "It is a fragrance that exists in the direction like a fish". Hopefully you can understand.
The interesting part of this is that adding "ni aru" to the end of a noun phrase just turns it into a verb phrase. And the even more interesting bit is that the only thing that can modify a noun phrase is a verb phrase.
But, you may have heard of "i-adjectives" -- these are adjectives that end in i. In actuality, these are not adjectives! They are verb phrases! So the word "cute" is "kawaii". However, the actual word is "kawai" and the inflection is "i". That's why when you want to say "not cute" it becomes "kawaiku nai" -- the "i" turns into "ku" because you are inflecting a verb.
This in turn is why you modify nouns directly with "i-adjectives". "kawai sakana", or "cute fish". Other adjectives are actually noun phrases in Japanese. "yumei na sakana" or "famous fish". This is, again, exactly the same as "yumei ni aru sakana" -- "The fish exists in the direction of fame".
So the rules are even simpler than presented in this blog post.
By the way, for anyone trying to learn Japanese and who wants to go beyond phrase-book level: learn plain form first and polite form later (if ever). Japanese makes absolutely no sense if you learn polite form first. It's incredibly logical (even the polite form extensions) if you start with plain form.
Do you have any recommendations for material that teaches Japanese in this manner? I've found few resources that actually go into etymology like you have shown.
However,most of the stuff like the above that I learned actually comes from an NHK programme, "Kotoba man". It went into the history of vocabulary and grammar and after I watched as many episodes as I could find, the structure of the language really started to make sense to me.
I think part of the problem is that most prescriptive grammars for language are subtly incorrect. For example, what part of language is blue in "I am blue"? What part of the language is it when in "Blue is my favorite colour"? If they are different parts of language, what is the difference in meaning between the two? What if you said, "I am painting"? Or if you said "Painting is my favourite hobby"? What is the difference in meaning between the 2 uses (if any)? Is there a difference in the meaning of "am" between "I am blue" and "I am painting"? I think if you follow the prescriptive grammar of English, it will force you to answer the questions in different ways than your intuitive (internal) ideas. Or at least it did for me. Studying that sort of sentence in English helped me to study similar sentences in Japanese and puzzle out similar insights (or at least as imagined by me). YMMV ;-)
My google-fu isn't turning up anything with this title. Do you happen to have a link to the page?
Unfortunately that would leave me with the issue of now learning 2 languages.
Interestingly enough the best (and only) resource I've been able to find for content like GP is https://www.japanesewithanime.com/ which – far from what I originally imagined given the domain name – actually has in-depth explanations of both grammar and etymology (see: , ) replete with references to research articles in JP linguistics.
> In literary Japanese you would say "sakana de aru", "de" being the particle that links a verb [with] the means with which the verb is executed.
´de´ has a few more uses, than just instrumentals, that you explain in that sentence. Instrumentalis show what instrument (noun) is used to do an action (verb). This ´de´ is often translated as "using" as in "I ate sushi using chopsticks". The somewhat broader interpretation of ´de´ you present, kinda makes sense, but I'm not clear why "de aru" is more appropriate/meaningful than "wo aru" (other than, because "that is how it is").
> However, the actual word is "kawai" and the inflection is "I".
I disagree. The word is kawaii. The word stem is kawai. But I think the logic holds either way.
> This in turn is why you modify nouns directly with "i-adjectives". "kawai[i] sakana", or "cute fish".
It is "kawaii sakana" with two I's.
> That's why when you want to say "not cute" it becomes "kawaiku nai" -- the "i" turns into "ku" because you are inflecting a verb.
But inflecting a verb to its negative form is turning a "u" into an "a" or simply removing the "ru" ending: "aruku -> arukanai", "taberu -> tabenai". Saying that the "i" turns into a "ku" because you are inflicting a verb has no explanatory power over saying the the change is because you are inflicting an adjective.
I think the argument for calling I-adjectives verb-phrases, is that they have tense (the same tenses as regular verbs)
The argument against, is that you can say "kawaii de aru", but you cannot add "de aru" to a verb-phrase.
I wish more learning material acknowledges this fact. This is why they can forms entire sentence by themselves. I also agree with your comment about plain form first. That’s why for the Japanese method I’m writing for a relative, I start with those predicative adjectives, then explain how は change the topic, etc.
Language methods should use the underlying logic and natural use of a language. I cringe each time I see something with lesson #1 begin "watasi ha gakusei desu" or the like because this is teaching bad habits from day one.
A refresher on Forth: in Forth, you run a lexer over text input to get a stream of tokens; but you don't run a parser (or at least not much of one) over the tokens to get an AST. Instead, each lexeme—each "word" in Forth—self-describes as either a literal or a symbol representing a function call. All words the lexer encounters are immediately "run" using the runtime. Structured programming (like defining functions and then later calling them) is enabled by having the runtime itself be a finite-state machine, that can be put in different states by the execution of certain words, such that all words that are executed in the new state are executed to different effect (e.g. the word `[` will make all words up to a matching `]` have the execution semantics of pushing their symbolic representation onto the stack, sort of like a Lisp quote; the `]` itself then captures what's on the stack and builds a function, sort of like the `defmodule` macro does in Elixir.)
Analogously, in Japanese, most words are literals, that just push themselves onto the stack; while there are two types of "active" words: grammatical particles and verbs. Each particle is a construction function, which attempts to match and consume a certain shape of existing words on the stack, pushing back a tagged structure in its place; and then verbs consume a particular set of tagged structures from the stack (varying by verb), in any order, leaving on the stack anything they didn't expect, and pushing back a representation of the structured meaning of the sentence-as-a-whole up to that point.
I think this is well-represented by showing how Japanese does quotation:
To transliterate and rewrite this in a Forth-like grammar, with grammatical particles as lower-case keyword symbols, verbs as upper-case bare symbols, and everything else as quoted literal words:
"Boss" :subject [ "I" :subject "him" :referent DISLIKE NEGATION ] :cons SAID.
Note that, in this particular case, the brackets are sugar here: the sentence would have the same semantics without them. (They're helpful to visually disambiguate where you should mentally backtrack to, but it's clear by the fact that there's two :subject-particles that the inner verb DISLIKE is only going to capture the last-pushed one, leaving the previously-pushed one on the stack for SAID to later consume.)
One could also describe this as what a shift-reduce parser does internally, but with the reduction edges triggered as explicit command-words in the input lexeme stream, rather than being triggered implicitly by non-conflicting pattern-matches.
And this is also, as it happens, the core of any pure-functional abstract-machine bytecode ISA (e.g. Erlang's BEAM bytecode ISA.) You've got ops that push literals; ops that take patterns of stuff off the stack and push back new product-types containing those same things; and ops for calls to (maybe-built-in) functions named by a pushed symbol term.
I'll just point out two things odd with your sentence example though:
- はvs.が is a difficult topic, but I think 上司が would be more appropriate
- you're supposed to use keigo when talking about a superior, not only when talking to them. So it should be 仰った(おっしゃった) rather than 言った.