
What part of speech is “the”? (2006) - synesso
http://itre.cis.upenn.edu/myl/languagelog/archives/002974.html
======
tedd4u
I'm surprised the author has run in to so many that don't seem to have learned
about articles. Isn't this one of the most basic topics in middle-school
grammar? I'm pretty sure "the" is one of the most common words in the language
after all.

I was taught in 6th or 7th grade that "the" is the definite article. As
opposed to "a" which is an indefinite article.

For those that are fuzzy on the concept this is the best online reference I
found: [https://learnenglish.britishcouncil.org/en/english-
grammar/d...](https://learnenglish.britishcouncil.org/en/english-
grammar/determiners-and-quantifiers/definite-article)

~~~
SilasX
As the article points out, it's not really "wrong" to say "the" is an
adjective: it functions exactly like one! It's only wrong in the sense of
violating one of two equally useful classification systems.

And this error doesn't even bubble up to the level of miscommunication, the
only real sense in which you can say someone's usage is incorrect! Remember,
in everyday use, you don't directly observe word classification. So unless
someone goes Archeresque and says "I saw the movie -- in which I of course
mean 'the' to be an adjective", there's no sense in which that mistake is
common either.

With that said, I think there is a substantive, common misuse of "the", but
it's not what the article describes. Rather, it's when people say eg "I went
to the store", when the speaker and listener don't have a common reference
point that would justify "definiteness". I've always felt these should be "a
store", since there are so many they could be referring to. (It's fine when
there really is only one store they could be referring to.)

~~~
escherplex
Other modern languages such as Russian, Japanese, Chinese, and others get by
perfectly well without the qualifiers 'the' and 'a', which according to
Wikipedia are the first and sixth most frequently used words in English
respectively, so they don't seem cognitively compulsory for understanding. And
juxtaposing British English with US English 'the' has throw-away status in
assertions such as 'I am going to university' (Br) vs. 'I am going to the
university' (US) but not in 'I am going to the store' so you're also dealing
with local conventions. Plus in common usage 'the' and 'a' can be used in
their technical sense in EG, 'not a [any old] car but THE car' with suitable
prosody added for effect. With all the spins that can be imposed on articles
'the' and 'a' trying to formalize their use with precise definitions would
seem like a lost cause. (and of course definitions use words which themselves
remain undefined but please don't go there)

~~~
cjhveal
It should be noted, however, that while many languages get by without
explicitly marking definiteness, there are other linguistic techniques that
can provide similar contextual clues that are required for understanding the
pragmatic meaning of an utterance. Topic-comment structures[1] are often used
to convey a sense of what is assumed as new versus known information. For
instance, Russian's relatively free word order allows it to rearrange the
constituents of the sentence so that the topic (what is being talked about,
usually familiar to the listener) precedes the comment (what is being said
about the topic). Japanese has this baked-in with particles that mark the
topic (see [2]).

[1]
[https://en.wikipedia.org/wiki/Topic_and_comment](https://en.wikipedia.org/wiki/Topic_and_comment)
[2]
[https://en.wikipedia.org/wiki/Japanese_grammar#Topic.2C_them...](https://en.wikipedia.org/wiki/Japanese_grammar#Topic.2C_theme.2C_and_subject:_.E3.81.AF_wa_and_.E3.81.8C_ga)

~~~
escherplex
Good observations. It's also interesting to look back 2K years at the kingpin
Eurasian languages for evidence of definite/indefinite articles. From right to
left, ancient Chinese (probably) lacked the qualifier but was exposed to its
existence via the silk road (which used the lingua franca Aramaic which had
them) and also through exposure to the Greek Buddhist colonies in North India
(ancient Greek had a definite article). Same holds for Sanskrit in ancient
India, which lacks any definite article. Same for old Latin. And Greece was
subordinate to Rome, which only tolerated Greek as a language of the
intelligentsia. Curious how an appendage in ancient secondary tongues became
so mainstream in so many dominant contemporary languages.

~~~
cafard
In Homeric Greek, the definite article is more of a demonstrative adjective,
isn't it, more "that man" rather than "the man"? And Greek hung on as the
language of administration in the eastern part of the empire.

~~~
escherplex
True. The Wikipedia page on 'Article (grammar)' brings up the same points.
Meanwhile, back in Greece,the European Union, and the OXIclean vote (a stain
remover product in the US :) ...

------
alricb
These days, in Quebec, French (as a first language) is taught using a more-or-
less linguistically correct approach, with articles called determiners,
dividing sentences into groups and using transformations to understand
interrogative sentences. In my days we used a more traditional system, and the
time we spent doing grammar analysis was very tedious; we only approached the
subject in secondary school. These days they are taught to analyse sentences
in 3rd-4th grade and up.

------
bane
Part of the problem of course if the naive notion that you can fit words into
a single grammatical category, like species in a taxonomy, while word usage
ends up defying this simplification.

The real problem is that taxonomic classification is rarely correct in nature.

~~~
hyperpape
Reading this, I have no idea if you're saying it because you know how modern
syntactic theories work and how they're developed or if you're just saying it
because you thought it sounded good.

Don't leave us hanging!

~~~
bane
I'll be perfectly honest, I'm half-serious, half-drunk-posting.

But on simple inspection, the idea that a word can fit into a single category
is obviously wrong. So to generalize the idea, you have to assume that a word
can fit into multiple categories.

Biology is undergoing a similar transformation from simple "this animal looks
kinda like this one" to genomics classification, which is revolutionizing the
tree of living things used by biologists to organize living things.

There's not really any reason why words, a biological artifact, have been
correctly classified all along. Hell! A copy-editor neighbor and I have minor
quarrels over the capitalization of college subject names.

Trees are a specialization of graphs, and they're usually wrong. Just assume
graphs to start with and your data will organize much better, even if the
properties of the organizational metaphor aren't quite as nice.

~~~
hyperpape
Well, I share some of the instinctive scepticism about tidy theories of
syntax/semantics.

But I don't have any real reason to claim that they don't work. Syntactic
theories aim to explain why particular sentences are grammatical or
ungrammatical (in the judgments of native speakers). If you can accurately do
that, then yes, you've put each word into one category (as far as making up
grammatical sentences). So to say the kind of thing you're saying, you'd have
to explain why the current research programs in syntax are misguided.

------
kmicklas
Linguistics is the only field I can think of where the general public does not
take seriously the knowledge of the experts (linguists).

~~~
umanwizard
Climate science, evolutionary biology, ...

~~~
Cushman
I actually think this comparison is quite apt. Like those others, the science
of linguistics has become heavily politicized in the US, most notably
regarding the validity of AAVE as a first language.

Though I'd still argue linguistics has it worse. You don't really expect
people in educated society to get into heated arguments with the 10-day
weather forecast, or jump in to correct your "biology mistakes" while you're
eating...

~~~
umanwizard
Spot on. Disdain for Ebonics is one of the last remaining ways in which it's
considered acceptable in polite society to be blatantly racist.

------
sethjgore
"The deeper problem is the school tradition itself. It's a TRADITION, after
all"

Interesting point and I agree. Why are we identifying metalinguistics with
words anyway? Why not just use symbols so we can actually compare them across
languages?

Is the tradition so inherently rigid and stuck to the written/spoken languages
- rather than moving beyond words and use a system of symbols?

The tradition obsufcates the universality of grammar and semantics by adding
words atop words rather than simple symbolism.

Doesn't this make internalization harder? You don't teach math just by numbers
but also with symbols and real-world references.

As for internalization and understanding what and where 'the' belongs...There
has been astounding success in identification and understanding when
metalinguistics is entirely removed and replaced with simple symbols like
_grmmr_ ([http://green-bridge.org](http://green-bridge.org)). I've seen
results upfront and it seems a non-word approach eases and solidifies
linguistic understanding in ways far more profoundly than metalinguistics ever
can. Teachers and students alike have said this system has taken away all the
cloud of uncertainty in language grammar and semantics.

If you are a linguist and you are interested in helping grmmr become a
standard in the studies of metalinguistics, please do let me know! I'm the
design partner of the company. We are eager to open source this knowledge into
a simple but reliable notation.

This system has been in use for over 20 years in classrooms, colleges, and
adult education programs. It has been redesigned from bottom up in order to be
as accessible and consistent as you would expect from a grammar categorizing
system.

------
kazinator
Just because you can point to "the" and identify it as a complement or article
doesn't mean you understand it and you're doing linguistics!

See this recent submission:
[https://news.ycombinator.com/item?id=9836769](https://news.ycombinator.com/item?id=9836769)
[Richard Feynman on the difference between being able to name something, and
understanding.]

Native-English-speaking children already know how to correctly use articles,
without requiring a theory of what they are.

Teaching English learners that "the" is an article will not by itself result
in correct use; they will be able to name it as an article, yet continue to
use it in ways perceived as incorrect by native speakers, and do so for
decades to come, possibly.

It doesn't matter whether we "the" a _determiner_ , or whether we call it a
_blunx_. If we don't know where in a sentence a particular English _blunx_
must be used or else must be omitted, then we don't actually know what a
_blunx_ is! And if we don't know that, we probably don't even understand why
certain words are classified together as blunxes and others are excluded from
that category; we are just demonstrating rote memorization.

 _Without using the word "determiner" or "article", tell me where to use
'the', where to use 'a', and when not to use them._

------
Animats
This was resolved years ago in the linguistics community. There are open-class
words, which in English are nouns, verbs, adjectives, and adverbs. New open-
class words can easily be added to the language. There are closed-class words,
about a hundred of them, although opinions differ on this.[1] Some lists have
"the" as a closed-class word, others have it as an an adverb.

Closed-class words are treated like keywords in programming languages. The
list seldom changes, and it's necessary to know the closed-class words, but
not the open-class words, to parse a sentence. (At least in formal English.)
While there are taxonomies of the closed-class words, modern thinking is that
they're all handled as special cases.

[1]
[https://en.wikiversity.org/wiki/English_vocabulary_list](https://en.wikiversity.org/wiki/English_vocabulary_list)

~~~
philsnow
I would argue that "the" is a determiner, and the class of determiners is
mostly closed. A determiner is the root of a determiner phrase. A determiner
phrase can consist of only the root, like "the" or "a", but it can also be
more complex like "every other", "uncomfortably few", or "90% of".

(This is all from chomsky's x-bar theory [0], which dates back to the 60s. I'm
not aware of another model of syntax that better captures reality, but I'd
love to hear about such.)

So the class of determiners is probably closed, but the class of determiner
phrases is not (because phrases like "fewer than five of", "fewer than six
of", etc are "generative" i.e. you can generate as many as you like).

I could absolutely conceive of a situation where a certain determiner phrase
gets used so much in a community that an abbreviation becomes accepted as a
new determiner. Perhaps parliamentary procedure buffs, who (say) care about
2/3s majority just as much as simple majority ("most"), come up with "tooths"
meaning "at least two thirds of", and it becomes accepted as jargon within
that community. Could totally see it happening. In fact, if anybody knows of
any such instances I'd love to hear about them.

[0]
[https://en.wikipedia.org/wiki/X-bar_theory](https://en.wikipedia.org/wiki/X-bar_theory)

~~~
unhammer
Well, you've got
[https://en.wikipedia.org/wiki/Lexical_functional_grammar](https://en.wikipedia.org/wiki/Lexical_functional_grammar)
and [https://en.wikipedia.org/wiki/Head-
driven_phrase_structure_g...](https://en.wikipedia.org/wiki/Head-
driven_phrase_structure_grammar) which are both model-theoretic grammars (see
[http://www.researchgate.net/publication/220718697_On_the_Dis...](http://www.researchgate.net/publication/220718697_On_the_Distinction_between_Model-
Theoretic_and_Generative-Enumerative_Syntactic_Frameworks) ). In short,
grammar rules are constraints that have to unify in order to describe a
grammatical utterance, instead of phrase-structure rules that generate
sentences.

And there are newer theories that are much more compatible with a
probabilistic / data-oriented outlook, even combining this with the model-
theoretic grammar: [http://www.nclt.dcu.ie/lfg-
dop/](http://www.nclt.dcu.ie/lfg-dop/)

People into semantics seem to like
[https://en.wikipedia.org/wiki/Construction_grammar](https://en.wikipedia.org/wiki/Construction_grammar)
which I never learnt much about. Oh and there's also the Meaning-Text theory,
Relational Grammar and various dependency-based frameworks, Optimality Theory
applied to syntax, … it goes on and on :)

------
tokenadult
The author of the submitted, um, article (blog post) didn't cover all the
ground that could be covered on this issue, because it was an off-hand post.
The previous comments here on Hacker News prompt me to bring up one more
issue: even if we follow tradition and call "the" an "article" (as I was
taught at some point in my schooling), we have the interesting situation that
some languages, even in the Indo-European language family, have no expressed
definite article at all. Latin didn't have one, and Russian doesn't have one.
Definite reference in Latin, in Russian, and in many non-Indo-European
languages (all the various Sinitic languages that are jointly called "Chinese"
immediately come to mind) is indicated by means other than a dedicated word
such as "the." Because languages can do perfectly well without words like
"the" and "a" as those words are used as articles in English, perhaps it is
not so shocking that modern grammarians prefer different category names for
those words.

My eighth grade English class was innovative in that it used a textbook based
on phrase-structure transformational grammar to teach me a lot of my English
grammar. I would be glad to see books like that (modernized based on further
linguistic research since the 1960s when the book was published) used in
classrooms today. The "traditional" grammar poorly taught in the United States
is based on an Indo-European grammatical tradition that is not completely
lousy for teaching native speakers of Latin how to read and write Greek, but
it has never been well suited for teaching analysis of English to native
speakers or foreign-language learners of English. English has many grammatical
features that are poorly described by the grammatical traditional of school
lessons in English-speaking countries.

For further reading on this point, see Steven Pinker's excellent new book _The
Sense of Style: The Thinking Person 's Guide to Writing in the 21st
Century_.[1] For a better than average treatment of this point on Wikipedia,
see the article "English language,"[2] which was updated to "good article"
status during the most recent Wikipedia Core Contest, and is actually pretty
decent for a Wikipedia article, with lots of references to good-quality
reference books about the English language.

[1] [http://www.amazon.com/The-Sense-Style-Thinking-
Persons/dp/06...](http://www.amazon.com/The-Sense-Style-Thinking-
Persons/dp/0670025852)

[2]
[https://en.wikipedia.org/wiki/English_language](https://en.wikipedia.org/wiki/English_language)

------
Anderkent
>(If you have pronoun as a part of speech, that would be a very clever answer,
but you're going to have a lot of trouble convincing non-linguists of that.)

Can someone elaborate on that?

------
iGoPro_HD
I thought that I was the only one who was never taught this.

 __Go public school system! __

~~~
emidln
You never covered articles?

~~~
iGoPro_HD
Not in the depth that it should have been.

~~~
ilitirit
I don't think there's much you can say about them at high school level. All we
were taught was that "the" is the definite article and that "a" is the
indefinite article.

------
marze
The purpose of "the" is to reset the vocal tract to a neutral configuration,
allowing the subsequent (and more arbitrary) word to be resolved more readily.

~~~
RaptorJ
Are there words that function the same way in Chinese, Latin or Greek?

