Hacker News new | past | comments | ask | show | jobs | submit login
Numbering should start at zero (1982) (utexas.edu)
107 points by checkyoursudo 33 days ago | hide | past | favorite | 295 comments



  Numbering should start at π (2025) (umars.edu)

Seriously, it all depends on whether u're counting the items themselves (1-based) or the spaces btwn them (0-based). The former uses natural numbers, while the latter uses non-negative integers

For instance, when dealing with memory words, do u address the word itself or its starting location (the first byte)?

The same consideration applies to coordinate systems: r u positioning urself at the center of a pixel or at the pixel's origin?

Almost 2 decades ago I designed an hierarchical IDs system (similar to OIDs[1]). First I made it 0-based for each index in the path. After a while I understood that I need to be able to represent invalid/dummy IDs also, so I used -1 for that. It was ugly - so I made it 1-based, & any 0 index would've made the entire ID - invalid

---

1. https://en.wikipedia.org/wiki/Object_identifier


Why do you write “you” as “u” here? I don’t want to be rude, but it’s very jarring to see juvenile txt-speak in a serious discussion.


I want to be rude.

The mix of "The same consideration applies to coordinate systems:" and "r u positioning urself..." in the same message is ridiculous to me and I can't take anyone seriously who speaks like this.

It's not illegal or immoral, but it is at the very least, demonstrably distracting from their own actual substantive point. Here we all are talking about that when they otherwise had a perfectly good observation to talk about.

Everyone is jumping on your use of the word "serious" as though you meant serious like cancer instead of just not texting ya bro bout where u at tonite, as though there are only two levels of formal and informal, the two extremes.

If this is so casual and we are txting, then why did they use punctuation and no emojis? Ending sentences with periods is verbal abuse now, so, I guess we're not txting after all? Pick a lane!


People taking such offense to something so absolutely inconsequential, on an internet forum no less, is ridiculous to me and I can't take anyone seriously who gets worked up about it.

You, and parent poster, understood them fine. You, and the parent poster, are the ones who are steering the conversation in the direction of how they typed, not what they typed.

They had a "perfectly good observation to talk about", yet u decided not to talk about it.


It's at least as valid, in fact more so, to say that I was distracted by something they decided to say.

There is no objective way to assign all blame for the tangent to just us or just them, however the closest you can come is to say that whoever speaks first is more responsible for their unprompted speech than responders are for their reactions. They chose their reactions, but they are not reactions in a vacuum, they are reactions to something, that came from someone else.


Both reactors and reactees have equal opportunity not to speak.

A reactor who reacts simply to nitpick provides much less to a conversation than a reactee with interesting thoughts expressed unusually.


ur both rite


I do not know about native speakers, but for me as a non-native speaker, such informal language is much harder to read.


I'm a native English speaker and whether it's harder to read depends on what you mean by "harder". It's immediately obvious what it means (so it's not "harder" in that sense), but on the other hand, it's jarring and distracts me from the point being made.


Stuff like this could be an innocent attempt to obfuscate one's stylometric fingerprints. I just read around it like mistakes from ESL commenters.


agreed and good to remember, thanks


>a serious discussion

hardly. this is an informal internet forum of mostly anonymous people chit chatting about random stuff instead of working.


“serious discussion” notwithstanding, I thought the “u’re” was particularly interesting. Why not “ur”? “u” is obviously easier to write, so why worry about properly contracting it? Just interesting imo


Abbreviations can only be useful when they're unambiguous within their context. Maybe “u’re” is unambiguous here, but “re” might have been a step too far to still know what word writer was abbreviating.


Maybe “u’re” is unambiguous here, but “re” might have been a step too far to still know what word writer was abbreviating.

The question was about contracting “u’re” to “ur”, as is commonly done in text-ese, not to “re”.


It wasn’t clear to me what “u’re” was and I wouldn’t have understood without the comment. I’m a native English speaker.


Wasn’t the guy who wrote “all correct” as “oll korrekt”[1] a native [American] English speaker also?

—-

1. https://en.m.wikipedia.org/wiki/OK


Not to contradict you, but fascinating coincidence: my favorite alleged originator of OK — "Old Kinderhook," Martin Van Buren — is to date the one and only U.S. president who did not speak English as his first language.

(But "oll korrect" is apparently attributed to Andrew Jackson, who was a native speaker, yes.)


Not sure if it's the case here but over the decades I've observed there's a cultural aspect here. When I've had colleagues from the Middle East and/or South Asia I've found they're much more likely to use `u` and `r` colloquially than western counterparts.

This could be what you're observing. Or perhaps they just like the aesthetics.


I fully agree, youmanwizard.


touché ;)


it’s not that serious


U are u man wizard. It's probably slightly jarring to see u r instead of you're, but I did not even notice it at first glance.

Like the sibling comment says, we're all semi anonymous internet folks talking about mostly not serious things.


maybe on a mobile?


I've never understood the universal acceptance of poor writing on mobile; everyone immediately throws their hands up and goes ahhh okay that makes sense then.

If only smartphones had some means of seeing the typed output... a screen perhaps? Icing on the cake would be some kind of backspace button, which together would enable proofreading! You know, like in other forms of written communication.

Basically everyone will insist it's entirely to be blamed on the phone, and we're expected to believe that no, really, the moment they sit at a physical keyboard they reliably distinguish your from you're etc.


u must be fun at parties


I dare u to correct sama’s tweets ;)


It’s stupid when he does it too.


Who?


Sam Altman: https://x.com/sama


I knew who it was, sorry, I was just trolling. Both twitter and AI-bro culture are extremely distasteful to me so I thought citing an AI bro’s twitter account as an example of how it’s acceptable to behave online was absurd.


Ah. I took it to be less "sama did it, so it's OK," and more "you wouldn't gatekeep sama's English, so why gatekeep the English of a random HN commenter?" But maybe that was the wrong interpretation.


I have started using “u” in my posts as there are a surprising number of sites that block or delay for approval posts that use “you”. They seem to feel that posts with “you” are too often inflammatory. It is very frustrating when you use grammatical “you” as I just did and suddenly your post is stuck in limbo.


What sites? I have never heard of anything like that.


Reddit does it


[citation needed] I've never seen that claim before. I can imagine that some subreddits do something silly like this but Reddit itself?


Sorry you’re right. Some subreddits do, they display a little prompt explaining it before you reply.


It definitely does not


> Seriously, it all depends on whether u're counting the items themselves (1-based) or the spaces btwn them

I've never heard the distinction stated this way. It's clarifying.


Zero is a natural number. It is in the axioms of Peano arithmetic, and any other definition is just teachers choosing a taxonomy that best fits their lesson.


> Zero is a natural number. It is in the axioms of Peano arithmetic, and any other definition is just teachers choosing a taxonomy that best fits their lesson.

It is, but it need not be. In the category of pointed sets with endofunctor, (Z_{\ge 1}, 1, ++) and (Z_{\ge 0}, 0, ++) are isomorphic (to each other, to (Z_{\ge 937}, 937, ++), and to any number of other absurd models), so either would do equally well as a model of Peano arithmetic.


I may be misunderstanding your argument, but if it's that of a simple offset, then only the one starting from 0 forms a monoid (a group without an inverse to each element). Though, of course, you could redefine the + operation...


> I may be misunderstanding your argument, but if it's that of a simple offset, then only the one starting from 0 forms a monoid (a group without an inverse to each element). Though, of course, you could redefine the + operation...

Yes, agreed, there is other algebraic structure that can tell the difference, but Peano arithmetic by itself cannot.


I think I’m missing something here. PA defines x * 0 = 0 for all x. So while we could take (Z+, 1, ++) as a model of it, we would be imposing a completely different definition of multiplication than the usual. Would this not be simply choosing to label 1 as 0 and work from there?


> I think I’m missing something here. PA defines x * 0 = 0 for all x. So while we could take (Z+, 1, ++) as a model of it, we would be imposing a completely different definition of multiplication than the usual. Would this not be simply choosing to label 1 as 0 and work from there?

Despite the name, in the usual mathematical meaning of the term, Peano arithmetic does not define arithmetic at all, only the successor operation, and everything else is built from there. Once we have those, for the model (Z_{\ge 0}, 0, ++), we certainly usually do define x0 = 0 for all x; and, you're right, if for the model (Z_{\ge 1}, 1, ++) we defined x1 = 1 for all x (as no-one could stop us from doing), then we'd just be dealing with "0 by another name." But it might be equally sensible, if our model of Peano arithmetic is (Z_{\ge 1}, 1, ++), to define x1 = x for all x, in which case we recover the expected arithmetic.


Sorry, but that’s incorrect. Multiplication is defined in the Peano axioms, in terms of S(x).

2 of the axioms are:

1. For all x, x*0 = 0

2. For all x, y: x*S(y) = x*y + y


In the usual terminology, these are not axioms; as your wording itself says, they are definitions. (Indeed, I'd argue that it's almost ungrammatical to say something is "defined in the axioms"; axioms may, and probably must, be stated in terms of definitions, but the definitions are not themselves axioms.) As I say, one can quibble about terminology, since what's important is less what's axiom and what's definition, and more what we can build on top of both; but the usual mathematical presentation separates out the axioms (numbered 1–9 at https://en.wikipedia.org/wiki/Peano_axioms#Historical_second..., though things like 2–5 wouldn't usually be stated as an axiom of the theory but rather of the ambient logic) from the definitions (see https://en.wikipedia.org/wiki/Peano_axioms#Defining_arithmet...).

(Now having written that and looking back, I see that, in my previous post https://news.ycombinator.com/item?id=43442074, I wrote "Despite the name, in the usual mathematical meaning of the term, Peano arithmetic does not define arithmetic at all, only the successor operation, and everything else is built from there." Perhaps this infelicitious-to-the-point-of-wrong wording of mine is the source of our difference? I meant to say that Peano arithmetic does not axiomatize arithmetic at all, but that arithmetic can be defined from the axioms. Thus the specific definition x[pt] = [pt] is eminently sensible if we consider the distinguished point [pt] to be playing the usual role of 0; but the definition x[pt] = x is also sensible if we consider it to be playing the usual role of 1, and even things like x[pt] = x + x + x + x + x can be tolerated if we think of [pt] as standing for 5, say. The axioms cannot distinguish among these options, because the axioms say nothing about multiplication.)


No, they are axioms. Peano arithmetic itself is a first-order theory, and a theory is just a recursively enumerable set of axioms.

Enderton, “A Mathematical Introduction to Logic, 2nd Ed.”, p,203,269-270

Kleene, “Mathematical Logic”, p.206

EDIT: It seems like you're talking about Peano's original historical formulation of arithmetic? That's all well and good but it is categorically not what is meant by "Peano Arithmetic" in any modern context. I've provided two citations from pretty far apart in time editions of common logic texts (well, "Mathematical Logic" is a bit of a weird book, but Kleene is certainly an authority) and I hope that demonstrates this.

There's a lot of reasons that the theory is pretty much always discussed as a first-order theory. The biggest, of course, is that when taken as a first-order theory it fits neatly into the proof and statement of Godel's Incompleteness Theorems, but iiuc it's just generally much less useful in a model theoretic context to take it as a second order theory (to the point where I only ever saw this discussed as a historical note, not as a mathematical one).

EDIT 2: This is all a digression anyway. Both first- and second-order PA label the start of the Z-chain as 0; so any model of PA contains 0 when interpreted as a model of PA.


I'm away from my library, but fortunately the books you referenced are a Google away, so I could consult them and confirm that they say what you say. I'm not quite willing to accept Kleene's word as an authority on common modern mathematical practice, since he was a theoretical computer scientist before the term, but, though I'm not familiar with Enderton's book, it certainly looks like a reasonably standard one.

But these are all referring to Peano arithmetic as a model of the theory of the natural numbers. And that seems a bit silly: the impact of Peano's work wasn't because he showed that there was a model of the theory of the natural numbers, which everybody believed if they bothered to think about it, but because he showed that all you needed to make such a model was a successor operation satisfying certain axioms. Yes, they may be less model-theoretically congenial because they're second order, but to change Peano's work from what he did historically and still call it Peano's seems strange to me. (I'm fine with dressing it up in modern language, and calling it an initial object in the category of pointed sets with endofunctor, which perhaps is biased but still seems to me to be capturing the essential idea.)

Certainly I was taught the second-order approach, though it was as an undergraduate; I've never taken a model-theory class. As I say, I'm away from my library and so can't consult any other sources to see if they still teach it this way, and anyway I am a representation theorist rather than a logician; but, if the common logical approach these days really is to discard Peano's historical theory and to call by Peano's name something that isn't his work, even if it is more convenient to use, then I think that's a shame from the point of view of appreciating the novelty and ingenuity of his ideas. But just because I think something is a shame doesn't mean it's not true, and so far you've produced evidence for your view and I can't for mine, so I can't argue any further.


It’s not really considered throwing away Peano’s work. Peano was working in the very infancy of logical formalism.

As it turns out, further work developing on his discovered that using a recursively enumerable schema for induction rather than a second-order induction axiom gives rise to a simpler abstraction that still has all the properties that Peano actually desired, and which makes further developments in the space much easier.

Continuing to call it Peano Arithmetic is respect for the fact that the guy got it mostly right, and it took the mathematics community many more years to refine the ideas to their current point.

Is it a shame that Galois theory isn’t presented as a historical fossil and frozen to its state of development in Galois’s lifetime? I may be making a rather big assumption, but I like to think he would be proud, and so would Peano.


> Continuing to call it Peano Arithmetic is respect for the fact that the guy got it mostly right, and it took the mathematics community many more years to refine the ideas to their current point.

> Is it a shame that Galois theory isn’t presented as a historical fossil and frozen to its state of development in Galois’s lifetime? I may be making a rather big assumption, but I like to think he would be proud, and so would Peano.

Oh, by no means do I object to calling an updated and generalised version by the name of the person who originated the subject! Since you've brought up Galois, I hardly think that he'd recognize the modern theory of Galois connections, but I think that the name is wholly appropriate.

No, what I thought was a shame is if the original theory doesn't get discussed at all. If my only exposure to Peano's work was, for example, the axiom schema in Enderton, then I don't think I'd be able to appreciate why it's such a big deal. That would feel to me like teaching the theory of Galois connections without ever saying anything about field theory! Whereas, on the other hand, I did immediately understand as an undergraduate the magic of being able to define everything in terms of the pointed set using induction, and I think I'd appreciate even more having seen that first and then seeing how it is updated for modern mathematical logic.

In fact, at a casual glance, I still don't see why L1, L3, and the A, M, and E axioms can't be omitted in the presence of the axiom(s) on p. 269, which has been the whole substance of my objection. I believe that there's an answer, but, if I don't see it as a professional mathematician (though not a logician), then surely it can't be true that every undergraduate will appreciate it!


Second addendum, chapter 4 is about second order logic and apparently I just forgot that exercise 1 is simply showing that you get all of the structure built up in Chapter 3 with Peano's original formulation in second-order logic. Seems that here I'm the one suffering from a lack of historical context!

I think from a logic standpoint this also makes sense -- getting to undecidability quickly makes taking the direct route through first-order logic more appealing.

If I'm being honest, I now do feel a little bit deprived, I probably would have enjoyed the categorical view when I was learning this too.


Oh, then yeah, I totally agree. In general I think it's a shame that so little emphasis is placed on the history of mathematics, though at the same time I appreciate that most of my peers just didn't care :(


> EDIT 2: This is all a digression anyway. Both first- and second-order PA label the start of the Z-chain as 0; so any model of PA contains 0 when interpreted as a model of PA.

Ah, good point that this was the actual source o# the discussion. This one at least can be argued, because the question is about how things should be axiomatized/defined, not how they are. And certainly the theory of the "natural numbers starting with 1" can be axiomatised just as well as the "natural numbers starting with 0." All these axioms are made by humans, and an appeal to existing axioms here can only say what's been done, not what should be. (And I say this as someone who does start my naturals at 0.)


There is no consensus on that, and it's not just about teachers. It depends on the mathematical field and tradition. It usually starts at 1 in German, at 0 in French due to the influence of the Bourbakis, and in English I think it's more field-dependent.

The original formulation of Peano started at 1.


+1 for peano arithmetic club.

I never realized it was controversial. I think I've always included 0 in the nat numbers since learning to count.

But there are some programming books I've read, I want to say the Little Typer, or similar, that say "natural number" or "zero". Which makes actually confuses me.


IMO zero represents an absence of quantity and doesn't appear in Nature, so it cannot be classified as a Natural number

Just like a negative numbers, it's a higher-level abstraction or a model, not a direct observation from the Nature

Likewise, the digit "0" originating from the Hindu-Arabic numeral system[1] is merely a notation, not a number

---

1. https://en.wikipedia.org/wiki/Hindu%E2%80%93Arabic_numeral_s...


> zero represents an absence of quantity and doesn't appear in Nature

From one point of view, zero never appearing in nature is exactly an example of it appearing in nature!

From another point of view, do you not think a prairie dog has ever asked another prairie dog, "how many foxes are out there now?" with the other looking and replying "None! All clear!"? Crows can count to at least 5, and will count down until there are zero humans in a silo before returning to it. Zero influences animal behavior!

From a third point of view, humans are natural, so everything we do appears in nature.

From a fourth point of view, all models are wrong, but some models are useful. Is it more useful to put zero in the natural numbers or not? That is: if we exclude zero from the natural numbers, do we just force 90% of occurrences of the term to be "non-negative integers" instead?


> From another point of view, do you not think a prairie dog has ever asked another prairie dog, "how many foxes are out there now?" with the other looking and replying "None! All clear!"?

  type PrairieDogFoxCount = NoFoxesAllClear | SomeFoxes 1..5 | TooManyFoxes

  type CrowCount = Some 1..5 | UpsideDown 5..1

  type HumanProgrammerCount = 0..MAXINT

  type HumanMathematicianCount = 0..∞
My point is: "No Foxes - All Clear" is not the same thing (the same level of abstraction) as 0.

> From a third point of view, humans are natural, so everything we do appears in nature.

using this definition everything is Natural, including fore example Complex numbers, which is obviously incorrect, and thus invalidates yr argument

> From a fourth point of view, all models are wrong, but some models are useful. Is it more useful to put zero in the natural numbers or not? That is: if we exclude zero from the natural numbers, do we just force 90% of occurrences of the term to be "non-negative integers" instead?

all models are wrong, but some are really wrong

If all u care is the length of the terms, i.e. "Natural" vs "non-negative integers", then what's wrong with 1-letter set names, like N, W, Z ?

I think the usefulness of including 0 into the set of natural numbers is that it closes the holes in various math theories like [1,2]

1. https://en.wikipedia.org/wiki/Peano_axioms

2. https://en.wikipedia.org/wiki/Set-theoretic_definition_of_na...


> using this definition everything is Natural, including fore example Complex numbers, which is obviously incorrect

No, that's not "obviously incorrect", nor does it invalidate my argument: that is my exact argument. Complex numbers appear in electromagnetism, in exactly the same sense of "appear", as whole numbers appear in herds of sheep. Which is to say, it's the simplest and most useful model of the situation. And what's more natural than one of the four fundamental forces of nature? And the weak & strong nuclear forces have even more esoteric math structures appearing in their most parsimonious models as well.

> "No Foxes - All Clear" is not the same thing (the same level of abstraction) as 0.

In your model. In my model, it is the same thing. All models are wrong; some models are useful. Which one is more useful? Almost always, the one with 0 as a natural number. What about this:

    type PrairieDogFoxCount = NoFoxesAllClear | JustOneFox | ACoupleOfFoxes | SeveralFoxes 3..5 | ManyFoxes
I can make any model as complex as I want; that does not prove some other model wrong.


> using this definition everything is Natural, including fore example Complex numbers, which is obviously incorrect, and thus invalidates yr argument

Except you’re wrong here; should we thus call your argument “obviously incorrect”?

Complex numbers are natural; they’re fundamental in quantum mechanics. Ever since Schrödinger’s equation fundamentally required them for time evolution of states, physicists (and philosophers) wondered if they could be removed. Recent experiments say “no.” QED and QFTs are the most precise theories known in all of science.

https://physicsworld.com/a/complex-numbers-are-essential-in-...


No doubt they're useful to explain some theories, but did someone observed in the Nature 6-4i apples or -7+3i particles?


They’re more than useful: they’re required, hence the research demonstrating it that I linked. That you don’t understand it doesn’t mean the rest of science is at your level of ignorance on the topic.

Your repeated, willful ignorance on a topic, especially when shown to you, is why you have such low understanding of the incorrect claims you make.

Take a moment and learn. Then maybe you’ll not repeat claims shown to be wrong.


I never agreed or disputed whether they're required or not, what I asked was whether they (complex numbers) were ever observed in Nature


Yes, they have been observed in the same rigor as any number you claim has been observed. If you’d spend a moment and read instead of repeated willful ignorance, you’d learn something.

Now go do your homework. Attaching idiot phrases like complex apples is as stupid as claiming we don’t see radio waves so they can’t exist or that matter cannot be mostly empty space because you can stack books.

Your limited imagination, understanding, and unwillingness to learn, even when given a source and phrases to look into, doesn’t apply to those scientists that have done the work.


> Yes, they have been observed

Any references?


A symbol being arbitrary doesn't influence the reality of the meaning behind a thing. I've always thought about `zero` while counting, it never was about `0`.

I observe zero.

I don't think zero is an absence of quantity. I don't think zero is the null set.

You can write types in a programming language, but there are other type theory books that do include zero in the natural numbers. And type theory comes from number/set theory. So it's ok if you decide to exclude it, but this is just as arbitrary.

In fact I'd be happy to write `>=0` or `>0` or `=0` any day instead of mangling the idea of zero representing 0 and zero representing something like `None`, `null` or any other tag of that sort. I don't think the natural world has anything like "nothing" it just has logical fallacies.


> I don't think zero is the null set.

zero is the cardinality of the empty set

> I observe zero.

it cannot be observed directly at any static point in time, but it can be observed as a dynamic process when some quantity goes down to empty and back up over time

> In fact I'd be happy to write `>=0` or `>0` or `=0` any day instead of mangling the idea of zero representing 0 and zero representing something like `None`, `null` or any other tag of that sort. I don't think the natural world has anything like "nothing" it just has logical fallacies.

N, W, R, etc. - r just well-known names for sets of numbers, nothing stops us from defining better or additional names for them (with self-describing names)

We can discuss Empty type[1] vs Unit type[2], but I think it goes off-topic

---

1. https://en.wikipedia.org/wiki/Empty_type

2. https://en.wikipedia.org/wiki/Unit_type


      Numbering should start at π (2025) (umars.edu)
It's funny because pi is the joke compromise between 0 and tau, the circumference of the unit circle, and the period length of the trigonometric functions.


I think we can pretend now that anyone talking about pi is just being sarcastic? It is such a random and useless number, a perfect candidate for a funny meme


> It is such a random and useless number

Random ? Useless ?


Yes to both!

If you haven't yet, read the tau manifesto: https://www.tauday.com/tau-manifesto


Actually, the duality arises from counting (there can be 0 items) and ordering (there is only a 1st item), conceptually. Which is why the year 2000 can and cannot be the start of the 3rd millenium, for instance.


Dates and times are prime examples of modular systems that make the most sense when they start at 0. but most commonly start at 1. Think how stupid it is that the day start at 12 hours then goes back to 1 hour, at least 24 hour clocks do away with this absurdity.

My personal take is that we should not let one short sighted decision 1500 years ago to mess us up and the first century covers from years 0 to 99 and the 21 century 2000 to 2099

https://madeofrome.com/how-was-it-made/how-did-we-choose-the...

I have a database where I tried keeping once per year periodic events(like birthdays or holidays) as postgres intervals, so number_of_months:number_of_days_in_month past start of year or effectively 0 based month-day. this looks weird but solved so many calculation problems I kept it and just convert to more normal 1 based on display. and a shout out to postgres intervals, they do a heroic job at tracking months, a thankless job to program I am sure.


Fun fact: the words Ordinal and Cardinal respectively derives from Ordering and Counting.

So Ordinal quantities represent the ordering of coordinates and can be negative, while Cardinal quantities describes the counting of magnitudes and must be positive.

You can transform between ordinal domains and cardinal domains via the logarithm/exponential function, where cardinal domains have a well-defined “absolute zero” while ordinal domains are translation-invariant.


I don't follow the distinction you're making. The number line is ordered and contains a 0....

The GP's explanation seems more fitting for the year 2000 ambiguity. Are you measuring completed years (celebrate millenium on NYE 2001) or are years the things happening between 0 and 1, 1 and 2, etc (celebrate on 2000, because we're already in the 2000th "gap")?


Order as in "first, second, third", whereas counting is "none, one, two".

This is not a formal distinction, it is a conceptual one for things (not mathematical models).


Thought about this a bit more... Not sure if this is what you're saying, but the concept of "space between" I alluded to seems to arise naturally whenever you have ordered items, and vice versa. Because once you have order you have the concepts "greater than"/"less than", and once you have that you have a border between your items, and your items are between those borders. This connects back to Dijkstra's consideration of <, <=, etc....


Yep, same idea.


Can you elaborate? What duality, and how?


That makes sense for the ID system or a database, but for arrays in a language I still prefer starting at 0. It makes frame buffers easier easier to index


I prefer thinking from the first principles, and not according to the current computer architecture fashion.

And BTW that ID system was used in the system processing PBs of data in realtime per day back in the early 2000s, so it’s not that it was super-inefficient.


Well, EWD gave a solid argument from first principles, which you choose to ignore.


> The former uses natural numbers, while the latter uses non-negative integers

In fact “natural numbers” is ambiguous, as it can both contain zero or exclude it depending on who uses it.

See https://en.m.wikipedia.org/wiki/Natural_number


The center of the debate is that outside of pure mathematics numbers and number systems can only be signifiers for some physical or conceptual object. It is the signified object that determines the meaning of the number and the semantics of the mathematics.


I totally disagree, but it's only my opinion and probably not scientific at all.

From a logical point of view I think it's totally unnatural to start at 1. You have 10 diffferent "chars" available in the decimal system. Starting at 1 mostly leads to counting up to 10. But 10 is already the next "iteration". How do you explain a kid, who's learning arithmetics, that the decimal system is based on 10 numbers and at the same time you always refer to this list from 1 to 10.


I think it's totally natural to start counting at 1, because you start with one of something, not zero. How do you explain to a kid that although they're counting objects, the first one is labelled zero, and that when they've counted 10 objects, they use the number 9?


Decimal system is simply a notation chosen by Humans to communicate and preserve information, it’s not a Law of Nature.

Also you use a circular reasoning.


The reminds me of the pain I felt seeing how the blacktop was painted at the local elementary school:

They had a 6 x 6 grid with 26 letters, then the digits 1-9, then an extra X to fill in the space left over. :facepalm:


"Should array indices start at 0 or 1? My compromise of 0.5 was rejected without, I thought, proper consideration." — Stan Kelly-Bootle


Python supports a third way: start at -1. And if you think about it a little (but not too much) then there's some real appeal to it in C. If you allocate an array of length n and store its length and rewrite the pointer with (*a+=n)=n, then a[0] is the length, a[-1] is the first element (etc) and you free(a-a[0]) when you're done. As a nice side effect, certain tracing garbage collectors will never touch arrays stored in this manner.

Upshot: if you take the above seriously (proposed by Kelly Boothby), the median proposed by Kelly-Bootle returns to the only sensible choice: zero.


He was right. If the first fencepost is centered at x=0 and the second at x=1, and you want to give the rail in-between some identifier that corresponds to its position (as opposed to giviung it a UUID or calling it "Sam" or something), 0.5 makes perfect sense.

In computer programming we often only need the position of the gap to the left, though, so calling it "the rail that starts at x=0" works. Calling it "the rail that ends at x=1" is alright, I guess, if that's what you really want, but leads to more minus ones when you have to sum collections of things.


I can't find a reference, but I have a vague memory that in original Mac OS X, 1-pixel-width lines drawn at integer locations would be blurred by antialiasing because they were "between" pixels, but lines drawn at e.g. x = 35.5 were sharp, single-pixel lines. Can anyone confirm/refute this?


Not sure about old Mac OS, but I think HTML canvas works that way.


First person to get a postgraduate degree in CS. I still miss his satirical writing.


Perhaps ideally we'd change English to count the "first" entry in a sequence as the "zeroth" item, but the path dependency and the effort required to do that is rather large to say the least.

At least we're not stuck with the Roman "inclusive counting" system that included one extra number in ranges* so that e.g. weeks have "8" days and Sunday is two days before Monday since Monday is itself included in the count.

* https://en.wikipedia.org/wiki/Counting#Inclusive_counting


> At least we're not stuck with the Roman "inclusive counting" system that included one extra number in ranges* so that e.g. weeks have "8" days

French (and likely other Latin languages?) are not quite so lucky. "En 8" means in a week, "une quinzaine" (from 15) means two weeks...


Hmm, "en 8" makes sense to me in that you're using it to reference the next Whateverday that is at least 8 days apart from now.

If we're on a Tuesday, and I say we're meeting Wednesday in eight, that Wednesday is indeed 8 days away.

Now I'm fascinated by this explanation, which covers the use of 15 as well. I'd always thought of it as an approximation for a half month, which is roughly 15 days, but also two weeks.

To partially answer the other Latin languages, Portuguese also uses "quinze dias" (fifteen days) to mean two weeks. But I don't think there is an equivalent of the "en huit". We'd use "na quarta-feira seguinte" which is equivalent to "le mercredi suivant".


It is definitely my engineer myopia, but octaves in music should be called dozens


Music only settled on 12 equal tones after a lot of music theory and a lot of compromise. Early instruments often picked a scale and stuck with it, and even if they could produce different scales, early music stuck to a single scale without accidentals for long stretches. Many of these only had 5 or 6 notes, but at the time and place these names were settling down, 7-note scales were common, so we have the 8th note being the doubling of the 1st.

Most beginners still start out thinking in one scale at a time (except perhaps Guitar, which sorta has its own system that's more efficient for playing basic rock). So thinking about music as having 7 notes over a base "tonic" note, plus some allowed modifications to those notes, is still a very useful model.

The problem is that these names percolated down to the intervals. It is silly that a "second" is an interval of length 1. One octave is an 8th, but two octaves is a 15th. Very annoying. However, it still makes sense to number them based on the scale, rather than half-steps: every scale contains one of every interval over the tonic, and you have a few choices, like "minor thirds vs. major thirds" (or what should be "minor seconds vs. major seconds"). It's a lot less obvious that you should* only include either a "fourth" (minor 3rd) or a "fifth" (major 3rd), but not both. I think we got here because we started by referring to notes by where they appear in the scale ("the third note"), and only later started thinking more in terms of intervals, and we wanted "a third over the tonic" to be the same as the third note in the scale. In this case it would have been nice if both started at zero, but that would have been amazing foresight from early music theorists.

* Of course you can do whatever you want -- if it sounds good, do it. But most of the point of these terms (and music theory in general) is communicating with other musicians. Musicians think in scales because not doing so generally just does not sound good. If your song uses a scale that includes both the minor and major third, that's an unusual choice, and unusual choices requiring unusual syntax is a good thing, as it highlights it to other musicians.


I think we should call them doubles


Sure, and we can call major sixths "one-point-six-eight-one-seven-nine-threes". Or "one-point-six-sevens" if you're a bit flat.


A fifth of a decade you say?


> At least we're not stuck with the Roman "inclusive counting" system that included one extra number in ranges* so that e.g. weeks have "8" days and Sunday is two days before Monday since Monday is itself included in the count.

Yes, we are. C gives pointers one past the end of an array meaningful semantics. That's in the standard. You can compare them and operate on them but not de-reference them.

Amusingly, you're not allowed to go one off the end at the beginning of a C or C++ array. (Although Numerical Recipes in C did it to mimic FORTRAN indices.) So reverse iterators in C++ are not like forward iterators. They're off by 1.

[1] https://devblogs.microsoft.com/oldnewthing/20211112-00/?p=10...


> C gives pointers one past the end of an array meaningful semantics

Nelson Elhage wrote about an alternate interpretation: https://blog.nelhage.com/2015/08/indices-point-between-eleme...


Note that 'first' and 'second' are not etymologically related to one or two, but to 'foremost'. Therefore, it is would make sense to use this sequence of ordinals:

first, second, twoth, third, fourth, ...

or shortened:

0st, 1nd, 2th, 3th, 4th ...


In terms of another thread the item is the "rail" between the "fence posts". The address of the 'first' item starts at 0, but it isn't complete until you've reached the 1.

Where is the first item? Slot 0. How much space does one item take up* (ignoring administrative overheads)? The first and only item takes up 1 space.


Who knew those bodybuilders were such history buffs?


A day needs to finish before you count the next one. It makes perfect sense.


The 1980s were not a particularly enlightened time for programming language design; and Dijkstra's opinions seem to carry extra weight mainly because his name has a certain shock and awe factor.

It isn't usual for me to agree with the mathematical convention for notations, but the 1st element of a sequence being denoted with a "1" just seems obviously superior. I'm sure there is a culture that counts their first finger as 0 and I expect they're mocked mercilessly for it by all their neighbours. I've been programming for too long to appreciate it myself, but always assumed it traces back to memory offsets in an array rather than any principled stance because 0-counting sequences represents a crazy choice.


I've heard the statement "Let's just see if starting with 0 or 1 makes the equations and explanations prettier" quite a few times. For example, a sequence <x, f(x), f(f(x)), ...> is easier to look at if a_0 has f applied 0 times, a_1 has f applied 1 time, and so on.


0-based indexing aligns better with how memory actually works, and is therefore more performant, all things being equal.

Assuming `a` is the address of the beginning of the array, the 0-based indexing on the left is equivalent to the memory access on the right (I'm using C syntax here):

  a[0] == *(a + 0)
  a[1] == *(a + 1)
  a[2] == *(a + 2)
  ...
  a[i] == *(a + i)
For 1-based indexing:

  a[1] == *(a + 1 - 1)
  a[2] == *(a + 2 - 1)
  a[3] == *(a + 3 - 1)
  ...
  a[i] == *(a + i - 1)
This extra "-1" costs some performance (through it can be optimized-away in some cases).


But then again Fortran proceeded C, is known for being very performant, and is 1-based by default.


The comment you are replying to essentially said exactly that:

> but always assumed it traces back to memory offsets in an array rather than any principled stance because 0-counting sequences represents a crazy choice.


> Dijkstra's opinions seem to carry extra weight mainly because his name has a certain shock and awe factor

So you claim this is just an appeal to authority and as a rebuttal you give appeal to emotion without being an authority at all?

> the 1st element of a sequence being denoted with a "1" just seems obviously superior

> I'm sure there is a culture that counts their first finger as 0 and I expect they're mocked mercilessly for it by all their neighbours

> 0-counting sequences represents a crazy choice

5G chess move.


> The 1980s were not a particularly enlightened time for programming language design; and Dijkstra's opinions seem to carry extra weight mainly because his name has a certain shock and awe factor.

Zero based indexing had nothing to do with Dijkstra's opinion but the practical realities of hardware, memory addressing and assembly programming.

> I'm sure there is a culture that counts their first finger as 0

Not a one because zero as a concept was discovered many millenia after humans began counting.


For math too, 0-based indexing is superior. When taking sub-matrices (blocks), with 1-based indexing you have to deal with + 1 and - 1 terms for the element indices. E.g. the third size-4 block of a 16x16 matrix begins at (3-1)*4+1 in 1-based indexing, at 2*4 in 0-based indexing (where the 2 is naturally the 0-indexed block index).

Also, the origin is at 0, not at 1. If you begin at 1, you've already moved some distance away from the origin at the start.


Just speaking anecdotally, I had the impression that math people prefer 1-based indexing. I've heard that Matlab is 1-based because it was written by math majors, rather than CS majors.


Indeed. I was going to point out that mathematicians choose the index based on whatever is convenient for their problem. It could begin at -3, 2, or whatever. I've never heard a mathematician complain that another mathematician is using the "wrong" index. That's something only programmers seem to do.


Yes but I think it might be just habit, and it's exactly in matlab that dealing with for loops over sub matrices is so annoying due to this


In mathematics, if it matters what index your matrix starts on then you're likely doing something wrong.

Besides, in the rare cases where it does matter you're free to pick whichever is convenient.


Zero-based counting works better with modular arithmetic. Like

  arr[(i++ % arr.length)] = foo;
Is certainly nicer than the equivalent in one-based subscripting

  arr[(i++ % arr.length) + 1] = foo;
(The above is actually wrong, which helps the idea)

I'll concede that it's not all that significant as a difference, but at least IMO it's nicer.

Also could argue that modular arithmetic and zero-based indexing makes more sense for negative indexing.


But on the other hand,

  last := vector[vector-length(vector)];
is nicer than

  last := vector[vector-length(vector) - 1];
so in the end I'd say 'de gustibus non est disputandum' and people who prefer 0-based indexing can use most languages and I can dream of my own.


That's arguably one of the only downsides of zero-based, and can be handled easily with negative indexing. Basically all indexing arithmetic is easier with zero-based.


Using an array as a heap is also easier with 1-based indexing:

  base-element := some-vector[i];
  left-child := some-vector[i * 2];
  right-child := some-vector[i * 2 + 1];
where the root element is `some-vector[1]'.


Yes, negative indexing as in e.g. Python (so basically "from the end") can be incredibly convenient and works seamlessly when indexes are 0-based.


Not quite seamlessly, unfortunately.

`l[:n]` gives you the first `n` elements of the list `l`. Ideally `l[-n:]` would give you the last `n` elements - but that doesn't work when `n` is zero.

I believe this is why C# introduced a special "index from end" operator, `^`, so you can refer to the end of the array as `^0`.


So you're saying a negative index value should work like a count of elements to return and not an index?

Then you couldn't do thing like l[-4:-2] to get a range of elements which seems slightly useful.


No - negative indexing is fine as it is. You just need to be careful about the special case of negative zero.


> Yes, negative indexing as in e.g. Python (so basically "from the end") can be incredibly convenient and works seamlessly when indexes are 0-based.

I'd claim 0-based indexing actually throws an annoying wrench in that. Consider for instance:

    for n in [3, 2, 1, 0]:
        start_window = arr[n: n+5]
        end_window = arr[-n-5: -n]
The start_window indexing works fine, but end_window fails when n=0 because -0 is just 0, the start of the array, instead of the end. We're effectively missing one "fence-post". It'd work perfectly fine with MatLab-style (1-based, inclusive ranges) indexing.


What culture would that be? Because I was under the impression that counting "nothing" generally makes little sense in most of the practical world.


Pretty much any algorithm that involves mul/div/mod operations on array indexes will naturally use 0-based indexes (i.e. if using 1-based indexes they will have to be converted to/from 0-based to make the math work).

To me this is a far more compelling argument for 0-based indexes than anything I've seen in favor of 1-based indexes.


Forget multiplication, even addition becomes simpler.


Both are fine, IMO. In a context where array indexing is pointer plus offset, zero indexing makes a lot of sense, but in a higher level language either is fine. I worked in SmallTalk for a while, which is one indexed, and sometimes it made things easier and sometimes it was a bit inconvenient. It evens out in the end. Besides, in a high level language, manually futzing around with indexing is frequently a code smell; I feel you generally want to use higher level constructs in most cases.


I've always appreciated Ada's approach to arrays. You can create array types and specify both the type of the values and of the index. If zero based makes sense for your use, use that, if something else makes sense use that.

e.g.

  type Index is range 1 .. 5;
  type My_Int_Array is
     array (Index) of My_Int;
It made life pretty nice when working in SPARK if you defined suitable range types for indexes. The proof steps were generally much easier and frequently automatically handled.


Many BASIC dialects had this too, which could make some code a bit easier to read e.g.

    DIM X(5 TO 10) AS INTEGER
I recall in one program I made the array indices (-1 TO 0) so I could alternate indexing them with the NOT operator (in QuickBASIC there were only bitwise logical operators).


On the other hand, if you receive an unconstrained array argument (such as S : String, which is an array (Positive range <>) of Character underneath), you are expected to access its elements like this:

    S (S'First), S (S'First + 1), S (S'First + 2), …, S (S'Last)
If you write S (1) etc. instead, the code is less general and will only work for subarrays that start at the first element of the underlying array.

So effectively, indexing is zero-based for most code.


I think lower..higher index ranges for arrays were used in Algol-68, PL/1, and Pascal long before Ada

At least in standard Pascal arrays with different index ranges were of different incompatible types, so it was hard to write reusable code, like sort or binary search. The solution was either parameterized types or proprietary language extensions


Pascal has it too.


I found it devastating that there are no distinct agreed-upon words denoting zero- and one-based addressing. Initially I thought that the word "index" clearly denotes zero-base, and for one-base there is "order", "position", "rank" or some other word, but after rather painful and humiliating research I stood corrected. ("Index" is really used in both meanings, and without prior knowledge of the context, there is really no way to tell what base it refers to.)

So to be clear, we have to tediously specify "z e r o - b a s e d " or "o n e - b a s e d" every single time to avoid confusion. (Is there a chance for creating some new, terser consensus here?)


AFAIK "offset" (i.e. from the beginning of the array/file/etc) is commonly used to indicate a zero-based index.


I like this. I feel like "offset" hints at the reason for starting at 0. "How far do you have to offset your feet (from the beginning of whatever space we're talking about) before you're touching the thing in question?" If it's the first thing, you don't have to move at all, so zero offset


Of course you could also "offset your feet" until they're past the end of the last thing, and then you've counted the number of things. But the offset of the thing itself (as opposed to that of your feet) could be considered zero, assuming the natural position of the thing is for its left edge to be at the left edge of the space.

But maybe its natural position is to be centered at x=0 and it had to be moved by 0.5 for the left edges to line up, in which case see my other comment.

In any case, I think the argument over 0 or 1 or 0.5-based indexing can be resolved just by being clear about what it is you're counting.


I humbly submit '1ndex' and 'ind0x', or '1dx' and 'i0x'.


While most people ran screaming from programming assembly aero142 thought it was a rather swell idea.


Frankly it's no worse than "mebibyte".


Pronounced "one-dex" and "in-dox"?


I always thought:

- Offset: 0-based

- Index: 1-based


That sounds reasonable at first, but humans are messy and so the distinction is not always clear.

For example, in music we use numbers for the notes of a scale (G is the 5th of C, because in the C major scale: C=1, D=2, E=3, F=4, G=5, A=6, B=7). The numbers are clearly indices for the notes of the scale.

But we often think of stuff like: what's the 3rd of the 5th -- that is, get the note at the 5th position (G in our example) and then get the 3rd of that note (the 3rd of G is B). But note that B is the 7th of C major, not the 8th you'd get from adding 5 and 3.

The problem, of course, is that we ended up using the numbers as offsets, even though they started as indices.


Yeah, because the scale is an index (e.gm A=6) but the operations are offsets in that case. Because if you moved from G/5 to B/7 that is clearly 2.


Yes, that's the point: the numbers are used as both indices and offsets. To any musician or music student, moving from G to B is "going up a (major) third", even though it's obviously going up by 2 notes. The name of that offset ("interval" in music speak) is "third", even though it has a distance of 2 notes.

My point (more generally) is that even though it looks reasonable to make indices start from 1 and offsets from 0, in practice these things can get mixed together. It's not reasonable to get people to use two different numbers for what they see as the same thing (because their use got mixed).


Music doesn't count, it has its own numbering.


Index is ambiguous. I think "Offset" and "Itemized".


Index is like your finger, you count from 1, right?


I think that "zero-based" and "one-based" expressions are the distinct agreed-upon "words" and they are terse enough.

I can suggest "z8d" and "o7d" otherwise. (/jk)


I don't think "index" by itself should imply any starting value. After all many induces start at higher numbers and then you'd have to invent words for 2-based, 3-based and so on as well.


Maybe we can call it 0ndex and 1ndex. Really rolls off the tongue.


Zerodex and onedex?


I mean, I think "ordinal" and "zordinal" would work.


or Ordinal and Cardinal, meaning ordering and counting respectively


"offset" and "ordinal".


Ordinals in math also start at zero. It is just common people accepted zero for cardinals and but not (yet) for ordinals.


This article is one my pet peeves. It always shows up in discussions as "proof" that 0 indexing is superior, but it hides under the carpet all the cases the where it is not. For instance, backwards iteration needs a "-1" and breaks with unsigned ints.

    for (i=N-1; i>=0; i--)
I like the argument that 0-based is better for offsets and 1-based is better for indexes: https://hisham.hm/2021/01/18/again-on-0-based-vs-1-based-ind...


  for (unsigned i = N - 1; i < N; --i)


Always beware the word should. I agree with Dijkstra's logic in the context that he presents it, but there are other contexts where I don't think it applies.

Personally, I find that in compiler writing, which is the only programming I do these days, the only things I use indexes for are line numbers and character offsets into strings. Calling the first character the zeroth character is ridiculous to me, so I just store a leading 0 byte in all strings and then can use one based indexing with no performance hit. Alternatively, since I am the compiler writer, I could just internally store the pointer to the string - 1 to avoid the < 1 byte per string average overhead (I also machine word align strings so often the leading zero doesn't affect the string size in memory).

If you are often directly working with array indices, you are likely doing low level programming. It is worth asking if the task at hand requires that, or if you would be better off using higher level constructs and/or a higher level language. Low level details ideally should not leak into the application level.


Not only is this preference not restricted to compiler programming, but it's not even restricted to programming.

Try to count 4 seconds, if you start at 1 you messed up. Babies start at 0 years old. Etc..

I do agree it's a convention though. Months and years start at 1, but especially for years, only intervals are meaningful, so it doesn't really matter what zero is (even though christ is totally king)


Whenever I explain to someone when or why to use 0-indexing, I like to say:

Start from 0 if you are counting boundaries (fenceposts, memory addresses)

Start from 1 if you are counting spaces (pages in a book, ordinals)

Floors are a case where both make intuitive sense, which is maybe how we ended up with European vs American floor numbering.


That's a very confused way of thinking about it IMO. I say:

* Start from 0 if you are indexing. I.e. you are identifying an item or its position.

* Start from 1 if you are counting. I.e. you are saying how many items there are.

It doesn't matter what it is. I don't know why you think pages in a book are somehow different to memory addresses.


You know what I like this much better...rule of thumb updated.


I like my numbering to start like my tape measure: at zero.


I like my numbering to start like my elevator panel: at zero.


American elevators are 1-indexed!


at the whitney art gallery (nyc), the ground floor is 1, but the basement is -1. there's no 0 in between

it bothers me way more than it should. i have to tell myself that the point of art is often to evoke emotions, and the rage i feel is included in that


In the Boole Library in Ireland (which has entrances on different floors) they use an algebraic (affine) system. There is a floor designated "Q" and then other floors are labelled relatively, "Q+2", "Q-1", etc.


A possible interpretation....

Using the insight from the top comment that "it all depends on whether u're counting the items themselves (1-based) or the spaces btwn them," the American way of numbering floors is based on counting actual floors (ie, the things you stand on) -- and the one at earth level is one floor. If you go up a flight of stairs, there is second floor to stand on, and so on.

For buildings that go underground, the "-" sign can now act as a signifier of being underground, and the counting works as normal. If you take the stairs down one level, you are on the first underground floor, -1.

Of course, you want to interpret it like a y-axis number line, where 0 is the earth, 1 is "1 floor unit" above the earth, -1 is "1 floor unit" below the earth, etc. This is the "space between" model.

Elegance aside, both can be viewed as logically consistent depending on your lense.


When my badge doesn't scan at work, that is great art


At the University of Arizona (or at least in most of the buildings there), the lowest floor of the building is always 1, even if it’s a basement. So the ground floor is often 2. Maddening.


In the '80s the ground floor of the EE building (Steele) at Caltech was labeled ⏚. Anyone here happen to know if it is still labeled that way?


I'd certainly hope any floor was ⏚'ed.


Depends how many basement levels there are.


G, 1, 2, 3... :D


mine starts at -1


Array elements are not a continuous measure. Level of water, for example, is.

The mark on your measure tape corresponds to the total sum/amount.

If you count from zero, number of elements no longer corresponds to the size/length. So already here you deviate away from your tape principle.

You have one whole element when your tape measure shows 1, not zero.


Funny. I like my numbering to start at one for the same reason.


1-based numbering is nonsense. How many years old are you when you’re born?

I notice almost all defenses of 1-based indexing are purely based on arbitrary cultural factors or historical conventions, e.g. “that’s how it’s done in math”, rather than logical arguments.


> How many years old are you when you’re born?

You have lived zero full years and are in the first year of your life. In most (but not all) countries the former is considered "your age".

That's consistent with both zero-based and one-based indexing. Both agree on cardinal numbers (an array [1, 2] has length 2), just not on ordinal numbers (whether the 1 in that array is the "first" or "zeroth" element).

> I notice almost all defenses of 1-based indexing are purely based on arbitrary cultural factors or historical conventions, e.g. “that’s how it’s done in math”, rather than logical arguments.

I think it's largely a matter of taste in either direction. But, I'd raise this challenge:

    arr = ['A', 'B', 'C', 'D', 'E', 'F']
    slice = arr[3:1:-1]
    print(slice)
If you're unfamiliar with Python (zero-based, half-open ranges exclusive of end), that's taking a slice from index 3 to index 1, backwards (step -1). How quickly can you intuit what it'll print?

Personally I feel like I have to go through a few steps of reasoning to reach the right answer - even despite having almost exclusively used languages with 0-based indexing. If Python were instead to use MatLab-style indexing (one-based, inclusive ranges), I could immediately say ['C', 'B', 'A'].


Both agree on cardinal numbers (an array [1, 2] has length 2), just not on ordinal numbers (whether the 1 in that array is the "first" or "zeroth" element).

I think whenever people say "zeroth" they speak in jest and doubt that there is any disagreement on the fact that the element without predecessor (nowadays in programming most often assigned index 0) is the first element.

You used "first" in that sense naturally just in the sentence before without the slightest notion of ambiguity.

You have lived zero full years and are in the first year of your life.

What people wrongly get riled up with is the fact that ordinal (first) is not in sync with the cardinal (index 0), but it rarely is anyways. If you go to an administrative office and pull your number 5357 no one assumes that there are 5356 people in the queue before them. Your are still the 5th or 10th in line, even if your index is 5357.


> I think whenever people say "zeroth" they speak in jest and doubt that there is any disagreement on the fact that the element without predecessor (nowadays in programming most often assigned index 0) is the first element.

"Zeroth" sounds silly because English has generally settled on one-based indexing, and so typically you'd convert to one-based indexing when speaking out loud or writing something other than code (`users[3]` is the "4th user").

Maybe you could argue that the words "first"/"second", as just some letters/sounds, are not inherently one-based? But I feel that gets a bit dubious from "third" onwards where the connection to numbers is obvious, and with the ordinals frequently spelled as 1st/2nd/etc.

But, if you want (and if I'm understanding what you mean), you could replace my statement with: "whether the '1' in that array is the index-1 or index-0 element" - purpose was just to distinguish between ordinals and cardinals, in response to a comment that seemingly implied the cardinality of an empty collection would be 1 with one-based-indexing.


I don’t know python but I figured out immediately that it should print ['C', 'B']. Does it? If not, Python is just wrong.


No, it doesn't, it prints ['D', 'C'].

I agree that it should be ['C', 'B'] - the way that Python handles negative `step` values is wrong.


I don't think it's "wrong" per se - it's zero-based indexing and ranges are consistently inclusive of start, exclusive of end. But, it breaks the "fence-post" mental model that some people use, and is less intuitive than MatLab's indexing approach IMO.


> I don't think it's "wrong" per se

I do. I think it only makes sense for ranges to be inclusive at the lower end and exclusive at the higher end. A slice with a negative step `-n` should contain the same elements as the same slice as with step `n`, just reversed.


> How many years old are you when you’re born?

Consider the following from the Irish Constitution:

> 12.4.1° Gach saoránach ag a bhfuil cúig bliana tríochad slán, is intofa chun oifig an Uachtaráin é.

and the official translation to English:

> 12.4.1° Every citizen who has reached his thirty-fifth year of age is eligible for election to the office of President.

For those unfortunate few who do not understand Irish, that version says "Every citizen who is at least thirty-five years old", whereas the translation should in principle (arguably) allow a thirty-four-year-old.

Luckily the Irish version takes precedence legally. A constitutional amendment which would have lowered the minimum age to 21 and resolved the discrepancy was inexplicably rejected by the electorate in 2015.


> 1-based numbering is nonsense. How many years old are you when you’re born?

Typically 3 months shy of 1 year, so about 0.75.


The key word in "birthday" is "birth". Not conception.

For human cultural purposes, you are 0 days old at the time and date recorded for your birth. It goes on your birth certificate. If you find a culture that celebrates conceptiondays and issues conception certificates, let me know.


For human cultural purposes, you are 0 days old at the time and date recorded for your birth

While that is the norm, there both are and have been cultures where you are 1 when you are born (as in you are living your first year of life).


The comment I’m replying said nothing about “birthday”. They asked how many years old you are when you’re born, which is ambiguous.

You know some cultures don’t even celebrate birthdays, right? It’s not even uncommon for people to not know their date of birth. Even my own grandmother, born in Chicago in the 1920s, didn’t even know exactly what year she was born in.


When you're born. That is your birth day, because the person to whom you were born gave birth to you that day.

No culture I know of tracks the date of your conception as a measure of your age. If they measure it at all, the starting point is your birth.

You are zero days old on the day of your birth. Even if those cultures think you are "in the first year of your birth" for days 0-365 of your existence, they would accept that 1 day after your birth, you are 1 day old, not 9 months old.


So if you go into a bar and order a beer, you drink the zeroth beer before the first?


You would, if English were logical as opposed to being a randomly-evolved cultural practice. With computer languages we have the opportunity to fix the mistakes of the past and do better.


When I was doing actuarial work all ages were age next birthday. So you start at age 1!


Another idiot confusing continuous measure (time) with counting discrete units

If it helps: you have one whole array element only if one year has passed

What are you trying to do: do you want to know where each element starts, or do you want to measure the total sum/accumulated amount?


We taught our toddler to count from zero. Their kindergarten teacher was not amused.


The question arises when people get confused between a cut and span. These two are opposite concepts, and they make up a continuum, and they define each other.

So, it depends on what you understand as "numbering". If it is about counting objects, the word "first object" refers to existence of non-zero number of objects. This shows why the first one can't be called as zero, as zero is not equal to non-zero.

If the numbering is about continuous scale such as tape measure, then the graduations can start with zero. But still the first meter refers to one meter, not zero meters.

It looks silly when people have their book chapters numbering to begin with zero. They have no clue whether the chapter refers to a span or a cut. Sure, they can call the top surface of their book cover as zero, though. But still they can't number a page as zero.

The use of zero index for memory location comes from possible magnetic states of array of bits. Each such state is a kind of a cut, not a span. It's like a graduation on the tape measure, or mile stone on the side of the road. So it can start with zero.

So, if you are counting markers or separators, with zero magnitude, you can start at zero. And when you count spans or things of non-zero magnitude, you start at one. If you count apples, start at one. If you count spaces between apples start at zero.


I see people bringing up arrays, and an array index is represented by a number, you can do math on it, but it's not a regular number for counting a sequence of items. It's a unique reference to a location in the memory, and it's dangerous to treat an array index like it's just any old number.

Behold, the really stupid things you can do in Javascript:

  let myArr = [];
  let index = 0;
  myArr[--index] = 5;

  console.log(myArr.length); // 0
  console.log(myArr[index]); // 5


An index is a specific kind of number, but so is a count.

Indexing should clearly start from 0. It leads to far more elegant code and lower risk of off-by-one mistakes.


Then why do the scientific computing languages start at 1? Fortran started at 1 before C was invented.


It's a misguided attempt to appeal to mathematicians.


1 or 0-based index...

I recently picked up Lua for a toy project and I got to say that decades of training with 0-based indexes makes it hard for me to write correct lua code on the first try.

I suppose 1-based index is more logical, but decades of programming languages choosing 0-based index is hard to ignore.


> decades of programming languages choosing 0-based index is hard to ignore

Yes - many file formats also work with zero-based indices, not to mention the hardware itself.

One-based indexing is particularly problematic in Lua, because the language is designed to interoperate closely with C - so you're frequently switching between indexing schemes.


Interestingly, it also poses great challenges for LLMs. GPT-4 can translate Perl into Python almost flawlessly, but its Lua is full of off-by-one errors.


Haha, then my lua code is reaching LLM code quality...


This was the thing that turned me off from Lua unfortunately.


>I suppose 1-based index is more logical

I wouldn't say it's more logical. More intuitive perhaps.


It is not more intuitive.

It just matches the convention used in the language that one has learned as a child, so one is already familiar with it.

The association between ordinal numbers and cardinal numbers such that "first" corresponds to "one" has its origin in the custom of counting by uttering the string "one, two, three ..." while pointing in turn to each object of the sequence of objects that are counted.

A more rigorous way of counting is to point with the hand not to an object, but to the space between 2 successive objects, when it becomes more clear that the number that is spoken is the number of objects located on one side of the hand.

In this case, it becomes more obvious that the ordinal position of an object can be identified either by the number spoken when the counting hand was positioned at its right or by the number spoken when the counting hand was positioned at its left, i.e. either "0" or "1" may be chosen to correspond to "first".

Both choices are valid and they are mostly equivalent, similarly to the choice between little-endian and big-endian number representation. Nevertheles, exactly like little-endian has a few advantages for some operations, so eventually it has mostly replaced big-endian representations, the choice of "0" for "first" has a few advantages and it is good that it has mostly replaced the "1 is first" convention.

For people who use only high-level programming languages, the differences between "0 is first" and "1 is first" are less visible, exactly like the differences between little-endian and big-endian. In both cases the differences are much more apparent for compiler writers or hardware implementers.

Besides "1 is first" vs. "0 is first" and little-endian vs. big-endian, there exists another similar choice, how to map the locations in a multi-dimensional array to memory addresses. There is the C array order and the Fortran array order (where elements of the same column are at successive addresses).

Exactly like "1 is first" and big-endian numbers match the conventions used in writing the European languages, the C array order also matches the convention used in the traditional printed mathematical literature.

However, exactly like in the other 2 cases, the opposite convention to the traditional written texts, i.e. the Fortran array order is the superior convention from the point-of-view of the implementation efficiency. Unfortunately, because much less people are familiar with the implementation of linear algebra than with the simpler operations with numbers and uni-dimensional arrays or strings, so they are not aware about the advantages and disadvantages of each choice, the Fortran array order is used only in a minority of programming languages. (An example of why the Fortran order is better is the matrix-vector product, which must never be implemented with scalar products as misleadingly defined in textbooks, but with AXPY operations, which are done with sequential memory accesses when the Fortran order is used, but which require strided memory accesses if the C order is used. There are workarounds when the C order is used, but with the Fortran order it is always simpler.)


I have a similar experience with pythons negative-indexing. In Python, you can access elements counting from the back by using negative numbers. But for this, they start with 1, not 0. Which is inconsistent, as they start for the normal forward indexing at 0. I guess it comes from reducing n.length-1 to -1, but it's still kinda annoying to have two different indexing-systems at work.


It makes more sense if you think of indices as pointing between elements:

   0   1   2   3   4
   -----------------
   | A | B | C | D |
   -----------------
  -4  -3  -2  -1  -0
Except, of course, -0 doesn't exist. AFAIK that's why C# chose to add a special "index from end" operator, `^`, instead of using negative indices.


> Except, of course, -0 doesn't exist.

Not for integers on modern hardware. If only hardware used ones’ complement (https://en.wikipedia.org/wiki/Ones'_complement)… ;-)

Meanwhile, the workaround is to use -1 through -5 to index from the end of the array.


That makes sense; in the diagram of the 4-element array, 3 points at the same point as -1.


Your visualization makes sense if you always count them going from left to right. But with negative index you naturally count them going from right to left, from the last element backward to the first element. So -0 is the natural starting-point, except it's -1 in python.


> Your visualization makes sense if you always count them going from left to right.

Yes - personally, I do always count them left-to-right.

I don't think negative indices implies counting right-to-left. A negative step does, but I never use one because IMO it doesn't make sense to have an exclusive LHS and inclusive RHS.


-1 is correct, there is no -0. The math works correctly the Python way.


Indexing is not really math, it's a reference-system which is using numbers for convenience. You could use letters or even emojis to get the same result.


[closed, open) intervals on the number line are well-known in math, as are operations on them. -0 is not, nor supported by hardware.


Yes, but Python is not math. This is a syntax-feature we are talking about, math is here just a tool, not the purpose. And it's also not running on hardware, but multiple layers higher.


Python already has the best design (fewest tradeoffs) in this small area.

You're proposing to break the expectations of millions and break offset math while you're at it. Not very compelling if you ask me.


I'm proposing nothing. I'm pointing out a flaw, for me. Nobody will change this at this point. Nobody should change it at this point. There is no benefit in this, just harm.


Values take up space. When you manipulate a value or pass it around, it makes no sense to sometimes refer to the beginning of the value and sometimes to the end.

It makes a little more sense if you have an array of something other than plain integers. Let's say you have 2-tuples:

     0        1        2        3        4        5
     | (0, 1) | (1, 1) | (1, 2) | (2, 3) | (3, 5) |
    -5       -4       -3       -2       -1       -0
or perhaps a better display would be more memory-based, where a tuple is represented as a pair of bytes:

     0   1   2   3   4   5
     00010101010202030305
    -5  -4  -3  -2  -1  -0
Now -3 clearly refers to the beginning of an array element. At least I wouldn't expect -3 to refer to (1, 1), even though I'm mentally traversing right to left for negative indexes.

Or another way to think about it: arr[5] does not exist in the above example. It's the end of the array, and the end is exclusive. Negative indexes count from the end. -0, as a result, refers to the (unmodified) end, which is the nonexistent thing, same as arr[5].

And yet another way: think of positive indexes as going forward, negative as going back. Imagine a syntax arr[3][-2] where arr[3] gives you the subarray starting at offset 3. (In C or C++, this would be like (&arr[3])[-2] with an array type that supported negative indexes, which implies it tracks subarray length.) Where should you end up? Start with the simpler case of arr[3][-0] -- clearly that should be the same as arr[3], not arr[2], if you are "going back 0". And if you're starting out with 0-based indexes, then the "going forward"/"going back" interpretation is inescapable.

As a bonus, arr[-n] is the same as arr[arr.length() - n]. But that's just a lucky happenstance; I wouldn't argue that the semantics of negative indexes should depend on it. Well... one could argue that arr[arr.length()] is the (nonexistent, exclusive) end.


Can visualize it as wrapping to the sequence's other side. That is you start at elem 0 and going backwards to -1 gets you to the other side (up to -len(seq) that returns to elem 0). Kinda like border wrapping in most modern Snake variants. Although this is only for negative indexing.


I don't particularly like negative indexing, but if we assume that -0 represents the address of the first element past the end of the array (the logical "end" of the array's span in memory), then -1 is naturally the starting address of the last element in the array, as measured in its offset from the end.


I don't use Lua as much anymore, but there were a few years where I used Lua and C++ both daily and very quickly you can easily handle both zero and one-indexing, even while switching between languages frequently. As with most things it's just practice.


> I suppose 1-based index is more logical

Why?


It's interesting how most people find learning 0-based indexes confusing, but after a few years of programming, they don't even notice how odd it is.

How do you number things in real life? If you have two options, do you number them "option 0" and "option 1" or "option 1" and "option 2"? If you create a presentation with numbered bullet points, do you start numbering them from 0 or 1?

    1. This is my first point

    2. This is my second point
It would be odd to have your first point be point number 0 and your second point be point number 1, wouldn't it?

Outside of programming, even most programmers use zero-based indexing only when they're making a programming joke.

Zero-based indices create odd effects even in programming, although we don't really notice them anymore.

Consider:

    $array := ['a','b','c']
    
    for($i:=0; $i<3; $i++) {
    
        echo($array[$i])
    
    }
$array has 3 entries, but your for-loop says you should stop iterating before you reach 3. This isn't really consistent with how we intuitively understand numbers to work outside of programming.


if you see how its implemented in machine code (on most modern archs) you will see how it is logical. once you see that, you cannot unsee it.

people call these things an index, but really it is an offset from the base of the array. not an index into it. hence, base + 0 is the first entry. as the offset to that entry is 0. thats how it will work in generated machine code on all the machines i saw (i did not see all of them obviously).

i think people struggle with these concepts because they never bother to see whats under the hood. a bit of assumption ofc on my part i cant see into ppls minds nor do i know all architectures. x86 and x64 (amd/intel) definitely work like this.


Yes, it's the offset from a pointer in some languages. But to the user, it's presented as an index, not an offset. In normal syntax, you don't access an array value using a construct like

*(arraypointer+offset)

If you did that, then using 0 as the start offset would make intuitive sense.

In fact, if not for this technical reason, I doubt that any programming language would have 0-based indices.


im not debating the user level stuff. honestly i think there's fair argument for not letting users get bogged down in system level details. but in my opinion, if people dont understand this why, its a useful thing to learn. its only a small detail in the end.

i think, because software is built up abstractions over abstractions, its unavoidable such details creep into languages from the depths of the system. its not too long ago people didnt even have C or such high level constructs. so coming from there and building up its logical. Now, making something new, from above, one might put 1 as an index for the first element, but its likely during writing of a language you'll end up in the depths and come up with 0 anyway.

think of how a 1 index will work vs a 0 index on ptrs. (pseudo code)

ptr += index x size_of_object

ptr += (index-1) x size_of_object

if you wanna run that on the cpu, either compiler needs to come up with the first one or it will have an additional sub or decrement?

or do you want the compiler to have a duty of doing the index-1, it will result in the original code again, also possible. it will take extra compiler or interpreter time.

(or am i missing something completely?)


I don't disagree with anything you say. I'm not saying that C should have 1-based indices, for example. I'm only saying that from a language design perspective, ignoring technical limitations and historical precedents, 1-based indices would be preferable.


0 is logically superior regardless of how any of it is implemented under the hood. It just makes more sense.


You are just citing cultural factors like how we happen to do things in the English language, not any logical argument.


Are there any languages that call the first element the zeroth element in everyday speech? I can't think of one, Google can't come up with one, and neither can ChatGPT. This isn't just cultural; it's universal or almost universal.


Not zeroth but ground floor vs 1st floor in buildings is common.


That's a good example! Elevators sometimes have a 0 for the ground floor, but they often have an "E" in German-speaking countries or a "C" in English-speaking countries.

In this example, people also call the floor with index 1 the "first floor," although they don't call the ground floor the zeroth floor, as you say.

Since floors can go below ground, the first underground floor is floor -1, so everything works out. There's a floor for every number, unlike with 1-based floor numbering, where floors go from -1 to 1.


"C"? It's more likely to be "G" round here.

By the way, has anyone else had this problem: you're on floor 10, say, and you want to get to floor 15, say, so you run up 5 flights of stairs and try to find the room but then after a while you realise that somehow you've ended up on floor 16 so you think you're going demented and can't count but after this has happened a few times you realise ... THE IDIOTS HAVE OMITTED FLOOR 13!

If there's one thing worse than 1-based indexing it's leaving-out-13-based indexing ... Has any programming language tried doing that with arrays, I wonder?


> It's more likely to be "G" round here

I see what you did there.

Somebody should create a programming language that implements all real-world number idiosyncracies. Don't have 4 or 13; define π as 3 and τ as 6, print 6 three times every time it occurs for good luck (or bad luck, depending on where you live), replace all numbers close to 8 with 8 in money values for great prosperity, have a constant for "dozen" that's 13 in case you're counting loaves of bread, have a rounding function that rounds to a close number that's easy to say depending on your locale...

And let's go with 0.5-based indexing as a compromise.


Your arguments using "first" and "second" are invalid, because those words have nothing to do with numbers, like also "last".

If you argue based on English, you should point only to the ordinals "third", "fourth" and the like.

In the older European languages, the initial position in a sequence was named using words like "first", which mean "closest to the front". The next position was named with words meaning "the other", e.g. "alter" in Latin (the old European languages had 2 distinct words for "the other from two" and "another one from many", e.g. "alter/alius" in Latin).

However when the need for naming other positions in a sequence besides the first, the second, the last or the next to the last (penultimate) has appeared, then the ordinals derived from numbers have been invented, like third, fourth etc.

In later Latin, the word meaning "the other" has been replaced with a word meaning "the following". This has been inherited in the Romance languages and it has been borrowed in English as "second". Also in Old English, "other" had been used for "second", before "second" was taken from French.


> Your arguments using "first" and "second" are invalid,

> because those words have nothing to do with numbers

"First" is the ordinal number corresponding to the number one, while "second" is the ordinal number corresponding to the number two. You can represent "first" as 1st and "second" as 2nd. I believe the origins of these two words do not significantly impact my argument.


Nothing in the word "first" indicates that it corresponds to "one". The same for "second". Those words are not numerals, "first" is a superlative adjective, while "second" is an active participle. They are perceived as ordinal numerals only because English does not have ordinal numerals for the 2 initial positions of a sequence and "first" and "second" are used instead of the missing ordinal numerals.

The abbreviations "1st" and "2nd" are very recent and they cannot be used as an argument that there is a long tradition of correspondence between "first" and "one".

Like I have said, the correct argument based on English is that the position after the second is called "third", which is derived from "three", and the next position is called "fourth" from "4", so extrapolating backwards that sequence results in decrementing "3", which gives a correspondence between "2" and "second", and decrementing the number once more gives a correspondence between "1" and "first".

This is the exact reasoning that has lead to the abbreviations "1st" and "2nd", which have no relationship with the pronunciation or the meaning of the abbreviated words, which mean "closest to the front" and "the following", meanings that are unrelated to any numbers.


It doesn't matter whether there is a long tradition of correspondence between first and 1 and second and 2. What matters is that the correspondence exists today, because I'm making my argument today.


It's only weird in certain cultures. For example, it's common in Germany to start counting at zero.

"Null, eins, zwei.."


Das erste Element in dieser Aufzählung ist jedoch null, und das zweite ist eins. Deutsche verwenden einsbasierte Indizes, genau wie die Amerikaner, auch wenn sie manchmal bei null zu zählen beginnen.

(The first element in this list, however, is zero, and the second is one. Germans use one-based indices, just like Americans, even though they sometimes start counting from zero.)


sets is the most intuitive reason for me

[0,1,2) + [2,3,4) = [0,1,2,3,4)

meanwhile

[0,1,2] + [2,3,4] = [0,1,2,2,3,4] — this double counting is just ugly


It is easy to miss that his argument boils down to that zero-based is “nicer” in a specific select case. The paper is written in the style of a mathematical proof but hinges on a completely subjective opinion.


>> when starting with subscript 1, the subscript range 1 ≤ i < N+1; starting with 0, however, gives the nicer range 0 ≤ i < N.

What about the range 0 < i ≤ N which starts with 1? Why only use ≤ on the lower end of the range? This zero-based vs one-based tends to come up in programming and mathematics, and both are used in both areas. Isn't it obvious that there is no universally correct way to index things?


I believe the main argument (from the OP) is that you have to specify the range with two bounds, and that it is common to want a 0 (assuming a 0-based indexing world), and so in order to refer to a range that includes index 0 you'll need to use a number that is not in the set of valid indexes to define the bound.

I would note that the argument is weakened when you look at the later bound, since you have the same problem there, it's just more subtle and less commonly encountered -- yet it routinely creates security bugs!

It's because we don't work with integers, we work with fixed-size intervals within the set of integers (usually a power of two consecutive integers). So `for (i = 0; i < 256; i++)` is just weird when you're using 8-bit integers: your upper range is an inexpressible value, and could easily be compiled down to `for (i = 0; i < 0; i++)` with two's complement, eg if you did `uint8_t upper = 256; for (uint8_t i = 0; i < upper; i++)`. That case is simple, but it gets nastier when you are trying to properly check for overflow in advance and the actual upper value is computed. `if (n >= LIMIT) { return error; }` doesn't work if your LIMIT is based on the representable range. Nor does `if (n * elementSize >= LIMIT) { return error; }`. Even doing `limit = LIMIT / elementSize; if (n >= limit) { return error; }` requires doing the `LIMIT / elementSize` intermediate calculation in larger-width numbers. (In addition to the off-by-one if LIMIT is not evenly divisible by elementSize.)

So when dealing with overflow checks, 0 ≤ i ≤ N may be better. Well, a little better. `for (i = 0; i <= LIMIT; i++)` could easily be an infinite loop if LIMIT is the largest in-domain value. You want `i = 0; while (true) do { ...stuff...; if (i == LIMIT) break; i++; }` and at that point, you've lost all simple correspondence with mathematical ranges.

> Isn't it obvious that there is no universally correct way to index things?

I don't know about "obvious", but I agree that there is no universally correct way to index things.


I love the uncertain history of both 0 and 1: https://win-vector.com/2020/09/18/clearly-the-author-does-no...


love this :D

"The above has been triggered by a recent incident, when, in an emotional outburst, one of my mathematical colleagues at the University —not a computing scientist— accused a number of younger computing scientists of "pedantry" because —as they do by habit— they started numbering at zero. "

:')


In France, street-level is the 0th floor, and the one above is the first floor.

You see zero in elevators all the time.


Same in Germany, just that we usually call it ground floor instead of 0th floor.

You could argue it's a bit of a translation error. The French and German words for floor are referring to ways to add platforms above ground. Either by referring to walls, wooden columns or floor joists. Over the course of language evolution those words have both broadened and specialized, referring to building levels in general. But the way they are counted still reflects that they originally refer to levels built above ground. The English "floor" on the other hand counts the number of levels that are ground-like, which naturally starts at the actual ground.


It’s the same in non-American Anglo countries as well. Zero-indexed with the zeroth floor being called “ground”.


In my (physics faculty) building 0 is the lowest floor. 2 is ground level.

I used to work in Civil Eng which started at either G or 0.


A good observation that explains everything!


Zero means nothing (not that it has no importance :-) but that it symbolises the void). So the symbol 0 could be also a single space or any other predetermined. So, it is not a number and should not be used like one (pun intended)


Let's also add a link to the handwritten version for good taste:

https://www.cs.utexas.edu/~EWD/ewd08xx/EWD831.PDF


Perhaps we can extend this to everyday language? Taylor Swift had a number zero hit, ones company, two's a crowd, I won the race and came in at number zero and so on?


Numbers are a joke. I count from A-Z and then Aa Ab Ac ...

I have Ar apples for sale.

Only $A.J each!


I appreciate Dijkstra's arguments, but the fact remains that no non-technical user is ever going to jibe with a zero-indexed system, no matter the technical merits.

Languages aimed at casual audiences (e.g. scripting languages like Lua) should maybe just provide two different ways of indexing into arrays: an `offset` method that's zero-indexed, and an `item` method that's one-indexed. Let users pick, in a way that's mildly less confusing than languages that let you override the behavior of the indexing operator (an operator which really doesn't particularly need to exist in a world where iterators are commonplace).


Dijkstra's objective was to make programming into an intellectually respectable, rigorous branch of mathematics, a motivation he mentions obliquely here. He was, generally speaking, opposed to non-technical programmers and languages aimed at casual audiences, such as BASIC and APL; they were at best irrelevant to his goal and at worst (I suspect, though he never said) threats to its credibility.


There is no good reason to cater to non-technical users when designing programming languages. In what way is Lua “aimed at casual audiences”?


A lot of Lua users are kids playing Luanti or Roblox or WoW who also spend a little time modding them—mostly editing textures or 3-D model meshes, but also scripting. Lua is a small and simple language which can be learned easily, prefers to produce incorrect answers instead of throwing exceptions when confronted with ambiguous situations (for example, permitting undeclared variables), has an interactive REPL, is memory-safe to avoid crashes, and uses dynamic typing (thus avoiding type declarations) and has garbage collection, as well as using 1-based indexing.

All of these design features seem to be helpful to casual programmers and are common in languages and programming environments designed for them, such as BASIC, Smalltalk, sh, Python, Tcl, and Microsoft Excel.

pansa2's comment https://news.ycombinator.com/item?id=43435736 also has a citation to Ierusalemschy, who said in https://old.reddit.com/r/lua/comments/w8wgqb/complete_interv...:

> And at that time, the only other option would be Tcl, "tickle.” But we figured out that Tcl was not easy for nonprogrammers to use. And Lua, since the beginning, was designed for technical people, but not professional programmers only.In the beginning, the typical users of Lua, were civil engineers, geologists, people with some technical background, but not professional programmers. And "Tcl" was really, really difficult for a non-programmer. All those substitutions, all those notions, etc. So we decided to create a language because we actually needed it.

(Tcl, of course, was designed for chip designers.)


Okay, fair enough. I stand corrected.


I didn't intend to be correcting you. You weren't factually incorrect about anything; you just have different objectives than Lua's designers.


I mean, I was incorrect that Lua wasn't intended to be used by beginners/non-programmers.


From Roberto Ierusalimschy himself [0]:

"one of the design goals, was for Lua to be easy for nonprogrammers to use"

[1] https://www.reddit.com/r/lua/comments/w8wgqb/complete_interv...


'start counting on your zeroth finger' 'can I have none apples please'

- statements dreamed up by the utterly deranged.

They have played us for absolute fools.


Where I live and maybe where you live our ages are zero based although no one seems to like me calling their baby zero years old.


Well, "three months old" does sort of imply "zero years and three months".

But thank you for reminding me that I am zero centuries old. More decades old than I would like, but zero centuries.


Just a youngster!


If it's about collections of things, use collection iterators (`each`, etc.) and avoid the problem entirely.


Starting from zero saves memory. If I have a variable used as an index for an array of 256 elements, starting from 0 allows me to store it in a single byte. If I start from 1, I need two bytes, effectively doubling the memory usage—an unnecessary 100% increase. Now, multiply this inefficiency across every instance where programs encounter a similar situation.


> Starting from zero saves memory.

Computer memory.


Consider n real numbers a_0, ..., a_{n-1}. That's not very elegant.


Sure it is.

This discrepancy appears in physics too. It's common to use 1,2,3 for spatial indices, but when you reach enlightenment and think in terms of spacetime you add a zero index and not a four.


That's only because you insist on explicitly mentioning the last element, which you can only do when the sequence is finite and non-empty (more generally, when it is indexed by a successor ordinal). So your choice of notation is not only inelegant, it cannot even express all possible sequences.


as a software engineer I see this all day long haha

but good point, remembering academic linear algebra, seeing 0..n-1 in sigma/sums notations would be not convenient


Hard ask considering there's effectively 2 Americas: only the scientific one using scaleable units like mg/g/kg, cm/m/km- everyone else using randomized trash... ft, mile, yard, inch, pound....


I'll upvote this Djisktra note every time it appears. :-)

It settles the discussion of array numbering. F*ck off Visual Basic, MS Javascript, and all the languages that said you should start with 1.


Matlab, Fortran, Julia, R, SAS, SPSS, Mathematica, and the whole field of mathematics. F*ck off all mathematicians, what do they know about counting?


Yeah, they didn't even manage to get the circle constant right.


Swift uses 0-based indexes, but offers, IMHO, a nice choice for specifying ranges,

  Array(1...5) == [1,2,3,4,5]

  Array(1..<5) == [1,2,3,4]


Explains why he didn't like APL...


He also hated Lisp


I don't know of evidence that he did. But Dijkstra left us a famous quote:

"LISP has jokingly been described as “the most intelligent way to misuse a computer”. I think that description a great compliment because it transmits the full flavour of liberation: it has assisted a number of our most gifted fellow humans in thinking previously impossible thoughts."

This is obviously a compliment; it even mentions that word.

Even a less positive remark than this would still be resounding compliment from a computer scientist who said things such as that BASIC causes irreparable brain damage!

So count this as a piece of evidence that he liked Lisp.

Lisp emphasizes structured approaches, and from the start it has encouraged (though not required) techniques which avoid destructive manipulation. There is a lot in Lisp to appeal to someone with a mindset similar to Dijkstra.


"I must confess that I was very slow on appreciating LISP’s merits. My first introduction was via a paper that defined the semantics of LISP in terms of LISP, I did not see how that could make sense, I rejected the paper and LISP with it."

https://www.cs.utexas.edu/~EWD/transcriptions/EWD12xx/EWD128...


Even McCarthy initially rejected the idea that the Lisp-in-Lisp specification could simply be translated into working code so that an interpreter pops out; at first he thought Steve Russel was misunderstanding something.


I know I'll get downvoted to Hell for this, but I have a mental list of traits poor programmer's have, and one of them is "Excessively complains about 1 based indexing".




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: