
Why numbering should start at zero (1982) - feynma
https://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/EWD831.html
======
randcraw
Dijkstra's argument is obsolete. Iterators have largely replaced the explicit
specification of integer ranges in loop control.

Also, the recent rise in programming with multidimensional data invites the
question of why matrix integer indices _should_ include zero, which Dijkstra's
essay doesn't address.

Likewise, the increasing use of real variables in software invites asking
whether an elegant syntax for specifying integer ranges applies equally well
when used for real ranges, since <= operators make less sense when
constraining floats. Also not addressed.

~~~
avian
> Iterators have largely replaced the explicit specification of integer ranges
> in loop control.

Yes, explicit indices are less common than they used to be due to iterator
support in languages. However they are far from being obsolete. I've been
recently going through a lot of code dealing with numerical calculations and
there you see hardly any iterator use.

Also, MATLAB code, which uses 1-based indices, is an absolute pain to deal
with compared to Python/numpy, for example.

~~~
tnecniv
Maybe I've been drinking to koolaide for too long (and it's not a good
flavor), but the 1-based indices in MATLAB are not bad once you get used to
them. They're especially nice when translating mathematical papers that
normally index from 1.

~~~
goatlover
Yeah, I don't get the love for 0 base, unless you're doing C level stuff,
which is the real reason for 0 based indexing.

~~~
avian
I guess it depends on the field you're working with. I'm definitely more at
home with 0-based languages, although at this point I have worked a lot with
MATLAB/Octave code as well.

In my case, it is very common that i-th element in an array is somehow related
to say (x0 + i*dx). This fits naturally with 0-based indices.

With 1-based indices you tend to have (i-1) or (x0-dx) everywhere. It gets
extremely annoying when people get clever and rearrange this -1 in equations.
Then it is no longer clear what is there as a side effect of the indexing and
what is perhaps some constant in the numerical method.

------
wfo
Dijkstra is of course brilliant and I recognize much of his tone is tongue in
cheek and irreverant but sometimes I find his writing so nasty, pompous and
arrogant that it becomes irritating; claiming, after a series of opinions
about what he personally finds "nice" or "natural" that there is a "most
sensible" way of doing things and that everyone else is being ridiculous. A
tamer one of his offenses but still unpleasant.

I agree on the numbering but many people don't. He doesn't mention how nicely
it works with the modulo operator, which is I think the most compelling
reason. But having L[2] be the third index in an array is horribly
counterintuitive and leads to countless mistakes. CS students when they begin
do not like it, it is not natural. It is beaten into them by scary red error
messages and instructors over months and years until they finally accept it.

~~~
zAy0LfpBZLC8mAC
> But having L[2] be the third index in an array is horribly counterintuitive
> and leads to countless mistakes.

Really, that's just a confusion caused by language and historical accidents.

There is nothing that inherently links "one" to "first", or "two" to
"second"\--it just so happens that for some weird reason, it's common in
English to abbreviate "first" as "1st", "second" as "2nd", and so on.

But just as "A" is the first letter of the alphabet (and not, say, the Ast),
"0" is the first cardinal number, and "first" is the first ordinal number.

And there is a very simple reason why zero is the correct default first array
index: Zero is the neutral element of addition, and (sub)array indexing is an
additive operation.

~~~
51Cards
I have to offer a counter view to this. Using letters of the alphabet as an
analogy isn't accurate to me. Each letter we accept as an element... so a true
analogy would be to put a Space as the first character of the alphabet, to
indicate the absence of a letter.

Zero indicates the absence of a value in our numbering system. If I have a
single apple, I have 1 Apple, not 0 Apples. Thus if I have 1 element in an
Array I have element[1], not element[0]. It is more natural for us to use
numbers not by their cardinal position in a set, but by their inferred
value... because they represent quantities.

I am one of those who feels the confusion around zero bounded indexes leads to
countless issues because it is counter-intuitive to how we use numbers in real
life. I agree it is beaten into developers that this is how is should be, but
to me we would have been further ahead to let developers accept what comes
intuitively.

~~~
zAy0LfpBZLC8mAC
> I have to offer a counter view to this. Using letters of the alphabet as an
> analogy isn't accurate to me. Each letter we accept as an element... so a
> true analogy would be to put a Space as the first character of the alphabet,
> to indicate the absence of a letter.

So, you would say that "A" is not the first element of the alphabet or that
"0" is not the first cardinal number?

> Zero indicates the absence of a value in our numbering system.

You are utterly confused. If I have an account balance of zero, that indicates
an absence of a value? How is zero mathematically any different then +1000 or
-1000 that justifies making it a special case?

> If I have a single apple, I have 1 Apple, not 0 Apples. Thus if I have 1
> element in an Array I have element[1], not element[0].

You are mixing up ordinal and cardinal numbers.

> It is more natural for us to use numbers not by their cardinal position in a
> set, but by their inferred value... because they represent quantities.

There is no such thing as a "cardinal position", positions are ordinal,
cardinality describes sizes.

> I am one of those who feels the confusion around zero bounded indexes leads
> to countless issues because it is counter-intuitive to how we use numbers in
> real life.

Nope, what leads to problems is that people don't understand the difference
between cardinal and ordinal numbers. You cannot fix that by changing the
first index, all that causes is a shift to different bugs. Including stuff
like equating zero with the absence of a value.

~~~
ivanhoe
> How is zero mathematically any different then +1000 or -1000 that justifies
> making it a special case?

Zero is actually always a special case in math, specially in algebra, the only
number where division by it is not defined.

> Nope, what leads to problems is that people don't understand the difference
> between cardinal and ordinal numbers.

Actually, by Von Neumann definition of ordinals, ordinal 0 represents an empty
set, thus it's equivalent of null, not single element set. So strictly
mathematically speaking for non-empty sets we should use ordinals starting at
1.

But using zero-based indices is just a matter of convention, so lets not make
a big philosophy out of it...

~~~
zAy0LfpBZLC8mAC
> Zero is actually always a special case in math, specially in algebra, the
> only number where division by it is not defined.

And what does any of this have to do with division?

> Actually, by Von Neumann definition of ordinals, ordinal 0 represents an
> empty set, thus it's equivalent of null, not single element set. So strictly
> mathematically speaking for non-empty sets we should use ordinals starting
> at 1.

So, strictly mathematically speaking, we should use "2 apples" to describe an
apple within another apple, where both of those apples also contain nothing?

I'm sorry, but this just makes no sense whatsoever. What relevance does von
Neumann's construction of the natural numbers have to this question?
Especially so, given that the empty set, aka "0", is the first(!) von Neumann
ordinal. We should use 1-based ordinals because von Neumann constructed
0-based ordinals, maybe?

> But using zero-based indices is just a matter of convention, so lets not
> make a big philosophy out of it...

No, it's not. You yourself stated that the algebraic/arithmetic behaviour of 0
is quite different from that of 1, so how do you suddenly come to the
conclusion that it's merely a matter of convention?

~~~
ivanhoe
> No, it's not. You yourself stated that the algebraic/arithmetic behaviour of
> 0 is quite different from that of 1, so how do you suddenly come to the
> conclusion that it's merely a matter of convention?

Because in programming it is, it's just a number chosen by some languages to
start enumerating from because it makes things easier in some real-life cases.
In pascal you could set the arbitrary start and end index and it worked just
as well. It's like arguing which is a more precise way to measure temperature,
in fahrenheit or celsius scale...

> What relevance does von Neumann's construction of the natural numbers have
> to this question? Especially so, given that the empty set, aka "0", is the
> first(!) von Neumann ordinal. We should use 1-based ordinals because von
> Neumann constructed 0-based ordinals, maybe?

You talked of ordinal numbers, so I cited one of the definitions of ordinal
numbers, the one from the set theory. The similar reasoning applies for other
definitions as well. Ordinal number zero is always the ordinal of an empty
set. We use null for that in programming. So the next one is a single element
set and it has ordinal 1. So how from this you conclude that index 0 is "the
correct way" to count non-empty sets? You completely ignore the empty array
that way...

~~~
zAy0LfpBZLC8mAC
> It's like arguing which is a more precise way to measure temperature, in
> fahrenheit or celsius scale...

Nah, bad analogy. Let me suggest a better version: It's like arguing about
whether degree Fahrenheit or Kelvin is a more useful default unit to use for
physics calculations. What do you think?

It's not just a number that's used to start enumerating from, it's a set of
numbers used to calculate with as well, and array indexing, as I mentioned, is
an additive operation, and thus using a set of indices that's not a group, for
lack of an identity element, is as sensible as doing most of your physics
calculations in degree Fahrenheit.

> You talked of ordinal numbers, so I cited one of the definitions of ordinal
> numbers, the one from the set theory. The similar reasoning applies for
> other definitions as well. Ordinal number zero is always the ordinal of an
> empty set.

Exactly. And you realize that that is the first ordinal number, right? "First"
means "the one without a predecessor", and the empty set, also represented as
a zero in that context, is the element without a predecessor, and as such the
first ordinal number of that construction. Now, what is your point?

> We use null for that in programming.

We use null to represent the first ordinal number? What kind of weird
languages are you using?!?

> So the next one is a single element set and it has ordinal 1. So how from
> this you conclude that index 0 is "the correct way" to count non-empty sets?
> You completely ignore the empty array that way...

First of all, it seems like you are utterly confused about the relevance of
von Neumann's constructions of ordinal numbers to how to index arrays. There
is none. It's simply an exercise in finding an isomorphism between sets and
"normal" numbers.

Also, no, I don't ignore the empty array. An empty array has no positions,
therefore, there is no ordinal number that could refer to a position in an
empty array. If you have an index pointing into an empty array, you are doing
something wrong. You cannot point to the fifth element in a four-element
array, you cannot point to the second element in a one-element array, and you
cannot even point to the first element in a zero-element array.

If you want to be able to have a data structure that can represent an index
into an array, or alternatively the absence of an index (for whatever reason--
one reason might be because the array is empty), that's your problem, not the
problem of the index type. Why should we reserve perfectly useful values of
data types to represent stuff that's logically separate from the type? How
about we reserve the value "2" in floating point values for cases where you
want to store the word "honey" as a floating point value? That's just crazy.
You can possibly use such sentinel values in your software for performance
reasons or whatever, but limiting sensible use by the general public because
you want to mix concerns?!?

The third element of the sub-array starting at the second element of an array
is which element of the base array?

With zero-based indices, that is a perfectly normal additive operation: 1 + 2
= 3 --the element is to be found at index 3.

With one-based indices, you resurrect all the crazyness of algebra and
arithmetics before the zero was invented, just because you want to use the
zero for something else: 2 + 3 = 5 ... and the index is 4, obviously!

------
waynecochran
I always that the best argument, that every assembly programmer knows, that
starting at i=0 allow i to be used as an offset from the base address of an
array. I bet that's why K&R choose this for C (unlike Wirth for Pascal which
allowed any index range for arrays).

~~~
elevensies
This makes the most sense to me if you look at it backwards. Given that array
access starts with zero, what do you call the access parameter? If you call it
"offset" from the beginning of the array, then starting at zero makes sense.
If you call it "index", or "element", then we have this confusion. The problem
to me isn't the choice of the numbering scheme, it is the names and
explanation around it.

~~~
goatlover
Which makes sense that it's offset for C, but higher level languages like
Python, JS, etc should be index based, which would be 1. That's where R,
Julia, Matlab get it right.

~~~
waynecochran
Actually Niklaus Wirth did the best thing a long time ago with Pascal ... let
the programmer decide the index range for arrays (although he screwed up
making the size of the array part of the type). K&R made the best decision for
C since it is the most transparent for the hardware.

------
mayoff
Dijkstra's handwriting is quite charming. If you've never seen it, it's worth
clicking the “EWD831” link at the top right to see the scan of the handwritten
original.

~~~
logfromblammo
Egads, it looks like a handwriting font. I have never seen its match for
legibility.

~~~
gumby
Last month I went to my parents' house and cleaned out a bunch of my old crud.
Which included school notebooks and books back to 1980 (at which time I had
access to computers and typed all my papers). I was shocked (as was my kid) at
how clear and legible my old notebooks were.

Nowadays I do write stuff but summarize it into the computer. I've tried
various tools for automatic capture (e.g. livescribe) but they don't work that
well. I realized that the probably would have worked great with my 35-years-
ago handwriting.

~~~
Sean1708
I do wonder whether given enough effort you could train an OCR program to be
pretty good at recognising one person's handwriting. Like if you meticulously
corrected the output and added those corrections to your training set, would
it eventually be perfect (or close enough anyway) or would it hit some sort of
fundamental limit?

~~~
gumby
Clearly _some_ peoples' handwriting sure. Good handwriting recognizers take
advantage of stroke recognition, which you can't do by OCRing after the fact.

In addition my handwriting has become terrible. If I check my notes soon after
writing them (within the same day) I can often tell by making out a few words
and remembering what was being discussed when I was marking that part of the
page :-(. A couple of days later I can barely understand them and longer than
that I might as well forget it.

I could probably get by making random marks on the paper!

It's so sad, because when I was a scientist and had to maintain a lab notebook
other people could easily read my notes (and they had to be clear for IP
reasons). Or maybe I should look at it the other way: it's not sad, because it
doesn't matter as much any more.

------
pjungwir
I answered a StackOverflow question a while ago about why date ranges in
Postgres are [x, y), or in other words include all t where x <= t < y:

[http://stackoverflow.com/questions/37953786/why-does-
postgre...](http://stackoverflow.com/questions/37953786/why-does-postgres-
upper-range-function-for-a-daterange-return-an-exclusive-boun)

It is interesting to see the same question applied to natural numbers, and
Dijkstra landing on the same [x, y) recommendation. I'd say for natural
numbers the argument is even stronger, since there is a least possible value.

------
tannhaeuser
0-based arrays are IMHO simply a consequence of C's array access operator
given transparent semantics in terms of pointer arithmetic.

Try some string manipulation using the awk language, which has 1-based strings
and arrays, and you'll see that programs and string expressions are generally
shorter and more idiomatic.

Later languages building on awk such as Java and JavaScript (awk is almost a
subset of, and _the_ inspiration for early JavaScript) have cargo-culted 0
into the language as base offset. Though probably it may have prevented some
errors to make commonly used languages behave same in this respect.

~~~
asveikau
> ... 1-based strings and arrays, and you'll see that programs and string
> expressions are generally shorter and more idiomatic.

Depends very much on what you're doing. If you're doing, say, a filesystem
driver, and you need to do math or reasoning on [offset, length] pairs quite
frequently, you will find 0-based indexing much more natural. At least that's
been my experience. I think some of this applies to strings and memory too.

~~~
coreyd
The way I tried to resolve 0 vs 1 based indexing in my head was is they have
slightly different contexts. 0-based indexing tells you have far you are from
the origin, the first element whereas 1-based indexing is useful for when
thinking about how many elements you have.

~~~
dTal
Ah, counting vs measuring! That makes a lot of sense.

In the west, we _measure_ our age and start from 0. In China, they _count_
their age (number of distinct years someone has been alive in) and thus start
from 1. In fact, in China, if you are born on new years eve you can be 2 years
old at 2 days old!

------
divbit
If you represent base-b coefficients as such: a = a[0] b^0 + a[1] b^1 + b[2]
2^2 + ... , then the exponent agrees with the coefficient, so that's nice...

------
_vya7
This makes a lot of sense in the context of programming languages, where
counting usually starts "where you are", thus 0 moves forward.

But it doesn't really apply in the natural world, where 1 is a much better
starting number for probably all contexts where you have to number or
reference things in an order.

Maybe this clarification is redundant, I don't know. But when I read the
headline, I assumed it was talking about _everywhere_ , which is why I found
it interesting and even clicked it in the first place.

~~~
Al-Khwarizmi
It does make sense in the natural world for distance and time, though. For
example, when you "count to ten" you are not counting 10 seconds if you start
at one, but 9 seconds, as you say "1" on second zero. To count 10 seconds, you
need to start at zero ("where you are", as you mentioned). Same for distance.

I'd say that the only situation where it makes sense to count from 1 is when
you are counting _objects_. Of course, counting objects is so important that
it dominates people's perception of numbers and counting.

~~~
x1798DE
If you're counting off seconds, you say the number of seconds that elapsed
before your declaration:

" _pause_ one, _pause_ two, ..."

So when you get to 10, you have said 10 numbers and there have been ten
pauses. The "you are only counting 9 seconds" thing would only make sense if
the counting period starts when you say the first number, which is not how I
usually do it...

~~~
Jtsummers
That first pause being preceded by what? "Go"? A button press? A pistol shot?
All surrogates for "0".

~~~
junke
I agree that counting from zero has nice properties. But I must concede that
e.g. in music, the surrogate for zero is the "first" beat. You are basically
counting when intervals of time start, not when they end. The first second
ends with the second beat (pun intended).

    
    
         |------------|-------
         t0           t1
         ^            ^
         First        Second

------
grimoald
He argues that 1 < i … is bad because it doesn't really work if your lower
bound is 0 (because then you'd write -1 < …). Funny thing is, in most
programming languages, there _is_ a biggest natural number, too. For example,
in Rust, you have a problem, if you want to enumerate all possible byte
values:

    
    
        for i in 0u8..256 { }  // <-- doesn't compile, because 265 is not a valid value for a u8

~~~
steveklabnik
Yeah, this one will be fixed once ... is stable. It's been dragged down in
minor details for a while...

------
protonfish
Or we could just accept and implement the "pernicious three dots".

when you see `-1` in code it is often a failure of the expressiveness of
0-indexed lists. For example, the last item in an array:

    
    
        arr[arr.length - 1]
    

Or a found item by index number:

    
    
        if (arr.indexOf("thing") > -1) {
    

This is hardly an elegant convention.

~~~
qwertyuiop924
>when you see `-1` in code it is often a failure of the expressiveness of
0-indexed lists.

...And when you see '+1' or '(x%ArrayLen)+1' or 'x<=arrayLen', in lua or
FORTRAN it is often a failure in the expressiveness of one-indexed lists.

~~~
Sean1708
I agree with 'x%ArrayLen+1', but how is 'x<=arrayLen' a failure?

Also, I don't think the parent's point is "1 is better than 0" but rather "0
is not better than 1".

------
Walkman
A little bit about 0-based indexing on the practical side from Guido van
Rossum: [http://python-history.blogspot.hu/2013/10/why-python-
uses-0-...](http://python-history.blogspot.hu/2013/10/why-python-uses-0-based-
indexing.html)

------
desireco42
As much as I respect Dijkstra, I never accepted this as something that makes
sense, it is ugly remnant from older times where calculations were precious
and it was considered smart to put number that you would add to base address,
to calculate array address.

This is not commonly held belief. I am in minority but this is what makes
sense and I always believed that computers should serve us, not the other way
around.

~~~
dainichi
> I always believed that computers should serve us, not the other way around.

Amen! The idea that we should change our way of thinking because it somehow
makes more sense for computers is preposterous.

------
marknadal
Absolutely not, quoting from here:
[https://github.com/amark/theory#notes](https://github.com/amark/theory#notes)

1\. Naturally, the first element in a list cardinally corresponds to 1.
Contrarily, even official documentation of JavaScript has explicit disclaimers
that the "first element of an array is actually at index 0" \- this is easily
forgotten, especially by novices, and can lead to errors.

2\. Mathematically, a closed interval is properly represented in code as for(i
= 1; i <= items.length; i++), because it includes its endpoints. Offset
notation instead is technically a left-closed right-open interval set,
represented in code as for(i = 0; i < items.length; i++). This matters because
code deals with integer intervals, because all elements have a fixed size -
you can not access a fractional part of an element. Integer intervals are
closed intervals, thus conclusively proving this importance.

3\. Mathematically, matrix notation also starts with 1.

4\. The last element in a list cardinally corresponds to the length of the
list, thus allowing easy access with items.length rather than having
frustrating (items.length - 1) arithmetic everywhere in your code.

5\. Negative indices are symmetric with positive indices. Such that -1 and 1
respectively refer to the last and first element, and in the case where there
is only one item in the list, it matches the same element. This convenience
allows for simple left and right access that offset notation does not provide.

6\. Non existence of an element can be represented by 0, which would
conveniently code elegantly as if( !items.indexOf('z') ) return;. Rather, one
must decide upon whether if( items.indexOf('z') == -1 ) return; is
philosophically more meaningful than if( items.indexOf('z') < 0 ) return; with
offset notation despite ignoring the asymmetry of the equation.

~~~
mangodrunk
These are excellent points. I find that many programmers will assume that zero
based indexing is better because that's what was drilled into them until they
stopped making the mistake and authorities in computer science (like Dijkstra)
said that it's better.

------
gohrt
Offset (distance, 0-based) and Ordinal (position, 1-based) are two different
things, and the confusion disappears when your program/language properly
treats different types.

[https://en.wikipedia.org/wiki/Zero-
based_numbering](https://en.wikipedia.org/wiki/Zero-based_numbering)

------
guelo
The cost of index off by one bugs must be many billions. Probably more than
null pointer bugs.

~~~
solipsism
I doubt it, but even if you're right, 1-based indexing wouldn't eliminate off-
by-one errors. They're easy to make regardless.

Ask ten people to implement binary search with 1-based indexing, for example.
Watch the off-by-ones roll in.

In fact, from [https://en.m.wikipedia.org/wiki/Zero-
based_numbering](https://en.m.wikipedia.org/wiki/Zero-based_numbering) _With
zero-based numbering, a range can be expressed as the half-open interval,
[0,n), as opposed to the closed interval, [1,n]. Empty ranges, which often
occur in algorithms, are tricky to express with a closed interval without
resorting to obtuse conventions like [1,0]. Because of this property, zero-
based indexing potentially reduces off-by-one and fencepost errors._

~~~
Sean1708
I find that statement very suspect, and I highly doubt there is any data to
back it up. In my opinion [1,0] is far more obviously an empty range than
[1,1) but that's just what it is, my opinion. And I suspect what we are seeing
in that Wikipedia article is the author's opinion, not anything backed up by
facts or data.

Edit: Lol, the Wikipedia article cites Dijkstra's article.

~~~
JadeNB
> In my opinion [1,0] is far more obviously an empty range than [1,1) but
> that's just what it is, my opinion.

On its own, I think that's a reasonable contention, but it doesn't work when
combing intervals: [1, 1) combined with [1, 2) is still [1, 2), but [1, 0]
combined with [0, 2] isn't [1, 2].

------
Animats
Dijkstra mentions Mesa offering all four options. Mesa's interval notation was

    
    
       [0..5]   0,1,2,3,4,5
    
       (0..5)   1,2,3,4
    
       [0..5)   0,1,2,3,4
    
       (0..5]   1,2,3,4,5
    

This reflects the concept of closed and open intervals in mathematics. It
might have been useful when porting FORTRAN programs to Mesa. In FORTRAN,
arrays start from 1. Pascal required a range in the declaration. When C
started consistently from 0, that was considered radical.

For programming, as Dijkstra mentions, a consistent start from zero seemed to
be helpful.

------
ruraljuror
I assumed that indexing started at zero because of binary representations. For
example if you're using two bits to represent four states, the first one will
be 00 and the last one will be 11, which is three.

------
MichaelBurge
Some people prefer 0, and some people prefer 1, so it should be a per-module
setting that anyone can change.

Let's call it $[, so a statement like '$[ = 17' causes arrays to start
indexing at 17.

[http://search.cpan.org/dist/perl-5.17.1/ext/arybase/arybase....](http://search.cpan.org/dist/perl-5.17.1/ext/arybase/arybase.pm)

------
JohnStrange
I like Ada's arbitrary array indices. If you want to loop through an array,
you loop from Array'First to Array'Last.

------
benlorenzetti
I agree with Dijkstra on using one-side open, one-side closed boundaries to
denote a range of integers.

(C.S. way) 0 <= a < N

(Math way) 0 < b <= N

As he says, both ways have the property that "upper bound - lower bound =
Number of elements", both can represent an empty set "lower bound = upper
bound", and both make partitioning easy "(0 <= a < M)U(M <= a < N)".

However I disagree that including the lower bound way is preferable to the
starting at 1, math way. Math is the older science and children are still
taught to count in this way.

As many of the benefits of one way or the other are mostly just network
effects (i.e. in math subscripts typically start at 1 or in C, ints and
structs are pointed to by their lowest char), we should have used the notation
of the older science.

------
paulmd
How about we split the difference and start at 0.5? That sounds like a fair
compromise.

~~~
ScottBurson
"Should array indices start at 0 or 1? My compromise of 0.5 was rejected
without, I thought, proper consideration." \-- Stan Kelly-Bootle

------
kruhft
Is this whole comment train 'bike shedding'? Like tabs/spaces, brace indent
levels and all the others, these arguments and explanations about why one way
is better is like watching people run around a tree.

~~~
jdbernard
Bike shedding as I understand it is more about non-experts arguing endlessly
about some trivial detail because everyone can understand it and has an
opinion.

What's happening here is experts arguing about minutia that feels important to
them, but looks trivial from the outside.

~~~
adwhit
I await a bikeshed about the precise meaning of bikeshed.

~~~
paulddraper
Whether the conversation itself would qualify as a bikeshed could be the
matter at hand.

------
gungsukma
Only few days ago I was thinking that numbering should start with zero, then
things will be easier.

[1-based index] Year 2016 -> Century 21 -> Millennium 3

New year on 2017-01-01 01:01:01

[0-based index] Year 2015 -> Century 20 -> Millennium 2

New year on 2016-00-00 00:00:00 :-)

------
gnufx
Dijkstra was wrong about Fortran at that time. The standard describing arrays
with arbitrary bounds is dated 1978.

Why do people make the fuss about indexing and not about column-major arrays?

------
anigbrowl
There is a certain elegance tos tarting at zero, most easily appreciated when
writing assembler. And I agree with dijkstra that _Adhering to convention a)
yields, when starting with subscript 1, the subscript range 1 ≤ i < N+1;
starting with 0, however, gives the nicer range 0 ≤ i < N._

Such minimalism is elegant and efficient, to be sure, and I'm a big believer
that beauty often shows the path to truth. So from a computer science
standpoint, I wholly agree.

From a software engineering standpoint it's really fucking stupid. People
generally learn to count on their fingers and they need to grasp the concept
of counting and numbers as an abstraction before they can get with the concept
of a zero.

Counting on your fingers is one of those very basic things like learning the
alphabet (ro its equivalent in other languages) or learning to tie shoelaces.
Nobody I have ever met defaults to counting from zero, not even people who
claim to do so - easily tested by buying them a drink at some later time and
tricking them into counting something.

If you're a software engineer your job is to build something that works and is
maintainable. This is not the same thing as doing computer science or
mathematical computing, even though it may all be math from the computer's
point of view - just as architects are not mathematicians despite their
reliance on geometry. Array bound errors are omnipresent in software projects
because counting everything from zero is directly at odds with how we count
things in the real world, and when we're in a hurry or under pressure we
_default to learned behaviors_. Even people who have been bilingual for years
will burst into their default language or switch between two default languages
when they're excited. Building your code to go from 1 to n+1 may not be quite
as beautiful, but it is a hell of a lot easier for other people to read and
(in my experience) you make fewer mistakes if you switch away from starting at
zero.

Yes, this requires developing some new habits that feel really awkward at
first. Since I've only ever programmed for myself rather than as part of a
team I haven't had to deal with the inevitable resistance to this. But it is
worth making the switch. Just as a lab and a construction site are very
different environments [1], so are the practices of computer science and
commercial software development. It would also make a huge difference in
programming education, where many prospective students are alienated by being
asked to do something highly counterintuitive very early on (typically while
trying to grasp the concept of looping) and lead to errors which they are
likely to _keep making forever_ because nobody defaults to counting from zero
in any other context.

If you're a computer scientist, carry on as you were. If you're an engineer,
then build for the needs and instincts of your end users - some of whom will
eventually be programmers - and not for the computer gods. They're not going
to be around to help when your code breaks down.

I'm not very optimistic about this plea (especially not in the USA where y'all
won't even adopt the damn metric system), but please at least give it a try.
Technology should adapt to the needs of the people that use it, rather than
the other way around. If you're working in a high level language, then use
high level concepts.

1\. Remember the story about the three little pigs who all built their houses
out of subatomic particles formed into atoms formed into molecules and whose
houses were topologically identical but had different coefficients of
structural stability? Me neither.

~~~
ScottBurson
I've spent years writing in 0-origin languages and years writing in 1-origin
languages. I do not agree with your assertion that 0-origin is more error-
prone; quite the contrary. If you're manipulating sequences using indexing,
there are _more_ cases in a 1-origin system where you have to remember to add
or subtract 1; a 0-origin system has fewer of these, and if you learn good
habits, you can get rid of almost all of them.

(And for the record, I started out in the 1-origin Basic and Fortran world; my
preference isn't just a matter of which one I learned first.)

~~~
anigbrowl
I started out with 0-origin and for pure programming I agree. The problem is
that real-world indices almost invariably start at 1 and that is how everyone
who isn't a programmer counts, so automating anything based on a real-world
process requires a translation. Conversely, when testing and debugging against
real-world stuff everything is going to be off by one. It's just asking for
trouble.

------
paulddraper
> Different tasks call or different conventions

[https://xkcd.com/163/](https://xkcd.com/163/)

------
Pica_soO
Sometimes the only way, for people to take your ideas serious, is to get them
to want to take them apart. 500 pages of proof, but to silence this grating,
always arrogant, ####### at the next conference, so worth it.

Surely not the Noble kind of sciences, but if animosity leads too thoroughly
read papers and found errors, maybe the dark side is just willing to work
harder.

------
johan_larson
Are PL geeks _still_ arguing about this?

~~~
JadeNB
But Dijkstra was writing in 1982 ….

------
polarvortex
No. Just no. Mathematically, it just doesn't make sense in so many use cases.
But, hey, this is CS, math is an afterthought, right.

