> Depends on whether you're measuring or counting.
You run into issues when using 1-based counting as well as measuring.
Our calendar years use 1-based counting, which results in things like "2020" being the "20th year" of the "21st century". If we used 0-based counting instead, we would call the current year "2019" and it would be "year 19" of "century 20".
Explanation of conventional year counting:
1st century began in year 1
2nd century began in year 101 (100 years later)
..
21st century began in year 2001
If you run the same sequence as above, but start counting from 0, you get seemingly much more sensible numbers (0, 0; 1, 100; 20, 2000; x, x*100...). I suspect most people probably think that the 21st century began in 2000 anyway.
It should also be noted that the use of the number 0 in general is relatively recent. The Roman numeral system (which was initially used for our year numbering) has no representation for it, as it was only used conceptually hundreds of years later. It seems to me that the main reason people count from 1 is historical.
Counting from 0, 2019 would still be the 19th year of the 21st century. I'm puzzled why you completely changed grammar to make it seem like counting from zero also solved the century mismatch. If anything it suggests that counting from 1 is more consistent. Then 2020 is year 20 of century 20.
But then 2000 becomes year 0 of century 20, which is confusing English. But "century 20" isn't great English anyway. Calendars should be 0-based like any other measuring tool. The 3rd minute of the 2nd hour of a marathon is 1:02. The 3rd day of the 2nd month of my marathon is 1/02. If that happens to be the 3rd day of the second month of the year, it's 2/03?
> Counting from 0, 2019 would still be the 19th year of the 21st century.
I think you meant to say the "20th" year (off-by-one error!), since 2000 would be the "1st" year under your grammatical assumptions.
I think it's safe to assume that if we conventionally counted from 0 instead of 1, we wouldn't refer to the initial thing as "first" (or "1st"). If we did maintain that construction, we would probably just refer to the initial century as the "0th century" and then 2019 would be the "19th year of the 20th century".
> But then 2000 becomes year 0 of century 20, which is confusing English.
I think it's only confusing because it's unconventional, not because there's something inherently more confusing about it. There have been plenty of languages that haven't even had counting systems that do anything more than "alone, pair, many" .. sure, you can find contexts where counting in general is confusing.
> The 3rd minute of the 2nd hour of a marathon is 1:02.
Right, under current 1-based counting convention, but the point of this discussion is that this inconsistency exists for basically no reason (or rather, historical reasons—look up the history of "0"). When people want to do arithmetic on ordinal numbers, they end up subtracting 1 to turn them into a zero-based natural number, then add 1 again to turn them back into an ordinal:
I suspect most off-by-one errors can indeed be seen as due to this inconsistency between conventional ordinals and the more arithmetic-friendly zero-based naturals. Again, since the former is simply convention, I claim it would be better if our convention were different.
It doesn't go from -1 to 1, it goes from 1 BCE to 1 CE. Negatives aren't used at all. Personally I think we should use zero and negative CE to extend backwards instead. So the year we currently refer to as 1 BCE becomes 0 CE, 2 BCE becomes -1 CE and N BCE becomes -N+1 CE.
That's because we don't count decades. We say we live in the third millennium, and in the 21st century, but nobody calls this the 203rd decade. Instead we call it based on a common property of all the years in the decade (the count of these years can be abbreviated as twenty-something).
I would think that depends on whether that gets taught at schools early in life.
Intervals that are closed at the low and open at the high end also are convenient when using zero-based indexing (you want an easy way to express 0,1,2,…,n-1, and if you’re used to the notation [0,n) is nicer than [0,n-1])
I would think overloading the meaning of the various parentheses makes parsing and generating clear error messages harder, though (but not as hard as when one would follow ISO 31-11 and allow such things a ]3,7] for an half-open interval excluding 3, but including 7)
I think the desire to not overload parentheses with yet another meaning is why ‘Modern’ languages tend to use an infix operator for ranges, e.g. ‘1 to 5’ or ‘1...5’, with Swift having the half-open variant ‘1..<5’ (IMO ugly, but clear, so I guess one would get used to it).
Swift may have borrowed it from E, where the half-open variant is `1..!5`, read "from 1 up to, but not including, 5". It is extremely useful for 1-to-n sorts of counting tasks.
After having many fencepost errors, I finally came to the conclusion that all my code shall henceforth be zero-based. I'm much happier not having those errors anymore. I.e.:
for (i = 1; i <= N; ++i)
The <= is always a huge red flag for me, and I rewrite it as:
It's usually a C++-ism, since in C++ generic iterators overwrite operator++, and when writing generic code, you must allow for this, since ++i is equivalent to i.operator++(), while i++ is equivalent to `auto x=i; i.operator++(); return x`.
Given the author, I assume that this is also common in D.
You can override the postfix increment operator as well.
The argument I've heard for why in C++ prefix is preferred is that compilers have a harder time optimizing out the temporary object cruft (especially older compilers).
In theory extremely naive compilers may copy `i` if you use `i++` (since `i++` evaluates to the old value, whereas ++i can always be a destructive update), so some programmers have a habit of defaulting to ++i.
I think that is arguable. The form "array[idx++] = foo;" has undeniable elegance, and I still use it sometimes, even though I've developped a pretty verbose coding style in general.
maybe? It avoids repeating i and j twice and arr thrice, at least. Admittedly, there probably are cases where you really do want a range [x..^y] for arbitrary x and y, but it seems like most cases aren't.
The set of natural numbers should have zero, otherwise it's not a ring. However, from the perspective of Peano's axioms, it doesn't matter as far as I am concerned what the symbols are called, as long as you have a successor function.
Apart from that, I agree that it depends whether you are counting [1] or doing something else.
Except that even Turbo Pascal for MS-DOS allowed it.
People should actually stop using late 70's Pascal examples, specially when ISO Extended Pascal in 1990 fixed most of them, not to mention famous dialects like UCSD, Apple's Object Pascal and Turbo Pascal, or its modern variants.
Python follows convention the first convention, just as Dijkstra recommends. Mostly this worked-out well except for the case of descending sequences.
With half-open intervals, that case proved to be cryptic and error-prone, so we added reversed() to flip the direction of iteration. That allowed people to easily bridge from the comfortable and common case of looping forwards. Instead of range(n-1, -1, -1) we can write reversed(range(n)).
I was intrigued to see how this worked out in practice for Python 3, where range returns an iterator rather than a list— like, would reversing the iteration require allocating the whole list in memory?
$ python3 -c 'print(reversed(range(10)))'
<range_iterator object at 0x7fdb53ae96f0>
It turns out that reversed() only works on objects that expose the __len__ and __getitem__ methods:
If you try to use reversed() with your own generator it will fail with "TypeError: argument to reversed() must be a sequence", until you wrap the generator invocation in an explicit list().
Alternatively the object can expose a __reversed__ method, which is why reversed(range(10)) is a "range_iterator object" rather than a "reversed object".
The first time I read this, I was ready to be convinced and left disappointed. His entire argument is based on avoiding "the pernicious three dots."
What's so pernicious about them? They are short, clear, unambiguous, easy to type. Why don't we just get our computer languages to understand what "2, 3, ..., 12" means? Is there any other argument to start counting at 0 other than not defining a range using the three dots? If not, maybe starting at 1 (like everyone does outside of computing) is the better option.
I would argue that starting at zero comes more naturally if you’ve been working in binary and are often using bits to select or signal things. In that case the single bit zero is a rather important piece of information and where everything starts. Then when you translate these ideas up into higher languages it continues to feel somewhat natural — just as the hardware may be selecting for register addressed as 000000000 etc, so the items in an array would start from zero. We go ahead and represent the indexes after one in decimal, for convenience, but there’s a feeling that they are like the hardware memory addresses under the hood.
This isn’t so compelling in today’s world of mostly people who live in high level programming languages with no real notion of what goes on in the hardware. But backwards compatibility is a thing, and “following conventions” as well, so I see zero based indexing used in modern high level languages as a logical extension of where this all came from.
What’s hard, really, isn’t using zero based indexing, it’s switching back and forth between systems.
Agreed, what's weird to me is how high level languages like Python and JS adopted C's indexing. But I guess that's the huge C influence on the programming community. Most of us learn programming being taught zero-based indexing.
His logic is that we should cater to the empty sequence case, and it would be "unnatural" to write 2 <= x <= 1, so we have to write 2 <= x < 2. That is just not an improvement. Ask somebody to write down an empty sequence, starting at 2, they'll think you're crazy. The Common Case is give the First and Last.
Option C is, to me, obviously best. How you can replace 2 .. 12 and not use 2 & 12 is beyond me. (/s Dijkstra's opinions considered harmful /s)
You mean mathematically, or specifically for programming?
For programming, I imagine it's better for it to be infinity than 0 since having 0 in the denominator implies that it's changing (who would write a constant 0 in a denominator?), and so as the denominator is approaching 0, the fraction is getting closer and closer to pos/neg infinity, not 0. Making it evaluate to zero would imply a change of direction for whatever the fraction represents.
So, if it's a position, as the position goes 1000/1000, 1000/500, 1000/100, 1000/1, etc. it's getting closer to infinity. Having it suddenly go to 0 would break the pattern of movement.
Mathematically, there is no x that satisfies x = 1/0 -> x * 0 = 1, since anything multipled by 0 is 0, so it's undefined.
Yes, there's a discontinuity between the positive and negative numbers, that was part of the argument. Another part was that certain mathematical formulas get simpler when x/0 = 0.
They’re not unambiguous when the endpoints are variables. What is [1,2,...,x] when x=1? You probably want just [1] in this case if you’re iterating over indices.
When I did maths in high-school and university, most indices were 0-based,not 1-based. 0-based indexing is so common in physics and engineering (e.g. the initial time of a system is T0, not T1).
Note also that, in maths, both the cardinal and the ordinal numbers start at 0 - defined as the cardinality of the empty set and the set containing the empty set, respectively.
Lots of people have a strong opinion on this: Ones that say that 0-based indexing is the only logical way of things, and ones that don't see what the fuss is about and prefer 1 because that's how we count objects. I suspect the people in the first category did some low level programming that needed to do arithmetic on indices. For example, let's say you want to take a string "abc" and repeat it until the length is 10, getting "abcabcabca". Assuming some Python-like language you would start with:
a = "abc"
b = [" "]*10
With 0-based indexing you would do:
for i in range(0,10):
b[i]=a[i%3]
In a 1-based language that becomes:
for i in range(1,11):
b[i]=a[(i-1)%3+1]
So you need to shift the index twice. This is because modulo arithmetic needs 0 to form a ring. As a result, in situations where the difference between 0 and 1 based indexing makes a difference it's ususally 0-based indexing that leads to simpler code.
Julia actually just includes a dedicated function for this case (mod1), and that covers the vast majority of places where 0 base is easier.
In my cases of numerical code 1 based indexing causes fewer ±1 than 0 based indexing. (Nitpick: In modulo arithmetic 0 = N so having the index run from 1 to N is a ring just as well. The problem is the conventional representation of the equivalence class -N,0,N,2N,3N… in the numbers being 0 (which arises from arithmetics definition of modulo))
In pretty much any language other than python you'd have to be careful for the case that i might become negative anyway.
And in python you could just go:
import itertools as it
a = ''.join(it.islice(it.cycle("abc"),10))
Also you don't need 0 to form a ring. Modulo arithmetic forms a ring no matter which representatives you pick. Languages just tend to implement one that includes 0 because it's more convenient.
This used to be unspecified behavior. C99 then codified this wrong behavior.
Assume positive b for a moment.
We want a * (a/b) + (a%b) = a. If a%b is to always be within [0..b), then a/b has to round toward -infinity. C99 instead chose round towards 0.
That's right. Why do you suppose they chose that behavior to standardize, rather than Python's? Conceivably it's because nobody on the C99 standards committee had enough technical expertise to make the argument you're making, but can you think of another explanation? Because the prior probability on that one is pretty low.
> In C89, division of integers involving negative operands could round upward or downward in an implementation-defined manner; the intent was to avoid incurring overhead in run-time code to check for special cases and enforce specific behavior. In Fortran, however, the result will always truncate toward zero, and the overhead seems to be acceptable to the numeric programming community. Therefore, C99 now requires similar behavior, which should facilitate porting of code from Fortran to C.
Ha - you’re correct of course. I saw the a^2 term and thought that can’t be right. Note to self - before attempting to correct others, check the “correction”.
According to wahern's quote above it was to "avoid incurring overhead in run-time code to check for special case", like the ability to accelerate mod power-of-two by convertig to AND.
In two's complement? You'd need to add extra masks and shifts to replicate the sign bit. It puts you in much more complex territory than throwing an AND at it.
Some things are more naturally numbered from 1, others from zero. Fence posts versus fence spans.
Pointer arithmetic is a great example of a situation well adapted to 0 based numbering, and was quite relevant to many programmers when this was written. However, far fewer programmers nowadays interact with raw pointers directly.
Personally, I run into more cases where I’d rather have 1 based numbering than zero, but your mileage may vary. It’s valuable to have a language which can support both.
> However, far fewer programmers nowadays interact with raw pointers directly.
It doesn't have to be literally a memory pointer to run into the same phenomenon. For example, offsets into a file. I can think of an example when I was doing work on a file format, and 0 based indexing certainly helped simplify some logic.
Correct, I was merely pointing out that the main motivator of Dijkstra’s argument is not nearly as relevant as it was when this was written.
I have no idea what indexing style is more appropriate to the hypothetical ‘average’ programmer, but I find myself preferring mostly 1 to N the most, followed next by -N to N and finally 0 to N.
As I said originally, your mileage may vary, but I think it’s important that we use languages where many different indexing styles can be supported ergonomically, and in a way that avoids silly errors due to using the wrong index style.
Type systems and generic functions help a lot with this. I think of it a lot like say, signed versus unsigned integers. Multiple useful ways of looking at the same bitwise data. Type systems generally save us from nasty errors with the various types of integers. They can do it with arrays too.
In the business domain one is often counting and tracking explicit things: widgets, contracts, customers, complaints, pizzas, etc. When counting, you almost always start with 1. "1" fits the domain more naturally, and if you deviate, you'll likely spend code converting between the end-user's view and the code's view. The translation layer creates risk of errors and more code. Maybe weather forecasting or space probe orbit calculation is different, but 1 better fits biz. Dijkstra was not (primarily) a business domain coder. (Smarter languages, like Pascal, allow you to define the range, although I'm not sure if it has dynamic upper ranges.)
I think you'd still start counting those things from 0, because it's a convenient and natural way to express that you don't have complaints, widgets or pizzas.
Or, I suppose, you could start counting at 1, but use an optional type and use None for that case. But that makes doing arithmetic on those numbers ("widgets per pizza") harder than if you'd just used 0 to begin with.
Why would you need any special case? If 1 means there is 1 item then you can still use 0 to mean there are 0 items.
The 1-based counting would be to index or label them like item 1, item 2, item 3, ... which is quite natural to humans. If you had zero items, you'd express that by your list being empty, not by having a number 0 somewhere. It's not a special case.
Lets consider a measuring tool for fluids which is based on volume. What is a useful range for the index on the side that measures the currently non-air capacity?
Many such devices, E.G. a measuring cup I've got at home in the kitchen, use a series of tick marks within a bounded scale of accuracy range ] with marks on both the top (of course) and also the bottom (it looks nice).
While 0 isn't expressly labeled on many scales, it is part of the inclusive range by implicit nature. Thus as you point out the case of 0 units, and 'empty set' are one and the same in this real world example.
I haven't done enough "fluid" applications to give domain-related suggestions. That's not really an integer "count". I'm only saying it doesn't fit well in typical business applications as I see them. Each domain is different.
Many of our existing conventions derived from military, science, and academic applications. Business applications came along later, emphasizing discrete counting and categorizing things along the lines of set theory, such as in the set or not in the set, not half in. (Money has decimals, but you don't typically increment through it with indexes.)
(Although COBOL was published around 1960, it took roughly 5 years for computers to get powerful enough to make it practical and widespread. The earliest COBOL compilers were dog-slow resource hogs relative to the hardware of the day.)
You forgot to explicitly mention that bottom mark on your measurement cup is indeed 0, and without it you could measure "at least this volume of fluid".
That doesn't change anything that I see. Indexes and counters are usually not used for "continuous" metrics such as liquid quantities. And again, the biz domain typically does not count fractional quantities, at least not in an indexed way.
It's not a practical problem for most coding. If you disagree, can you demonstrate a problem caused by indexing arrays starting at one in a typical code situation/scenario?
There are two separate concepts here: counting (how many things do I have) vs numbering/indexing (which one in a sequence am I talking about).
If I have five pizzas, would anyone argue that the variable storing that fact should have a value of 0x0004? On the other hand, if there’s a stack of five pizzas, they’re either numbered 0..4 or 1..5 depending on your religion.
I concur, because I observe this exact problem for years now. I work for a company producing modular hardware for decades. Until like 5-10 years ago all countable stuff was counted from 1 as is only logical. Processing blades, interface ports, chassis, internal modules of same type, logical elements (sessions, channels etc.). And then on seventh day devil came by :) . The biggest customer along with several other wrote a new revision of standard we are using and there everything is 0 based. And now we are in hell. Yes, it is mostly contained but even after years of developments of new product, years in production sometimes we still find bugs around 0 based numbering. There are multiple places where 0-based is mixed with 1-based or just used incorrectly. And talking to humans about hardware become much more error prone and inefficient - "please connect card one, port one to the switch" - first as in 0 or 1? Should I specify it this time explicitly? But he probably knows what I mean. Or maybe not? And if he is wrong I will waste another day waiting to change wiring again. Screw it, I'll tell him explicitly. Every goddamn time.
All of this because some arrogant programmers (or ex-programmers) think that they know better than everyone else and that changing legacy and/or logically correct conventions is good because of their religious beliefs that "0-based is better for everyone and every task".
Re: All of this because some arrogant programmers (or ex-programmers) think that they know better than everyone else ...
It could be they lack real-world experience or work in "esoteric" domains. Theorem proving is a different animal than making Boss Bob's billing summary reports come out right.
I find the opposite. UI issues dominate too much since The Web murdered client-server stacks. Client-server was like parking a passenger car: you aim the front wheels where you went to go, and then you are there. Web stacks are like driving an 18-wheel truck: you have to plan in advance your multi-point swings and move carefully and slowly because rework is expensive.
The syntax I always want but no language I know of supports:
for 1 <= i <= 10:
do stuff
for 0 <= i < 10:
do stuff
for 1 <= i < j <= 10:
do stuff
(In the last case, the exact order in which the 45 iterations happen should maybe be left unspecified; at any rate, it would be bad style to depend on it.)
Downward iteration:
for 10 >= i >= 1:
do stuff
Obviously this doesn't cover every case of "arithmetic for loop" that would be useful: sometimes you want a stepsize that's neither 0 nor 1. I'd be quite happy with a language in which I had to do that using a more general iterate-over-an-arbitrary-sequence construction; I'm tempted by options like
for 100 <= i <= 1000 where i%3==1:
do stuff
but it's probably too clever by half; either you only support the simple "specify the value of the variable mod something that doesn't change while iterating" case, in which case users will complain that some obvious generalizations fail to work, or you support arbitrary predicates, in which case you have to choose between making the "easy" cases efficient and the "hard" cases not (in which case users will be confused when what seem like small changes have drastic effects on the runtime of their code) and making all of them inefficient (in which case users will be confused by how slowly their code runs in some cases).
You may enjoy Common LISP's looping: it's a whole minilanguage where you can say stuff like
(loop for i from 1 upto 10 ..)
(loop for i from 1 below 11 ..)
(loop for i from 10 downto 1 ..)
(loop for i from 10 above 0 ..)
(loop for i from 3.5 to 7.5 by 0.25 ..)
It does much more than that, I should mention; it's a whole mini-language/DSL (like format).
> (In the last case, the exact order in which the 45 iterations happen should maybe be left unspecified; at any rate, it would be bad style to depend on it.)
Relying on something unspecified isn't just bad style; it's a bug.
Why would you introduce requirements into a higher level language that leave important aspects unspecified, opening the door to bugs?
1. It might be best not to specify what happens, because
2. if it were specified then people would rely on the specific order, which would be bad style because it would make code harder to read and be more liable to errors if anyone misremembers what order the iteration happens in.
(But it's true that people might rely on the observable behaviour even if it weren't officially specified, and that would be even worse. Which is why I said "should maybe be left unspecified"; it isn't obvious to me which way is better overall. If it's unspecified then fewer people will rely on it but the consequences each time will be worse.)
Indices should start at 1, but in programming we mostly use offsets and call them indices which is one source of confusion that leads to this endless debate.
A real problem and pretty sure quite a few bugs in mathematical toolsets happen because of this. It was a real pain translating formulae to code for numerical computing.
There is worse. In Perl, you have the global variable $[, which lets you specify the first index of an array. One could imagine setting it to 0.5. I don't thing that would work, but with Perl being Perl, we never know.
This is, of course, a terrible idea, and that feature is now deprecated. $[ is always 0 and setting it to any other value is an error.
Sadly missed. Stan Kelly-Bootle was the first person to get a postgraduate degree in computer science. For more with the flavour of the quote above, check out his Devil's Advocate column in UNIX Review¹ magazine (not much UNIX, promise) and the post-paper Son of Devil's Advocate.²
On the assembly language level, you have compare instructions. These subtract, and throw away the result, but leave the flags.
Many CPUs have a flag that is automatically raised if a value is 0.
This means you don't have to execute a compare instruction to test if a value is 0, because the "Zero" flag will be set as soon as the 0 value is loaded.
That means you can load a value, and directly go to a BEQ (which is really shorthand for "if zero flag = 0") and save a few cycles by avoiding the CMP instruction.
So this is why numbering starts at zero. Testing if your list is empty, which is probably a common thing if you loop through each element, is slightly quicker.
In scientific-oriented languages (R, MatLab, Julia, Fortran, &tc.), array indicies tend to start at 1. I think it is a culture thing. Software engineers prefer 0. Scientists prefer 1.
In science indexing can be quite inelegant too. We often denote time steps as t0, t1,...,tn. But if we put these steps in a vector v, we would get: v[1]=t0,...,v[n+1]=tn.
I think that's the case only if we're stuck in a programming language like Matlab. Usually we would get v[0] = t0, ..., v[n] = tn. Mathematicians are perfectly happy starting with zero, even for things like matrix indices, when it makes sense.
But note that the potential for underlying confusion is not limited to the choice of indexing in programming language syntax. Casually ask someone who lives on the 8th floor of an apartment building how many floors they go up in the elevator and you'll probably discover that people don't think in detail about the precise labeling of elements in a collection or how arithmetic with intervals work.
This is silly. Different domains have different needs. If you think of arrays as pointer + offset calculation, and your only for loop has an explicit index increment in it, then starting at 0 is natural. If you are a systems programmer then a language targeting you should accommodate that.
If you are anyone who doesn't want to care (too much) about how things are implemented, then it's a lot less error prone to mark the 1st element of an array a[1]. Having taught and written fairly complex scientific code in both 0 and 1 indexed languages, there really is no good reason to do anything else.
for a in array[3:6]
Should iterate over the third to sixth element of the array.
Nah everyone should just listen to the mathematicians and allow for arbitrary index sets.
Seriously though you save yourself a lot of trouble if you don't worry that much about what range your indices have but rather what they represent. The fact that they're integers starting at 0 is just an implementation detail.
I'm usually a start-at-zero guy, except if I'm implementing matrix routines... going from the 1-based indexing from the math theorems to 0-based code is so tedious in order to ensure the translation is correct.
I totally agree with this. However, I'm trying to teach my 1 year old how to count. He's got 1, and 2 down, and we're working on 3 (He can say the numbers 1 to 10, I'm talking more about the concept of numbers). I would love to get him to understand 0 too, but I'm not sure how.
People almost always start "stopwatch" counting at 1. If you time how long a juggling ball stays in the air, or how long it takes to run a short distance, a stopwatch will read 0.9 or something meanwhile most people start "1, tw-.." and then say 1.5 or so.
Yeah, it's pretty incredible, isn't it? Makes you realize that it's not just the training of the brain, that brain is growing and it is simply able to comprehend more as it grows, no training will do that. For some things, you just wait.
Yeah, I think I posed the question just to get ideas for how others think about it. I don't have a need for him to learn it per se, I just wanted to start a discussion about how zero is a difficult concept to teach. Heck, humanity had to "invent" the concept of zero.
I alluded to this in another comment, but what astonishes me about my son is how rapidly he grows and changes. Every night, my wife and I talk about how amazing he is, etc. and I can't help but think about the fact that 1 year ago, he was barely able to roll over, and that he's now running around, "reading," eating solid foods, and a full-blown chatterbox. If he's doing this much now, what the heck is he going to be doing next year?
It's forced me to reevaluate my own life. If my son can accomplish all of that in one year with proper support, nurturing, and guidance, and we are genetically related (i.e. 50% of him is me), then what could I do in the same amount of time?
It sure can be motivating. But unfortunately we do not benefit from any growing or our "hardware" anymore (or not as much at least). Where your son get some extra ram and CPUs plugged in every month,we have to make do with what we have.
He's actually 17 months if we want to get specific. On his 1st birthday, I managed to get him to answer the question "X, how old are you?" with "Wah!", which only became an articulate "One!" around Christmas/16 months, which was also about the time he started being able to say 1 to 10.
He actually is a pretty amazing kid. He's miles ahead of other kids his age, even some who are older. I wasn't expecting to be able to teach him counting until next year, so I'm being very relaxed about it right now and making it fun. He's 1, not 5, as far as I'm concerned, he can do what he wants, but if he's into it, I'll teach him.
Consider P₁ ∈ H and P₂ ∈ G where H is the set of HN posters, and G is the set of persons in the general population. Let p(P U) be the probability that a person, P, will produce an utterance, U, using mathematical terms and/or notation. Let q(U) be the probability that U will be better understood in mathematical terms than in plain language.
I’m an embedded C programmer who typically avoids dynamic allocation... so I know the answer is zero apples... but it doesn’t matter because I have to hold my hand out making a space for an apple to go just in case some day you do decide to hand it to me.
It's trying to replace/inter-operate with Excel, so it has to maintain backwards compatibility; bug for bug, including design bugs.
I couldn't tell you offhand if the logic you speak of is distinct to Sheets or if it's Sheets maintaining the same interfaces other spreadsheet software defined for those macro names long ago.
Counting a physical process, so you have to allocate space. 0 is that allocation, and as you move along the number line, you update that space (using "add 1"). Note that you only update the space at discrete intervals, which has surprisingly deep implications.
Having a kid I've been thinking a lot about counting, and what it really is. It seems totally wrapped up in repetition, and so I'm wondering if teaching counting as a function of, say, circular motion doesn't give a better intuition than the usual "count this clump of things" approach. (Counting clumps requires the person to simultaneously introduce an ordering and then implement a kind of internalized repetition as they point and count rhythmically. My kid seems to struggle with the ordering part, and no wonder: N objects have N! orderings.)
Please note that “counting” there is really “indexing”, as in 1st, 2nd, etc. Apart from indexing in programming languages, some commenters in this thread seem to have an idea of counting (indexing) real things from zero, as if it solved the -1 problem easily. But that is a natural offset between a quantity and a position. We could in theory rename our ordinals one backwards like: zeroth, first, second, and so on. But that would shift the meaning temporarily, and 0th would become new 1st. With new counting in mind, for an empty set [] count is 0, for [x] count is one and an ordinal for x is “zeroth”, -1 again dammit.
Between 0 and 10 there is 10 one-sized intervals touching 11 connecting/borderline points (integers). You cannot make this fact go away, no matter which language you choose.
If the author had contemplated this sentence before writing this pained and narrow-minded treatise: "Why thinking should start before writing", engineers would be slightly more capable of interacting with and designing for other human beings. Numbering is highly domain specific -- the concept of zero is not always relevant. And sometimes the lay perspective has primacy over others. I remember the intense arguments engineers had regarding when the new millennium was to start. There weren't celebrations on New Year's 2001 that could compete with the scale of those for the year 2000.
Worst than starting at 0 or 1 is two give the option. In old Visual Basic the "Option Base" statement change the indice for a whole module. I had to debug code program with a mix use of code with indice starting at 0 and 1.
I (almost) always fall victim of one-off errors in competitive programming when working with counting numbers in range. Sticking to convention like 2 ≤ i < 13, as Dijkstra observed for Mesa programmers, sounds like a good idea.
One thing that’s nice about it is that you can use a positive difference as an array index directly. This is nice when you want to store a function of the “distance” between two things, e.g. potential[] = {-10, -5, -3, -2.2, ...}; ... potential[r1-r2]. It also works nicely with modular arithmetic as ‘0-_-0 mentions in their post. More generally this corresponds to the array being a function from the natural numbers to some other values, and this function can only start at zero if the arrays start at zero. But if you want a function that starts at 1, you can just set x[0] = invalid().
"In corporate religions as in others, the heretic must be cast out not because of the probability that he is wrong but because of the possibility that he is right." Antony Jay
« Exclusion of the lower bound forces for a subsequence starting at the smallest natural number the lower bound as mentioned into the realm of the unnatural numbers »
There's a parallel argument for being inclusive in the upper bound: it lets you specify a range which is the entire size of your integer type.
I can see that that might seem like a poor trade for losing the ability to represent an empty range starting at zero, but it seems a shame he didn't mention that this is the tradeoff you're making.
The only reason to count from zero is for the sake of array processing -- given a pointer to an array, the first element will be at pointer plus zero. That's significant for low-level code as it avoids an extra add one operation.
However for high-level programming, starting at one has advantages. For instance what index to use to represent insertion of an element at the beginning? Being able to use zero for this is cognitively easier.
That convention seems confusing to me, especially one new to the convention. If you need an explicit "insert at the start" operation, then do something like "x.insertAtStart(y)" instead of "x[0] = y".
Insertion at the beginning is insertion at zero in most zero-based systems too. Inserting at zero means that the new element will end up at index zero, and the other elements are shifted forward. In a one-based system, similarly, insertion at the beginning should be insertion at one, not zero; doing it any other way would just be a source of off-by-one bugs.
It was an app I wrote, probably in algorithmic category. But even in business code, I very rarely see off by one problems that would be easier to handle in 1 based indexing.
I'm sorry, the "recent study" was "an app I wrote"? Sorry, but unless I'm misunderstanding... that's not a study. And the whole point of a study is to back up assertions like "I very rarely see" so they aren't just opinion subject to all the normal human biases we all have.
But if you ran such an analysis across public GitHub repositories per-language and wrote the results up in a blog post, I'm sure HN would love to see it. Definitely front-page material.
It wasn't a serious comment. I thought it was obvious from the tone.
> But if you ran such an analysis across public GitHub repositories per-language and wrote the results up in a blog post, I'm sure HN would love to see it.
I'll see what I can do. I do have some repos in sight that could be used for it.
But even then this won't be as straightforward as you say since different languages have different applications (eg cpp for games, julia for scientific computing). This would require writing the same code in both indexing patterns and then comparing them.
My experience strongly disagrees. I have done scientific computing in Matlab, Python, and C++, and Matlab is the one with the awkward +1, -1 everywhere.
Fair enough. I was thinking of operations like getting row i from a n*m matrix, which would be A(i,:) in 1-based compared to A(i-1,:) in 0-based (or A(i,:) with a 0..n-1 index).
While I agree with the conclusion, i think he could have expanded it even more. Not everyone has as developed aesthetic sense. Measuring, counting, combinatorics, indexing, arithmetic with ranges.
95% of cases it does not matter or make things more elegant or simple, but when it does it's always zero-indexing that is more elegant.
At the end of the day they are both great and they both suck depending on context. I use 0 based when programming and 1 based pretty much everywhere else.
And in both cases I always run into an issue where I think, "This would be so much easier if we just started at [the other index]".
I use 0 based when programming in some languages, and 1 based in others. Everywhere else I use 1 based.
When I'm programming in 1-based languages, I'm almost always happy with that choice, because they're almost always languages designed for solving problems in a domain that were standardized on 1-based counting before Charles Babbage was born.
When I'm programming in 0-based languages, I'm sometimes very happy with that. Those times are when I'm working in C or C++ . When I'm using other languages, I don't care much maybe 1/3 of the time. And the majority of the time I still wish it were 1-based, because I'm using that language to solve problems in a domain that had standardized on 1-based counting before Charles Babbage was born.
Case: indexing an array in ASM, C, C++, Swift,…
Case: using an offset
0 <= i < N
Case: indexing an array in Pascal, Ada, …
Case: writing a math paper
Case: talking with a non-programmer
1 <= i <= N
Silly article. Most of its points are actually arguments against JavaScript's type system and semantics, not zero indexing per se. (Consider how would they apply if we used letters for indexing, where there's no longer a superficially appealing zero value to special-case: ‘We should use B-based indexing, because A is a falsey value in JavaScript! Also, now we can use A to denote missing elements!’.) Others are empty appeals to vaguely defined ‘mathematics’ that ultimately boil down to historical legacy.
It is true that most matrix theorems are formulated with 1-based indices. But mathematical notation is full of unfortunate historical accidents (like defining π to be half the circumference of the unit circle), so one should be cautious when drawing inspiration from it. And for what it's worth, set theorists start ordinal numbers from zero, and each ordinal a half-open interval of numbers which precede it:
Purely functional languages have no reason to 0-indexed arrays. Imperative languages that deal with for loops are more logically consistent with 0-indexed arrays. I think.
One is the first number. Until there is one of something there is no point to counting at all. Zero is a regression from one and only significant for the lack of that something which we started counting at 1.
But there absolutely is a point in setting out to count something when it turns out there is zero of that thing. If you don't know how many of something there is, but you know that this information is important, you will want to go out and count. If it's really important, then you will likely want to keep records of the count every time you perform it. This is, of course, very important in commerce.
This is numbering being talked about. Guy finishes a marathon in first place, he would be the zeroth runner and one runner has finished. Most people would intuitively like those numbers to line up.
Of course, when you build the natural numbers on a computer and use those as indices to arrays, it makes sense that the first index is zero. So the first (common usage) element would then naturally be called the zeroth element. This, of course, gives you the problem that array.size == x means that array.last_index == x - 1 and all the off by one errors that entails.
Sure, but I don't think anyone's complaining about how you would assign race completion position labels in a programming language. Surely you would do something like this:
// Start the position at one, because that's how we
// assign number labels to finishers in a race.
let position = 1
// Create a callback that will assign a label to
// each competitor when that competitor finishes
// the race.
let assignLabel = (competitor) => {
competitor.label = position
position = position + 1
}
// Begin the race, and pass it a callback that will
// be run each time a competitor finishes the race
// (in the order they finish).
race.begin(assignLabel)
Note that this "label" really is just a label. It happens to be an integer here, but we could also use a library that outputs English strings like "first", "second", "third", etc. This debate is focused more on how elements in an ordered collection are accessed.
Before any runners have completed the race, there are 0 runners in the set of runners that have completed the race (empty set/null set).
As soon as the first runner crosses the finish line 1 runner has finished the race, increasing the counter by one increment and it is customary to assign them the designation of the upper end of the unit range rather than the lower number. Thus, they are the First (1st) runner to have completed the race.
We count rooms the same way probably out of a similar convention, the first room on a floor is numbered 1, rather than 0, because it is more useful to annotate the end of the range than the start. It makes a lot of math easier too. If there are 10 rooms on a floor and 10 guests, there are 0 rooms free.
It does, ask any shepherd, accountant, quartermaster, etc. It has a well defined meaning and use over the centuries which for hysterical reasons of economy it took on a second meaning equal to physical 1. Luckily it's the only case in computing where established meanings in base ten were changed to fit a base 2 world.
CPU0 would disagree, for example. A non technical person might be forgiven for thinking a system that says CPU0 has no CPUs, when in fact it has one, again, for example. Many logical objects start their enumeration at zero and increment from there. There are four rings of operation in Intel CPUs, but they are labeled ring 0 through 3.
I'm sorry, but you clearly haven't heard of unary, or base 1, which is probably the earliest counting system which 1 is 1, 2 is 11, 3 is 111, etc. Tallymarks. You are correct base 0 would make no sense, however.
But yes please explain to me the very thing we're talking about which is exactly what I was referencing when pointing out that using 0 as an identifier rather than a value is moderately counterintuitive.
The article mentions Mesa's interval syntax, which gave programmers all the possible options:
[1..100] (0 .. 101) [1 .. 101) [1 .. 101)
were all the same interval, 1 to 100 inclusive. Bad idea. Great way to get off-by-one errors.
Another bad idea from that era - Wirth's insistence that you shouldn't be allowed compile-time arithmetic in Pascal. Like.
No, you had to have a separate named constant for "tabsize-1", and it couldn't be initialized as "tabsize-1", either.