
Statistics with Julia [pdf] - aapeli
https://people.smp.uq.edu.au/YoniNazarathy/julia-stats/StatisticsWithJulia.pdf
======
superdimwit
I'd really recommend anyone doing mildly numerical / data-ey work in python to
give Julia a patient and fair try.

I think the language is really solidly designed, and gives you ridiculously
more power AND productivity than python for a whole range of workloads. There
are of course issues, but even in the short time I've been following & using
the language these are being rapidly addressed. In particular: generally less
rich system of libraries (but some Julia libraries are state of the art across
all languages, mainly due to easy metaprogramming and multiple dispatch) +
generally slow compile times (but this is improving rapidly with caching etc).
I would also note that you often don't really need as many "libraries" as you
do in python or R, since you can typically just write down the code you want
to write, rather than being forced to find a library that wraps a C/C++
implementation like in python/r.

~~~
opportune
>you can typically just write down the code you want to write, rather than
being forced to find a library that wraps a C/C++ implementation like in
python/r.

I don't think this is really a feature. It's nice that you can write more
performant code in Julia directly and don't need to wrap lower level
languages, without question, but the lack of libraries or library features is
not a good thing. It's always better to use a general purpose library that's
been battle tested than to write your own numerical mathematics code (because
bugs in numerical code can take a long time to get noticed)

For specialized scientific computing applications, which would normally be
written in C/C++, I would absolutely look into using Julia instead (though not
sure what the openmp/mpi support is like). But I would also recommend against
rolling your own numerical software unless you need to

~~~
jjoonathan
I don't just think it's a feature, I think it's a killer feature.

You are much less likely to reinvent the wheel if you can add your one
critical niche feature / bugfix to an existing library. In python, learning C
and C build systems and python's C API are gigantic barriers to doing that.

More importantly, if every fast data manipulation needs to be written in C, a
few of them can be profitably shared, but you need more than a few of them.
Pretty soon you wind up with a giant dumping ground of undiscoverable API
bloat. See: pandas.

~~~
tomrod
Maybe I don't understand what API bloat is in this context -- can you give
some more detail regarding your thoughts on pandas?

~~~
jjoonathan
Here's one of the fifteen API ref sections in pandas:

[https://pandas.pydata.org/pandas-
docs/stable/reference/serie...](https://pandas.pydata.org/pandas-
docs/stable/reference/series.html)

Even though it's long, it undersells the problem, because many of these
methods have nontrivial overload semantics that open up like a fractal when
you look in turn at their docs. The link also undersells the problem because
this junkheap is evidently so incomplete that people are frequently forced to
rely on numpy to extend it.

APIs should make hard things easy, but API gloveboxes like this make easy
things hard. Minimal API + Performant Glue >> We do everything for you + You
can't ever touch your own data or your perf dies + Good luck reverse
engineering _these_ semantics if you've forgotten the context and need to port
it.

~~~
tomrod
Okay, I think I see your point. The different object methods you are seeing as
API calls, and because they are granular and have capacity to do many common
and uncommon tasks this is viewed as bloat. Makes sense from that perspective.
Thanks.

------
jointpdf
This looks like a good reference for the fundamentals of both statistics and
Julia, as claimed. I have a small critique, since the authors asked for
suggestions.

The format for the code samples goes like (code chunk —> output/plots —>
bullet points explaining the code line-by-line). This creates a bit of a
readability issue. The reader will likely follow a pattern like: (Skim past
the code chunk to the explanation —> Read first bullet, referencing line X —>
Go back to code to find line X, keeping the explanation in mental memory —>
Read second bullet point —> ...). In other words, too much switching/scrolling
between sections that can be pages apart. Look at the example on pages 185-187
to see what I mean.

I’m not sure what the optimal solution is. Adding comments in the code chunks
themselves adds clutter and is probably worse (not to mention creates
formatting nightmares). I think my favorite format is two columns, with the
code on the left side and the explanations on the right.

Here’s what I have in mind (doesn’t work on mobile):
[https://allennlp.org/tutorials](https://allennlp.org/tutorials). Does anyone
know of a solution for formatting something like this?

~~~
j88439h84
I'm not sure how that allennlp site is doing it, but source is here:
[https://github.com/allenai/allennlp/blob/b0ea7ab6be2787495fa...](https://github.com/allenai/allennlp/blob/b0ea7ab6be2787495fa52efd0f603659197c7d76/tutorials/tagger/basic_allennlp.py)

~~~
j88439h84
Here's what they're doing:
[https://github.com/allenai/allennlp/blob/master/tutorials/ta...](https://github.com/allenai/allennlp/blob/master/tutorials/tagger/convert.py)

------
xvilka
Note that Julia 1.2[1] is on the verge[2] of being released. Also, it is
interesting to see the list[3] of GSoC and JSoC (Julia's own Summer of Code).
A lot of projects target the ML/AI applications. Personally, I am waiting for
proper GNN support[4] in FluxML, but seems not much interest in it.

[1]
[https://github.com/JuliaLang/julia/milestone/30](https://github.com/JuliaLang/julia/milestone/30)

[2] [https://discourse.julialang.org/t/julia-v1-2-0-rc2-is-now-
av...](https://discourse.julialang.org/t/julia-v1-2-0-rc2-is-now-
available/26170)

[3]
[https://julialang.org/blog/2019/05/jsoc19](https://julialang.org/blog/2019/05/jsoc19)

[4]
[https://github.com/FluxML/Flux.jl/issues/625](https://github.com/FluxML/Flux.jl/issues/625)

------
caiocaiocaio
Julia looked interesting to me, so I tried 1.0 after it came out. I have a
oldish laptop (fine for my needs), and every time I tried to do seemingly
anything, it spent ~5 minutes recompiling libraries or something. So I've been
waiting newer versions that hopefully stop doing that, or for me to buy a
better computer.

~~~
anonova
Yes, this is ones of my problems with Julia. It seems to be optimized for long
runs and REPL/notebook usage.

Take, for example, a simple program that creates a line plot
([https://docs.juliaplots.org/latest/tutorial/](https://docs.juliaplots.org/latest/tutorial/)):

    
    
        using Plots
        x = 1:10
        y = rand(10)
        plot(x, y)
    

After installing the package, the first run has to precompile(?), and
subsequent runs use the package cache. But ~25 s to create a simple plot is
incredibly slow and frustrating to work with.

    
    
        $ julia --version
        julia version 1.1.1
        $ time julia plot.jl
        julia plot.jl  73.71s user 4.45s system 110% cpu 1:11.04 total
        $ time julia plot.jl
        julia plot.jl  24.41s user 0.39s system 100% cpu 24.633 total
        $ time julia plot.jl
        julia plot.jl  23.38s user 0.36s system 100% cpu 23.519 total

~~~
ViralBShah
The time to second plot will be a few milliseconds, in the same process - in
the same Julia session. So, while the time to first plot is frustrating, it is
ok if your interactive session times are longer.

Of course, we continue to work on improving compile times. About half of the
time is spent in LLVM compilation, which has actually become slower over time.

~~~
tomrod
What prevents the plot compilation from being pre-compiled at install?

------
ChrisRackauckas
This is a very good resource. The one thing I would ask is that I would like
to see examples of using DifferentialEquations.jl when you get to the section
on dynamical systems, especially when doing discrete event simulation and
stochastic differential equations. I opened an issue in the repo and we can
continue discussing there (I'll help write the code, I want to use this in my
own class :P)!

~~~
Cybiote
I agree it's a wonderful resource. Which is exactly why I disagree with your
suggestion. The book is uncommonly clear in how it explains fundamentals and
bringing in such a powerful library ends up moving quite a bit away from that.
It will no longer be just about the fundamentals of Julia on one hand and on
the other, the algorithms will no longer be implementing language invariant.
Losing that invariance IMO makes it less of a text on fundamentals.

~~~
ChrisRackauckas
I would say calling an ODE solver is pretty fundamental to a lot of real
scientific workflows, but I am pretty biased on that.

~~~
iamcreasy
I do not remember using much calculus other than usign it to pass the college
courses. Can you point me to some resources that would teach me how to use
calculus(or ODE if that's more interesting) to solve interesting problems?

------
adamnemecek
I invite everyone to check out julia. The language is pleasant and gets out of
the way. The interop is nuts. To call say numpy fft, you just do

using PyCall

np = pyimport("numpy")

np.fft.fft(rand(ComplexF64, 10))

Thats it. You call it with a julia native array, the result is in a julia
native array as well.

Same with cpp

[https://github.com/JuliaInterop/Cxx.jl](https://github.com/JuliaInterop/Cxx.jl)

Or matlab

[https://github.com/JuliaInterop/MATLAB.JL](https://github.com/JuliaInterop/MATLAB.JL)

It's legit magic

~~~
fny
How does Julia handle typing for interop?

~~~
StefanKarpinski
If I understand your question correctly, the answer is that there are a fixed
number of native types supported by Python and NumPy, all of which correspond
naturally to Julia types and are converted bidirectionally by PyCall. Julia
and NumPy arrays are memory-compatible and Julia knows how to handle arrays
with memory allocated by other systems, so conversion back and forth between
Julia arrays and NumPy arrays is zero-copy. Other types like Python dicts are
proxied in Julia as special types that Julia knows how to work with as
dictionaries (user-defined data types are common in Julia), while general
Python objects are just proxied transparently and `obj.method` calls are
passed through to the embedded Python runtime. You can even define a function
object `f` in Python and call it using `f()` syntax in Julia and vice versa.
It's all highly transparent and smooth.

------
bdod6
Can someone explain how this is more powerful than someone use an Python/R
based workflow? E.g., I currently use a combination .ipynb, python scripts,
and RStudio and this feels like it covers everything I need for any data
science project.

~~~
jointpdf
I think Julia has a cleaner focus on scientific and mathematical computing
than either R or Python (both for performance and understanding). i.e. the
language is designed in such a way that corresponds more directly to
mathematical notation and ways of thinking. If you’ve been in a graduate
program that’s heavily mathematical, where you spend equal time doing pen and
paper proofs and hacking together simulations and such (and frantically trying
to learn a language like R/MATLAB/Python while staying afloat in your
courses), you’ll appreciate the advantage of this. To my eyes, Python is too
verbose and “computer science-y” and R is too quirky to fulfill this niche (I
say this as someone that bleeds RStudio blue, and enjoys using Python+SciPy).
I don’t think Julia is aimed at garden-variety / enterprise data science
workflows. Caveat—I’m not a Julia user currently, so this is sort of a hot
take.

The “Ju” in Jupyter is for Julia, so it’s designed to be used as an
interactive notebook language also. The Juno IDE is modeled after RStudio.

~~~
anthony_doan
> R is too quirky to fulfill this niche

I'd like to offer a counter point or add on to this.

It's quirky enough to have many packages backed by some expert statistician.

I hope Julia get to be successful in this regard too.

~~~
jointpdf
The way I wrote that comes off as more dismissive than I intended. I think
it’s quirky in the sense that there is a wide variance in styles of
accomplishing things in (base) R, so something that appears perfectly natural
to me can look foreign to someone else. I think this is partly the user base
and partly the language itself, and of course the two are interdependent. To
me, it’s a joy to write R code because of it’s flexibility and power, but I
often have dreaded sharing it with others (especially as a beginner). It’s
easy to look at someone else’s R scripts and think “this is horrifying”. By
the way, this is referring more to scientific/statistical workflows—for more
general purpose data science in R, the Tidyverse (or even just the pipe
operator %>% around which the Tidyverse is built) goes a long, long ways
towards helping people write expressive but readable code.

By contrast, Python feels a bit too rigid/standardized. Everyone’s code looks
like it was copy+pasted from a book of truth somewhere. This is good for
sharing and engineering, not as good for expressing mathematical ideas.

So whereas R has evolved organically over decades and Python is for everyone
(and alternatives like MATLAB or SAS are first and foremost software for
industry rather than languages), Julia seems to be thoughtfully purpose-built
to be a modern language for numerical/scientific computing. It polishes off
the rough edges and blends some of the best features of each language. Again,
this is just an impression from someone who already thinks in R but is
learning both Python/Julia.

More to your point, maybe Julia is at a stage of development where it’s good
for both students (for developing computational and mathematical thinking) and
experts (for slinging concise but performant code), but not yet the rank-and-
file users looking to just get things done.

------
aapeli
Accompanying code here:
[https://github.com/h-Klok/StatsWithJuliaBook](https://github.com/h-Klok/StatsWithJuliaBook)

------
Merrill
In section "1.2 Setup and Interface" there is a very short description of the
REPL and how it can be downloaded from julialang.org, as well as a much longer
description of JuliaBox and how Jupyter notebooks can be run from juliabox.com
for free.

Although JuliaBox has been provided for free by Julia Computing, there has
been discussion that this may not be possible in the future. However, Julia
Computing does provide a distribution of Julia, the Juno IDE, and supported
packages known as JuliaPro for free.

For new users, would the free JuliaPro distribution be a good alternative to
JuliaBox and/or downloading the REPL and kernal from julialang.org?

~~~
improbable22
No, I think you should simply download the ordinary version. Jupyter, Juno,
etc. are easy enough to install locally. I forget the precise details, but I
think JuliaPro comes with certain versions of packages, and it's less
confusing just to get the latest of what you need (using the built-in package
manager).

JuliaBox (and [https://nextjournal.com/](https://nextjournal.com/)) are cloud
services, but if you have a real computer and want to do this for more than a
few minutes, just install it. (There's also no need for virtualenv etc.)

------
cwyers
For people who have more Julia experience -- is this (thinking mainly of
chapter 4) representative of how most Julia users do plotting? It looks like a
lot of calling out to matplotlib via PyPlot. I know Julia has a ggplot-
inspired library called Gadfly.jl, is PyPlot more commonly used?

~~~
chrispeel
There is not yet a universally-used package for plotting. One recent tool is
Makie.jl [1]. Many use Plots.jl [2] as an interface to PyPlot, GR [3], and
other backends. I.e. you can change the backend with a single command.

[1]
[https://github.com/JuliaPlots/Makie.jl](https://github.com/JuliaPlots/Makie.jl)

[2]
[https://github.com/JuliaPlots/Plots.jl](https://github.com/JuliaPlots/Plots.jl)

[3] [https://github.com/jheinen/GR.jl](https://github.com/jheinen/GR.jl)

------
dlphn___xyz
whats the selling point with Julia? why would i use it over something like R?

~~~
cwyers
In R, most of the high performance code isn't written in R, it's written in
Fortran or C or C++ (R has really good C++ integration via Rcpp). Python has
something similar. The value prop of Julia is supposed to be that you have a
language flexible enough to do the high-level stuff you'd normally do in
R/Python, plus the ability to write high-performance code without having to
drop into another language.

I remain skeptical that this solves a lot of real-world problems (I know a lot
of users of R/Python who never need to resort to writing their own C/C++
code), but that's the sales pitch.

~~~
superdimwit
I think if you're just plugging together reasonably "vanilla" components from
python / R libraries, and only using vectorised operations, those languages
are fine and you can get away with using vectorised libraries wrapping C++.

The moment Julia shines is when your workloads can't be phrased by stringing
together the limited set of vectorised verbs that python / r libraries give
you: this is anything stateful and loopy like reinforcement learning,
systematic trading, monte carlo simulations etc. It's also useful if you
really care about performance and are doing "vanilla" computations at a truly
large scale. If you want to avoid copying memory (i.e. doing vectorised
operations), or want to tightly optimise / fused some numerical operations,
it's great.

The other issue with python / r wrapping c++ libraries is that different
libraries will generally not play well together (without coming out into
python / r space, and doing a lot of copying / allocation). This tends to
encourage large monolithic c/++ codebases like numpy and pandas, that are
pretty impenetrable and difficult to extend / modify.

~~~
improbable22
One more advantage to these libraries being written in Julia is that, if they
are almost do what you need but not quite, it's often pretty easy to reach
inside and patch the function which needs changing. You already speak the
language and don't need to stop the world to do this. The barrier to doing
this to (say) numpy is just much higher.

------
jbee618
Would love to see chapter exercises to test comprehension and reinforce
learning objectives.

------
chakerb
I was going to ask is there any Kindle version of this, then I skimmed over
the book, and I don't think it will be readable on a Kindle. And even if it
does, the reading experience will definitely be inferior.

~~~
ynazarathy
The book will be published by Springer (at which point the online draft will
be removed).

Yoni Nazarathy.

------
mruts
Julia is everything python could have been, and much more. I'm stuck with
python right now as a lot of people in the data science/ML community are, but
it's becoming increasingly viable to use Julia for "real" work. The Python-
Julia interop story is pretty strong as well, which allows you to (somewhat)
easily convert pandas/pytorch/sklearn code into Julia using Python wrappers.
Julia has some unconventional things in it but they are all growing on me:

1\. Indices by default start with 1. This honestly makes a ton of sense and
off by one errors are less likely to happen. You have nice symmetry between
the length of a collection and the last element, and in general just have to
do less "\+ 1" or "\- 1" things in your code.

2\. Native syntax for creation of matrices. Nicer and easier to use than
ndarray in Python.

3\. Easy one-line mathematical function definitions: f(x) = 2*x. Also being
able to omit the multiplication sign (f(x) = 2x) is super nice and makes
things more readable.

4\. Real and powerful macros ala lisp.

5\. Optional static typing. Sometimes when doing data science work static
typing can get in your way (more so than for other kinds of programs), but
it's useful to use most of the time.

6\. A simple and easy to understand polymorphism system. Might not be
structured enough for big programs, but more than suitable for Julia's niche.

Really the only thing I don't like about the language is the begin/end block
syntax, but I've mentioned that before on HN and don't need to get into it
again.

~~~
kbd
I can't believe I'm jumping into the inevitable 1-based indexing discussion,
but I'm surprised to see you say that one-based indexing results in "less "\+
1" or "\- 1" things in your code". Most arguments I've seen come out to "it's
fine" (certainly) or "it's more comfortable for mathematicians" (which I can't
speak to).

Besides Dijkstra's classic paper[1] showing why 0-based indexing is superior,
in practice I find myself grateful for 0-based indexing in Python because of
how slices and things just work out without needing +1/-1.

I'd like to understand. Could you give an example of when 1-based indexing
works out better than 0-based?

[1]
[http://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF](http://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF)

~~~
mruts
The classic example is getting the last element of an array. With 1-based
indexing the length of the array is the index of the last element. It has a
nice symmetry to it.

Also I find it elegant that for 1-indexing that the start and end value for
slices are both inclusive, instead of the first one being inclusive and the
last being exclusive.

Also, isn’t it just weird that the index of an element is one less than it’s
“standard” index? Like if I take the first nth elements of a list, it would
stand to reason that the nth element should be the last element, right?

The reason for zero indexing is historical, related to pointer offsets. I
don’t think anyone chose them to be easier for people. They just made them
that way because it maps closer to how contiguous values in arrays are
accessed.

Also, with 1-indexing I can multiply numbers by arrays and get reasonable
offsets. 3 x 1 is three, so I would get the third element of the list. But
with 0-indexing, I have 0 x 3 which gives me the same element, clearly
inconsistent.

There are some good reasons for 0-indexing and I have been using it in every
language for my entire career. The amount of code I’ve written in Julia is
marginal compared to my 0-indexing experience, so I might be missing
something.

One nice this about 0-indexing is that I can slice a list in half with the
same midpoint. For example a Python array with 10 elements:

fst, snd = arr[0:5], arr[5:10]

A little nicer than:

fst, snd = arr[1:5], arr[6:10]

Though you could have inclusive slices with 0-indexing, but it would be
inconvenient and suffer from the same problem as 1-indexing.

~~~
kbd
> The classic example is getting the last element of an array.

Good point, in Python I don't notice that the last element is arr[len(arr)-1]
because Python provides arr[-1]. I think in general your point is that it's
natural for the nth element to be arr[n].

> The reason for zero indexing is historical, related to pointer offsets.

There is that, but Dijkstra's paper makes the case from first-principles that
the closed, open interval of [0,n) for sequences is the most appropriate.

> with 1-indexing I can multiply numbers by arrays... 3 x 1 is three...

Sorry, I don't understand this. It makes sense that the point I don't
understand is probably most-related to why Julia chose its indexing scheme and
why Matlab et al. do the same.

> One nice this about 0-indexing is that I can slice a list in half with the
> same midpoint.

Yeah, arr[0:index] + arr[index:len(arr)] is the full list. And to your point
earlier ("if I take the first nth elements of a list"), len(arr[:n]) == n
seems natural.

Edit: I've been trying to formalize why Python's indexing scheme, along with
its negative-indexing, is optimal (slight pseudocode):

    
    
        l = ['a','b','c']
        n = len(l)
        i = -n
        while i < n:
            print(l[i++])
    

prints "a b c a b c". That code makes no reference to any bound but 'n', nor
any constants (1,0) or offsets, yet it iterates over the list twice through
its range (first negative indices then positive).

~~~
iamed2
Dijkstra's write-up is full of subjective aesthetic judgements that certain
things are ugly. I personally don't find `1:0` for an empty sequence to be
ugly, and I do find using `1:1` to refer to an empty sequence and `1:2` to
refer to the sequence `{1}` to be ugly. I would encourage everyone to read
over his reasoning and see if you agree with his aesthetic judgements.

~~~
eigenspace
I’m constantly baffled by the way people hold that paper up as some sort of
objective proof that 0 based indexing is superior.

~~~
kazinator
Zero based indexing objectively has various convenient properties which one-
based indexing doesn't.

The value of convenience over inconvenience isn't objectively better, that's
all.

Objectively speaking, if I find the least positive residue of some integer
modulo M, I get a value from 0 to M-1. If my M-sized array is from 0 to M-1,
that is objectively convenient:

    
    
      hash_table[hash(string) % table_size]
    

Objectively speaking, if I have some files in a directory and I give them zero
based names like file000, file001, ... then I can objectively refer to the
first volume of ten using the single pattern file00?, and the next volume as
file01?. If they are numbered from 001, I need file00? file010 to match the
first ten. For the next ten, the file01? pattern unfortunately matches file010
so I objectively need some way to exclude it.

Objectively speaking, if I have zero based indices to a 3D array, I can find
the address of an element <i, j, k> using the homogeneous Ai + Bj + Ck rather
than Ai + Bj + Ck + D, which objectively adds an extra term.

Objectively speaking, the zero-based byte index B can be converted to a four
byte word index using B / 4 (truncating division), and within that word, the
zero-based byte local offset is B % 4. Objectively speaking, the same
conversion from 1-based bytes to 1-based words requires (B-1)/4+1 and
(B-1)%4+1, which is objectively more syntax and more nodes in the abstract
syntax tree.

There is no reason I should like shorter, simpler, faster; that's a purely
subjective aesthetic. After all, a short poem isn't better than a long one; a
bacterium isn't better than a rhinoceros; and so on.

Hey, how about those one-based music intervals? C to E is a major third and
all that? We have a diatonic scale with _seven_ notes, right? As we ascend,
whenever we cycle through seven notes, we have passed ... one more _oct_ ave.
And to invert an interval, we subtract from ... why, _nine_ of course! And a
fifth stacked on top of a fifth is a major ninth. There is objectively more
cruft in 1 based music theory than 0 based. But there is no accounting for
people liking it that way, right?

~~~
goto11
"Objectively speaking" there are pros and cons to each system. The largest pro
of 0-based indexing is of course that it can correspond to a memory address
plus an offset, which is the reason C (and derived languages) use 0-based.

But it is also an objective fact that using 1-based indexing means that the
index corresponds to the ordinal numbers, e.g. index 1 is the first element,
index 2 is the second element and so on. This _also_ have a number of
convenient properties.

For example February is the 2. month, so if you have a list of the names of
the months, you would expect month_names[2] to be February. With zero-based
you would have to do month_names[month_number - 1]. And if you want to get the
month number from the name, you would have to do
month_names.index_of(month_name) + 1. Be careful not to switch up the +1 and
-1!

As for music theory, a third is called so because it _spans_ three half-notes.
It describe the _size_ of a range which is independent of the offset of the
indices. By the same token decades are 10 years (not 9) and centuries are 100
years (not 99).

~~~
kazinator
Some machine-level implementation convenience is the _smallest_ advantage.
Zero based would be better even if it cost more at the machine level. Of
course it doesn't cost more because the advantages are relevant at the
implementation level also.

> _For example February is the 2. month, so if you have a list of the names of
> the months, you would expect month_names[2] to be February._

That's not 1-based indexing being _good_ ; that's _conforming to_ (or
reflecting) an externally imposed 1-based system that is _itself_
questionable.

Should the seconds of a minute go from 1 to 60 instead of 0 to 59? Dates and
times are full of poorly chosen conventions, including ones that don't match
people's intuitions. For instance, many people celebrated the new millennium
in January 2000. People also want decades to go from 0 to 9; the "eighties"
are years matching the pattern 198?, not 1981 to 1990. Yet the 20th century
goes from 1901 to 2000.

In many situations when 1 based numbering is used, it's just symbols in a
sequence. It could be replaced by Gray code, greek letters, or Japanese kana
in i-ro-ha order.

When the arithmetic properties of the index matter to the point that it's
involved in multiplication (not merely successor/predecessor relationship), it
is advantageous to make the origin zero.

If month_name[1] must be "January", I'm okay with wasting month_name[0];
that's better than supporting a one-based array (let alone making that
default).

> _As for music theory, a third is called so because it spans three half-
> notes._

No it doesn't; in that very same music theory, a "major second" interval is
also known as "one step" or a "whole step"! A third is "two steps"; that's
what it spans. (I don't know what you mean by "half-notes"; I sense
confusion.) This nonsense was devised centuries ago by innumerates, just like
various aspects of the calendar.

~~~
goto11
But it is not some arbitrary historical accident that months are numbered from
1. It is the same reason the days of the month are numbered from 1. It is how
ordinal numbers work!

Neil Armstrong was the 1st man on the moon - not the zero'th. Everywhere you
have a sequence of discrete units, they are numbered starting from 1.

The thing with array indices in C is they are not ordinal numbers. They are
offsets. Which means you can (at least in theory) do x[-1] to get the element
_before_ x. So a C array is not actually an array in the mathematical sense,
it is just syntactic sugar for relative offsets in a larger array.

So what makes most sense? It really depends on what you want to achieve.

------
abakus
I find Julia's .> , .==, .*, ./ (dots for element-by-element ufunc)... really
ugly. Numpy's design is cleaner and better.

~~~
ddragon
Why? When I see the '.' I immediately know it's a broadcasted function (for
example * for matrix multiplication vs *. hadamard product), and I get the
vectorized version of any function I write for free with no extra boilerplate
(and the compiler will even automatically fuse them together if I chain them
to avoid wasting allocations). You can even customize the broadcasting and the
fusion.

------
plouffy
Commenting to find later.

~~~
grzm
You can effectively bookmark submissions by using the "favorite" link or just
upvoting. The submission will show up in your profile under "favorite
submissions" or "upvoted submissions", respectively.

~~~
the_duke
In addition, I hear that modern browsers support a ground-breaking
functionality called "Bookmarks".

~~~
iamcreasy
I would not rely on browse bookmarks too much. Recently I lost a large amount
of bookmarks for google chrome sync overwriting my local copy. The
bookmarks.bak file was missing too.

