Hacker News new | more | comments | ask | show | jobs | submit login
Julia Language Co-Creators Win James H. Wilkinson Prize for Numerical Software (siam.org)
335 points by yarapavan 31 days ago | hide | past | web | favorite | 239 comments

From the interview with the award recipients:

“A programming language cannot be derived from first principles alone. Language design is applied psychology—the computer program is the ultimate human-computer interface. Sometimes you have to try a design out and see how people interact with it and iterate based on that real-world feedback.”

I wish more people viewed PL like this.

That's why C is the way it is, for all the hate it gets. I've been teaching my son the beginnings of C now that he's proficient in JavaScript and Lua, but making sure that every step of the way, we jump from a concept he already understands, to a new but related concept, and how they're connected.

For example, when I explained signed and unsigned to him, since we already covered the concepts of groups of bits having 2^n possible values, he quickly understood that signed has just as many values as unsigned, but the upper half just become negative inverse of the lower half.

C evolved from actual needs, and the fact that it's still widely used today shows how incredibly well designed it was for practical but relatively general problem solving. The C committee's job is mostly just a matter of saying "no" to every new feature request, because C has proven to be versatile enough to work in an extremely diverse array of situations.

We've been using Lua's C API and the Win32 API[1] to make cool things, such as a primitive Love2d clone, and just last night he said "Wait, you're telling me this is the old way of programming Windows? It just makes so much sense and seems so well designed!" C isn't perfect, but if you start from where C started, and trace it until now, you'll learn all the reasons behind its decisions, and they make sense in practical situations.

Kind of like the difference between metric and imperial measurement systems. Metric makes sense in an isolated mathematical world (i.e. science), bu imperial units make sense when you're working with real physical objects (e.g. construction).

[1] http://laurencejackson.com/win32/

Well designed.. I'm not so sure, once you're used to something you're 'blind' to its defaults.. Big: signed/unsigned automatic conversion? Minor: conflating unsigned and modulo arithmetic (why not allowing signed modulo int?)? Huge: no way to pass an array but just a pointer? ...

> I wish more people viewed PL like this.

I wish less people abbreviated "programming language" to PL.

If anyone is interested, I was snooping around and found one of the co-creator's PHD thesis that explains in detailed terms some of the ideology behind Julia: https://dspace.mit.edu/handle/1721.1/99811#files-area

Can be downloaded here (PDF): https://github.com/JeffBezanson/phdthesis/raw/master/main.pd...

For a shorter read with many of the same highlights, this is a paper about the design and implementation of Julia (PDF):


Thanks! Awesome work on Julia!

congrats Stefan :)

Or its forerunner, the master thesis: https://dspace.mit.edu/handle/1721.1/74897.

Can't wait for the language to be as widely used as Python or R. Been playing around with it at home since the 1.0 release and it really is a joy to use.

I think Python has a large advantage in this space since it is not as specialized. For many projects that I work on the actual core number crunching code is small (or part of a library I don't have to worry about). The rest of the code could be scraping data, or a web application. These are things that there are numerous packages available for in Python but seem to fall outside the focus of Julia.

I would highly recommend reading this recent post in the Julia forums [1]. It's from a "relatively new" Julia user explaining why Julia is actually a very good general purpose language, e.g. for string manipulation of scraped data. You do have a good point about packages, though: The problem is that the user base is smaller than Python's so some packages have yet to be created. And since the userbase is currently largely numerical programmers, non-numerical packages are more likely to be missing.

[1] https://discourse.julialang.org/t/not-only-for-technical-com...

Julia would be ideal for those task as well, but alas the libraries aren't there. Julia can easily call Python functions with PyCall, although I still wouldn't want to extensively use a complicated Python library in this way.

Seems like it would be better to call Julia from python.

They're way ahead of you: https://github.com/JuliaPy/pyjulia

Python doesn't have the typing support or even availability of many types. Calling Julia from Python in some cases would require something like cython or require even using the C api ahem.

I wish that Python feels a bit of Julia's heat, as that might be the only way for more PyPy love.

Pypy is basically abandoned project. Main version is Python2 and It has poor support for python 3. The latest version They are supporting is python3.5. Beside that even with Pypy they wouldn't be even close to nodejs. Julia is on par with Fortran or C. Reason behind Julia amazing performance is its type system. e.g multiple dispatch and value types. Python lakes both. Python also has parallelism problem with GIL. Mypy team currently works on compiler based on static type annotation: https://github.com/mypyc/mypyc I think they could gain much more performance than Pypy.

This is not correct. Pypy is actively developed.

Numerical code is always written in numpy which can drop the GIL.

I recall that the main issue with PyPy is that it doesn't have as good interop with C and a lot of CPython codes just call C code and hence is difficult to widely adopt given the amount of reliance on CPython.

Python without GIL to provide real shared-memory threading would address 90% of my issues with the language.

In my experience, CPython is 100 to 1000 times slower than C (or C++, Rust, Fortran, etc...). CPython would be better off fixing the speed (perhaps by embracing PyPy) and staying single threaded than removing the GIL. At least if you're choosing threads for performance gains.

Think of it this way: A single threaded C++ or Rust program can do more work per second than a hypothetical GIL-free CPython with 64 cores...

I'm using compiled libraries for all my heavy lifting—the Python code is there to glue those libraries together in convenient manner that allows for exploratory programming. GIL-free CPython would let me distribute computations across multiple cores without having to deal with the overhead of multiprocessing (serializing objects across process boundaries, etc.).

If your compiled code is thread safe, you can releae the GIL in your library and achieve parallelism with current CPython. If your compiled libraries are not thread safe, you can't get parallelism anyway.

Maybe, if your heavy lifting functions are very short lived, releasing and acquiring the GIL is holding you back...

Btw, I'm not defending the GIL. Python has enough visibility and significance to deserve a good implementation. It's just that most people don't realize the slowness of CPython itself is a bigger impediment to performance than the GIL. Python should be in the same speed league as JavaScript, but there are too many things hindering progress.

> CPython is 100 to 1000 times slower than C

That's very context dependant. With annotations, typical calculations compile to C code roughly equivalent to native. The more of the actual runtime you use, the more you call into cpython which can't be sped up this way. Cython compiled code will be somewhere between C and cpython in terms of speed, but putting a single number to it will be always misleading.

>> CPython is 100 to 1000 times slower than C

> That's very context dependant. With annotations, typical calculations compile to C code roughly equivalent to native. The more of the actual runtime you use, the more you call into cpython which can't be sped up this way. Cython compiled code will be somewhere between C and cpython in terms of speed, but putting a single number to it will be always misleading.

OP said "CPython", not "Cython". CPython is pretty reliably 100X to 1000X slower than C.

Oops, you're right. One day I'll learn to read :-)

No worries. I suspect many of your downvoters were confused as I was that you appeared to be arguing that CPython could compile your Python code to efficient C! Took me a few passes to spot the misunderstanding.

I built a very simple neural-network app a few years ago with Julia, and while the project was fun and I didn't think the language was bad by any means, as someone who does software for a living I had trouble seeing why compsci people really got into it.

I could totally see someone like my dad using it (he's an aerospace engineer, not software), but I have friends who work in compsci in academia trying to evangelize the language.

Not trying to start a war here, but I'm curious what more-seasoned Julia vet might have seen that someone with only ~12 hours with the language wouldn't.

It's extremely expressive. Notably, Julia is homoiconic, with full lisp-style macros. It also has multiple dispatch, which is a far more general technique that OO single-dispatch. This makes it very easy to define modular interfaces that work much like statically-typed type classes in Haskell. This allows you, for example, to define a custom matrix type for your bespoke sparse matrix layout and have it work seamlessly with existing linear algebra types.

I've done a lot of work in both Python with Scipy/Numpy, and Julia. Python is painfully inexpressive in comparison. Not only this, but Julia has excellent type inference. Combined with the JIT, this makes it very fast. Inner numerical loops can be nearly as fast as C/Fortran.

Expanding on the macro system, this has allowed things like libraries that give easy support for GPU programming, fast automatic differentiation, seamless Python interop, etc.

I'm not sure how I feel about multi-dispatch...I've had a few headaches chasing down problems with multimethods in Clojure...I'd have to try using Julia full-time to see how it feels.

I was unaware that Julia was homoiconic...I'm somewhat of a Lisp fanboy so I might need to give the language another chance.

There's pretty big differences in usage between multimethods in Clojure and Julia. I've used both a decent amount. All functions in Julia are multimethods by default. If you don't use type annotations, a new method will be generated whenever you call the function with new argument types. This explicit type specialization is a very important part of why Julia can have such a consistently fast JIT despite its dynamicity.

Errors from missing or conflicting methods tend to not happen much in practice.


> If you don't use type annotations, a new method will be generated whenever you call the function with new argument types.

Damn. That's a pretty clever trade off between dynamic and static types.

I vaguely recall a talk by Stefan Karpinsky where he mentions meeting one of the big names in compiler land (working on V8 or something) and they said Julia's JIT is just a lazily evaluated AOT compiler, and as a result much simpler than the JITs typically seen in other languages.

Forgive a bit of ignorance here, but that doesn't sound terribly different than overloading functions in C++. Am I way off on that?

The difference is run-time (semantically) vs compile time. If you had overloading at runtime in C++, you wouldn't need virtual functions or any of the thousands of OO "patterns" (visitor, factory, etc.), that are working around the lack of this capability.

> Julia is homoiconic

It depends on what you understand by homoiconic:


This is why the language creators usually avoid using the word 'homoiconic' because every time one uses that word there's a finite probability of being bogged down in an incredibly uninteresting semantic argument.

Instead, people prefer to say that julia code is just another (tree-like) data-structure in the language and it can be manipulated at runtime with functions or compile time with macros or at parse time with string macros and now with Cassette.jl[1] we can even manipulate the form of code that has already been written and shipped by other packages all with first class metaprogamming tools. It seems to me that even if Julia is not 'truly homoiconic', that we seem to get the touted benefits of homoiconicity to the point that it seems like an unimportant distinction.

[1] https://github.com/jrevels/Cassette.jl

Then why not just say 'Julia has macros'? Lightly perusing the description of Julia's features, that seems like a clear way of expressing what it is.

(I also vaguely recall Julia describing its type system as "dependent" in way that goes against convention. Maybe they just liked controversy in the early days!)

What I just said is why Julia people don’t tend to call it homoiconic or dependantly typed. It lead to so many semantic arguments that most just talk about actual features such as macros and various multiple dispatch features instead of using words like homoiconicity and dependant typing.

"macros" are unfortunately used by C and lisp to describe two different things, and both usages are as widely popular as their parent languages (i.e. very).

That's a good point that it doesn't have the direct homoiconicity of Lisp, but it is not buried deep, since we have full access to the underlying AST as just an Expression type. In practice, this makes macros much less painful than in other non-lisp languages. This means that macros are used all the time, and idiomatically. The most proiminent non-lispy language with a macro system that I can think of is Scala. Macros there are a nightmare, and so tend to be only used in library code that really, really needs them.

I agree that macros are better than in other non-lisp languages, but I find them ugly in Julia. Maybe not painful, but they don’t integrate seamlessly into the language.

They are ugly, I agree. Personally, I prefer lisp (especially Clojure) syntax, but not many people agree with me there.

What's actually sad (I learned this quite recently) is that the inventor of Lisp wanted to include a mechanism to evaluate formulas along the lines of FORTRAN. Something like:

(math "x = sin(a) * b ^ 2")

Lack of something like that turned me away from Lisp (LISP at the time) a (cough) long time ago, since I didn't like the impedance mismatch between Lisp code and math for numerics.

At any rate, Julia looks like a fine general purpose language, I'm going to see if I can contribute somehow. I'm hopeful that GC pauses can be avoided for large classes of Julia programs, but we'll see...

Throughout Lisp history, this has been done.

(Let's not count things like computer algebra systems written in Lisp that have some sort of actual math notation for input and output, just alternative syntax for programming in Lisp itself.)

As early as 1973, there was https://en.wikipedia.org/wiki/CGOL

That can actually be made to work today: http://abcl-dev.blogspot.com/2010/04/cgol-on-abcl.html

Please don't vilify Python or R thinking that will help with Julia's adoption or popularity. If Julia is as awesome as its evangelists say, it will gracefully displace its competitors without the need of a smear campaign

Criticisms of languages should not be taken as criticisms of their creators or users. I don't see anything wrong with saying "Language X does Y better than language Z in my use cases."

Languages are toolsets and programmers are always looking for appropriate tools for a given task. It seems crazy to demand that people not be able to talk honestly about their experiences using a given tool. Python doesn't have feelings, it won't be sad if you don't like it.

Even if it does, what's wrong with stating an opinion? Why does an honest opinion have to be branded "smear"?

You seem to be suggesting it’s not fair to make superior comparisons to python until Julia has mainstream popularity. That’s rubbish. Python is much less expressive.

Python does have excellent libraries and it fits many people's mental model of programming, but not mine. I still use it often. However, for a dynamic language, it is simply not very expressive. It has no multi-line lambdas, OO syntax that I find a bit awkward, and some other limitations. These can be seen as advantages, since they enforce the "there's only one way to do it" philosophy, but that doesn't work for me.

Hmm ... I remember when Perl was king and Python was the shiny new thing. I watched Python supporters belittle Perl as "line noise" and other quite silly descriptions. They were all proud of "there is only one way do to it" versus the linguistic Perl way of "there is more than one way to do it."

I am experiencing something of a sense of schadenfreude over defense of Python against a newcomer. 20 years on, Perl is still in widespread use, still unbeatable for its growing use cases. I don't wish to speculate where Python will be 20 years from now. I do know that Fortran will still be around. That's about the only sure thing.

Wish advocates of all new languages stuck with this principle.

I wasn't being disrespectful of Python. We are allowed to criticize languages. The fact is, by any reasonable measure, Julia is more expressive than Python, which was all I stated. This is an advantage of Julia. Both languages have advantages and disadvantages, and comparing them honestly is necessary for people to make a good choice when selecting a language for a project. C is less expressive than Python or Julia, but no one would say C is a bad language based on that. And no C programmer would disagree with that statement.

It's not about putting any language or community down.

By your argument, it sounds like we cannot talk about any advantages a new language may have over existing ones. If we stuck to that, how could we possibly make any argument over why one may want to select a newer language?

Can you give concrete examples of where Julia is more expressive than Python? I can think of defining custom units https://medium.com/@Jernfrost/defining-custom-units-in-julia...

I give examples of expressivity in my comment above. The multiple dispatch is completely ubiquitous in Julia, being the default. This allows for the definition of flexible and extensible interfaces similar to Haskell's typeclasses. Macros are used to make things like PyCall or generate fast and flexible CUDA code.

This is not easily possible in Python, especially not with high performance.

In a nutshell: it combines the modern language features a CS person expects with the numerical excellence of Matlab.

Is it a better language, from a pure CS perspective, than Python/Scheme/pickyourfavorite? No, not really. If your job is writing web pages, Julia will be fine but it won't excel.

But if your job is writing trajectory planning routines for robots or climate simulations or so on, you aren't coming from those languages. You are coming from Matlab. Matlab has great libraries -- every numerical method under the sun. It has a very shallow learning curve. It's good for prototyping. But the actual Matlab language, as opposed to the toolboxes, is absolutely painful for writing programs longer than a few pages. As a numerical methods guy with a CS background, it literally makes me want to tear my hair out. It feels like writing in BASIC, circa 1989.

Julia can also have a shallow learning curve. If you want to prototype a couple of pages of numerical methods, it works nearly as well as Matlab does. But if you then need to expand that code into production, it can do that too.

> Is it a better language, from a pure CS perspective, than Python/Scheme/pickyourfavorite? No, not really.

Could you be thinking more of a "pure developer perspective" than "pure CS perspective" perhaps? Because if anything I find Julia's type system a lot more interesting from a computer science point of view than Python or Scheme.

If I want to get things done, I'd stick to Python for now though due to sheer momentum.

"Could you be thinking more of a "pure developer perspective" than "pure CS perspective" perhaps?"

Yes, sure. Julia is an interesting language, CS-wise. My only point, inasmuch as I had one, was that there are other languages out there that have those same features.

"If I want to get things done, I'd stick to Python for now though due to sheer momentum."

I tried python, because you're right, it has the momentum and is the obvious first choice for numerical computing after you've had your fill of Matlab. Unfortunately, for a lot of the code I write, speed matters. In large part I'm prototyping my own numerical methods, not totally relying on numpy/scipy. The last time I tried python for numerical code development I ended up with a routine that was two orders of magnitude slower than it needed to be. I rewrote the entire thing in Julia, with little to no effort towards optimizing it, and it was immediately 50x faster than python and within a factor of two of what I needed. I put a couple of days of effort into optimization and I got it down to 2-3x faster than required.

You could argue that I should have gotten better at optimizing python, and maybe you'd be right. I'm no python expert. But in my limited interaction with the python community, the answer to faster python seems to be, at the end of the day, to write the parts of the code you really need to be fast in C and call them with the python FFI. Which is what numpy does, if I understand correctly. Whereas I can write that same code directly in Julia and it's generally fast enough.

For a lot of use cases speed doesn't really matter. But for robotics (my field) or self driving cars or machine vision or a lot of other embedded applications, it does matter.

I was playing around at work with some galois field stuff (very cs) and was very pleased to find that the lu decomposition for Julia "just worked" with the custom galois field type that I implemented. All in all about 75 lines of highly performant code, which is what I needed because I was conducting searches over 2^32 elements.

Did you ever try Magma? I graduated in pure maths from University of Sydney where Magma was born. It was great for doing Galois theory stuff.

Does magma compile down to highly efficient code?

That's cool! What were using the galois fields for?

To name a few: beautiful type system, optional typing, multiple dispatch...

edit: as another comment mentioned, metaprogramming is also great!

Julia is regarded as similar in syntax to python, or at least they have similar readability. What I love about it though (fyi not a CS degree person here) is the elegant way it handles types, and generics. User defined types perform just as fast as the built-ins, and I never seem to labor over design choices needed to generate fast computation.

Multiple dispatch is such a great way not to have to write boilerplate code for every method, somehow java seems very awkward in comparison.

Julia feels like a thoughtfully designed and carefully constructed language. I don’t havd the training to understand why exactly.

I really like the multiple dispatch design.

Congrats! JuliaCon 2019 registration also just opened (22 July, Baltimore)


Julia deserves way more attention than it's currently getting. I'm hoping now that it's past 1.0, a few big-name companies will start using it and raise its profile.

Attention is better in small doses, I think it's clear to many that Julia is a strong project, let it grow peacefully.

The fact that you have numpy like matrix manipulation directly out of the box is really cool, and it's even better than numpy because if the dot operator for element wise operation, which is just awesome

The authors claim Julia is a general programming language, but after a few tries I am not totally sure about that. Yes, it's a nice programming language for scientific computing but I am afraid its usefulness stops there.

Matlab is (was?) the default tool for computational science, and Julia borrows lots from it. I'd say it's safe to put Julia in the same category with Matlab and Mathematica

People are using Javascript or python for real production. How those are general purpose and Julia is not?

What about the Julia 1.0 language design do you find prevents general purpose programming?

I think he's just trolling...

Conceivably you can see a backen server being built in Julia e.g. https://github.com/JuliaWeb/Mux.jl

But I can't imagine anyone would choose Matlab or Mathematica to build a backend...

>Conceivably you can see a backen server being built in Julia e.g. https://github.com/JuliaWeb/Mux.jl

>But I can't imagine anyone would choose Matlab or Mathematica to build a backend...

Julia is a clean, concise language that has so far mainly been focused on numerics. People are looking at using it in many other areas, for instance real-time robotic control: http://www.juliarobotics.org/

This means that at least some Julia folk are looking at ways of avoiding garbage collection during critical code sections...

Data center efficiency is a concern, and there is real interest in better languages for web applications. Rust and Swift both have projects exploring this area. Julia would fit well in that space, and in my opinion its syntax is much cleaner than either.

I'm always looking for good computer tools to assist with learning Math with computers. I found this: [1].

Are there any other resources like this to learn fundamentals of Math with Julia? ([Linear] Algebra, Calculus, Geometry, Basic Physics, Discrete Math).

1: https://calculuswithjulia.github.io/

This course looks fantastic! Thanks for sharing :-)


I'll say only one thing: 1 based indexing.

Jokes asides, congratulations to Jeff Bezanson, Stefan Karpinski, and Viral B. Shah, well deserved!

>I'll say only one thing: 1 based indexing.

You mean you'll say only zero thing?

Oh well, can't let this pass...

Introducing TwoBasedIndexing.jl (https://github.com/simonster/TwoBasedIndexing.jl)

The point is, n-based indexing isn't really that big a deal.

Want to drive adoption? Introduce it in schools.

Evil but true. I think the easiest way in would be to rally up some university students into starting a "Julia Club" or maybe even a "Julia Evangelism Strike Force".

Even further than that, you need faculty to teach with it. I know I use R a lot primarily because that's what I learnt at Cal, it was easy, and it worked. Things are probably shifting towards Python in schools now, but I think that's the best way you're going to get a lot of users using it persistently in their careers.

As a current R user, and former SAS user, can someone give me an elevator pitch as to why Julia is worth my time? I use a lot of the tidyverse and some more niche epidemiology/stats packages in R, I'm also pulling in datasets on the order of 1-10 million rows from MS SQL.

the elevator pitch is, it's an easy to use language, you can vectorize things if you want, but also if you hand-write loops it will be very fast. Whereas in R, hand-writing loops brings you to the 3rd circle of hell [1].

the more general version of the elevator pitch is that being in an environment where all the code is written in one high-level language that is still fast is qualitatively different. you can combine things in all kinds of strange ways with surprisingly little work. I.e. the DiffEqFlux package takes about a hundred lines of code to glue together a GPU neural network library and some very advanced differential equation solvers.

realistically, it sounds like you are hitting all of R's strongest points currently, though. I suspect right now you would be pretty dissatisfied with Julia.

1: https://www.burns-stat.com/pages/Tutor/R_inferno.pdf

One usecase: Do you want to do interesting cutting edge modeling of disease propagation dynamics or other complex systems simulation? Then check this out: https://julialang.org/blog/2019/01/fluxdiffeq

Here's context : https://jontysinai.github.io/jekyll/update/2019/01/18/unders...

There's nothing else like it.

Also here's the julia equiv to dplyr: https://github.com/queryverse/Query.jl

On top of what else has been said: you can call R from within Julia with almost negligible overhead via RCall.jl[0], so you can still use code you have already written.


Wow, congratulations to the Julia team!

The prize lecture will be given at the SIAM Computational Science & Engineering conference in a little over a month (https://www.siam.org/conferences/CM/Main/cse19). SIAM hasn't been doing live streams, but does sometimes put up video or audio+synchronized slides a little (a couple months?) after the conference, in case anyone is interested.

Julia is interesting, but the block syntax is a little off-putting. I'm not sure why someone would design a language that uses 'end' to delineate a block. Curly brackets make sense. Tabs make sense. But 'end'? All it does is make code harder to read and harder to write.

Another compelling argument is that {} , while very easy to type on an English keyboard, is highly awkward on keyboards of almost any other language in the world. E.g. I'm Danish and on my Mac { involves flexing the right thumb down to hit right-alt while typing 7 with my middle finger and holding down shift with my left pinky. `end`, though, being actual letters, is completely easy and fast to type (on languages that uses the latin alphabet).

Ruby and Elixir both made the same decision, and I don't feel like it's had an actual negative impact, for me at least; I've found both languages to be delightfully readable. I haven't dabbled very much in Julia since the pre-1.0 days, but I don't recall it being any more or less jarring.

I suppose it could indeed be jarring if you're not already used to languages which use 'end' to terminate blocks, but after awhile your brain will just start to treat 'end' synonymously with '}' and all will be right with the world.

I mean, maybe it's not a big deal. But what value does it add? What if we had a 50 character string of random digits be the terminating identifier for a block? What about 100 characters? What about 200? Syntax shouldn't create visual noise, it should be as lightweight as possible.

Maybe someone can justify why a 3 character word (end) doesn't really matter. But can someone give an argument on how it adds value versus one character (a bracket), or zero characters (visible at least), a tab?

Because 'end' has a specific meaning in English, and that meaning translates well to ending a block. If you decide you want a programming language geared toward Spanish speakers, 'end' would indeed be gibberish (unless they also speak English), but 'fin' would not be.

You can certainly argue that '}' is language-agnostic and therefore a better choice, and I wouldn't disagree with you. But 'end' is by no means an arbitrary choice for a character sequence that ends a block.

it's easier to read.

If every sentence ended in an "end" instead of a period, would you posit that that would be easier to read as well?

The crux of the issue is the signal to noise ratio. The more noise there is (syntax), the more difficult it is to pick up the signal (the semantics).

Sentence is not code! According to your logic, when do you see {} in prose?

begin some code some more code end

It's easier to read.

Because Pascal. And as Pascal was used as a teaching language, I don't think your 'make code harder to read and harder to write' argument holds. But certainly nowadays 'end's are not seen often; maybe this has an influence in your statement?

Why do you think that "end" is superior to more lightweight syntax? One might argue that using "end" doesn't really matter, but I am extremely skeptical one could argue it is superior to a curly bracket or a tab in any way at all.

PS: And in regards to your last statement, maybe "end"s aren't seen very much anymore is because language designers' agree with me, not that somehow my opinion is influenced by the syntax of modern languages.

If you just think about it, that is much more likely than what you are proposing.

Given the historical precedent of Pascal, Ruby, Matlab and others, using `end` to close blocks is not exactly controversial or uncommon. 5 of the current top 20 languages on the Tiobe Index use `end` to close blocks versus 11/20 for curly braces and 1/20 for indentation. Most of the `{}` languages are C descendants: using `{}` for blocks is popular because C has been wildly successful, not the other way around. Using `{}` for blocks wasn't some brilliant language design choice that caused the success of C—it was a fairly mundane choice that happened to be made by a language that succeeded for entirely unrelated reasons and the syntax spread from there.

As to the benefits of not using `{}` for blocks, pairs of ASCII brackets are one of the most precious commodities in programming language syntax design. Wasting them on blocks is not ideal. For example, C++ was out of brackets to use for template parameters, and had to use `<>` instead—despite the syntactic clash with their meaning as less than and greater than operators. This has caused no end of parsing problems and irregularities in C++ and related languages and has only been made to work because they're all static languages with separate "expression context" and "type context" and inequality operators don't make sense in type contexts. In a dynamic language, there's only one context and you can't distinguish the use of `<>` as brackets from its use as inequality operators based on that. Julia uses `()` for function calls (pretty important), `[]` for array indexing (also pretty crucial), and `{}` for type parameters.

Using indentation is a fairly clever choice but not without its drawbacks. It causes lots of problems with cutting and pasting code. It means that your IDE cannot generally autoformat or autoindent your code for you. There are also many people who find the "trailing off into space" visual appearance of Python code unsettling and prefer the symmetry and closure provided by `end` or other block delimiters.

I actually agree with you on the last part, I do find python's tab indention unsettling. I just think it's better than using "end"s.

In regards to you the use of brackets, array indexing and function calls can use the same syntax: lisp and scala seem to manage just fine. There's a lot of unnecessary syntax in a lot of languages that doesn't really serve any purpose (besides historical).

I'm not trying to argue about any particular syntax. Just that, syntax should be as concise as possible and not include extra identifiers that confuse the eye.

As an aside, I also think that programming languages should probably use less words, and more symbols, as it lets people who are familiar with other spoken languages to more easily adopt them.

Also, using more keywords (like end), restricts the name space of variable names.

> In regards to you the use of brackets, array indexing and function calls can use the same syntax: lisp and scala seem to manage just fine. There's a lot of unnecessary syntax in a lot of languages that doesn't really serve any purpose (besides historical).

Yes, Matlab does this too. However, not syntactically distinguishing array indexing from function calls has problems in a language that provides as much array functionality as Julia does. The classic example in Matlab is that when the interpreter sees `a(b(c(end))))` it needs to dynamically look up whether `c` is an array or not to decide if the `end` refers to the length of `c` or not, if not then it has to check if `b` is an array or not, etc. That would be even worse in Julia since the question of "is it an array or not" can be somewhat fuzzy. It's perfectly possible to implement an array-like type that uses indexing syntax and does not subtype AbstractArray. In Julia, when you see `log(v[end÷2])` it's syntactically unambiguous that this means `log(v[length(v)÷2])` which is only possible because of the distinct syntax for array indexing and calling functions. There are other where this distinction is essential as well, such as broadcasting [1].

[1] https://docs.julialang.org/en/v1/manual/arrays/index.html#Br...

> I actually agree with you on the last part, I do find python's tab indention unsettling. I just think it's better than using "end"s.

Ok, well we made the opposite judgement—we found `end` preferable to indentation. When you create a language you get to pick syntax that appeals to you.

> I'm not trying to argue about any particular syntax. Just that, syntax should be as concise as possible and not include extra identifiers that confuse the eye. > As an aside, I also think that programming languages should probably use less words, and more symbols, as it lets people who are familiar with other spoken languages to more easily adopt them. > Also, using more keywords (like end), restricts the name space of variable names.

You may appreciate K and other APL derivatives, although they are not known for their readability. But they certainly don't favor English speakers over anyone else.

>However, not syntactically distinguishing array indexing from function calls has problems in a language that provides as much array functionality as Julia does.

The best example is probably the differential equation solution type in Julia. When you have a solution `sol`, the interface gives you `sol[i]` as the values in the solution's time series and `sol(t)` the continuous solution of the differential equation. This are natural ways to describe both the discrete array-like and continuous function-like nature of an ODE's numerical solution, and that syntax wouldn't be possible without distinguishing a function call from indexing.

I love the use of end for blocks! It avoids the arguments about brace placement on one hand and problems with space sensitive indention on the other. It is simple, consistent, easy to read and write.

Yes, easy to write. Seriously, I can type short words like end just as quickly as I can type shift-}, two keys off in the extremes of the keyboard layout.

I think it gets a bad reputation from languages like bash that have a horribly inconsistent mix of `fi`, `end`, `esac`, and require opening words like `then` and `do`, which people can argue about where to place.

end worked great in pascal and matlab, and it works great in julia as well.

I mean, what about auto-generated code? Not only does using words forbid potential variable names, but it makes code generation more difficult as well (as does tabs). A good positive example is Haskell, which uses tab for normal block delineation, but also allows curly brackets (which is usually only used for auto-generated code).

By sharing block delineation with variable namespaces, the language creates a whole bunch of problems (a block identifier is essentially a newline followed by a “end” vs the much simpler “}”).

Also, the more english centric a language becomes, the harder it is to learn for non-english language speakers.

Moreover, words are used for variable names, while symbols are disallowed is many/most languages. By having block delineaters share the variable namespace, you detract from the readability of the language (as well as restricting variable names as I mentioned earlier).

One might ask: Does any of this matter in the scheme of things? The fair and honest answer is: not really, it’s a minor syntactic difference. But I firmly believe using words to end blocks is still inferior, even if it’s not a big difference.

Autogenerated code is very easy in Julia — there is no string processing at all. You can splice things directly into quoted code (working at the literal syntax level) or manually modify the `Expr` representation in a lispy manner (in which case there aren't any `end`s at all). Code generation is _really_ easy, thanks to its lispy inspiration.

Block delineation in Julia doesn't require the newline. You can just throw a ` end` at the end of the line. Often you'll see folks put the `;` line delimiter there, too, to make things super-clear (as that's generally how you delineate multiple expressions on the same line), but it's not required for an `end`.

And while it is indeed a little sad that you can't use `end` as a variable name, we make the most of it by using it to also represent the last index in an indexing expression like `A[end]`.

So, yeah, not a big deal at all, but very few of your detractions even apply in the first place. :)

Because MATLAB. And I don't think that's a very compelling argument.

Problem with Julia is not (only) 1-based array indexing, but their uses of ranges. Closed ranges are not composable. For Python-style ranges for a lot of operations (mean, sum, ...) I could define monoid, provide composition rules and be done with it. WTF I supposed to do with Julia ranges?!

What do you mean by closed ranges composition ?

Suppose I have to compute sum for array of length N. In python it is something like

s = 0 for k in range(0, N): s += a[k]

Ok, I have more cores/CPUs, could I do it in parallel? Sure

v1 = Sum(0, N/2) v2 = Sum(N/2, N) s = v1 + v2

I even could do it on asymmetric cores (like most phone CPUs today)

v1 = Sum(0, K) v2 = Sum(K, N) s = v1 + v2

I could do a bit more complicated things, like calculating means. For that, I have to make a monoid

def mean(from, to, a): m = 0 for k in range(from, to): m += a[k] N = to - from return (m/N, N)

Here I'm returning tuple, and for that to be a monoid I have to state composition law:

def compose(M1, M2): A1, N1 = M1 A2, N2 = M2 N = N1+N2 return ((A1N1 + A2N2)/N, N)

If my identity is (0,0) tuple, it is quite easy to verify that indeed I have a monoid. Nice, simple, composable ranges.

For closed [1...N] ranges this is NOT nice, NOT simple, just plain fugly exercise. Sorry, looks like code samples are screwed up a bit, don't know how to fix it

But why should you have to care, why not just sum(range)? Or

    @threads for i in eachindex(object) 
and let the details about how many cores be written once, correctly, elsewhere?

> But why should you have to care, why not just sum(range)?

Sure, if you have single infinitely fast CPU. In real life, we have to decompose problem, run it in parallel and compose it back.

> Or @threads for i in eachindex(object) & let the details about how many cores be written once, correctly, elsewhere?

Here I'm explicitly talking about details, how it should be done, importance of composition, monoids etc. I understand, that if someone did it for you, you might not care - well, more power to you then. But some basics rules about Pyton ranges are pretty good:

1. [0...K) + [K...N) = [0...N)

2. Given range [K...N), number of elements in the range is N-K, in case of [0...N) number of elements is N-0=N.

Simple, elegant, composable. I can't say the same about Julia

1-indexed ranges can still be composable if we define them as open on the right at N+1 (which is how I imagine one would want to define things in their implementation).

[1...N] = [1...N+1) = [1...K+1) + [K+1..N+1)

So the real loss is your point (2) - which, I agree, makes the implementation much less elegant and simple.

> 1-indexed ranges can still be composable

you could make them work, but improper abstraction leaks out in so many fugly ways. F.e. in Python you could compose functions, not only ranges. What do you pass in? Simple, [0, len(a)). What you get out? Simple, len(a). So you could operate on ranges composing function calls like in FP. Even works for unknown beforehand sequences/streams, just count along the sequence how many events you processed, and return it as length, and it could go into composition function. With Julia you have to think what is passed and what is returned. Shall I always return len(a)? Or maybe len(a)+1? What to return in case of streaming events? Sometimes len(a) and sometimes len(a)+1 with tons of comments and warnings? It is not a good way to deal with all that and not a good way to build API. Could be done and probably was already done, sure. BUt simplicity, elegance and composability is missing in the base design.

I don't buy it. Inclusive ranges are much more intuitive than left side inclusive and right side exclusive ranges. And they compose just fine; you just have to add a plus one at the right places. E.g.

    function mean(a, l, r)
        s = 0
        N = r-l+1
        for k in l:r
            s += a[k]
        s/N, N
    function mean2(a)
        N = length(a)
        lr1 = (1,div(N,2))
        lr2 = (div(N,2)+1,N)
        v1, N1 = mean(a, lr1...)
        v2, N2 = mean(a, lr2...)
        N = N1+N2
        (v1 * N1 + v2 * N2) / N, N

    function compose(M1, M2)
        v1, N1 = M1
        v2, N2 = M2
        N = N1+N2
        (v1 * N1 + v2 * N2) / N
    function mean3(a)
        N = length(a)
        lr1 = (1,div(N,2))
        lr2 = (div(N,2)+1,N)
        compose(mean(a, lr1...), mean(a, lr2...))
You have to think about what is passed in and out anyway. Arguments of elegance etc. are most of the time personal preferences and very biased.

> Arguments of elegance etc. are most of the time personal preferences and very biased.

No, they are not. I'm talking about ability to build (and build upon) underlying algebraic structure.

ok, could you make a monoid out of your/Julia closed ranges? What would be a unit in this monoid? How would you make a null/empty range?

Empty range is 1:0 or a+1:a or a:a-1 for any integer a. Closed ranges (for integers) are equivalent to open ranges since [a,b] is the same as [a,b+1). You just put the plus ones in the right places.

Use them like the first-class vectors they are. You can use _any_ array manipulation on ranges.

please take a look at my rant below, do not want to write it twice

Your rant comes down to a distaste of +1s in certain situations. My point is that Julia's emphasis on arrays allows you to compose ranges with mathy operations in very powerful ways. You're concerned about composing APIs that take endpoints as arguments — I say you should just accept a range in the first place!

Say I have an index `idx` into a very large array and I want to select a symmetric window of `N` indices before and after that index. In Julia:

    A[idx .+ (-N:N)]
In Python:

Python cannot even index lists by ranges (or other lists). And if I go out of bounds, well, it'll happily just give me back a list of a size I didn't expect.

Heck, in Julia I can even easily determine what the OOB behavior should be. It's an error above, but I can easily change it. This, for example, will ensure no indices go out of bounds by repeating the final endpoint as necessary.

    A[clamp.(idx .+ (-N:N), 1, end)]
Sometimes you want to work in terms of the fenceposts, and sometimes by the length of the fence. There certainly are cases where 0-based offsets are nice — or even arbitrary offsets. In those cases we have OffsetArrays.

It is not about distaste, though idea about keeping and carrying 1 around is a recipe for bugs. It is all about composability.

Could you make proper monoid out of Julia ranges? What would be a unit in that monoid? Could you tell me how empty range looks in Julia (in Python/C++ it looks trivial)?

> Python cannot even index lists by ranges (or other lists)

sure, Python has deficiencies as well.

C++ STL iterators/ranges are made in the same way, and for a good reason - composability first. It described very well in Alex Stepanov book (https://www.amazon.com/Mathematics-Generic-Programming-Alexa...), as well as how it helps when we move to parallel algorithms (http://stepanovpapers.com/p5-austern.pdf).

It is even more explicit in upcoming Ranges library in C++20

An empty range in Julia is `1:0`. So yes, they can form monoids just fine. I highly recommend you work with them _directly_ instead of carrying around the two endpoints separately. For example, you can divide any range into two parts with:

    split(r) = r[1:end÷2], r[end÷2+1:end]
That'll happily recurse and eventually spit out empty ranges given an input like `split(32:56)`.

A zealous adherence to one strategy or another will simply blind you to the ways in which the other might be easier at different times.

Python cannot even index lists by ranges (or other lists)

True for python lists, but not true for numpy arrays.

Oh, I'm well aware. That gets at my point, though: Julia's ranges are _just_ arrays, and behave like all other arrays in the language.

Agreed. The fact that python has several 'array-like' data structures with subtly different interfaces is not my favorite feature of the language.

Multi-methods really are a lot of fun, but I'm missing the Trait system (instead of the object hierarchy) that makes Rust so modular. I know that there are some tries to put it into Julia, but none of them is well supported.

I thought Tim Holy's Trait Tick was considered the de facto way to use traits in Julia?


Maybe it works great, but the syntax doesn't feel native to me, and I don't see it used in the standard library, which means that it's not embraced by the language developers.

Holy Traits are used extensively in the standard library (for iteration and collections among other things).

Not many folks use the macros that those macro-trait packages define, but the technique itself is in widespread use throughout the core language and standard library. That's where it was first posited and implemented, in fact: https://github.com/JuliaLang/julia/pull/8432 (and see the link to the "trait trick")

Great, thanks!

With computational power becoming cheaper and cheaper - will we have two-language problem anymore?

Computational power doesn't become cheaper and cheaper.

If anything it's the inverse:

Not only Moore's law has stopped working for several years now ([1]), but we have also greatly increased what we do with computers (e.g. the whole dataset of an 80s company could fit on a 1GB disk -- today we routinely need to process terabytes of data both streaming and offline -- see also 4K video, webasm vs simple websites of 1999, VR, AR, and so on).

But even if computational power increased continuously and became cheaper and cheaper it would only help if our data and processing needs remained static or increased at a significantly slower pace.

[1] https://steveblank.com/2018/09/12/the-end-of-more-the-death-...




I don't know if I'd call processing terabytes of data "routine" even in this day and age. Same deal with 4k video, VR/AR, etc. Yeah, some particularly large or cutting-edge companies might be doing that, but the average business will likely not have nearly as extreme of needs.

Re: the "webasm vs simple websites of 1999" factor, I think that's a case of designers (and the programmers enabling them) self-inducing such problems. Most business sites don't actually need all those newfangled features and tools; plain old HTML5+CSS3 (plus maybe a little bit of JS here and there) is more than good enough.

The target of julia is computional sciences. Sure for a web apps you don't care if your language is 30x slower, 1ms or 30us who cares, you can just rely on the raw speed of your overpowered computer and enjoy your favorite language.

But when it's between waiting 1 day or 1 month for your results you start caring quite a lot

Sometimes the wait isn't the computational time wait, but the developer time wait, too.

Yes, in the sciences and engineering faster computers = larger, more accurate models one can feasibly study. So while the old models might run faster, many researchers will transition to solving more complicated problems that become just as slow as what we solve now.

We're not going to have a computer that can solve the Schrodinger Equation for a brain anytime soon.

Yes (if you're asking what I think you're asking). Computers are never fast enough. Even if they get 1000x faster, we'll want to squeeze every drop of performance out by using C or whatever.

As someone that remembers when C compilers for home computers weren't able to compete with junior devs hand writing Assembly, it always feels interesting to see this kind of remarks.

Thankfully the changes in modern hardware architectures have triggered many runtime devs to start having another look at their compiler backends, improving vector instructions support and access to GPGPU features, instead of sticking with just good enough for CRUD apps.

I personally don't like the fact that you have structs but no classes and no instance methods, but I know this is personal and don't want to start a religious war this morning

Don't worry, someone started a war over the 1-indexing so you likely won't get enough attention that it counts as a war :)

Multi-methods are much more powerfull than instance methods, but yeah lets agree to disagree. :)

I feel like they're a significant departure from the OO concept of sending a message to an object (for example, "Foo ! bar()" in Erlang or "Foo.bar()" in a more typical OO language would be interpreted as "send message bar() to object/actor/whatever Foo"). Julia's approach is more "call this function on this object", which is indeed perfectly useful, but is semantically different.

That OO concept is pretty much Simula specific, Lisp languages got influenced by the work done in LOOP and Flavours which lead to the design of CLOS, the origin of multi-methods idea.

Building on top of your remark, Julia's approach is more "call the visible implementation of this function that better matches the types of all given parameters".

And these are only two possible ways of doing OOP, there are a few other ways of approaching the ideas of writing extensible modular code with polyphormism.

That OO concept is also what Alan Kay et. al. were envisioning when they coined the term "object-oriented". It's all about message passing between blackboxed instances with isolated/private states. Multi-methods and such are cool and useful, but they're kinda secondary to that core concept (which Julia doesn't really have, to my knowledge at least).

In other words: object orientation != strong typing (even if they do tend to go hand-in-hand).

I never said that object orientation == strong typing, rather that Smalltalk/Simula OOP isn't the only way of doing OOP.

CLOS, Beta, SELF, Oberon (the first version), Component Pascal all have explored different ways of doing OOP.

Sorry to continue, but could you elaborate on that? I understand both concepts, but not their relative merits...

Instance methods are bound to a specific object definition, thus they can only be applied to instances of that specific object or its descendants.

Also in static languages, the set of available instance is closed and extending it requires approaches like extension methods now available in a couple of OOP languages.

Whereas multi-methods are dynamically dispatched taking into consideration the whole set of parameters, and can be defined after the fact. Meaning that they aren't necessarily written in the same module that defines the respective data structures, which opens the door to more extensible designs.

Taking the Alan Perlis' quote "It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures." to the next level.

Because not only can you define all those functions that operate on one data structure, they actually apply to the tuple made by the dynamic type of all parameters at call site.

Naturally it also might make it harder to follow what actually happens with a given method dispatch.

This is my insight from having gone through "The Art of the Metaobject Protocol" and dabbling with multi-methods in Clojure, so I might not be 100% correct here.

You mean you don't like that the methods are declared outside of the struct?

Or that you have to say foo(x) instead of x.foo()?

The latter makes writing code using autocomplete less doable, but reads more naturally in many cases.

I can't think of compelling pragmatic reason to mind the former, but if you have one I'd like to hear it.

Pipes are a nice compromise perhaps. x |> foo(). You get the clear left-to-right flow and easy autocomplete.

class methods provides namespace and scope

I think this mainly makes a difference with regard to reading the code. I'm looking at a struct in someone's code and ask myself "How does this thing get used?" With classes I'll be able to see its methods right away, and if it had inherited then I could see right away which parent to go check.

But the problem is not that big. Maybe the methods are right next to the struct, or if I search the codebase for the struct I can find which methods use it. Still, would be cool if Juno had some feature like "show which methods use this."

With multiple dispatch, if a library has a function that just works on abstract array types, then you as the user could write your own sub-type of the abstract array and it will still work with the library function.

Julia needs something like IPython/Jupyter but without Python.

I think this is what you're looking for: https://github.com/JuliaLang/IJulia.jl

the "Ju" in Jupyter is for julia, do you mean you want Jupyter without the python support?

I was curious about this and it turns out you're (partially) right: https://github.com/jupyter/design/wiki/Jupyter-Logo#where-do...

Yes, this is why I said "without Python".

Wouldn't it be easier to just use Jupyter and not use the Python-supporting stuff?

Just because Visual Studio supports Visual Basic doesn't mean I can't use it when I'm only interested in writing things in C#.

It already has it. Jupyter has been multi-language for ages...

That might be possible with nteract (Javascript, not sure if pure) and the IJulia kernel.

HackerNews’ traffic broke this website’s database:


500 - Internal server error.

SQL Exception

Error Details


Error Index #: 0

Source: .Net SqlClient Data Provider

Class: 17

Number: 1105

Procedure: AddEventLog

Message: System.Data.SqlClient.SqlException (0x80131904): Could not allocate space for object 'dbo.EventLog'.'PK_EventLogMaster' in database 'DNN-PROD' because the 'PRIMARY' filegroup is full. Create disk space by deleting unneeded files, dropping objects in the filegroup, adding additional files to the filegroup, or setting autogrowth on for existing files in the filegroup. at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction) at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction) at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose) at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady) at System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString, Boolean isInternal, Boolean forDescribeParameterEncryption, Boolean shouldCacheForAlwaysEncrypted) at System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean async, Int32 timeout, Task& task, Boolean asyncWrite, Boolean inRetry, SqlDataReader ds, Boolean describeParameterEncryptionRequest) at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method, TaskCompletionSource`1 completion, Int32 timeout, Task& task, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry) at System.Data.SqlClient.SqlCommand.InternalExecuteNonQuery(TaskCompletionSource`1 completion, String methodName, Boolean sendToPipe, Int32 timeout, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry) at System.Data.SqlClient.SqlCommand.ExecuteNonQuery() at PetaPoco.Database.Execute(String sql, Object[] args) at DotNetNuke.Data.PetaPoco.PetaPocoHelper.ExecuteNonQuery(String connectionString, CommandType type, Int32 timeout, String sql, Object[] args) at DotNetNuke.Data.SqlDataProvider.ExecuteNonQuery(String procedureName, Object[] commandParameters) at DotNetNuke.Data.DataProvider.AddLog(String logGUID, String logTypeKey, Int32 logUserID, String logUserName, Int32 logPortalID, String logPortalName, DateTime logCreateDate, String logServerName, String logProperties, Int32 logConfigID) at DotNetNuke.Services.Log.EventLog.DBLoggingProvider.WriteLog(LogQueueItem logQueueItem) ClientConnectionId:84098483-739c-4cd8-bf62-38ff4b256361 Error Number:1105,State:2,Class:17

what good is philosophy if you choose arrays based on 1? https://groups.google.com/forum/?hl=en#!topic/julia-dev/tNN7...

Please don't post unsubstantive flamebait to HN. We don't need yet another low-quality spat about array indexing.

So does:

APL, AWK, COBOL, Fortran, Lua, Mathematica, MATLAB, R, Smalltalk, Wolfram

Because it is Mathmatics and have 1 be based on 0 makes no sense, you then have to switch between the two and it is easy to make a mistake. This is why I don't use Python and Pandas. I got burnt once and that was enough and switched to R. Sadly we are stuck with 0 based array in programming and due to a historical issue.

It's the difference between numbering the contents of the list (1 indexing), and measuring the distance from the beginning of the list to the start of the item (0 indexing).

Personally I'm happy to switch between both, and they both have positives and negatives when I actually write code.

0 indexing is not incorrect, or incompatible with maths, it is just a different way of conceiving of lists/arrays by considering the index to more like the distance from the beginning. The first item begins 0 spaces from the start position.

That is why in that context array length is usually the highest index + 1, because length is the distance from the start position to the end of the list, whereas the index is the start position of the item in the list.

I will concede that 0 indexing is a little too close to the storage paradigm for arrays (i.e. a sequential list of items of n bytes) and that when so often these arrays point to more complex data-structures / objects, the distance doesn't make as much sense as the item number... but meh - I'd take Python generators / iterators at the expense of 0 indexing any day.

>It's the difference between numbering the contents of the list (1 indexing), and measuring the distance from the beginning of the list to the start of the item (0 indexing).

The problem is that one is indeed an indexing (numbering the contents 1...X...N and asking for item X, customers[X]), whereas the other is not, but is used as an indexing (e.g. customers[5] is not getting the 5th item but the sixth).

Where does your definition of "an indexing" originate? The word "index" literally means "to point at" (hence, index/pointer finger), and in this sense, every element may indeed be indexed -- in this case, by means of a unique integer.

If I had to give a name to the concept you're talking about, it would be an "ordinal index".

Index finger, or first finger, right next to the zeroeth finger, the thumb... ;)

>If I had to give a name to the concept you're talking about, it would be an "ordinal index".

Which is both the naive understanding of a numeric index in everyday life in general, and the most common case in mathematics (and most math/scientific software).

In 1-based arrays the number is an index, but in 0-based arrays it is an offset. Languages like C or Rust make this behavior very explicit.

You have forgotten Algol, PL/I, Pascal, Oberon, Oberon-2, Active Oberon, Component Pascal, Mesa, Mesa/Cedar, Modula-2+, Modula-3, Eiffel, Basic,...

Although in Wirth inspired languages, actually the indexes are flexible, but by convention 1 gets used.

I had a job about 7 years ago doing Coldfusion and Flash. Coldfusion is 1 indexed and Actionscript is 0 indexed...this threw me off enough to the point of having a near-religious aversion towards explicit indexing, and while I've largely drunk the functional Kool-aid where I almost exclusively use maps and filters and reduce.

I really do with the 1 indexing had stayed around, since I feel like it's more natural to say the 1st element, instead of the 0th.

>> I really do with the 1 indexing had stayed around, since I feel like it's more natural to say the 1st element, instead of the 0th.

It's not the 0th element. It is the 1st element and has an offset of 0 from the beginning of the array.

Not saying it's better or worse, but that if you change the words you use to describe it, you may find it easier to use.

To my surprise, I just learned that ranges in Rust don't include the end number which felt incredibly stupid from a conceptual point of view (python too). But is syntactically useful when doing:

for x in 0..len(my_array) {

My preference would be that a range include the end number and the language provide more python-like iterators and list comprehensions. I am aware that there is a library that provides more python-like syntax, but that doesn't change the oddness of ranges in either language. I guess I just need to remember that the interval is open on one end. Which brings to mind the notion of using [0..n] or [0..n) in a language...

> It's not the 0th element. It is the 1st element and has an offset of 0 from the beginning of the array.

This is why I think the terminology should be "indexed arrays" and "offset arrays" (and variables perhaps even called index/idx/i or offset/ofst/o as appropriate) instead of using the terms "0-indexed" or "1-indexed."

But I have a hard enough time getting out of the habit myself, and some people are going to hate using o as a variable, so....

Rust actually has both [0..=n] is inclusive


Python almost always does what I want.

If I say `for x in range(10)` it iterates exactly 10 times, starting at zero (which is convenient for 0-index lists in python). It would be extremely counter-intuitive if it did [0,10] as it would iterate 11 times. I agree though that it would be much less ambiguous if you could just do `for x in [0..10)`, for example.

Current implementation requires the least amount of "off-by-one" corrections for most tasks, IMO.

>> If I say `for x in range(10)` it iterates exactly 10 times

Right and I expect that. The issue I have is that the values do not include 10. It turns out great in Python and Rust where things are 0-indexed, but that does not change the fact that the range construct itself is strange in not including 10.

I'm sure the notion of using mathematical notation where [1..10], (0..10], [1..11), and (0..11) all mean the same thing would drive the language parsing people nuts. Mostly because some of those involve unmatched [) or (]. But that's why we have a computer right?

>It's not the 0th element. It is the 1st element and has an offset of 0 from the beginning of the array.

It should be indexed with its position then a[1], as we do in all other aspects of life (and in math), and not with it's offset.

> as we do in all other aspects of life

You mean like elevators? You may be surprised to learn that elevators in America are 1-indexed (ground floor == floor 1), but in Europe they are 0-indexed (ground floor == floor 0).

As a European I'm not really surprised, but it's not exactly the same thing. It's not like the ground level is considered "the 0th floor" and thus we use 0-based indexing.

In most European countries the ground floor is not considered the same thing as the others. As Wikipedia puts it: "In most of Europe, the "first floor" is the first level above ground level".

We consider and count as floors the "layers" above the ground level. In french for example, the ground level is called "rez-de-chaussée" and the floors above it "XX étage". The ground level is not an "étage" (in American terms, the "ground level" is not considered a floor, and is not counted among the floors).

For this reason, apart from elevator labels nobody calls the ground level "the 0th floor" when speaking/writing.

And even in elevators, O is just used as a convenient way to convey the ground level (since the buttons don't fit much) -- and not everywhere. In many elevators it gets it own designation (e.g. G or GRD or some other national variety).

It's "the ground level" (with different names per country) + "N floors".

A multi-floor building with ground floor + 3 floors can even named "3rd-storey building" in some countries (and the ground level is just taken as granted) -- e.g. une maison à deux étages in the UK --> 3 story building in the US.

>> A multi-floor building with ground floor + 3 floors can even named "3rd-storey building" in some countries

I think you just refuted your own previous assertion that 1-indexing is universal in all other aspects of life.

Not sure what you mean. There are 3 floors (as europeans define a floor, ie. anything ABOVE what americans call ground floor) and are named floor 1, floor 2 and floor 3.

So, no 0-based indexing here.

We call the house a "3-storey house" not because we zero index, but because the cardinality of it's floors (as europeans define a floor) is 3. In "3-storey building" for Europeans, 3 is the "length", not the last index.

Ground floor, 1st floor, 2nd floor still sounds very much like 0-based indexing. Is the objection just that people don't literally say the word "zero", but instead say ground?

“Ground floor” is not the European name; the names are things like “Ground of the roadway” (literal word-by-word translation of the French name), “first floor”, “second floor”, etc.

It's no more zero-based indexing than naming relations as “sibling”,“first cousin”,“second cousin” is, just because some hypothetical culture that also uses 1-based indexing but counts siblings as first cousins might rationalize our system as starting with “zeroth cousins” where they have “first cousins”.

There are also differences of building styles, maybe this is part of how we ended up with different terminology. The prototypical Parisian building has a paved driveway to the courtyard which is on the ground level. The prototypical New England house has a first floor which is a wooden thing with floorboards, 3 feet or so above ground.

If I heard that in other countries their use of 1st floor was 1 off from me, and they had a name for whatever was before 1, then what they have is like a zero, especially if some elevators use the literal 0 to mean the ground floor.

>and they had a name for whatever was before 1

What they have merely names what Americans name as zero.

But it's not a stand-in for zero, for us, it's a different entity (a different thing).

A floor (etage, piano, orofos, etc) is a "layer of building where people live above the ground level layer".

So there's no zero-based indexing of floors, because the first item in the set of what are considered as floors is called "first" (e.g. primo in italian).

So, the thing one must understand before they say "same difference", is the other cultural difference: that most European countries don't consider the ground level "floor" to be a floor/story.

I.e, we don't consider the upper floors and the "ground floor" to be in the same set. Which is also why we don't count them when we say a "2 story house" (and we mean a house with 3 such layers, ground story + 2 stories).

Is there a zero-based indexing of the whole "heterogenous" set of ground level + floors?

Well, it's common to have the ground level designated with G or national specific names: BG, BV, E, IS, PB, PT, RC, S, P, PK, etc.

It's just that in some elevators 0 is a substitute for this -- which is probably due to the few common brands and parts being international and nobody bothering to change them (e.g. Otis). That's more a red herring than how Europeans think about floors or index them.

>> So there's no zero-based indexing of floors, because the first item in the set of what are considered as floors is called "first" (e.g. primo in italian).

So the floors are numbered according to their distance above the ground level (whatever that's called). This matches my suggestion of treating zero-based indexing as a distance from the first element.

>> Is there a zero-based indexing of the whole "heterogenous" set of ground level + floors?

Yes there is! It's how the elevator buttons are labeled with the possible variation of using a label instead of "0".

The European elevator numbering scheme according to your own comments matches more closely with a zero-based indexing scheme than a 1-based. Or more precisely it matches my proposal to consider the numbers used to reference array elements as distances rather than indexes.

I think you misunderstand how the European numbering works. Think of it like Europeans count the separations between levels. There is no 0th separation, the first separation is the ceiling above the ground level. Saying that the ground level is the "first element" is begging the question - this is just not what Europeans count. (Well some do, but then they count similar to Americans, there is still no 0th floor ether way).

It is true that some elevators label the ground level 0, but this is usually when there are also negative levels (cellars).

It's not zero-based indexing, it's measuring the distance from a reference level. That matches my suggested change in terminology for indexing quite nicely (for zero-based indexing).

Yup, for Europe I'd actually be curious which countries don't consider the first floor the one above the ground floor, since they're probably the minority.

It is not 0-indexed, it is just that "floor" means something different than in American English. It means a level above the ground. So the first level above ground is the first floor.

The German word Stockwerk is even clearer. It describes the wooden planks which separates two levels. So if you live on the first Stockwerk you live one level above ground.

0 based arrays is objectively the preferred way for me and not for historical reasons. I write a lot of low level graphics algorithm stuff, and 1-based arrays would complicate index arithmetic.

like now with everything 0 based having something like:

arr[offset1 * 32 + offset2]

would be the following if all my offsets would be 1-based and arrays would be 1-based:

arr[(offset1 -1) * 32 + offset2]

which is pretty arbitrary...

“Objectively preferred for me” sounds an awful lot like it’s subjective

Historically 0 based is for low level languages. 0 based makes sense for C.

My issue is higher level languages Python, Java, C# there is plenty of things that just are complicated and doing arrays off 0 is one of them. Doing data science or statistics just makes it obvious that you have two different sets on numbers. 1 doesn't mean the same thing in every instance in your programming and the functional programming side of me hates that.

You could reshape it (https://docs.julialang.org/en/v1/base/arrays/#Base.reshape). But if you're interfacing with libraries written with C interface you'd still have to subtract for both offset1 and offset2.

that may be your impression, because you dont know julia's powerful, no overhead indexing abstractions yet ;) i also write lots of graphics low level code, and i couldn't be more glad that i don't need to do those error ridden indexing calculations anymore, since they're simply not needed... note, that you can also seamlessly create new array types that store 0 indices into other arrays - and because of julia's great composability, i can make them work with my indexing agnostic algorithms while being 0 overhead compatible with opengl's memory layout :)

A ton more people do high level programming, than low level programming. You guys build the tools and many more people use them.

Dijkstra makes an interesting argument for 0 based indexing in EWD831.


For what it's worth mathematics is more than flexible enough to handle both 0 and 1 based indexing. Using 1 based just looks neater since you can count x_1, ..., x_n rather than x_0, ..., x_{n-1}. Sometimes both are used at the same time e.g. if you add an extra coefficient at the start rather than at the end, this happens particularly when dealing with wedge products where the parity of the new index matters.

Also it's usually a bad sign if you need to worry too much about what index you used, you might be better of using index sets with whatever structure you need (and only the structure you need). In program languages that support it generous application of ranges and generators tends to get rid of most complications.

VBA actually lets you decide at the top of each module whether to use 0 or 1 based indexing. The default value is 0, but you can specify "Option Base 1" to change that.

There is an interesting note in the Wikipedia page for "Dartmouth BASIC" which says the second edition "also allowed arrays to begin as subscript 0 instead of 1. This was especially useful for representing polynomials."

The issue is 1 will mean different things. That should be a bigger issue than anything else. When you are doing math

1 + 1 = 2 always means two. When you are getting the 1 column we are getting the second column. This while makes sense when doing certain lower level programming this makes no sense with functional or higher level programming.

+ Elixir:

Regex and Binary indexes are zero-based, List and Tuple are one-based

Are they, though? The relevant functions in Enum all seem to be zero-based, for example:

    iex> [:foo, :bar, :baz] |> Enum.at(1)
Not that it really matters anyway. Like with Erlang, in the vast majority of cases where I'd normally reach for querying a list element by index, pattern matching is the better / more idiomatic option. Compare:

    foo = list |> Enum.at(0)
    bar = list |> Enum.at(1)
    baz = list |> Enum.at(2)
    rem = list |> Enum.slice(3..-1)

    [foo, bar, baz | rem] = list
Both give you the exact same values of foo/bar/baz/rem, but the latter is arguably more readable and concise, and entirely avoids the 0 v. 1 debate.

0-based indexing is closer to the memory representation, but 1-based indexing is closer to how we actually think about arrays. We say the first element, not the zeroth element. Nothing wrong with 1-based indexing except old habits die hard.

I once had occasional to implement several of the algorithms from the second part of Knuth volume 2, in C++.

I don't recall the details, as this was a long time ago, but there were definitely times I was glad C++ used 0-based arrays, and there were definitely times when 1-based would have made for cleaner code.

So I overloaded the () operator to make a(n), where a is an array, equivalent to a[n-1].

It looked a little odd at first if I mixed a[] and a() form in the same program, but it wasn't hard to get used to.

I doubt I would recommend this in general, but it worked well for this particular application.

Can't reply anymore to flagged-to-death sibling, but wrt "1-based array indexing":

I've actually found behaviour of "zero-avoidance" a good battle-tested heuristics to coding. A lot of numeric spaces can be mapped as to be >0 or >0<. Obviously (?) not all of them can, but it's helpful in reducing cyclomatic complexity in logic code, as well as to consumers so they can uniformly handle all falsey values.

Arithmetic is much safer without 0.

offset 1 is logical if you look at the machine code / data in memory.

if you have array [a,b,c,d] array[0] is a which in computer is logical. why? because it stored in memory 'abcd' (forgetting endianness for sake of argument...) and the offset from the base pointer to a is 0. so it makes perfect sense for arrays to start at 0, as there is no offset from the base... there is not 1 item offset from the base :s if you find it confusing i would say learn how computers actually operate instead of arguing from some theoretical frame of mind which has nothing to do with how computers work. a computer isn't mathematics or some arithmetic machine.

just because arithmetic is safer without 0 doesn't mean it's logical for a computer to suddenly have different array indexing. array indexing on a computer has nothing to do with maths. just memory layout and pointers...


Ok, but why should that influence a managed language with no pointers?

It shouldn't. One-based indexing is what all the research uses, so it is undeniably the better choice for Julia. But zero-based indexing isn't just a memory trick, it makes lots of math easier when you're concatenating ranges or slicing arrays or doing modular arithmetic anywhere near an array.

GOTO is logical when you look at machine code - that does not mean it is the best possible abstraction. There is a reason we use higher level languages.

cyclomatic complexity in logic

Call it bikeshedding but I totally agree. As someone who used to use Matlab (since it was what more or less everyone else in machine learning did at the time) and was immensely relieved to switch to NumPy a few years back, I’d love to give Julia a whirl but I just can’t stomach going back to a 1 indexed language. I’m surprised nobody maintains a 0 indexed fork of Julia yet (Juli0, anyone?).

If you want to take it for a spin, challenge yourself not to write anything which cares about this. It will force you to learn the some neat features (e.g. for N-dim arrays) instead of indexing with your bare hands like you're in C. And is good style anyway, to be generic to things like https://github.com/JuliaArrays/OffsetArrays.jl

Math uses base 1, and so does Fortran and Matlab. While Julia is an incredibly powerful language with things like user-defined unboxed structs and homoiconicity, a key design goal is to make it easy to learn for scientists and engineers who are not professional programmers. Since the vast majority use Matlab, the one-based indexing makes sense.

One-based indexing works just as well as 0-based. I've done a lot in the past in Fortran 90 and (regrettably) Matlab. Some things are very slightly harder, some things are very slightly easier. At any rate, this isn't Fortran 77. You shouldn't be fiddling around with indices much.

The bigger difference IMO is Julia uses Fortran-style column-major ordering, instead of row-major like C and Python. I actually

as someone who's written a lot of Fortran and Python for the same project I fail to see the problem. I find something like row- vs. column order differences harder to deal with. 1 vs. 0 based is just a matter of choice, a developer should be able to adapt, just like tabs vs. spaces etc.

I think the exact amount of spaces that should make up a tab is an even more controversial point.

I'd love to see what happens to the tabs vs spaces debate once someone makes tabs that aren't an integer multiple of spaces wide.

Tabstops are really old concept and were never really a set number of spaces wide. Used to, you had physical stops on the typewriter that a tab would advance to.

So, that world was and is a long time ago. We just don't typically give a way for people to set tabstops in most environments, therefore it was chosen that tabs would advance a certain number of spaces.

Fun fact: arrays in Fortran are also based on 1 by default, but they can be defined to start at 0, or even arbitrary negative numbers. The following are all valid:

    integer :: array1(5)
    integer :: array2(0:4)
    integer :: array3(5:10)
    integer :: array4(-4:4)

Julia also supports this (called Offset Arrays)

Interesting. Doesn't this solve the problem then for people who can't get into 1-indexed arrays?

I suppose it kind of does for your own code, but you have to explicitly 'cast' all your arrays to 0-index arrays and it might break anything that assumes it's dealing with a normal 1-index array.

And realistically in many cases you won't even notice the indexing as you'll either be iterating over the array or using an array operator rather than accessing elements via an index. And in the remaining cases, much of the time whether the index is 0-based or 1-based won't affect the way the final code looks. Still, it's nice to have the option to 0-index in the few cases where it does make your code cleaner.

The problem is that it's not the default.

Perl has a global variable that does the same thing.

It's the only sensible choice for a numerical programming language

I'm ready to burn a part of my karma to say the following :

- Naming a PL "Julia" is sexist and creepy

- The language doesn't add any substantial value to PL theory or programming methodology compared to say, Python

- The future of computer-human interaction is not in programming. Python will be good enough until we reach that point where we can use neural networks/AI to do most of the business tasks.

> Naming a PL "Julia" is sexist and creepy

How? (I kind of assume that a big reason computer scientists would pull out that name without a specific referent in mind is the influence of the Julia set, named for Gaston Julia.)

> The language doesn't add any substantial value to PL theory or programming methodology compared to say, Python

Maybe; not all practical benefits in programming come from advances in methodology or PL theory.

> The future of computer-human interaction is not in programming.

I've been hearing that since the early 80s. When someone makes that future reality, we can discuss what it requires, until then programming is what we have and I'm glad it wasn't neglected for the last four decades, and I hope it won't be for the next however many it takes, either.

I also thought it was related to Gaston Julia or his sets, but apparently not: it was just chosen since it sounds nice (https://www.infoworld.com/article/2616709/application-develo...)

I don't even know or use Julia and find this confusing.

- I don't think Python added anything to PL theory or methodology either. At least Julia introduced multiple dispatch to a new audience.

- How is the name sexist or creepy?

Out of curiosity, what's wrong with the name? Using a human name doesn't seem much stranger than using the name of (say) a comedy troupe or fungus.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact