Hacker News new | past | comments | ask | show | jobs | submit login
The Phix Programming Language (phix.x10.mx)
51 points by pitr 11 days ago | hide | past | web | favorite | 41 comments

What I need to know about a language: Does it do static types? Does it depend on GC? Can I pass functions to functions? Can I build an aggregate by listing the members, and pass it around? Can I vary the implementation of a function according to the types passed, as determined at compile time?

If your description just lists (howsoever appealing) platitudes, I don't get the answers I need.

It is dynamicly typed with optional validators built-in. The cool part (and also the main weakness) is usage of resizable array-based vectors for all composite data types. Other distinguishing traits are RC-based GC, no higher-order functions (but ability to pass function pointers), direct memory access via BASIC-like poke. Original Euphoria language wasn't a scripting one, more like C on steroids (for MS-DOS), and its main usage was game prototyping.

> usage of resizable array-based vectors for all composite data types

Doesn’t this describe many of the modern languages - python’s lists, ruby’s arrays, etc...? Or is the cool part the lack of more specialized classes like dict or set?

Python, actually, is older than Euphoria. :) The cool part is the focus on cache-local data structure first. Way more popular scripting languages like JS and Lua use hash tables as their primary data structures, Python and Ruby offer arrays as secondary option, but focus on using structs (classes) everywhere. Euphoria answer to this is to use arrays of cells for everything and enum your way out of situations where you need to group things.

Exactly. What sets it apart. The current landing page screams the answer: nothing.

Okay, the design whispers to me that "lack of community/ecosystem" sets it apart.

An interesting artifact from the times the only platform was wintel, the filenames were 8 characters long and “conventional languages” meant “C, C++, Ada, etc.”


The only strange thing is release date and references to 64 bit arch..

My antivirus says: "Access has been blocked as the threat Mal/HTMLGen-A has been found on this website."

May be it's a false alarm, but I thought I had better share it here.

I don't understand why anyone would create a one-indexed language these days. I feel like Dijkstra laid out pretty well why zero-indexing is objectively better [1]. If that wasn't enough, it's also most common by far.

[1] https://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/E...

Because zero-indexing is in many cases not natural. Yes, a noted computer scientist made an argument years ago, and it's been cited many times. Perhaps to help you understand, nobody talks about row zero of a matrix. But yes, absolutely, there was a noted computer scientist who made that argument years ago.

That’s utterly unconvincing.

Zero-based indexing makes sense only in light of the underlying mechanics of memory address offsets (which, IIRC, isn’t even how eg C literally works). Natural numbers are counting numbers. It makes “more sense” that a sequence up to N contains N numbers, that is, its members can be put in a bijection with {1,...,N}

These are all conventions, of course. But the latter convention has a finer pedigree.

You'll typically see it in computer algebra systems whose syntax is intended to be typeset as traditional mathematical notation. Mathematica, for example, wants an array to look like what's being typeset, and it would be confusing (moreso than the potential off-by-one errors) to display `x[3]` while the actual code said `x[2]`.

I think other languages aimed at mathematicians, like Julia[1], do this for similar reasons but, honestly, unless you're writing a CAS, I think it's a bad idea.

Having lived with Dijkstra's ideas for a while, I will say that people find closed-open sequences highly counter-intuitive.

I work on a financial simulator, so we use dates quite a bit. Representing a year as 1-Jan-X up to but not including 1-Jan-(X + 1) is perfectly natural, and really works well with date-time libraries. It still works if you're using dates and decide to extend it to handle datetimes, and even months 1-M-X to nextmonth(1-M-X) do what you'd expect. So he's right, the math works beautifully. You also avoid writing a successor function when you split closed-open ranges; easy with integers, but gets tricky with almost anything else.

But I've had to nag people many times to not use Dec 31. I think the intuition for a closed-closed ranage is deeply ingrained and that extra "Jan 1" seems to stick out and bother people.

From that discussion, here's an interesting argument in favor of 1-based indexing:

> Let me try to briefly convey how I learned to stop worrying and love 1-based indexing. Basically it comes down to the fact that there are three significant numbers you have to think about when it comes to an array: it's initial index, final index, and length. In 0-based indexing, these are all different: 0, n-1, and n. In 1-based indexing, the final index and length are the same: 0, n, n. That's one less thing to think about and keep track of when coding. It seems trivial, but when you're doing something tricky and juggling a lot of complicated things in your head, even the tiniest alleviation of cognitive load helps, I find.

[1]: https://groups.google.com/forum/?hl=en#!topic/julia-dev/tNN7...

In the case of Mathematica, 1-based indexing also makes sense in the context of it being a lisp-like language where everything is an expression that can be represented as a tree.

So in the case of

exp = x + y + z

this is syntactic sugar for

exp = Plus[x, y, z]

As expected with 1-based indexing exp[[1]] returns x, but the zero index can also be used to access the head of the expression, so exp[[0]] returns Plus.

Another nice property of 1-based indexing in Mathematica's case is that by using negative indices you can access elements in the reverse direction, so exp[[-1]] would return z, and exp[[-3]] would return x.

Shouldn't the tuple for 1-based indexing be 1,n,n ?

Yes, you're right. I think that slip says a lot about the tricky issue at hand.

First position, last position, and length for an array of [1, 2, 3].

  0-based: 0, 2, 3  (0, n-1, n)

  1-based: 1, 3, 3  (1, n, n)
Actually, that's a pretty convincing argument. I've played with Lua before, and did get the feeling that 1-based index has some advantages (but couldn't articulate why).

Having the last position and the length be the same seems to fit the common intuition, like how children count: "What is the position of the last thing" and "How many things do we have?".

So it may make the language easier to learn. On the other hand, it may make the learner struggle to pick up most other languages which use 0-based index. I suppose it's been argued in depth on both sides..

Having the last index and the length be the same may not be the good thing. for instance, if you translate python's negative indices (x[-i] = x[len(x)-i]) directly the last element would be...x[-0]? But then having the last element be x[-1] in python is also not entirely obvious.

I think so, yeah. I copied that quote verbatim.

Though I've grown up with programming languages that have zero-based indexing, I've never become a "native speaker", as I always mentally add or subtract a one as needed when trying to understand code involving list indices. Zero-based indexing may be common, but I'm pretty sure it's far from natural to most people. It is likely a main cause of many off-by-one bugs. I wouldn't blame a language designer for choosing one-based indexing instead.

I don't have any trouble with it, I just imagine the index as what it is: an offset, and the first element's location has the offset 0 from the start of the array.

For me it's far more confusing when I have to write code in a language with 1-based indexing, although I can see why e.g. Julia does it as it's more focused on mathematics and also in some cases the indexing is actually simpler when specifying a range.

It depends on what you're trying to describe.

If you're trying to describe an offset from a starting position, then zero-indexing makes sense.

If you're trying to describe the rank/position of something in a list, then IMO one-indexing is not unreasonable.

This seems to be a euphoria fork, which can be installed on linux, and which runs fine. But how can I bootstrap phix on linux with euphoria?

Ah, got it. You need to download the executable p64, see http://phix.x10.mx/download.php Like with old lisps.

Out of curiosity, what is the purpose of this language? A learning exercise or an attempt to create just another language in the pool of over 1000 already there?

Writing an entire language as a way to hone someone's skills is great, even if it is very time consuming vs. the output (skills learned). But I wonder what better options exist.

Probably working together with a group, on an interesting and relevant project. The output will likely be better, and your skills will improve faster - because of the feedback.

Or, for medium scale ideas, maybe taking another language and forking it would be useful.

This one is a loose reimplementation of Euphoria language with long enough history of its own. Sadly, the development of the successor to original RapidEuphoria interpreter called OpenEuphoria somewhat stalled after 2014.

This is related to the Euphoria language which has a long history so I gather this is not something that was quickly slapped together. http://phix.x10.mx/docs/html/eucompat.htm

Looking at "Core Language" I was dismayed not to find any builtin floating point data type.

In "Phix vs Conventional Languages" I'm told that "1/2 is always 0.5". And "Library Routines" / "Math" includes sin, cos and tan. What's going on here?

The type "atom"[1], is described in "Core Language" as follows:

> An atom can hold a single floating point numeric value, or an integer

I don't know why it isn't called "number", but it exists.

The page on atom also mentions:

> It can also hold a raw pointer, such as allocated memory or a call_back address, but that is typically only used when interfacing to external code in a .dll or .so file.

[1]: http://phix.x10.mx/docs/html/atom.htm

You're right, I missed that.

But that still leaves the ASCII tree diagram agreeing with the sentence "Phix has just five builtin data types:" while showing (and describing, in the following bullet point list) five data types that don't include "number" or a floating-point type.

So what's left is a minor but consistently repeated error in the doc, on the "Core Language" page.

I'm still reading the doc, so nothing to say on the language but the website is great!

The quotes are mis-sttributed (at least one): https://www.brainyquote.com/quotes/e_f_schumacher_148840

It's broken on low-resolution screens.

Let alone mobile.

> my language is simple, unlike those other 1001 "simple" languages that aren't really


In other words, it seems it's easier to make your own programming language than learn the ones that already exist.

I can see why people think this way. If you know C, it sounds easier to build a new scripting language (it’s just another C program!) than to learn all the quirks of Python or Ruby.

And if you know Go it's [0] even easier. It all depends on just how general purpose you need it to be.

If you ask me; people should build more simple languages, and learning how be considered an essential part of becoming a programmer. It's not rocket surgery.

[0] https://github.com/codr7/gfoo

I know C and implementing a new scripting language for sure doesn't sound easier than using an existing one

Did you ever write an arithmetic evaluator? A calculator essentially? Or any other kind of evaluator? A text template engine perhaps?

They all have plenty in common and exist along a continuum of complexity that includes scripting languages and goes all the way to general purpose.

> exist along a continuum of complexity

Isn't that the point? Building a serious scripting language is a huge project. It's far more ambitious than writing a simple text-based calculator.

If you want to build a scripting language worth actually adopting, it's going to have to be far better than Python. That's a very high bar.

The same applies for everything in technology. You could try to write your own OS rather than use an existing one, but it's almost certainly a terrible idea.

Well, the whole idea is to keep it simple. Look at Lua for a language that can be implemented in a weekend and is still a very serious language. Admittedly, you would probably end up spending a lot of time in the design phase if you were to create your own language, but once you have that, if it truly is a simple language, the implementation is not a problem.

Worth adopting by who?

I once replaced a convoluted pricerule framework based on fixed options in a booking system with a simple evaluator that exposed variables for number of days, number of guests etc. That was very much worth adopting for that specific use case.

Your line of arguing always assumes that any solution has to cover all bases, which is just not true when you're solving a specific problem.

And no, the same doesn't apply for everything in technology. The continuum for operating systems starts at a much higher level of effort and cost of maintenance. Nothing is black or white.

I've written all of these and I agree with the GP that it's easier to use an existing one. Why recreate someone else's work when you can just use it.

I think it boils down to the fact that the last 10% of any project takes at least 90% of the work. It's easier to make the 90% of a programming language that meets your need than it is to learn a more complete language.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact