Hacker News new | past | comments | ask | show | jobs | submit login
Factor 0.96 now available – over 1,100 commits (re-factor.blogspot.com)
129 points by mrjbq7 on Apr 21, 2013 | hide | past | web | favorite | 64 comments

Having been a long-time user of HP series of calculators (HP48g, HP50g), postfix notation is not foreign to me. So I tried Factor, and gave up on it for several reasons.

First, there were obligatory stack effect declarations on each word. Want to refactor your program? Sure, rewrite your words, together with stack effect declarations. That part was extremely annoying when it came to exploratory programming. For the uninitiated: stack effect declarations are akin to function prototypes in C, only lacking type information. IMO, once you have them, and they're obligatory, you get all the drawbacks of postfix languages, and no benefits.

Second, the language and standard library rely heavily on stack combinators. Reading the standard library code requires intimate familiarity of what words like bi, bi@ and bi* do (among a whole lot of others).

Third, I find Factor's documentation extremely confusing. For example, there are "Vocabulary index" and "Libraries" sections, with little or no overlap between them. But vocabulary is a library (or package, whatever, same thing), so WTF?!

Then there are important and powerful features like generic words, but if you click on the "Language reference" link on the docs homepage, you get a menu with no mention of generics, and you have little clue in which section to look for them. (It's under objects.) Then you eventually find out (sorry, I was unable to dig up reference to the docs) that generics support only single dispatch, and only "math" generics (plus, minus, etc.) support double dispatch.

In short, the manual is a maze of cross-references with no head or tail.

Fourth, I dislike writing control structures in postfix. This is the part that, IMO, RPL on HP's calculators got right. Instead of forcing you to write something like

    "Less than 10"
    "Greater than 10"
    [ < ] 2dip if ! 2dip is a stack combinator, guess what it does!
you could write

    IF 10 < THEN "Less than 10" ELSE "Greater than 10" END
(Postfix conditionals were available for the rare cases where they were the most convenient form.)

Last but not least, it supports only cooperative threads.

You wouldn't write a conditional like that, you'd do:

    10 < [ "less than 10" ] [ "greater than 10" ] if
This is pretty easy to read. If the boolean condition is true, run the first quotation, if it is false, run the second.

The unusual names lke 'dip', 'bi', etc are just common names that once learned make code easier to read. 'dip' is from Joy IIRC.

I do agree that combinator heaviness can make things difficult for a beginner. Especially if code uses a combinator that isn't common it requires time to look up and see what it does. You really need to immerse in Factor development for a while to get familiar with them.

Stack effect declarations exist to make refactoring safer. Prior to having them enforced by the compiler it was common to change a words stack effect only to find it broke random code using it elsewhere. Making the check enforced provided a safety check that you hadn't forgotten to change word usage elsewhere.

Oh, I've mixed up order of arguments to if.

The refactoring problem could have been solved in a more practical way (though more difficult implementation-wise) by implicit versioning of symbol definitions. Plus, I find that stack effect declarations read like line noise (see for example the declaration for if; IIRC from browsing the docs it's one of the moderately bad ones.)

Factor is an impressive piece of work for sure, but, in my opinion, like many "modern" derivatives of Forth it loses the major strength of Forth, brutal simplicity. It has drawbacks, but it also has advantages. Introducing new "modern" features to Forth adds complexity, changes this balance, and I doubt these hybrids will ever attract 'hardcode' Forth programmers or non-Forth programmers. At least traditional Forth systems are small and simple enough for someone to understand (I learned myself from a tiny Forth named PygmyForth, but there are many others); it provides you with a kit you can modify and customize at will once you've absorbed it.

The postfix notation even for conditionals is really not an issue when you dive into the language. However, with most systems you should be able to translate IF into 'nop', THEN into IF, and END into THEN and it should do the trick. But then there are DO WHILE loops. etc. But everyone prefers to preserve the postfix logic at the end of the day. It's like infix notation systems in Lisp and Scheme that never caught up (and, I believe, never will).

About your second point: most of the library code, in any language, looks like gibberish and like a mess of arbitrary idioms. One simply cannot hope to inspect and understand non trivial library code just just by casually reading it. With Forth the situation is indeed even worse, because implementers generally cannot resort to spaghetti code too much, whose continuous nature, at least, helps the casual reader.

On the topic of cooperative multithreading, I think that the trend in recent programming languages in general is that, nobody wants to deal with pre-emptive threads anymore. It hurts kitties, it is dirty. It is like manual dynamic memory management. Cooperative threads (if I'm not mistaken, they're also sometimes called "lightweight threads", "green threads" or "fibers"), avoid most of the problems of preemptive threads (because the programmer controls when execution can switch to another thread and when it should not very easily), while retaining some benefits (background execution of low priority tasks). For a "modern" Forth system, I think it could be interesting to keep cooperative threads at user level (really hardcore implementers may just remove them completely, because it is expected that the programmer may prefer to implement some sort of cooperative tasking fitted to his needs), and use preemptive/OS threads in library code (typically implemented in C).

Now that's the point of view of a 'hardcore' kind of hobbyist Forth user (and implementor). There is another category that finds brutal simplicity simply not practical and I can understand that. But really for educational/curiosity purposes I would recommend the hardcore way. Like, abandon all the bloat you're addicted to, shut the up, try hard and sincerely. I tried Forth this way because I was very sceptical when reading Moore (inventor of Forth) and Fox (one of his collaborator and friend; Moore isn't very good at communication, so Fox [RIP] was sort of his Plato) that their approach could work. These experiments in extreme simplification brought me a lot when I was a young programmer.

>(It's under objects.)

That would be because Factor is OO.

It takes a few hours to get used to, but once you get the hang of it it becomes an amazing language. For most things postfix notation that Factor uses reads better than applicative (prefix) notation other languages uses. It's like jQuery's or linq's method chaining, only better :) and with less syntax. For example here is the solution to problem 22 in Project Euler (http://projecteuler.net/problem=22):

    : names ( -- seq )
         "names.txt" ascii file-contents [ quotable? ] filter "," split ;
    : score ( str -- n ) [ 64 - ] map sum ;
    : solution-22 ( -- n )
        names natural-sort [ 1 + swap score * ] map-index sum ;

Reverse polish notation means you have to start reading from the end of the expression:

    "hello world" print
This is also why ternary expressions in C like languages is hard to read.

I find the ternary expressions very easy to read and understand and never understood why others find it difficult at all.

I see no semantical difference between

    if(expr1) expr2 else expr3;

    expr1 ? expr2 : expr3;
Okay you need to learn the syntax, but it should be second nature after a few times. I also think the question mark is very intuitive.

If it was that simple, people wouldn't have any issue with the conditional operator. The problem is there actually is a semantic difference between the two lines you wrote in most languages.

If-Then-Else is a statement. It doesn't return a value. It's just a branching operation between the two blocks.

The conditional operator is an expression. It returns a value hence it has a type. And that's where things start to be messy. Typing rules for the conditional operator are, well, not trivial and varie from language to language. Let's use an exemple from Java Puzzlers (Joshua Bloch and Neal Gafter) :

char x = 'X'; int i = 0; System.out.print(true ? x : 0); System.out.print(false ? i : x);

will print X88 but the same thing in C++

        char x = 'X';
        int i = 0;
        cout << (true ? x : 0);
        cout << (false ? i : x);
will print 8888.

And I am not even talking about the priority mess when you try to use a conditional operator in a conditional operator.

That has nothing to do with the semantics ternary operator per se, but rather to do with how Java and C++ respectively type it. If you inserted the relevant casts to make sure both result halves have the same type, the difference would disappear.

You could expose the same semantic difference your example relies on with overload resolution, without needing to refer to the ternary operator at all. But I hope you wouldn't take that to mean that overloads are also a bad idea.

I also believe that the ternary operator is particularly nice when used in combination:

    x = a ? foo :
        b ? bar :
        c ? baz :
The idiom tastes a little like pattern matching. Of course, you need to be a little cautious about your precedence, but I've never used a language supporting infix operators that didn't have gotchas around precedence.

Yes, you are absolutely right. I forgot about this one major difference. However, you cannot generally say if-then-else is a statement; there are languages such as CoffeeScript in which if-then-else works exactly like the conditional operator and therefore has a type and return value.

As for the priority mess, I spend my brain cycles elsewhere and use parentheses, which also helps the next guy as well as me.

Or virtually every language that is functional in a way. For example Lisp and Haskell.

How is this any different than saying "Avoid the + operator at all costs, because behavior is non-trivial and varies from language to language". Of course there are going to be differences. You just have to learn the rules for the language you are working in.

> How is this any different than saying "Avoid the + operator at all costs, because behavior is non-trivial and varies from language to language".

I never ever remotely said that. You are putting words in my mouth and I don't like it. I just rightfully pointed that there actually is a semantic difference between the two mentioned lines. I never implied using any of them was wrong. I didn't even emit a judgement of value on either construct.

Now, I don't have problem per see against the conditional operator. It's just not as simple as some people like to pretend and it makes your code depends on some non obvious and not broadly known language feature for what is usually a minimal gain. So, yes, I do tend to avoid it in anything which is going to end in the hand of others (in the same way I try to avoid any un-obvious language idiom as much as possible). It's like nested comprehensions in Python, the slightly shorter code produced is just not worth the added complexity.

But you can read left to right to get the order in which operations/transformations are performed:

    "hello world" uppercase print
While in languages where function application is prefix, you have to go inside to outside, right to left:

    print(uppercase("hello world"))
Additionally, if there are infix operators, the direction is switched mid statement:

    print(uppercase("hello" + "world"))
hello -> world <- uppercase <- print

I like the way Haskell uses it's composition function operator (.), it's very clean, e.g.

  λ>putStrLn . reverse .  map toUpper $ "hello world"

And yet that's very much how object-oriented languages are typically structured.


  "hello world".print()
If you're used to speaking a language like English where the word order is subject-verb-object, you'd probably consider a subject-object-verb language to be "backwards" too. But native speakers of that language certainly wouldn't have a problem.

I'm sure it's just a matter of what you're used to.

I must say, that the syntax and semantics are not intuitive at all. I've spent half an hour trying to learn how does this thing work, and have not been able to. I guess, a damn good tutorial is in order. I think one of the primary reasons Python took off really well, was because of very good tutorials available. The language of Factor's documentation is still only tuned for language designers, not language users.

Concatenative language syntax is actually pretty easy at the basic level. I don't know Factor, but I used to dabble a lot in Forth. The single syntactical rule with Forth is that a symbol (called a "word") is one or more non-space characters, delimited by spaces. So the following three things are all words: SWAP ." -1

Next, there's the idea of the stack. Every word is executed, one after the other, and the words operate on the stack. So the four-word sequence 3 4 + . puts a 3 on the stack, then puts a 4 on the stack, then pops the top two numbers from the stack and puts their sum on the stack, then prints the top number on the stack. So the result, naturally, is 11. (Well, it is if you're operating in base 6 at the time. Didn't want to make it TOO easy for you!)

Oh, and there's also the concept of interpreting vs compiling. The colon word, which is spelled :, reads the very next word from the input and begins the definition of that word. Every word it encounters up to the next semi-colon is compiled, not executed. So if I did this

    3 4 + : FOO SWAP DROP ; .
... it would still print out the sum of 3 and 4, but in between it would also define a new word called FOO. Which would be a silly way to program, but it demonstrates the point.

So the short form is: begin reading at the top left, continue rightward and downward until you reach the bottom. There's no syntax as such, like Perl's die "Can't open file" unless $fileopen; because Forth-like languages don't read ahead to the end of the line.

[Edited: HN doesn't do Markdown. WTF? Also, I don't know my left and right.]

That's what's cool about the Forth idea, and why it can be so effective. It's not that it's unintuitive, it just doesn't take Algol-style semantics as a given as we do today. (It was being used before C was even developed.) If we're interested, it's our job to see if we can fit our thought process into the Forth idea, not vice versa.

And it's a very coherent idea. It's meant to respect processor architecture while maintaining a good level of comprehensibility to the human. And it keeps a MUCH closer correspondence between "word" (essentially a routine) and actual instruction. That got WAY lost with C, enter the beast-compiler.

I think of it as a concept layer above assembly, that has features far beyond just mnemonic value. It's like Lisp to me- a language whose "syntax" is dictated by the functional necessity of a computing methodology, not the decisions of language designers. For me at least, the approach leads to much higher productivity for a variety of reasons.

There's also a very high probability that Forth is the first thing to be run when you turn on the computer or cellphone you're reading this from.

Now as to Factor... see my post above and help me out!

Actually, Forth (the predecessor of concatenate languages) has the best tutorials I've ever read - Starting Forth and Thinking Forth by Leo Brodie.



To be fair Factor/Forth's way of thinking is really foreign to the usual Pascal/C derivatives. Learning how to program from scratch certainly too you more than 30min.

It has nothing to do with python being particularly intuitive and everything to do with it being very similar to every other imperative language you already know. If your first language was forth or PS you'd find factor intuitive too.

Postscript was around my 9th language[1] and I find Factor pretty easy to read. I might be a special case as I hate reading Python and every time I try an S-expr language I really never get the hang of it.

Thinking about Python, looks wise it is kind of the opposite of any of the stack languages. I get the feeling that you can like one or the other.

1) various 8-bit BASICs, 6502 ASM, Modula-2, 370 ASM, C, Ada, foxbase, SQL, Postscript, Forth, Objective-C (weird listing it out)

I started with C++ and thought I was just good at picking up languages because I picked up Java and Python in a day or two. Then I realized it was because they were so similar. When I learned Common Lisp I had to do some real work even though ny then I had been programming for 7-8 years.

>> I guess, a damn good tutorial is in order.

Firing up the Factor GUI environment you can access help via F1 where two options are the cookbook and the "Your First Program" tutorial.

This same documentation available in the environment is also available online[1].

[1] http://docs.factorcode.org/content/article-first-program.htm...

I think that's true. Forth is so called "ultimate low-level language".

Its interesting how Factor avoided "library fragmentation" by trying to include everything in the main distribution - definitely a win here, given that in the smaller communities you can't afford duplication. The average quality of the libraries is higher because of that...

I could never get over the syntax though - even though the #factor folks say its a matter of getting used to and theoretically not "harder" than Lisp, Lisp feels very natural to me, but Factor always required a big overhead.

For those looking for "Why Factor" material you might like to look at some of Manfred von Thun's writings on the Joy programming language [1]. Joy is similar to Factor and Manfred wrote a lot of theory on the "why" of concatenative programming languages.

[1] http://www.latrobe.edu.au/humanities/research/research-proje...

This is really cool!

I have always admired the engineering work on Factor, but had thought the project would die when Slava Pestov took his job at Google. Great to see it continue to improve.

I like the old logo better, http://re-factor.blogspot.com/2012/11/new-logo.html is waaay too busy.

Ah, the new logo is hideous! I don't see anything better about it other than the fact that it fits in a square. What's the purpose of the late-90's computer graphics style round triangle button thing? It's like a big skeuomorphic upvote button with an airbrushed Mario level in the background. Just get rid of it!

That nitpick aside--wow! Factor is getting more and more impressive every day.

My TA for a cryptography class worked on this. I had the pleasure of messing with it one afternoon. Pretty cool stuff.

Factor the language or factor the Unix command?

The language

As cool as an approach that Factor has, I cannot fathom why it is being developed for x86 for the life of me. (Besides claims of cross-platform native executable compilation.)

On ARM it would be interesting- or if it had a highly modular structure that would allow it to be "easily" ported to other microprocessors.

But on x86? What niche is it trying to serve? Is it a language-lover's-language? I guess what I'm asking is, why would I use a "high-level" Forth on x86? There has to be a reason, I'm just missing it.

Factor used to run on ARM [1]. Back then Windows Mobile only had 64MB of ram and it wasn't quite enough to do anything useful in Factor.

I used to run a video sharing website written in Factor [2], sadly the server crashed a few weeks ago and I haven't recovered the data from the disk yet.

Factor feels more like a Scheme than a Forth to me when programming.

[1] http://web.archive.org/web/20111117015025/http://www.bluishc... [2] http://web.archive.org/web/20130118204336/http://tinyvid.tv/

Woah thanks for that link, there could be some assembly worth salvaging there.

Currently working on an ARM forth-idea with some scheme-ideas. There's this nagging intuition I have that there's a symbiosis between Lisp atom/Forth word and S-expression/Forth dictionary.

I have to get something going pretty quickly so I can't get too pure about the idea, as I don't want to get bogged down in the necessary GC yet.

Thanks for your reply, it already made me more interested in Factor from an implementation standpoint, even if I wouldn't use it on x86.

Isn't Factor (at some level of abstraction) basically Lisp/Scheme with more minimal syntax? It replaces prefix notation and parentheses with postfix notation and a stack. All the benefits Lisps get from minimal syntax, Factor gets too.

That's exactly what I'm thinking, but I haven't had time to stress-test the idea or find any precedent. And remember i'm talking about a ~36kilobtye forth, not this large-sized and resource-intensive kind of x86 forth that I don't understand the reason for.

Theoretically you would have the HLL incredibleness of a lisp with the low level genius of Forth. I asked some old timers if the idea made sense and they were silent in specifics but not discouraging. Which would indicate to me they think it is a huge learning opportunity for me one way or the other. Have you done any deeper analysis?

They are working on reimplementing ARM support in the next release. It is in an incomplete state ATM.

Under new libraries I see a reddit API lib listed. Why are they including things like this in Factor itself. Doesn't it kind of bloat things? Factor can strip this out if it isn't used of course, but can you really maintain all of these libraries? If you just add random things to Factor itself you're eventually going to end up with something that is impossible to maintain without a very large number of invested people.

The 'extra' directory basically acts as a package manager. The "Batteries Included" approach Factor uses here makes it much more likely that libraries work out of the box.

This was very useful in the early days of Factor when language and base library changes caused bitrot in contributions. If the contributed library wasn't updated it was moved to an 'unmaintained' directory for later correction.

Because there are like 10 person using it. Until the language grows and acquires a package manager it's easier to just put everything in the stdlib.

Sure, but I'd vote for doing things right to begin with rather than waiting for complaining users. But I guess we might disagree on what 'right' means in this context. :)

There is another aspect too: when everything is in the stdlib it forces developers to collaborate and not create their Nth implementation of the same lib because theirs is better. There is a breaking point in number of developers where it becomes more efficient to distribute the work trough a package manager but then you also introduce package hunting which can become quite a time consuming task.

Where is Slava Pestov now? Haven't seen him for a long time.

He got a job at Google back in 2010 and pretty much handed over the reins at that time due to lack of free time to work on Factor. He still posts on the mailing list though periodically.

Is there a HelloWorld example I can try?

  "hello world" print
...although, more realistically, you're probably interested in something like the gettings started section of the docs [1]. There are also some great short example posts on the planet feed [2] from various blogs.

EDIT: I'll plug my own blog for this too, as I started writing a Beginning Factor series of posts a long time ago [3], but never got beyond a couple entries.

[1] http://docs.factorcode.org/content/article-handbook.html

[2] http://planet.factorcode.org/

[3] http://elasticdog.com/2008/11/beginning-factor-introduction/

Hey Aaron,

From your second article[1] introducing Factor

"Next time we’ll stick with a theme of “flow” and discuss control flow for your words and after that, the typical work flow when developing code with Factor. Hope you enjoyed this installment!"

I am still waiting for that article. Your introductions to Factor are awesome and more would be delightful.

[1] http://elasticdog.com/2008/12/beginning-factor-shufflers-and...

Anyone knows how one could compile to Factor VM? Slava once said that would be a good idea - imagine having a Lisp in that modern environment? (I love Slime don't hit me).

What is it? Why should I learn it? What are its "killer" features? Ain't nobody got time to learn about every new prog. language.. :)

My suggestion is that learning to write programs in a concatenative language will teach you to think in combinatorial logic, a fundamental basis for computation. I outlined this idea a while back and mentioned Factor in "Finding Joy in Combinators:"


You don't have to learn it. It is probably mostly interesting to users of concatenative languages, or those who just want to broaden their horizons..

Personally, I've fallen in love with it a few years ago. Even though I don't use Factor for anything practical, the language itself (as well as Forth) has taught me a completely new way of thinking about problem solving. Something that I benefit from in other languages I use.

So if there is one reason you should at least check it out, it is that it will make you a better programmer. And I think anyone who's been writing code long enough, will know that there is always room for self-improvement.

Ok, but that is not true for any new programming language. So why downvote my question?

Looks very interesting. What distributions package it?

Arch Linux has it in AUR, I'm currently in process of updating it to 0.96.

Oh look, they're basically dissing this community to hell in the irc logs. For instance,

"10:10:24 <RodgerTheGreat> another HN gem: "I must say, that the syntax and semantics are not intuitive at all. I've spent half an hour trying to learn how does this thing work, and have not been able to.""

It's horrible to watch when a language community has this sort of a elitistic attitude, especially when it's doubtful that the language has any future.

Source: http://bespin.org/~nef/logs/concatenative/13.04.21

If a person says that something is a gem, they can be sincerely saying that they think it has positive value. Your interpretation was that the phrase was being used in a sarcastic manner. That may well have been the case here, but it doesn't have to be so.

I think the "half an hour" bit is significant. If someone complained that they tried to learn Prolog and gave up after half an hour they wouldn't get much sympathy from me.

That's not the case at all. We are anything but elitist...

Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact