Maybe a few examples of "data munging" tasks which the authors view as poor fits for [language X] and how their stuff solves the problem better.
Maybe something like "why is our language better than regexps in whatever language environment you already know?"
TXR has regexps is you need them. The regex engine is geared in a different direction from mainstream regex: it doesn't have anchoring, register capture or Perl features like lookbehind assertions. On the other hand it has intersection and negation (without backtracking).
TXR translations of Clojure, Common Lisp and Racket solutions to the same problem:
If it's supposed to be obvious by inspection, well... I guess I'm too unenlightened.
As an example, I was doing some kernel work and needed patches to conform to the kernel's "checkpatch.pl" script. Unfortunately, this thing outputs diagnostics in a way that Vim's quickfix doesn't understand; I wanted to be able to navigate among the numerous sources of errors in the editor.
First I looked at the checkpatch.pl script hoping that of course they would have the diagnostic output in one place, right? Nope: formatting of messages is scattered throughout the script by cut-and-paste coding.
TXR to the rescue:
WARNING: line over 80 characters
#279: FILE: arch/arm/common/knllog.c:1519:
+static void knllog_dump_backtrace_entry(unsigned long where, unsigned long from
WARNING: line over 80 characters
#321: FILE: arch/arm/include/asm/unwind.h:50:
+extern void unwind_backtrace_callback(struct pt_regs *regs, struct task_struct
WARNING: line over 80 characters
#322: FILE: arch/arm/include/asm/unwind.h:51:
+ void dump_backtrace_entry_fn(unsigned long where,
WARNING: line over 80 characters
#323: FILE: arch/arm/include/asm/unwind.h:52:
+ unsigned long from,
#@code: FILE: @path:@lineno:
arch/arm/common/knllog.c:1519:WARNING (#279):line over 80 characters
arch/arm/include/asm/unwind.h:50:WARNING (#321):line over 80 characters
arch/arm/include/asm/unwind.h:51:WARNING (#322):line over 80 characters
arch/arm/include/asm/unwind.h:52:WARNING (#323):line over 80 characters
arch/arm/include/asm/unwind.h:53:WARNING (#324):line over 80 characters
arch/arm/kernel/unwind.c:352:ERROR (#337):inline keyword should sit between storage class and type
That being said, it has some intriguing features that I'm not going to dismiss. I work with COBOL on a daily basis, so I'm not going to say no to a new language just because it's ugly. There seems to be a lot of utility here.
One part that seems nice is that it handles multi-line constructs in a way that isn't horrible. Perl and awk have a big complexity jump once you go past one-line records, and most of the traditional Unix utilities just don't handle them at all (stuff like cut/join/sort only works on single-line, delimited records). Since constructs like Perl's while(<INPUT>) stop automatically doing the Right Thing once you get multi-line records, the usual next stop is that you're manually maintaining a state machine.
But I'd say that data munging is inherently ugly. I don't really see myself using this as the next tool to write clever algorithms that will stand the test of time, but if you offer me this as a stand-in for the usual shell-script/awk/sed/perl/printf/regexp mess you need for ad-hoc file transformations, I'm suddenly listening.
This is hard to show in small examples, so small examples become dense with the notation. Just like, say, tiny examples of HTML become a dense soup of tags.
Note that TXR Lisp doesn't have the at signs. You can write a pure TXR Lisp program by wrapping the whole file with @(do ... ).
TXR looks a lot better with syntax highlighting; unfortunately, this only exists for Vim. On the other hand, the syntax highlighting definition file for Vim is quite good.
Kaz: How come you didn't write TXR in lisp?
But seriously, TXR is built on its own Lisp: an infrastructure which provides the managed environment and data representations which also support the TXR Lisp dialect.
This is no different from any Lisp implementation based on a C kernel, like CLISP, GNU Emacs, ...
If you do it from scratch, you lose a lot: you don't have a mature, optimized dynamic language implementation. But, by the same token, you can experiment in ways that you normally wouldn't. You get to dictate things like, oh, what is a cons cell. I have lazy conses that look like ordinary conses: they satisfy consp, and work with car, cdr, rplaca and rplacd. You can invent new evaluation rules. I came up with a way to have Lisp-1 and Lisp-2 in a single dialect, seamlessly, with the conveniences of both. I have Python-like array access. I made traditional Lisp list operations work with vectors and strings: you can mapcar through a string and so on. Sequences and hashes are functions. For instance orf is a combinator that combines functions analogously to the Lisp or operator. If hash1 and hash2 are hash tables, you can do something like [orf hash1 hash2 func] to create a function which takes one one-argument that will look that argument in hash1; then if that returns nil, it will try hash2, and if that returns nil, it will pass the key to func and return whatever that returns. Or ["abc" 1] returns the character #\b. [mapcar "abc" '(2 0 1)] yields "cab": the numeric indices are mapped through "abc", as if it were a index to character function. Fun things like this are good reasons to experiment with your dialect.
I believe TXR is a great companion if you're a Lisper working in ... one of those other environments.
Ah, one more thing. Well, two, or maybe three. Part of why I used C was to create a project whose tidy, clean internals stand in stark contrast to some of popular written-in-C scripting languages. You know, to sock it to them! See, there is a hidden agenda: the call of "I can do this better". If you use C, then a more direct comparison is possible. Secondly, people widely understand C. Give them a cleanly written project in C, and maybe they will hack on it, and from there understand something about Lisp too. C means low dependencies from the point of view of packaging: easy porting with just basic shell environment with make and a C compiler. Cross-compiling for ARM or whatever is a piece of cake. Easy work for package maintainers, ...
TXR is not built "on its own Lisp", it's built on C. If you believe that lisp is so great, then why didn't you just use ANSI Common Lisp? Why is TXR even necessary when I can do all the same data processing stuff in Perl, which is far more versatile and ubiquitous?
And all this nonsense about writing TXR in C because it's "more widely understood", "low dependencies", "easily packaged" - after 15-some years of advocacy in comp.lang.lisp, it's laughable that defsystem, asdf, and SBCL/CLISP/CMUCL aren't good enough for you.
Lisp is either as good as all the Naggums, Tiltons, and Pitmans of c.l.l. proclaim, or it's not. By writing TXR in C, you've just proved that it's not.
SBCL's runtime contains traces of C.
CLISP is written on top of C.
CMUCL's runtime contains traces of C.
Now we are fucked...
I'm so glad that at least my Lisp Machine has no C. Oh wait, it has a C compiler...
I find this hypocrisy to be quite intriguing.
That's possible. There are many Lisp dialects and implementations which have few applications. That's true for a lot of other language implementations, too. There are literally thousands implementations of various programming languages with very few actual applications. Maybe it is fun to implement your own language from the ground up. Nothing which interest me, but it does not bother me.
If he wants to implement a small new Lisp dialect its perfectly fine to implement it in C or similar.
> They always seem to fall back on C, or some other language that's more "widely available" or "has minimal dependencies" or "has more potential contributors" or "can be more easily compared with other similar programs".
Some new dialect is written with the help of C? That bothers you?
Actually 95% of all Lisp systems contain traces of C and some are deeply integrated in C or on top of C (CLISP, ECL, GCL, CLICC, MOCL, dozens of Scheme implementations and various other Lisp dialects). There are various books about implementing Lisp in C.
Really nobody in the Lisp community loses any sleep that somebody implements parts of Lisp in C.
> I find this hypocrisy to be quite intriguing.
Because some random guys implement their own language in C? Why do we have Python, Ruby, Rebol? There was already PERL or AWK or ... Somebody decided to write their own scripting language. So what?
When a Python advocate wants to do some data processing, do they first write their own Python implementation in C? No. When a Ruby advocate wants to make a Rails website, do they first write their own implementation of Ruby in C? No.
Several fine implementations of lisp already exist that compile down to machine code and, if the lisp community is to believed, have performance "close to C". So why does a lisp advocate feel the need to re-write lisp in C for a project that didn't actually need it? The lisp community would have us all believe that lisp is the "programmable programming language", and all the other rhetoric about how every other language has just stolen ideas from lisp, etc., etc.. They all truly seem to believe that lisp is something special. That's why I find it laughable that someone like Kaz Kylheku, a 15 year veteran of comp.lang.lisp, decided not to implement TXR by using a pre-existing lisp implementation.
They write it in C. Checkout the Python world sometimes.
* CrossTwine Linker - a combination of CPython and an add-on library offering improved performance (currently proprietary)
* unladen-swallow - "an optimization branch of CPython, intended to be fully compatible and significantly faster", originally considered for merging with CPython
* IronPython - Python in C# for the Common Language Runtime (CLR/.NET) and the FePy project's IronPython Community Edition
* 2c-python - a static Python-to-C compiler, apparently translating CPython bytecode to C
* Nuitka - a Python-to-C++ compiler using libpython at run-time, attempting some compile-time and run-time optimisations. Interacts with CPython runtime.
* Shed Skin - a Python-to-C++ compiler, restricted to an implicitly statically typed subset of the language for which it can automatically infer efficient types through whole program analysis
* unPython - a Python to C compiler using type annotations
* Nimrod - statically typed, compiles to C, features parameterised types, macros, and so on
and so on...
> So why does a lisp advocate feel the need to re-write lisp in C for a project that didn't actually need it? The lisp community would have us all believe that lisp is the "programmable programming language"
Why don't you understand the difference between 'a lisp advocate' and 'the lisp community'?
> nd all the other rhetoric about how every other language has just stolen ideas from lisp, etc., etc..
> That's why I find it laughable that someone like Kaz Kylheku, a 15 year veteran of comp.lang.lisp, decided not to implement TXR by using a pre-existing lisp implementation.
I find it laughable that you find it laughable...
TXR didn't need its own dialect of lisp. So, the question remains: why didn't Kaz use SBCL or CLISP? They're good enough for c.l.l. kooks like him to recommend to everyone else, but why're they not good enough for him to use?
TXR does need its own dialect of Lisp because Common Lisp isn't suitable for slick data munging: not "out of the box", without layering your own tools on top of it.
This is a separate question from what TXR is written in. Even if TXR were written using SBCL, it would still have that dialect; it wouldn't just expose Common Lisp.
That dialect is sufficiently incompatible that it would still require writing a reader and printer from scratch, and a complete code walker to implement the evaluation rules of the dialect. Not to mention a reimplementation of most of the library. The dialect has two kinds of cons cells, so we couldn't use the host implementation's functions that understand only one kind of cons cell. So, whereas some things in TXR Lisp could be syntactic sugar on top of Common Lisp, with others it is not so.
Using SBCL would have many advantages in spite of all this, but it would also reduce many opportunities for me to do various low-level things from scratch. I don't have to justify to anyone that I feel like make a garbage collector or regex engine from scratch.
So, the reasons for not using "SBCL" have nothing to do with "good enough". It's simply about "not mine".
TXR is a form of Lisp advocacy.
TXR is also (modest) Lisp research; for instance I discovered a clean, workable way to have Lisp-1 and Lisp-2 in the same dialect, so any Lispers who are paying attention can stop squabbling over that once and for all.
It pays to read this:
Why we have Lisp today with all the features we take for granted is that there was a golden era of experimentation involving different groups working in different locations on their own dialects. For example, the MacLisp people hacked on MacLisp, and it wasn't because Interlisp wasn't good enough for them. Or vice versa.
That experimentation should continue.
Kaz, the C programming language isn't yours either. My point is that Common Lisp is supposed to be a general purpose programming language with power far greater than a primitive language like C, but you chose to implement TXR in C simply because C makes it much easier for you to accomplish your goal than Common Lisp. I'm just trying to point out the obvious, which nobody from c.l.l. seems willing to admit.
> why didn't Kaz use SBCL or CLISP?
Why should he? He can do whatever he want. I personally don't care at all about what he does. Why are you? Kind of strange obsession with comp.lang.lisp. Are you one of the trolls posting there?
> They're good enough for c.l.l. kooks like him to recommend to everyone else, but why're they not good enough for him to use?
Probably he did it to annoy real programmers like you?
> Why should he?
Kaz invested a bunch of time implementing a whole new backquote implementation for CLISP, but it's still not good enough for him to use CLISP to implement TXR? It doesn't make any sense!
Any right-thinking programmer should care about inconsistencies such as this. If I'm evaluating a programming language, and I see someone in its community writing their own language implementation to support an application that could've easily been written using one of the standard language implementations, then it looks to me like the standard implementations aren't mature enough or trustworthy enough for me to use for my application. Not only that, but it suggests that maybe this particular language isn't as good as its advocates claim, especially if I have to drop back down to C in order to meet certain requirements (e.g., portability, speed, wider understanding, etc.).
But any right-thinking programmer already knows that lisp is not worth wasting any time on. It's dead, and people like Kaz, and projects like TXR, are going to make sure it stays that way.
CLISP's licensing is somewhat confusing and appears to dictate the license to the application. So, for example, I probably wouldn't use it for a commercial, closed-source application. For the same reasons, it cannot be used for a BSD-licensed application.
(However, I did use CLISP for the licensing back-end of such an application: that back-end runs on a server and isn't redistributed. Things you don't distribute to others cannot run afoul of the GPL.)
CLISP's license lets you make compiled .fasl files, and these are not covered by its copyright (unless they rely on CLISP internal symbols). However, that is where it ends. Memory images saved with CLISP are under the GPL. (Memory images are the key to creating a stand-alone executable with CLISP!) If you have to add libraries to CLISP itself, you also run into the GPL. I believe that this would cause issues to the users of TXR, which they do not have today. For a user to be able to run the .fasl files, they need CLISP, and of course that has to be distributed to them under the GPL terms, and you can't add C libraries to that CLISP without taining them with the GPL.
You can wrap TXR entirely in a proprietary application, including all of its internals: the whole image, basically. This wouldn't be possible if some of its internals were the CLISP image.
Regarding the GPL, I do not believe in that any more. I will not use this license for any new project. It is not a free software license in my eyes. Free really means you can do anything you want; any restriction logically means "not entirely free". Proprietary products that use free code do not take away anyone's ability to use the original. The problem with the FSF people is that they regard the mere existence of something as offensive. "It's not enough that there is a free program; we must litigate to death the non-free program which is based on the same code before we can be happy."
My own take on easy data transformations, if you'll allow me the plug: https://github.com/stdbrouw/refract