There were so many before (SILE, Patoline, Rinohtype, ...)
Why not just use LuaTeX, either directly with Lua as a typesetting language, or with a custom language transpiling to Lua or compiling to Lua or LuaJIT bytecode?
The idea being that much as Java relieves some of the pain of writing object oriented code in C++, finl relieves some of the pain of writing documents in LaTeX. It helps, of course, to have a deep understanding of TeX and LaTeX in doing so. Looking at sile, for example, on my initial look, it seems that the creators didn't fully understand LaTeX or what makes LaTeX worth re-imagining. At the risk of being insulting, it kind of feels less like C++ to Java and more like Perl to PHP, where the creators of PHP were clearly influenced by Perl, but didn't understand why Perl works the way it does.
I've decided on Rust. I'm in the midst of a "short story" exploration of Rust and PDF typesetting with a replacement for gftodvi (gftopdf) which, while it will have a potential userbase in the high single digits, gives me a low-risk way to experiment with typesetting to PDF. One of the sub libraries of finl (the charsub module) will be part of this code in the 1.0 release and potentially other sub libraries will be incorporated as development continues.
I have a bit. I really want to avoid writing code as much as possible. The flip side is that I've noticed that a lot of people in the Rust community are too quick to reach for a library. For gftopdf, I had someone suggest that I use nom to parse the GF byte code, but there's enough impedance mismatch and missing functionality (I don't know that 24-bit integers exist outside of Knuthian binary formats) and the functionality was simple enough (four short functions to read values of u8, u16, u24 and i32 into an i32 container, plus a read a string with a 1–3 byte length parameter at the beginning) that it didn't seem appropriate to have the overhead of an external dependency. On the flip side, I was more than happy to use anyhow, thiserror and structopt for their functionality.
I suspect though, given how input will get tokenized, though, that I may not be able to use a generic library.
I agree, as much as I like LaTeX, and what it produces, it is dated. It's missing modern features. Lots of stuff is clunky. The content part should be declarative programming. It should be functional underneath the hood. I think ggplot2 is maybe an example of declarative visualization creation. You just declare what is in the document, very clear structure. Perhaps there is some styling/rendering component you can tweak. Seems like a job for Haskell. Needs like a simple DSL in a sense though.
At least you can use Pandoc to turn old LaTeX documents into html/etc. I kind of like using tex flavoured markdown and just generating html with pandoc.
My goal with finl is not to re-create TeX but to reimagine it. The macro-oriented programming language is a big pain point, as are the whole concept of category codes. In many instances, LaTeX is still hemmed in by the constraints of ca 1980 computing technology. If you look through tex.web, you'll see things like weird string concatenations that were done in order to save a handful of bytes of memory. It was a radical change to TeX in the 80s to move to 32-bit addressing for the main memory array from 16-bit addressing. And Unicode support is occasionally weird, particularly in the incompatibilities between how ^^ literals are managed in pdfTeX vs XeTeX/LuaTeX (in the former, ^^D7 is a byte, in the latter ^^D7 is a code point and turns into a two-byte UTF sequence).
I understand your motivation. The thing is, there were many attempts to reimagine TeX, and none of them succeeded. Not because there were bad ideas, far from that. It's a huge project requiring a significant development team.
The idea is that a LaTeX document and a finl document would look nearly identical between \begin{document} and \end{document}. As for packages, they fall into two categories:
* document classes. Creating document classes is too difficult with LaTeX and a new document class with finl will be a relatively trivial thing to create,
* extensions to LaTeX to remedy deficiencies. Some, like babel/polyglossia and the six million improvements on tabular would be things baked into the core format of finl. The most notable third-party ecosystem that would be translated would be the whole TikZ environment. I'm aiming to be able to allow TikZ illustrations to be cut and pasted into finl and to replicate at least the most common extensions. The other big one is the whole beamer ecosystem. The finl equivalent will keep similar if not identical syntax but will be superior in that it will be able to allow non-PDF backends for output (so, for example, one could run finl-ppt mypresentation.finl and get a PowerPoint presentation with all the capabilities that PowerPoint gives for things like animations and transitions that are difficult or impossible to achieve in a PDF-based presentation.
Recently I have been also thinking about TeX's internals, also because I feel the pain points you describe in at https://www.finl.xyz/. I have too thought about experimenting with TeX rewrites/alternatives and particularly in Rust.
I don't know if you are aware, but finl has been also submitted to Lobsters (https://lobste.rs/s/udu5oe/finl_is_not_latex_reinventing_lat...). One of the comments mentions SILE (https://sile-typesetter.org/). I looked into it (https://www.youtube.com/watch?v=5BIP_N9qQm4 was nice introduction, although I don't know how up to date it is). It seems to me that SILE replaces many pain points in TeX, while being somewhat close to LuaTeX (and TeX in general), due to use of similiar components (Lua, alogorithms, hyphenation patters, libraries). It even supports "LaTeX" like input syntax, but frankly I don't think that is the part of LaTeX I would keep :).
I too have recently been interested in TeX and Rust. Apart from Tectonic (wrapper around XeTeX and dvipdfmx) and the attempt to rewrite it in Rust (your first link), I also found [1], an attempt to rewrite TeX itself in Rust. I also understand that you are a supporter of the Tectonic in Rust effort, so hopefully you can fill me in on the current progress.
At first I though that just rewriting XeTeX/dvipdfmx in Rust just for sake of being written in Rust was foolish, because of TeX's untypical memory model. Also because of the manual translation using c2rust (also on C code generated from Pascal code that uses a lot of macros).
But after lookin at it now it seems that the rewrite progressed and the result are parts looking very Rusty, which is nice.
I too had ideas about TeX in Rust, but I think that starting with LuaTeX would be much more beneficial. Apart from LuaTeX having obvious support of scripting in Lua and being very extensible, XeTeX has other disadvantages. See for example [2], which still in my opinion misses many internal differences, where LuaTeX is much superior.
Do you have any tips on how to join the Tectonic / TeX in Rust community? Where can I potentially discuss my foolish TeX/Rust ideas, being very new to Rust?
They have a forum but now discuss the switch to GitHub Discussions[1]. The issue about oxidizing everything is here[2]. If you want to help the project there is a number of issues[3] that can be done, in particular syncronization with the Tectonic mainline and XeTeX code update[4]
While I think that Tectonic can have benefits for those coming to "TeX", ConTeXt is by all means superior to all other TeXs and even Tectonic.
While at the core there is always a TeX engine (original TeX, pdfTeX, XeTeX, LuaTeX) that provides typesetting primitives, almost everyone uses a macro package on top of that (plain TeX, LaTeX, ConTeXt) and there usually are many extending packages that are either targeted for one format or more of them (e.g. TikZ works in all three, while e.g. beamer is LaTeX specific). For many documents there can also be other external programs involved, e.g. BibTeX for bibliographies.
The idea of LaTeX is that it is a core that provides basic facilities and programmer interface. A lot of essential functionality is then implemented by external packages. Hence, one doesn't usually get away by only installing engine (e.g. XeTeX) and format (LaTeX) to get e.g. XeLaTeX, but needs many other packages, like fontspec (for macro interface of XeTeX's OpenType support). That is why people install distributions, like TeX Live or MikTeX, which provide a way for installing all engines, formats and packages (and other software like BibTeX and specialized TeX editors). Because of the single pass nature of TeX, to get things like forward references in documents right, one needs to process the document multiple times, there are scripts like "latexmk", which can do this automatically (it can even run BibTeX and other external software, the right number of times and in the right order).
In ConTeXt the situation is different. It is not a "minimal" format like LaTeX, but instead has all of the interesting functionality built in and nicely thought out. Also it has its own "distribution" that contains everything needed - the engine, format and even some fonts. While even ConTeXt has to be run multiple times to get references right, normally user doesn't even know about it, because "context" is a symlink to a script that does this automatically. Apart from that ConTeXt is vastly superior to any other TeX (both in engine and format).
Tectonic solves many pain points of XeLaTeX specifically:
- no need to install a distribution, all necessary files are downloaded when needed (from standard TeX Live),
- runs LaTeX / external programs the right number of times,
- nicer command line interface,
- AFAIK can be somewhat used programmatically,
But these problems are not present in ConTeXt. And even when comparing Tectonic to full TeX distrbutions, it obviously misses other engines / formats.
All in all: comparing Tectonic and ConTeXt is not straightforward. Both make it easy for users to compile (different kind of) TeX documents. My opinion: If you can use ConTeXt, do it. If you have to use LaTeX and can use modern variant of it, perhaps Tectonic is an easy way that lets you get away from knowing too much about all the internals of TeX, that you don't want to know anyways. (Installing "LaTeX" (and the size of it) is one of the biggest problems I see people have with it, so this easy way of getting started is not to be neglected.)
The thing with ConTeXt is that is changing way too fast and its documentation is poor, albeit they are working on that.
For example, with the recent LMXT implementation, the lettrine module support was lost and things like the file naming convention for external typescripts changed all of a sudden. LMXT introduced cool features but not all are documented, maybe only on the source - and not all people like me know how to read plain TeX or have a hard time finding something in particular between that rather weird mess of source files with a weird naming convention
What does [1] do over [2] ? I looked at the readme of [1] but it's just a fork of [2], so I'm not sure why [2] needs a rewrite? Is [1] just a rewrite of [2] in Rust?
Tectonic[2] is the Rust wrapper around C and C++ code of XeTeX. The fork rewrote everything, also aims to split it into separate crates that might be reused elsewhere, or to use some external crate instead of the own reimplementation.
Shudder. Haha. XSLT was actually conceptually a good idea. Just quite hard to read once written in lines and lines of XML.
In the 2000s I wrote a web site in mod_perl/Axkit that used XSLT to translate pure XML files into readable XHTML 1.0 for web browsers and WAP (!) for phones.
My personal website was generated using XSLT and some perl scripts to generate mostly static HTML files. I lost the master scripts during a computer upgrade somewhere but the only parts I update these days are backed by a database and that's the stuff I need to update.
This reminds me of something I have been wondering. If my goal is to programmatically create beautifully typeset documents (on par with modern high quality books or magazines), are there any great options? I don't care about mathematical notation, which is a big focus in LaTeX. Every time I've started going down the LaTeX path, it has felt like I'm swimming upstream by not wanting it to look like a standard CS or math academic paper. There is a very distinct "LaTeX look" that I don't care for.
I want to be able to use any OpenType or Postscript font, tweak leading, kerning, and so on. Basically all of the controls that I could get in InDesign or Quark XPress, but generated by code. Is there a good way to do this?
This is so much what I am thinking each time I'm forced to write in LaTeX. But then again, there is already a ton of alternatives out there. HTML + CSS + JS for screens. Word, InDesign, Scribus for pdf.
The one thing that forces me back to LaTeX is bibtex. There is simply nothing that can replace it. Some tools come close, like Mendeley for Word, but they have issues too (like being controlled by a giant company).
I'll switch for anything that lets me markup documents, render equations, and use bibtex. At present LaTeX is the most comfortable option to do all these three, in spite of the countless issues I have with it.
As far as I understand LaTeX3 is basically ready, it's just that the developers decided to make it available as an optional environment rather than a replacement to LaTeX2e - which would render it incomparable with all the things. It's also not comprehensively documented, though there are introductions. Core development work, again as far as I understand, has now shifted automatically annotated pdfs (might be the wrong term).
Not exactly. Originally LaTeX3 was going to be a complete rewrite of LaTeX2e. In the last year or so, there was essentially a declaration of surrender and LaTeX2e+expl3/xparse is now LaTeX3 but still called LaTeX2e. There will never be a LaTeX3 exactly. Some issues with LaTeX development are essentially intractable. They're doable in theory (because TeX is Turing complete), but not in practice.
It's ultimately still macro expansion. Don't get me wrong, I think that the work of Frank Mittelbach et al is nothing short of miraculous (and has been for the roughly thirty years since Frank burst upon the scene). If you look under the hood it gets clear just how tough programming in TeX macros is. I have a half-finished extension to expl3 to create a zip function that will take two lists and a macro to zip the two lists into a single list (half-finished because the motivating factor was a tex.stackexchange question and there's a similar function that's underdocumented and not a proper zip that I was able to use to provide an answer).
Almost all of these projects are doomed to fail (sadly). TeX is clunky, etc., but it makes really beautiful documents and the math support is ridiculously good. It's just a very high bar, so people are willing to live with crummy stuff. Even MS Word (a product with many billions of dollars of investment, likely) doesn't make documents that look as good as TeX.
There were so many before (SILE, Patoline, Rinohtype, ...)
Why not just use LuaTeX, either directly with Lua as a typesetting language, or with a custom language transpiling to Lua or compiling to Lua or LuaJIT bytecode?
Here is a solution based on LuaTeX: https://www.speedata.de/en/
Compiling to LuaJIT bytecode is not that difficult (see e.g. https://github.com/rochus-keller/Oberon and https://github.com/rochus-keller/LjTools).
If it's primarily about a better typesetting language based on TeX, then we already have e.g. https://en.wikipedia.org/wiki/ConTeXt.