Hacker News new | past | comments | ask | show | jobs | submit login
KeenType: Pure Java typesetting system (github.com/davejarvis)
99 points by todsacerdoti on Jan 23, 2023 | hide | past | favorite | 45 comments



For some history/context on what this is based on:

• During 1977–78, Knuth wrote the first version of TeX (now called TeX78) in SAIL and gave a talk[1] about it. Everyone was excited, and lots of people at other universities/labs starting writing their own ports of TeX.

• During 1980–82, Knuth set out to rewrite a “portable” TeX using Pascal, the most widely available language at that time (actually he wrote his own macros/wrapper on top of Pascal; the resulting language is called WEB). This (TeX82 aka TeX) is the program still in use today if you invoke "tex". (Continues to be updated once every few years, but basically stable.)

• During 1998–2000, there was a Java rewrite of TeX called the New Typesetting System (NTS)[2], which had an alpha release and then was basically abandoned: it had all the problems of TeX and was several times slower, so it was regarded by many as a failure. Meanwhile extensions of TeX like pdfTeX came into existence, which are the programs most people use today.

• The code still exists though, and I imagine people must have tried it on and off. But since I can't find any earlier record, I'll provisionally take credit for the idea of resurrecting/resurfacing it, when I tried it in 2017: [3]

• Recently (in the last few months), Dave Jarvis (user `thangalin` here) has forked/modernized this Java implementation and is trying it out. (I noticed when he left his comment on my TeX.SE question; I guess his post https://tex.stackexchange.com/a/672614 on another question is what led to it being posted here.) I'm excited about this; looking forward to see how it gets used.

(Apologies if this long comment potentially takes attention away from the current state and future possibilities, but felt like the history is useful context…)

————

I'll also add some history about the license, as I see another thread about it. When Knuth wrote TeX, it was to solve a specific problem he had encountered in publishing: the second edition of TAOCP Vol 2 was turning out worse in appearance than the first, because the publisher was moving from hot-metal typesetting (with Monotype machines) to phototypesetting, and could not achieve exactly the same appearance. While being bothered by this, Knuth discovered that digital typesetting machines were starting to come into existence: now it was just 0s and 1s, and as a programmer he felt he could handle it. That was his motivation for creating TeX and METAFONT: the appearance of each page would remain reproducible forever into the future.

Except: when he publicized TeX and various clones sprung up, the situation was heading towards merely re-introducing the problem: if different computer systems had different programs all called “tex” that made even subtly different choices about line-breaking or whatever, then someone could write a paper/book on one computer, fine-tune the typesetting, then send the .tex file to their colleague / editor / publisher who used another computer, and no longer be confident that they would get the same result. This was one of the reasons he used the rewrite to create a portable TeX, going to great pains to use only a common subset of Pascal (Pascal had widely diverging compiler and language features), wrote a very demanding conformance test suite called the TRIP test[4] and insisted that any program would need to pass this test to be called "tex" — that way, everyone using "tex" could be sure they were writing something that would always produce the same result everywhere. If you wanted to make different choices in the program you could, as long as you changed the name.

When he announced in 1990[5] that TeX would be stable (no more features), this is what he wrote:

> My work on developing TeX, METAFONT, and Computer Modern has come to an end. I will make no further changes except to correct extremely serious bugs. I have put these systems into the public domain so that people everywhere can use the ideas freely if they wish. I have also spent thousands of hours trying to ensure that the systems produce essentially identical results on all computers. […]

> […] anybody can make use of my programs in whatever way they wish, as long as they do not use the names TeX, METAFONT, or Computer Modern. In particular, any person or group who wants to produce a program superior to mine is free to do so. However, nobody is allowed to call a system TeX or METAFONT unless that system conforms 100% to my own programs […]

> I welcome continued research that will lead to alternative systems that can typeset documents better than TeX is able to do. But the authors of such systems must think of another name. That is all I ask, after devoting a substantial portion of my life to the creation of these systems and making them available to everybody in the world. I sincerely hope that the members of TUG will help me to enforce these wishes…

etc. And NTS derives a similar licence, and this project (being forked from NTS) simply carries over the license.

Of course such a license is unusual today, and thousands of pages' worth of ink have been spilled over whether or not this counts as “free software” (predating GPL / MIT license etc), and (finally) grudgingly accepted to be. You can read about the history of the LaTeX project's similar license (LPPL) and all the debates on Debian/FSF mailing lists etc if you're morbidly interested[6]; the discussion usually proceeds on similar lines and I hope it doesn't get repeated here. :-)

[1]: Gibbs lecture, “Mathematical Typography”: https://doi.org/10.1090/S0273-0979-1979-14598-1

[2]: https://en.wikipedia.org/w/index.php?title=New_Typesetting_S...

[3]: https://tex.stackexchange.com/questions/385645/is-nts-new-ty...

[4]: http://mirrors.ctan.org/info/knuth-pdf/tex/tripman.pdf

[5]: https://www.tug.org/TUGboat/tb11-4/tb30knut.pdf

[6]: https://www.tug.org/TUGboat/tb32-1/tb100mitt.pdf


Aren’t pretty much all modern TeX distorts using luatex?


Not really. The luaTeX engine is the slowest and some of its defaults differ from the behavior of pdfTeX and XeTeX. I generally recommend people use XeTeX (which is also slower than pdfTeX but faster than luaTeX), unless either they specifically need some luaTeX facility or they are required to use pdfTeX (some publishers insist on pdfTeX, although their number is declining. I think that arxiv used to (maybe still does) be one site that only supported pdfTeX, but I could be wrong about that.


There are two main TeX distros: TeX Live and MikTeX. Both of them ship the binaries for pdfTeX, XeTeX, and LuaTeX. All three binaries (“engines”) have their users. There are also some other binaries (very few users use “plain”/“Knuth” TeX, and mostly Japanese users use euptex etc).

BTW, unlike the sibling comment I've never found LuaTeX's speed to be a problem: AFAIK it's only a tiny bit slower when loading fonts, but for typical real-life documents the bulk of the time is spent in LaTeX/package macro expansion, where, in principle with enough effort, LuaTeX features could actually considerably speed up compilation, e.g. eliminate need for multiple passes writing to an aux file, for things like table of contents that could just as well be inserted later. (Personally I think it would be nice if everyone used LuaTeX or any other approach that reduced macro hell, but it will be a while before that happens.)


Hello. Old Compugraphic 8400 typesetting operator here [welcome to the 1980s]. Also old hand at PageMaker and Quark Xpress.

If it doesn't support kerning, it's not really a typesetting system. I would call this a "mathematical formula editor" but not a "typesetting system."

To be more conformant to my expectations for a typesetting system, I'd suggest to add more information to the docs so you can understand how to specify things like font face, font size, horizontal and vertical element positioning and line spacing, etc.


It would be interesting to build this with TeaVM to allow high-quality typesetting for online papers and interactive web content.

TeaVM transpiles Java to JavaScript. I used it to convert a Java interactive fiction project to run natively in the browser:

https://frequal.com/java/RestoringA19YearOldGameWithTeaVm.ht...

More background on TeaVM:

https://teavm.org/

https://blogs.oracle.com/javamagazine/post/java-in-the-brows...


There is also Google’s closure compiler that can compile java to js.


What a fascinating license.

This is a particularly compelling line from it:

  3. You must not distribute the modified file with the filename of the
     original file.
Just in isolation, outside of the other aspects of the license, this is pretty much completely crippling for a Java program.

Simply, if you change a file, you can't use the same name. Well, in Java, the name of the file is the name of the class. You can't have the file name differ from the class inside.

That means you can't change a Java file in this codebase.

Fascinating.


I've been looking high-and-low for a way to rename Java classes en masse without breaking the build.

https://stackoverflow.com/questions/75154524/bulk-rename-of-...

A number of classes have been renamed to have the Kt prefix. There are other changes that don't yet have the new prefix applied because changing those classes would ripple into, essentially, the entire code base. Hence the search for a mass rename procedure that works. Suggestions welcome.

Donald Knuth set that rule up so that "plain.tex" would be the same, world-wide. I don't know why the authors applied the same license to the entire Java application, rather than limit it to Knuth's work.

I've also added a comment "// Originally" that indicates the original name of the class, rather than the name of the file. This minor license violation keeps its spirit if not its letter. Otherwise there'd be no way to extract an inner class.


Ignoring the case of the TFA license that actually requires you to rename the files, renaming packages is normally enough, and is rather straightforward, which is probably why there isn’t much existing tooling for batch-renaming classes.


Wouldn't something like the following shell one-liner rename get you quite close?

   export javas=$(find . -regex '.*\.java$'); for old in $(grep -hrPo 'class (?!Kt)\K[A-Z]\w+'); do echo $old; sed -i "s/\<$old\>/Kt$old/g" $javas;  rename "s/^$old\$/Kt$old/" $javas; done
Untested, unoptimized, needs gnu grep and perl's rename utility (variously packaged as prename or rename-files, as opposed to the crappy linux-utils rename).


Here you are, "gradle build" works with this cleaned up and faster version (and the diff looks plausible on first sight):

   javas=$(find . -regex '.*\.java$')
   sed -i -E "$(printf 's/\\<(%s)\\>/Kt\\1/g;' $(grep -hrPo '\b(class|interface|record|enum) (?!Kt)(?!List\b)(?!Entry\b)\K[A-Z]\w+'))" $(echo $javas); 
   rename 's;\b(?!Kt)(\w+[.]java)$;Kt$1;' $(echo $javas)
(The $(echo $javas) is so it works in both bash and zsh).


If you're fine using intellij then you can probably use its scripting to do this. It has access to most of the same API as plugins and can access intellij features such as its refactor rename.

The (meager) docs on scripting: https://www.jetbrains.com/help/idea/ide-scripting-console.ht.... The plugin docs & intellij source code (especially the unit tests) are a better source of information.

And here's a script that will use intellij's refactoring to rename all classes recursively in a directory to add the prefix Renamed, correctly dealing with references & renaming files if necessary:

  @file:Suppress("NAME_SHADOWING")

  import com.intellij.notification.Notification
  import com.intellij.notification.NotificationType
  import com.intellij.notification.Notifications
  import com.intellij.openapi.actionSystem.*
  import com.intellij.openapi.keymap.KeymapManager
  import com.intellij.openapi.command.WriteCommandAction
  import com.intellij.psi.*
  import com.intellij.psi.search.*
  import com.intellij.refactoring.rename.RenameProcessor
  import com.intellij.util.ThrowableConsumer
  import java.io.PrintWriter
  import java.io.StringWriter
  import javax.swing.KeyStroke

  // Usage: In IDEA: Tools -> IDE Scripting Console -> Kotlin
  // Ctrl+A, Ctrl+Enter to run the script
  // Select folder containing target classes, Ctrl+Shift+A to open action menu, search for Bulk refactor

  //<editor-fold desc="Boilerplate">
  val b = bindings as Map<*, *>
  val IDE = b["IDE"] as com.intellij.ide.script.IDE

  fun registerAction(
    name: String,
    keyBind: String? = null,
    consumer: ThrowableConsumer<AnActionEvent, Throwable>
  ) {
    registerAction(name, keyBind, object : AnAction() {
      override fun actionPerformed(event: AnActionEvent) {
        try {
          consumer.consume(event);
        } catch (t: Throwable) {
          val sw = StringWriter()
          t.printStackTrace(PrintWriter(sw))
          log("Exception in action $name: $t\n\n\n$sw", NotificationType.ERROR)
          throw t
        }
      }
    });
  }

  fun registerAction(name: String, keyBind: String? = null, action: AnAction) {
    action.templatePresentation.text = name;
    action.templatePresentation.description = name;

    KeymapManager.getInstance().activeKeymap.removeAllActionShortcuts(name);
    ActionManager.getInstance().unregisterAction(name);
    ActionManager.getInstance().registerAction(name, action);

    if (keyBind != null) {
      KeymapManager.getInstance().activeKeymap.addShortcut(
        name,
        KeyboardShortcut(KeyStroke.getKeyStroke(keyBind), null)
      );
    }
  }

  fun log(msg: String, notificationType: NotificationType = NotificationType.INFORMATION) {
    log("Scripted Action", msg, notificationType)
  }

  fun log(
    title: String,
    msg: String,
    notificationType: NotificationType = NotificationType.INFORMATION
  ) {
    Notifications.Bus.notify(
      Notification(
        "scriptedAction",
        title,
        msg,
        notificationType
      )
    )
  }
  //</editor-fold>

  registerAction("Bulk refactor") lambda@{ event ->
    val project = event.project ?: return@lambda;
    val psiElement = event.getData(LangDataKeys.PSI_ELEMENT) ?: return@lambda

    log("Bulk refactor for: $psiElement")

    WriteCommandAction.writeCommandAction(event.project).withGlobalUndo().run<Throwable> {
      psiElement.accept(object : PsiRecursiveElementWalkingVisitor() {
        override fun visitElement(element: PsiElement) {
          super.visitElement(element);
          if (element !is PsiClass) {
            return
          }

          if(element.name?.startsWith("Renamed") == false) {
            log("Renaming $element")

            // arg4 = isSearchInComments
            // arg5 = isSearchTextOccurrences
            val processor = object : RenameProcessor(project, element, "Renamed" + element.name, false, false) {
              override fun isPreviewUsages(usages: Array<out UsageInfo>): Boolean {
                return false
              }
            }
  
            processor.run()
          }
        }
      })
    }
  }
If your project is large this will temporarily freeze intellij since it doesn't properly run on the background thread but just give it a minute.

By the way, this should work on all languages that IntelliJ has a language plugin for, not just Java.

Edit: Modified script to override isPreviewUsages to prevent intellij from opening confirmation dialogs


This looks rather promising. Doesn't work in my version of IDEA (Build #IC-223.8214.52, built on December 20, 2022, Community Edition), due to the following errors when running the script:

> Argument for @NotNull parameter 'module' of org/jetbrains/.../...GroovyRunner must not be null.

The project doesn't have any references to the NotNull annotation, and there are a number of what appear to be compile errors:

https://i.ibb.co/dfxbV4p/rename-script.png

Running with Ctrl+Enter produces:

    > @file:Suppress("NAME_SHADOWING")
    MultipleCompilationErrorsException: startup failed:
    Script1.groovy: 1: Unexpected input: '@file:' @ line 1, column 6.
       @file:Suppress("NAME_SHADOWING")
            ^

    1 error
I removed the line and re-ran, which resulted in:

    > import com.intellij.notification.Notification
    [22 ms]=> null
Then I selected the entire text and pressed Ctrl+Enter again, which returned:

    MultipleCompilationErrorsException: startup failed:
    Script4.groovy: 20: Unexpected input: '<*' @ line 20, column 25.
       val b = bindings as Map<*, *>
                               ^

    1 error
After changing the asterisks to question marks, the script failed to run, indicating:

    MultipleCompilationErrorsException: startup failed:
    Script5.groovy: 25: Unexpected input: 'registerAction(\n    name: String,\n    keyBind: String? =' @ line 25, column 22.
           keyBind: String? = null,
                            ^

    1 error
I appreciate the effort, though!


Make sure you pick Kotlin for the scripting language, not Groovy.

Then run the script like this:

1. In IDEA: Tools -> IDE Scripting Console -> Kotlin

2. Paste in the script

3. Ctrl+A, Ctrl+Enter to load the script. This should show a green "Loaded!" notification

4. Select a folder containing the classes in the Project window

5. Press Ctrl+Shift+A to open action menu

6. Search for Bulk refactor and select it

Also please note the first version of the script had a bug that would open confirmation dialogues for some refactorings, so see the edited script.


> 1. In IDEA: Tools -> IDE Scripting Console -> Kotlin

No such sub-menu of Tools exists, but that's fine because I've mapped Ctrl+Shift+A to bring up the menu. I installed the Kotlin plugin and tried again. There was a missing import. After fixing the missing import:

    [732 ms]=> null
I switched "Rename" to "Kt", of course, and re-ran the script.

> 4. Select a folder containing the classes in the Project window

> 5. Press Ctrl+Shift+A to open action menu

> 6. Search for Bulk refactor and select it

Opening the menu with Ctrl+Shift+A doesn't show the bulk refactor script, but that could be because I remapped Ctrl+Shift+A and am using the NetBeans keyboard mappings, rather than the IntelliJ map.

https://i.ibb.co/5LTsffj/bulk-refactor.png


> There was a missing import

Oops :)

When pressing Ctrl+Enter, did you see the green "Loaded!" notification in the bottom right? If so, then I guess installing the Kotlin plugin worked.

> Opening the menu with Ctrl+Shift+A doesn't show the bulk refactor script

The Ctrl+Shift+A menu from the default keymap that I'm talking about is the "Actions" dialog. You can also get there via "Navigate -> Search Everywhere" and then selecting the "Actions" tab (or just searching in All).

Also you only have to select a single folder/package and it will be processed recursively. I did not test what happens when you select multiple folders. Might be fine, might explode.

edit: Just realized the script I pasted here does not emit the "Loaded!" log message on load. So disregard those comments. If you want to verify it loads correctly, add this as the last line:

    log("Loaded!")
edit: HN rate limited my account again, so I'm not allowed to post any new comments for the next few hours. Good luck!

If you do wind up using this script and want to modify it, it is helpful to read https://plugins.jetbrains.com/docs/intellij/psi.html & to install the PsiViewer plugin from the marketplace.


There are some minor clean-up issues, but overall, it did the trick.

https://i.ibb.co/MP2JNRM/classes-renamed.png

Thank you!


Looks like it was inherited from an upstream project that has a long history: https://github.com/jamespfennell/new-typesetting-system#lice...


Huh? Refactor the class name in virtually any Java supporting IDE and the file name and invocations will be changed along with it.

The license term seems petty, and would be annoying, but the workaround is trivial.


> but the workaround is trivial.

If you have a trivial solution to rename hundreds of Java classes and their references en masse without breaking the build and without renaming each file one-at-a-time, do tell. You can find more details about potential solutions I've tried here:

https://stackoverflow.com/questions/75154524/bulk-rename-of-...

FYI, the following plug-in for IDEA supports regular expressions (as of Jan 20, 2023), so it may be a viable solution:

https://plugins.jetbrains.com/plugin/17455-rename-files-refa...


Edit: Never mind, whartung enlightened me in a sibling comment. Leaving the egg on my face for posterity below.

---

Why do you want to change all the files? The license only says you have to change the names of files that you modify. It's a pretty weird term.

If for some reason you were determined to change the names of all files then you ought to be able to create a plugin that walks the tree of classes and invokes the refactoring capabilities of the IDE to make the change. It's a strange use-case though so I'm not really surprised there isn't a ready tool for it.


> Why do you want to change all the files?

Ever pull on a thread to unravel an entire sweater?

Imagine wanting to add an SVG typesetter. To do this, the interface implemented by the DVI typesetter needs to be extracted and generalized so that it works for both SVG and DVI (to maintain existing behaviour). The `Typesetter` interface is used by, for example, `BaseNode`, so any changes to that interface could touch `BaseNode`. Changes to `BaseNode` can affect all `*Node` classes, which then cascade over to `*Noad` and `*Prim` classes, and from there the rest of the source base.


I bet Open Rewrite can handle that quite easily: https://docs.openrewrite.org/


I wasn't able to find a recipe that performs a regular expression rename. That is, there doesn't seem to be anything like:

    ---
    type: specs.openrewrite.org/v1beta/recipe
    name: com.yourorg.ChangeTypeExample
    displayName: Change type example
    recipeList:
      - org.openrewrite.java.ChangeType:
          oldFullyQualifiedTypeName: (.+)
          newFullyQualifiedTypeName: Kt$1


I would do that the hacker way:

- build the current TeX.jar

- jar tf TeX.jar | grep whatever to get a list of to-be-renamed classes

- use sed or manual editing to generate a large list of recipes

- run those recipes

Instead of jar tf, grep for “^package foo.bar.baz” and “class (complex regex)” in the source code to get a list of package and class names, or use the directory structure. Maybe do all of them, and compare the results for robustness.

That list will be large and it may not go perfect in one go, but it’s a one-off job, and you‘ll get there.


This feels like something where the right solution is to drop down to the terminal and use the old standbys - sed/grep/find, etc. Could probably wrap it all up in `make` or `just` if this is something you do regularly.


It's not trivial. Some class names are substrings of other classes. I'm pretty sure this requires loading and parsing the source code's AST to accomplish without breaking the build. The SO thread goes into more gnarly details.


Sorry, I didn't mean to imply it was trivial, by any means. Just that there are powerful tools for "manipulating many line-delimited text files at once" and those tools seem likely to fit your goal.

It's possible you do in fact need to parse the AST for your goals - I haven't spent the requisite time to deeply understand the problem.


Posted a solution on your other comment: https://news.ycombinator.com/item?id=34493760


If you do that, you have to rename every class file it impacts as well, because now those files are changed. A lot of refactoring systems don't take into account the ancillary files (like build files, start scripts, etc.).

So, the workaround can be quite impactful in the end. Renaming the entirety of the code base, potentially.


In principle, you could rename the classes on the class-file level after compilation, back to the original names. That wouldn’t violate the license.

Moreover, under the following definition in the license:

  `Modification' of a file
  means any procedure that produces a derivative file under any
  applicable law -- that is, a file containing the original file or
  a significant portion of it, either verbatim or with modifications
  and/or translated into another language.
…compilation would count as modification (“translated into another language”), and compilation also changes the file name (from .java to .class), so distributing the compilation output should be fine. Furthermore, any changes to the sources can be distributed as a diff patch, which also has a different file name.


Oh, duh, now I get it. That's very annoying.


As a workaround, you could distribute the modified Foo.java as Foo2.java, and have your build system rename Foo2.java to Foo.java at build time. It might be all it takes (but IANAL).


WHich would completely cripple IDEs or at the very least makes it difficult to work with


Or just use a shading tool (which comes by default with most Maven/Gradle builds). It is not a barrier for any experienced java developer.


Now I want to make a fork, DumbKeenType, where the first thing I do is append "Dumb" to the beginning of every class/filename.

https://www.newyorker.com/tech/annals-of-technology/dumb-sta...


> in Java, the name of the file is the name of the class. You can't have the file name differ from the class inside.

Actually, I'm fairly sure you can. You just need a custom classloader.


Class files don't actually care about file names so ClassLoader has nothing to do with this. The OP problem is the source. The Java compiler won't compile a source file if none of the classes (and the only public one) in the file matches the name of the file.


> Class files don't actually care about file names

But URLClassLoader does, which definitely has something to do with this.

True, before you get there, you'd first need it to compile - but modern JDKs have a compiler API that allows customization via the StandardJavaFileManager interface, and that looks like it should be able to do the job to make source file names follow a different convention.


Yes, commonly used ClassLoaders do care about file names and directories. The Java compiler javac does not. The javac compiler compiles compilation units which are generally files, but name does not matter.


No, the file names of compilation units do matter to javac. To quote the Java Language Specification, section 7.6:

> If and only if packages are stored in a file system (§7.2), the host system may choose to enforce the restriction that it is a compile-time error if a class or interface is not found in a file under a name composed of the class or interface name plus an extension (such as .java or .jav) if either of the following is true:

> * The class or interface is referred to by code in other ordinary compilation units of the package in which the class or interface is declared.

> * The class or interface is declared public (and therefore is potentially accessible from code in other packages).

and javac does indeed enforce this restriction, at least for public classes. It's an error to declare a public class called Foo in a file that's called anything except Foo.java.


I stand corrected. However, I don't believe that the class source file needs to be in a directory (or hierarchy of directories) that matches the package name. Many IDE's enforce this as well, which I believe is not correct.


Every class file contains the full information about the class name, package and everything needed to instantiate the class... it even includes the source class file name usually and other attributes like statement line numbers.

However, for performance reasons, the JVM loads classes lazily and will only try to find classes in the directory it would be expected to be (either in the directories included in the classpth or in jars). Because you can write your own class loader, you can implement something different (like OSGi does for example - it has much more metadata so it can even load multiple classes with the same name from different jars) but this is the default behaviour.


Such an easy statement to check and quickly prove completely wrong.

    echo "public class Foo {}" > Bar.java
    javac Bar.java

    Bar.java:1: error: class Foo is public, should be declared in a file named Foo.java




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: