Hacker News new | past | comments | ask | show | jobs | submit login

I’ve never understood the impetus for not using full LaTeX in an academic contex, given that the boiler plate is so minimal and presumably one has a built up a personal template over time.

For blog posts and notes I see the appeal, since the boilerplate can be a hindrance to spontaneous writing.




Latex can't produce web output, which is increasingly a target I want.

Also, Latex can't produce any output which is accessible to blind people (other than giving them the raw LaTeX). The PDFs latex produces are probably the least accessible format available (much worse than a word proeuced pdf, or some html). This matters to me, and should matter more to other people (in my opinion).


BUT that is what makes Pandoc powerful. You convert your latex or your whatever into: (Can we please add Racket's Scribble? It is by far the reason why Racket has the best documentation of any language. https://docs.racket-lang.org/scribble/)

Markdown, reStructuredText, textile, HTML, DocBook, LaTeX, MediaWiki markup, TWiki markup, TikiWiki markup, Creole 1.0, Vimwiki markup, OPML, Emacs Org-Mode, Emacs Muse, txt2tags, Microsoft Word docx, LibreOffice ODT, EPUB, or Haddock markup to

HTML formats

    XHTML, HTML5, and HTML slide shows using Slidy, reveal.js, Slideous, S5, or DZSlides
Word processor formats

    Microsoft Word docx, OpenOffice/LibreOffice ODT, OpenDocument XML, Microsoft PowerPoint.
Ebooks

    EPUB version 2 or 3, FictionBook2
Documentation formats

    DocBook version 4 or 5, TEI Simple, GNU TexInfo, Groff man, Groff ms, Haddock markup
Archival formats

    JATS
Page layout formats

    InDesign ICML
Outline formats

    OPML
TeX formats

    LaTeX, ConTeXt, LaTeX Beamer slides
PDF

    via pdflatex, xelatex, lualatex, pdfroff, wkhtml2pdf, prince, or weasyprint.
Lightweight markup formats

    Markdown (including CommonMark and GitHub-flavored Markdown), reStructuredText, AsciiDoc, Emacs Org-Mode, Emacs Muse, Textile, txt2tags, MediaWiki markup, DokuWiki markup, TikiWiki markup, TWiki markup, Vimwiki markup, and ZimWiki markup.
Custom formats

    custom writers can be written in lua.
https://pandoc.org/


Well, except LaTeX probably isn't the best base format to write in -- Pandoc's LaTeX parser isn't very good, it doesn't parse (from a quick check) any of the papers I've written. They've tried hard, but I think it's a losing battle, particularly once people start using a large range of packages.

That's not surprising -- it's basically impossible to "parse" LaTeX, as it's defined by execution.


iirc pandoc's markdown provides the set of functionality that one is capable of transforming back and forth. So as long as you stay within those formatting confines, you are set.

This works for everything except table notes a la ```threeparttable```


What about htlatex? It is quite powerful. In most of the cases, it produces nice HTML pages out of the box, with automatic rendering of figures and mathematical equations into PNG. It is part of most LaTeX distributions. On Linux, for example, just type

  $ htlatex mydoc.tex
instead of

  $ pdflatex mydoc.tex


For me, at least, htlatex never works just quite right. There are a lot of edge cases where it's broken. If you want to preserve having non-PDF output, starting in something like Pandoc Markdown is a better idea. And I do most of my documents in regular LaTeX.


>Also, Latex can't produce any output which is accessible to blind people

This sounds like it should definitely be a target of a grant. I guess most government organisations around the world are using Word et al, which isn't too bad these days accessibility wise (AFAIK).

Can you provide a small example of a LateX document that produces an inaccessible PDF?


If you grab any academic paper (particularly two columns) there is a good chance getting the text out will be hard, and any part of the paper with maths or tables will be unusable. Sorry. I'm away from a computer now, to make a smaller example.


The paper "GADTs Meet Their Match" (first I had in my list) seems to work fine, but I don't know what it was generated with.


cairo 1.13.1 is listed as the generator.

http://checkers.eiii.eu/en/pdfcheck/?url=https%3A%2F%2Fwww.m...

The ACM template fails more! http://checkers.eiii.eu/en/pdfcheck/?url=https://www.acm.org..., and it's generated by pdfTex-1.40.15


I'll pick on one of my own random papers:

https://www.cs.york.ac.uk/aig/projects/implied/docs/cp03.pdf

Try extracting "Theorem 2" on page 5, or any text really. I just get random noise through either a PDF reader, or something like pdf2ascii / ps2ascii.

We just made this with standard latex.


Any chance you could post the source code for this? It's using bitmaps for characters instead of proper fonts, which shouldn't happen nowadays. Maybe you should put "\usepackage{lmodern}" at the start? See for example https://tex.stackexchange.com/questions/1291/why-are-bitmap-...

I work with course materials made in Latex, and students sometimes need/want to copy and paste from them, so I try to avoid these kinds of problems.


That’s interesting. Did you \usepackage[T1]{fontenc}?


Thanks paper is from 2003, so I'm not sure.

This is just an example. From experience, most PDFs at conferences and journals, generated from pdf, are not accessible to varying degrees.


Accessibility is a big current push from the TeX Users Group. The president, Boris Veytsman, has made moving it forward a big goal. I know that a lot of people are working on aspects of that, but the name I hear the most is Ross Moore, who I have heard talk on making the output be PDF/A-3a compliant. I understood that it is a long way there.


I hope so, because honestly, Tex generated PDFs are the single biggest problem with being a blind researcher (I'm not blind, but I know a blind researcher).


>I’ve never understood the impetus for not using full LaTeX in an academic contex, given that the boiler plate is so minimal and presumably one has a built up a personal template over time.

I don't find the boilerplate minimal at all. Contrast the following:

    \begin{itemize}
     \item First
     \item Second
     \item Third
    \end{itemize}
with

     - First
     - Second
     - Third
I won't even get into the hell that is tables.

I loved LaTeX until I discovered Org Mode. Pandoc also scratches the same itch.


I agree. If one is going to use LaTeX directly or indirectly via Pandoc, eventually one would have to build up a personal template to fine-tune the look and feel of the documents.

If one is going to write LaTeX code anyway, it seems easier and cleaner to use LaTeX all the way, move all the boilerplate along with the personal template to say, a file named preamble.tex, and \input{preamble.tex} in the documents.

However, there are situations where Pandoc can be convenient. For example, I wanted a document[1] to be written primarily as README.md (CommonMark format), so that GitHub could render it as the project README. At the same time I wanted to render a PDF output from a customized form of the content. Pandoc is convenient for cases like this although it takes a bit of work to fine-tune the formatting and customize the content for each output format.

[1]: https://github.com/susam/gitpr

[2]: https://github.com/susam/gitpr/blob/master/Makefile


>If one is going to write LaTeX code anyway, it seems easier and cleaner to use LaTeX all the way, move all the boilerplate along with the personal template to say, a file named preamble.tex, and \input{preamble.tex} in the documents.

Not sure why you think it has to be that way. I author LaTeX documents using org mode. Org mode handles most of the boilerplate, and I can still put pretty much any custom LaTeX within the org document, wherever I want it (this includes \newcommand, etc). I lose nothing by going to org mode, and I gain much in terms of reduced boilerplate.


Yup. I’ve got a pandoc template for doing org-latex-pdf conversion, as well as some org templates for common documents that my clients need. Hack away on the document in org (which I’m probably going to be doing anyway, since the rest of my life is in there too), and then when it’s ready to hand off, turn it into a PDF using a shell script.

My absolute favourite moment with that flow was a client who wanted one as a docx instead of a PDF. Pandoc obliged and they commented that I must have spent a lot of time reformatting things for them :)


Why not use Org's built-in org->latex->pdf exporter? AFAIK Pandoc isn't compatible with many of the more interesting Org features, such as Babel.


That's a good question! The flow started out as markdown->latex->pdf via pandoc, and then when I got back into Org, it just slid right into that workflow to replace Markdown.

I'm curious now though... maybe I'm missing out!


It isn't clear to me whether you are saying that Pandoc is necessary or if you are saying that Pandoc is unnecessary and LaTeX alone is sufficient for all purposes.

I think your parent comment was saying that LaTeX alone is sufficient. You also seem to be saying that LaTeX alone is sufficient while using Org mode. Would you please clarify if I am interpreting your comment correctly or not?


>It isn't clear to me whether you are saying that Pandoc is necessary or if you are saying that Pandoc is unnecessary and LaTeX alone is sufficient for all purposes.

I'm not saying either. The parent said it's easier and cleaner to use LaTeX all the way. I was pointing out that it is easier to write in a format like Org mode and export to LaTeX (whether via Pandoc or Org mode's built-in exporter).

Of course LaTeX is "sufficient". It is also, IMO, painful.


Pretty sure they are saying pandoc is unnecessary.


I wrote my dissertation using Pandoc. It might seem that the LaTeX boilerplate is minimal, but Markdown is even more minimal, and it preempts the urge to fuss with your layout. Writing in Markdown means that you can wave your hand at the document and say, "It's a draft, I'll fix the formatting once I'm sure I even want this material." Afterwards, fixing the layout is really easy because you can drop raw LaTeX in wherever you need to, and you haven't wasted countless hours laying out a float you later end up cutting.


Having to use `\textbf{...}` is impetus enough for writing in Markdown instead.


LaTeX editors have simple keybindings for this, like ctl-b., or C-c C-f C-b in emacs, which makes this kind of thing a non-issue for me...


Just reading C-c C-f C-b is an issue for me and I haven't even tried to remember and type it yet.


C-c C-f is a prefix key, you get bold with C-b, italic with C-i and so on. At least it was last time I used AucTex :)


I agree - while pandoc is great it's usually not 'one click' to any format, especially when have html or latex-specific markups.

It's not for everyone, but emacs+auctex really reduces the latex boilerplate (at least writing it) that I don't really feel it's a hindrance.


I didn't use LaTex for years, is it still a hell to make tables? And also very difficult to use templates to generate good looking documents that doesn't like an academic paper?




Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: