Latex can't produce web output, which is increasingly a target I want. Also, Lat...

baldfat · on Aug 28, 2018

BUT that is what makes Pandoc powerful. You convert your latex or your whatever into: (Can we please add Racket's Scribble? It is by far the reason why Racket has the best documentation of any language. https://docs.racket-lang.org/scribble/)

Markdown, reStructuredText, textile, HTML, DocBook, LaTeX, MediaWiki markup, TWiki markup, TikiWiki markup, Creole 1.0, Vimwiki markup, OPML, Emacs Org-Mode, Emacs Muse, txt2tags, Microsoft Word docx, LibreOffice ODT, EPUB, or Haddock markup to

HTML formats

    XHTML, HTML5, and HTML slide shows using Slidy, reveal.js, Slideous, S5, or DZSlides

Word processor formats

    Microsoft Word docx, OpenOffice/LibreOffice ODT, OpenDocument XML, Microsoft PowerPoint.

Ebooks

    EPUB version 2 or 3, FictionBook2

Documentation formats

    DocBook version 4 or 5, TEI Simple, GNU TexInfo, Groff man, Groff ms, Haddock markup

Archival formats

    JATS

Page layout formats

    InDesign ICML

Outline formats

    OPML

TeX formats

    LaTeX, ConTeXt, LaTeX Beamer slides

PDF

    via pdflatex, xelatex, lualatex, pdfroff, wkhtml2pdf, prince, or weasyprint.

Lightweight markup formats

    Markdown (including CommonMark and GitHub-flavored Markdown), reStructuredText, AsciiDoc, Emacs Org-Mode, Emacs Muse, Textile, txt2tags, MediaWiki markup, DokuWiki markup, TikiWiki markup, TWiki markup, Vimwiki markup, and ZimWiki markup.

Custom formats

    custom writers can be written in lua.

https://pandoc.org/

CJefferson · on Aug 28, 2018

Well, except LaTeX probably isn't the best base format to write in -- Pandoc's LaTeX parser isn't very good, it doesn't parse (from a quick check) any of the papers I've written. They've tried hard, but I think it's a losing battle, particularly once people start using a large range of packages.

That's not surprising -- it's basically impossible to "parse" LaTeX, as it's defined by execution.

babahoyo · on Aug 28, 2018

iirc pandoc's markdown provides the set of functionality that one is capable of transforming back and forth. So as long as you stay within those formatting confines, you are set.

This works for everything except table notes a la ```threeparttable```

lqet · on Aug 28, 2018

What about htlatex? It is quite powerful. In most of the cases, it produces nice HTML pages out of the box, with automatic rendering of figures and mathematical equations into PNG. It is part of most LaTeX distributions. On Linux, for example, just type

  $ htlatex mydoc.tex

instead of

  $ pdflatex mydoc.tex

michaelhoffman · on Aug 28, 2018

For me, at least, htlatex never works just quite right. There are a lot of edge cases where it's broken. If you want to preserve having non-PDF output, starting in something like Pandoc Markdown is a better idea. And I do most of my documents in regular LaTeX.

voltagex_ · on Aug 28, 2018

>Also, Latex can't produce any output which is accessible to blind people

This sounds like it should definitely be a target of a grant. I guess most government organisations around the world are using Word et al, which isn't too bad these days accessibility wise (AFAIK).

Can you provide a small example of a LateX document that produces an inaccessible PDF?

CJefferson · on Aug 28, 2018

If you grab any academic paper (particularly two columns) there is a good chance getting the text out will be hard, and any part of the paper with maths or tables will be unusable. Sorry. I'm away from a computer now, to make a smaller example.

masklinn · on Aug 28, 2018

The paper "GADTs Meet Their Match" (first I had in my list) seems to work fine, but I don't know what it was generated with.

voltagex_ · on Aug 28, 2018

cairo 1.13.1 is listed as the generator.

http://checkers.eiii.eu/en/pdfcheck/?url=https%3A%2F%2Fwww.m...

The ACM template fails more! http://checkers.eiii.eu/en/pdfcheck/?url=https://www.acm.org..., and it's generated by pdfTex-1.40.15

CJefferson · on Aug 28, 2018

I'll pick on one of my own random papers:

https://www.cs.york.ac.uk/aig/projects/implied/docs/cp03.pdf

Try extracting "Theorem 2" on page 5, or any text really. I just get random noise through either a PDF reader, or something like pdf2ascii / ps2ascii.

We just made this with standard latex.

mkl · on Aug 28, 2018

Any chance you could post the source code for this? It's using bitmaps for characters instead of proper fonts, which shouldn't happen nowadays. Maybe you should put "\usepackage{lmodern}" at the start? See for example https://tex.stackexchange.com/questions/1291/why-are-bitmap-...

I work with course materials made in Latex, and students sometimes need/want to copy and paste from them, so I try to avoid these kinds of problems.

Gorgor · on Aug 28, 2018

That’s interesting. Did you \usepackage[T1]{fontenc}?

CJefferson · on Aug 28, 2018

Thanks paper is from 2003, so I'm not sure.

This is just an example. From experience, most PDFs at conferences and journals, generated from pdf, are not accessible to varying degrees.

jimhefferon · on Aug 28, 2018

Accessibility is a big current push from the TeX Users Group. The president, Boris Veytsman, has made moving it forward a big goal. I know that a lot of people are working on aspects of that, but the name I hear the most is Ross Moore, who I have heard talk on making the output be PDF/A-3a compliant. I understood that it is a long way there.

CJefferson · on Aug 28, 2018

I hope so, because honestly, Tex generated PDFs are the single biggest problem with being a blind researcher (I'm not blind, but I know a blind researcher).