Hacker News new | past | comments | ask | show | jobs | submit login

The scientific publishing workflow is insane, and this tool seems like it could help.

In the biomedical sciences (or any field that ends up on PubMed), articles have to be converted to JATS XML (http://jats.nlm.nih.gov/), a standard XML dialect for journal articles. It builds in citation metadata, cross-referencing, figure references, etc., and is supposed to be a stable archival format for long-term storage of articles. Individual publishers (PLOS, BMC, etc.) build their entire publication workflows around JATS, so articles can be typeset into PDF or rendered to HTML, or delivered to e-reader apps or whatever. Since it's semantic XML, you can do bibliography mining, automatic reference following, extraction of figures, or whatever you might want to make reading or text-mining easier.

But articles are often written in Word, so there's a tremendous amount of work going into manually or semi-manually converting every manuscript to semantic XML from the Word soup it arrives as. Same goes for LaTeX: a few journals just publish LaTeXed PDFs directly, but big publishers like Elsevier and Springer have semi-automated processes for converting LaTeX to in-house formats so they can provide HTML versions of pages.

So, short version: an editor supporting JATS XML can support all the features you need in a scientific document, and can dramatically simplify the publication workflow and hopefully save a bunch of money. And hopefully open-access journals pass that savings on to users.

For users, it could mean better e-reading apps (so you don't have to zoom in on tiny fonts in a PDF on your iPad), better support for cross-referencing and figures than Word has, automated formatting (journals style the XML, so you don't have to do margin and layout crap), and a simpler submission process.




And yet if you try and submit to publishers in JATS, editors tell you they have no idea what it is, and can you please send a Word document like everyone else...


Yeah, that's kind of my point.


Seems like a very domain specific thing. I've never heard of it in my discipline, and every article I've gone to upload supports a direct LaTeX upload.


It is domain-specific, but the domain is rather large. In the vast majority of life science fields, LaTeX is not accepted for journal articles, even in fields where you might expect it. In medical informatics, for example, direct LaTeX upload is uncommon, which you might not expect for an informatics journal.


Yeah, I realize I live in a bubble where everyone uses LaTeX, and that most of the world's papers are written in Word (the horror!).


In some senses, I share your distaste for Word. But, honestly, after having written my dissertation in LaTeX, I must say that it was a terrible experience.

I really wanted to love LaTeX. In the worst possible way. I wanted to sneer at people who used Word, looking down my nose and show them how reproducible research was really done. How much more efficient and beautiful my work would be using this software. I wanted to venerate Knuth and this beautiful example of open source software.

It took me years before I had to admit to myself how terrible it was. I remember seeing a post on HN about how LaTeX was a cargo cult for philosophy majors (the analogy being that they were using the tools of a technical discipline without a clear purpose). For me at least, a better analogy would be an abusive relationship. I would tell my labmates about how great it was, while it was obvious that I was spending hours on getting the most trivial configuration correct.

I am truly glad that some people have a good experience with LaTeX. I just wanted to say that I didn't.


I feel you. See my other comment. I've developed a good workflow with it, but it was only after using it for multiple years and developing a number of work arounds.


What was the pain point?


LaTeX is actually pretty terrible from a usability standpoint. It has some nice ideas, but it is ripe for a refresh or replacement. I'd rather write anything involving math with it than fight with Office's equation editor, which -- in my experience -- does seemingly arbitrary things.

For starters, the compilation output is impossible to parse if you have an error. The process is also needlessly arcane. You can normally get away with using the `pdflatex' compiler, but there's some (not uncommon) features it doesn't support. I don't see how you can use the thing without a good IDE, but the only one I've enjoyed working with is a non-free OSX one. The list of complaints in this style goes on.

LaTeX does have some good ideas. It attempts to separate style / layout from a semantic description of the text in a similar manner to HTML and CSS, although this very under utilized. I've found a couple packages that use this concept for stuff like algorithm syntax, but most people end up using the default document structure or the 1/2 very common ones for their field (IEEE or ACM for me). It is way more convenient to write equations in it than using Office's editor. Finally, the rendering algorithms are normally quite good at laying out your text (an exception being if you write a long equation and pick the wrong environment for the job, but that goes back the the thing about redundant modes).


> It is way more convenient to write equations in it than using Office's editor.

Really? Unless in the middle of the document you decide to make some change in notation...

But overall, I agree with you. I think the problem is LaTeX has a really steep learning curve. To understand what is going you have to read at the very least a couple of thick books, which is definitely too much too ask to somebody who's only using LaTeX for a one-off job.


> Really? Unless in the middle of the document you decide to make some change in notation...

You mean like renaming a variable from c to d? The proper things to do here is define an alias for any variables you use a lot.


Sorry for the confusion, I totally misread the comment I was replying to. Yes, I agree that LaTeX is more convenient for writing mathematical stuff.


We're solving a lot of these problems with Overleaf[1] by providing a cloud-based collaborative editor with a simpler UI for those new LaTeX, whilst keeping the power of LaTeX for those that need it.

We're also working with publishers to ease the submissive process to journals and repos[2], and include things like git-sync for offline working[3].

Great to see lots of different approaches to solving these problems though, and Texture is certainly an interesting idea.

[1] https://www.overleaf.com

[2] https://www.overleaf.com/publishers#!publisherslist

[3] https://www.overleaf.com/blog/195


I've been using LaTeX for about 15 years, and my biggest problem is still the error messages. At least once a year I just end up deleting a maths equation entirely, and then writing it out again in tiny fragments, to figure out where the error is.

Some concrete small problems:

* Why can't I just write _, < and > in plain text mode? Why does an _ complain I'm not in maths mode?

* It's very hard to cut+paste code samples in for this reason. You can use verbatim, but then that often doesn't nest correctly inside various types of things.


Interesting.

This seems like a tool for journal staff and editors, then, rather than practicing scientists.


It's a tool for everyone in the writing-editing-publishing pipeline. They could all use the same tool, working on the same document & document format, throughout the process.

This will remove a huge amount of reformatting/conversion work between steps in the pipeline.


The rebuttal, guaranteed to come from anyone with tenure but also from distinctly untenured folks like myself is this: "I have all my manuscripts and biosketches and workflow in LaTex/Word. All my references are in mendeley/zotero/endnote/bibtex files i've hand-massaged. Until such time as my home journal requires me to use this format i've never heard about, and gives me the tools to convert everything I've built over the last 2-20 years, I am not interested in it unless you've got something more compelling than what you've shown me so far."

getting traction is going to be _very hard_.


Of course - that's the most obvious thing in the world. Getting traction against an entrenched workflow component - a people problem - is, almost universally, very very hard. The technical problem - writing your new tool - is trivial by comparison.

But they do seem to be coming at this from the journal/pull end of the workflow, which is a very good place to start.




Applications are open for YC Summer 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: