Hacker News new | past | comments | ask | show | jobs | submit login

I'm still skimming the documentation so I apologize if the answer is obvious, but does anyone know offhand why this is implemented as a fork of Pandoc? Pandoc already has extended markdown features and the creator of pandoc is very much an academic (http://johnmacfarlane.net/), so is there are a reason why these contributions aren't part of pandoc proper?



Note: I'm the maintainer of this project

Basically this is forked because I've been wanting to see how much change to the AST is needed to include most of the academic-specific features. Since Pandoc's AST definition is a separate package from Pandoc itself and could possibly be a dependency of other projects, I thought it would be best to figure most of it out first and end up with just one proposal. Scholdoc has a much more limited number of input/output syntax, so it has much more flexibility when it comes to adding new document element types.

Consider this a self-motivated skunkworks project for Pandoc.


This is a fork because making these changes to pandoc itself needs a lot of consideration.

Internal referencing and attributes on figures are two things that are currently being discussed for pandoc. The discussion has been going on for quite a while though - hence people making forks.

Discussion on internal referencing: https://github.com/jgm/pandoc/issues/813

Discussion on image attributes: https://github.com/jgm/pandoc/issues/261


Thank you for the information.

This raises a couple more questions for me.

First, when searching I was able to find Martin Fenner's very interesting blog posts about ideas for a "Scholarly Markdown" and, as those issues and the first link on the scholarlymarkdown.com site reference, he appears to be associated with a separate "scholmd" project, also called "Scholarly Markdown," which is apparently a related project that itself is a fork of the Python markdown science project:

scholmd:

http://scholmd.org/

https://github.com/scholmd

Markdown Science:

https://github.com/karthik/markdown_science

However, it's unclear what all of the relationships are between all of these projects and forks.

Secondly, since some (or all?) of the changes are being discussed in the Pandoc issue tracker, are these changes intended to be submitted to Pandoc in pull requests? I don't currently see any.


First: I'm not sure of the exact origins of things. The way I see it, academic markdown is more of an ecosystem of tools with a lot of overlap. There is no one single markdown workflow right now when you want to do do academic writing. I think this is because no one is sure what the final spec should look like and people are trying things out and seeing what sticks. I feel that there has been some convergence in the last 2/3 years though.

Second: The PR for image attributes is here: https://github.com/jgm/pandoc/pull/1806

There isn't a PR for internal referencing yet because the implementation hasn't been worked out yet and it isn't a simple change (should there be a native representation, should it be a filter, which syntax should we use, what about the existing citation syntax...).


Note: I'm the maintainer of this project

The series of blog posts by Martin (and his efforts with John in getting citations to work in Pandoc) was the impetus of this project. I've reached out to Martin several months ago for comments, but I've not heard from him since. I guess he's very busy with his day job at PLOS. If he's willing, I'd very much like to reconcile this project with his efforts. The goal is, after all, better authoring workflows for all academics compared to the status quo, and it's going to take some concerted effort to get us all out of this giant energy well we got going for a few decades now.


Scholarly Markdown is very much a group of like-minded people, and we had a workshop with lots of good discussions in June 2013 (http://blog.martinfenner.org/2013/06/17/what-is-scholarly-ma...). What it has not been until the recent effort by timtylin is a specific set of tools, or spec.

Everyone seems to have an opinion on how to do this right, and that is part of the reason why the whole concept is pretty fragmented. Some of my thoughts:

Pandoc is the markdown converter that comes closest to what most people need, so I am happy to stick with it. I personally don't think that a fork is viable, things are already hard enough as it is.

Scholarly markdown is a solution for 80% of use cases, people writing math-heavy texts are probably better of sticking with Latex.

Scholarly markdown needs to be a community effort, I don't see any other way on how this can succeed


Hi Martin! Thanks for dropping by.

> I personally don't think that a fork is viable, things are already hard enough as it is.

I don't think so either. Scholdoc as a fork was always intended to be a stop-gap measure to quickly test out ideas. Pandoc's use of relatively standard Parsec is easier to hack, and lots of other subsystems like citeproc remain crucial. Scholdoc changes Pandoc's AST, so any discussion of re-integration is going to be a non-starter until at least 2.0

For this kind of workflow to be viable, 95% of the required effort is not going to be on the syntax/converter anyways. The real hard work is still ahead.

> Scholarly markdown is a solution for 80% of use cases, people writing math-heavy texts are probably better of sticking with Latex.

I agree, except I also think that there can be a 80% situation for math. I work with a lot of applied mathematicians/electrical engineers, and the math system in Scholdoc is designed with them in mind.

I really think that the ultimate goal is to arrive at many good ways (of which this may be one) to produce semantically-relavant open interchange format such as JATS. I assume this is what PLOS is trying to achieve as well? I do know that several people at PLOS is vehemently opposed to Markdown and what it stands for.

> Scholarly markdown needs to be a community effort, I don't see any other way on how this can succeed

Definitely. The best we can hope for is to occasionally stir this pot once in a while and hopefully something will spontaneously nucleate once the time is right.


This was my first thought as well. Pandoc has a rather excellent infrastructure in place to let you extend and build on it, so if you instead choose to fork, there better be a good reason...


Pandoc filters let you do a lot of things provided you stick to the defined types. If you want to add new types of element, like this does, then the only practical way is a fork.

(except for really simple cases)


However, in most cases you could just RawBlock (or was it BlockQuote?) with specific attributes, and then intercept those in a filter and convert to Latex.


This was basically my original approach. However, after working out the math syntax I realized that some things, like the double-backtick inline math, just can't be accomplished without a pre-filter. At that point I decided to just start playing with the parser code.

I also became super-convinced that some level of AST change was necessary to keep things sane, and since I wasn't able to use the existing Math and Image types anyways (they're not attributed), I ultimately just started a new AST type package namespace called "Scholdoc". Everything just evolved from there.


Yes, BlockQuote can be used as a generic block container. It doesn't have attributes itself but you can put them on a header at the start. For generic inline elements you can use Code.

Markdown syntax for generic containers (Div and Span) hasn't been implemented yet but the discussion is fairly mature now:

https://github.com/jgm/pandoc/issues/168

https://github.com/jgm/pandoc/pull/1791




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: