
Show HN: Url2cite: Transparently convert links to citations in markdown papers - phiresky
https://phiresky.github.io/blog/2019/pandoc-url2cite/?
======
Schiphol
Pandoc + Zotero + better-bibtex user here. This is indeed nice, thanks. One
possible problem I see (but I am not sure what could be done to solve it) is
that often metadata are incomplete or partially wrong, and I need to fix the
resulting bibtex entry by hand. This would be difficult with your tool, I
think. Have you thought about this?

~~~
phiresky
Each citation is added to a formatted JSON file when it is first seen. After
this it stays the same. This means you can fix the citation data by hand by
just editing the JSON file. You should include the JSON file in your version
control. An example is here [1]

Of course this might not be as comfortable as a GUI editor like Zotero, but I
think I still prefer doing some small tweaks to a JSON file vs. the problems
I've had in the past with Zotero exporting. The auto-export didn't always work
consistently, and Zotero has many features that I don't need at all that make
the whole thing more complex and fragile. Even collections are somewhat
confusing to work with - I've often added citations to the wrong collection
because it always adds to the currently open one, and then had to spend time
to find it again. You also can't know which references are actually used and
which aren't.

And I pretty much always gave up as soon as someone else needs to work on the
same paper, which then leads to managing the .bib file by hand anyways.

[1]: [https://github.com/phiresky/pandoc-
url2cite/blob/master/exam...](https://github.com/phiresky/pandoc-
url2cite/blob/master/example/citation-cache.json)

~~~
Schiphol
Thanks for this. Just a quick follow-up: if I use the same reference in
different papers, do I need to make those changes we were discussing on each
per-paper json file, or is there a way to do them once and for all?

~~~
phiresky
Well, you can use the same json file for multiple documents.

By default it will be stored in the directory from which pandoc is invoked,
but you can also change that by setting `url2cite-cache: filename` in the
markdown frontmatter or in the pandoc invocation: `pandoc -M url2cite-
cache=../bla.json --filter ...`.

------
hlieberman
What happens if the link content dies? Zotero pulls down a full text copy of
the document and saves it, IIRC, and I no longer trust URLs on the web to stay
up for any period of time (or be immutable at that position).

~~~
rahuldottech
You can always use Wayback Machine links

~~~
doomrobo
This is not reliable for paywalled links. I can sometimes download paywalled
content from my console because I'm behind a university IP block.

------
app4soft
You should exclude _“... Contribute to <user>/<repo> development by creating
an account on GitHub...”_ from referenced Github repo description.

~~~
phiresky
That's one problem with relying on automatically extracted information -
sometimes it's not really what you want. In this case that's just what GitHub
puts in the og:description tag for large? repos, probably to make it appear
that way on Google. Of course I could fix it for this instance but then it
wouldn't really accurately represent what you can expect...

The relevant code to extract that is a Zotero Translator [1] so that's what
would have to be changed for this.

[1]:
[https://github.com/zotero/translators/blob/master/GitHub.js](https://github.com/zotero/translators/blob/master/GitHub.js)

------
Gehinnn
This is nice, thank you ;) I love the pop-up that appears when hovering on a
reference.

An internet archive integration would be nice!

~~~
gwern
The floating footnotes need to be way more opaque, they're unreadable as is.

Reverse citation would also be interesting feature to have - on gwern.net, I
use [https://ricon.dev/](https://ricon.dev/) to provide reverse citation links
which search on DOI (if specified) or title (if no DOI) of links.

~~~
phiresky
I don't really see the issue, but I've increased the opacity anyways :)

------
bloopernova
sort of related: is there such a thing as an extension for Firefox that saves
a copy of the page when I bookmark it?

