
Hypothes.is: An Open Annotations Platform - karlicoss
https://web.hypothes.is/
======
smarx007
I tried them a while ago. Really liked it but put it aside till they implement
[https://www.w3.org/annotation/](https://www.w3.org/annotation/) as promised.
Just checked
[https://h.readthedocs.io/en/latest/api/](https://h.readthedocs.io/en/latest/api/)
and the v2 API in development has nothing to do with the W3C standard.

UPD: there is not a single issue open on the main repo wrt W3C standard
implementation:
[https://github.com/hypothesis/h/issues?q=is%3Aissue+is%3Aope...](https://github.com/hypothesis/h/issues?q=is%3Aissue+is%3Aopen+sort%3Aupdated-
desc+w3c). The post I took as a promise:
[https://web.hypothes.is/blog/annotation-is-now-a-web-
standar...](https://web.hypothes.is/blog/annotation-is-now-a-web-standard/)

~~~
woah
If your use of a piece of software is contingent on something like that, it
sounds like you didn’t really have a need for it in the first place

~~~
BiteCode_dev
Rather, OP is probably already using one and don't see the point of migrating
to yet another closed system.

~~~
tedmiston
Hypothesis has an open API -
[https://h.readthedocs.io/en/latest/api/](https://h.readthedocs.io/en/latest/api/)

~~~
BiteCode_dev
It's not the matter.

If you use a saas, unless it's open source and you can self host, you are at
the mercy of it.

If the service is good enough and the team behind it seems in good faith, it
can give you confidence for the switch.

But promising to adopt a standard then not doing it doesn't inspire
confidence.

The system being closed source (no open API can change that), they need to
build trust with some users. Particularly the tech saavy ones that have been
burnt in the past and hold data that is precious to them.

~~~
the_duke
Both the frontend and server seem to be open source.

[https://github.com/hypothesis/h](https://github.com/hypothesis/h)

[https://github.com/hypothesis/client](https://github.com/hypothesis/client)

~~~
BiteCode_dev
Then I stand corrected.

If it's open source, and it's just a matter of following a standard, I don't
think it's a reason for not giving it a try.

~~~
antman
I tried the docker some time back and it was broken. Has anyone ever managed
to set it up or are there any tutorials or blogs of any one outside the org
using their own setup?

------
tgbugs
My PhD work would not have been possible without Hypothes.is. My group uses it
for a variety of information extraction and curation tasks, and I'm fairly
certain that one of my colleagues has the highest number of human made
annotations in the system (though in private groups).

Hypothes.is is great for prototyping curation workflows to explore what is
possible before spending time implementing yet another tool. It also hits the
sweet spot for having a UI that is just accessible enough for non-technical
users that can display annotations made by bots. We have been using
hypothes.is as the backend to crawl [0] research resource identifiers from the
scientific literature for 4 years, and it has been solid the whole time.

I see some questions/concerns about the w3c compliance of the API. I'm pretty
sure that there is just a switch the needs to be flipped, and that it has been
implemented. Yep, it has [1]. Also as others have suggested it is fairly
straight forward to set up your own annotation store, though it is not simple
to replicate the Hypothes.is setup.

I hope that the pivot to education continues to go well so that it can fund
the infrastructure, because Hypothes.is provides a very valuable (if hard to
capture the value of) service.

0\. [https://github.com/SciCrunch/scibot](https://github.com/SciCrunch/scibot)
1\.
[https://github.com/hypothesis/h/blob/master/h/presenters/ann...](https://github.com/hypothesis/h/blob/master/h/presenters/annotation_jsonld.py)

~~~
antman
Did you guys use your own setup?

~~~
tgbugs
No, we retain a local copy of the annotations, but that's about it. Last time
I checked the hypothes.is client didn't have the ability to use an arbitrary
backend. We did at one point create a fork of the web extension that could
point to an arbitrary backend but the maintenance overhead ended up not being
worth it.

------
vonnik
Quick background. Hypothes.is was founded by Dan Whaley:

[https://www.linkedin.com/in/danwhaley](https://www.linkedin.com/in/danwhaley)

Dan built and sold an Internet travel app in the 90s, and then became involved
in climate work. He's been committed to it for many years.

Hypothes.is was created in order to address the problem of disinformation on
the Internet, especially related to climate change; e.g. the Wall Street
Journal publishes an Op-Ed from Charles Koch denying that human activity is
contributing to global warming, and an annotation channel run by climate
scientists can dissect it point by point.

I respect and admire that vision, and I think it's trying to get to one of the
root causes of climate change denialism.

(Obviously, Hypothes.is has many other great uses!)

~~~
mjayhn
Wow that's interesting. This makes total sense as to why you can't change URLs
then (which I just posted my use case needs), I suppose to maintain provenance
on the original documentation. I hadn't considered that.

------
vannevar
The idea of collaborative annotation has been around since the beginning of
the web. I tried to launch an annotation startup myself back in 2000, and
there were several funded competitors back then. It never catches on. It would
be interesting to analyze why people just don't use these systems when given
the opportunity. Is this a solution in search of a problem?

~~~
jerf
I think generalized annotation has the following problem:

Draw a graph of all web pages ordered by how many times they are visited. You
get a power law distribution. On the left are things like CNN's front page, on
the right things like a text file containing a cheesecake recipe someone
posted in 1995 that is technically still online but nobody has visited in
years.

On the left side are pages that are visited too often, and public annotations
inevitably descend into madness and chaos, even worse than comment sections
which have the courtesy to at least be isolated on the page to a certain area.
On these pages, trying to use the annotation software is worse than not using
it at all.

On the right are the pages that nobody visits, so when you visit them, there
is no chance of any annotations being on the page, nor any chance that if you
leave any annotations anybody will ever see them, or care if they do (i.e.,
just giving people a directory of "rarely annotated pages" doesn't solve the
problem, the problem is that nobody cares about these pages anyhow).
Annotation software doesn't harm these pages technically, but the user
experience is at best that the user will just forget about their annotation
software, and at worst, they'll feel alone on the page, a concept not
previously in their mental model and one that is not contributing to a
positive assessment of the annotation software.

In between is the sweet spot, where participation is adequate that the
annotation software brings some sort of value, but it isn't just overwhelming
chaos.

I submit to you that viewed through the power law lens, that sweet spot is
actually fairly narrow. Moreover, if you slice through a single user's browser
history looking for when they hit pages in that sweet spot, it'll still be the
minority of pages that they viewed, so basically by statistical necessity, the
pages where the software added value must be a tiny minority of the pages you
visit. The majority of pages they visit, the annotation software experience
ranges from extremely net negative to at best neutral, and it's hard for the
positive experiences to make up for that.

Moreover, this sweet spot is _moving_. As more of the general public tries out
your software, the sweet spot moves to the right, which also has the effect
when viewed from a single user's point of view of making the pages where it is
useful become more rare. When you're just starting out and you've got a 100
users, the useful pages are things like "the front page of CNN" and "the hot
wikipedia article about $CURRENT_EVENT", but as your user count increases it
moves down to specific articles on CNN and only the linked content from the
$CURRENT_EVENT, then only to archival content on CNN and random Wikipedia
pages, and just in general into stuff that becomes increasingly difficult for
it to be any significant percentage of the pages you visit.

I think this is why A: They can't get popular B: the ones that I know about
that have been around for a while and are therefore presumably successful
enough for someone to consider them worthwhile can only stay so provided they
don't get more popular and C: there's no chance you can make money in this
space because you get an _anti-_ network effect... the more users you get, the
_less_ valuable the service becomes to every existing user!

I think this is also why it seems like such an appealing idea. You create a
prototype, get a couple of friends on it, it seems like fun and to be useful.
You annotate some popular pages, you have some similar interests and you
annotate, I dunno, the latest game console announcement or something, and it
seems like fun. The problem is, rather than this being the worst the
experience will ever be as it just gets better and better as more people come
online, this is the _best_ it will ever be.

Also, I can tell you from the first couple of times around that if this did
become popular, the content producers of the Internet would fight you tooth
and nail. They did the first couple of times when the math sort of worked to
at least get to the point that these things could get in the news. With the
web so much larger and power-law-y and the anti-network effect correspondingly
so much more powerful, now this sort of thing can't even be successful enough
to so much as get noticed by anyone before it has already collapsed.

(I specifically said "generalized" annotation at the top, because specialized
use cases can get around some of these issues. But it's going to be hard to
make any money on the size of user base you can support, because while you can
mitigate the anti-network effect, it's always going to be looming over you if
you try to get large enough.)

~~~
memexy
Annotation systems don't have to be public. The default should be private with
an option to make parts public as necessary. Comments in some sense are
annotations that are anchored at the wrong place and with the wrong visibility
(public instead of private).

The main value of annotations is as a personal knowledge system anchored on
top of public knowledge. As one's private knowledge grows it eventually
reaches a point where it needs to be summarized. A good annotation system
helps with this summarization process (which mostly comes down to having a
good search and navigation system).

I think annotations can be made to work but the starting point has to be about
personal use and not public use.

~~~
jerf
The person I was responding to specifically mentioned "collaborative
annotation" and startups. Non-collaborative annotation has different
tradeoffs. It may be useful to you, but I can't imagine there's any money
there for a startup. It avoids the anti-network effects that kill annotations,
and it avoids publisher objections too, but it also doesn't have a very
compelling value proposition for very many people. It's like trying to make
money off of people who like mind maps, literate programming, or who use
personal wikis.... it's non-zero amounts of people, often very passionate
people, but it's not much of a market.

~~~
memexy
Why is there no money in a personal annotation system? I'd think any large
organization would be willing to pay a lot of money to use a tool that would
make its members more productive and effective.

In fact, Coda and Notion.so are very successful products and they're basically
personal/private annotation systems (modulo storing everything in the cloud).
I can imagine extending Coda and Notion.so with programmatic capabilities and
adding offline storage and turning them into pretty successful personal
annotation systems with premium features for supporting organizational work
and collaboration.

There is plenty of money in this market when viewed from the right angle.

------
xnx
I've long felt that annotation was one of the greatest missed opportunities of
the web. Every major content website implements commenting differently and
often poorly. I'd much prefer to have comments come from a third party system
where I can: own my own comments, not be subject to removal by the author of
the content, and choose the group of people I want to discuss with. Imagine if
hackernews discussions appeared right along the content instead of being
hidden away where no reader of the content could find them.

~~~
kirubakaran
Hey I'm working on exactly this with
[https://histre.com/](https://histre.com/)

~~~
shigeo
Nice idea! Some cheapshots:

1\. The comments aspect of this doesn't really jump out at me from the landing
page.

2\. The words "knowledge base" scare most people.

3\. "You will be our customer and we promise to treat you the way we’d like to
be treated too… with respect" is much more ominous with ellipsis than it would
be with an em dash or a colon.

~~~
kirubakaran
Thank you. Great points. I'll fix that.

------
hinkley
A little-known feature of NCSA Mosaic for Windows was a sort of annotation
system built into it. So little-known, in fact, that by the time Netscape
Navigator 1.0 shipped, nobody on the Mosaic team claimed any knowledge of how
the damned thing worked. That information left with the original team.

Every so often over the years I have recalled that feature and wished it had
survived, especially when some organization desperately needs a watchdog group
calling them on their bullshit.

Comment systems carved out a piece of that problem domain, but I’ve always
wished for comments curated by an independent party. Which I am sure is how we
got Digg, Reddit, and HN.

------
jborichevskiy
Love this project. I definitely have some long-term concerns around filtering
out spammers and being able to grok a heavily annotated page with
resolved/unresolved conversations but I have faith they'll figure them out.

In the meantime, I've embedded the sidebar into my site [0] and hope more
people do the same! Original idea from this [1] post.

0 - [https://jborichevskiy.com/about-blog](https://jborichevskiy.com/about-
blog)

1 - [https://blog.cjeller.site/annotate-this-
post](https://blog.cjeller.site/annotate-this-post)

------
mempko
Love this but please for the love of everything holy make a firefox extension.
The bookmarklet works but you only see annotations if I click on the link.
It's sad and clear chrome is the new IE and mozilla is again the underdog.

~~~
fiatjaf
It's so insulting they have an open issue and said they're working on it for
more than 2 years when Firefox extensions framework is 99% compatible with
Chrome extensions. Probably just have to repackage and publish, but apparently
they can't be bothered.

~~~
nikisweeting
It's open source, either do it yourself or quit complaining.

------
nkingsy
I interviewed with this little company back in 2014 trying to land my first
dev gig. Didn’t get the job but I would’ve taken it for less money if they’d
offered.

Impressed they’re still going! I’ve worked for two failed startups since.

------
JohnL4
Hypothes.is is great, but: annotations and living documents don't mix. I
learned angular 3 years ago by annotating angular.io. Mistake. Now most of my
annotations point to deep space because the text is gone or altered beyond
recognition. I should have made my annotations stand by themselves.

------
aeontech
I wonder if anyone else remembers hoodwink.d by _why

For those that don't, it was a locally running proxy service that would inject
its own annotation UI into any site you wanted. Basically your own pocket
comment/discussion board where you could discuss whatever URL you were on with
other users of hoodwink.

[https://github.com/whymirror/hoodwinkd](https://github.com/whymirror/hoodwinkd)

Also, wow, I can't believe that was 14 years ago. I miss _why.

------
mjayhn
Hi, I started using this as a training tool for people I'm trying to help in
meetups and my community (people who have never used AWS, GCP, etc) but a huge
problem I found out after writing a lot of annotations (and clearly having the
wrong expectation) was that you can't edit or have any control over the URLs
that the annotations are based on.

If the devs are interested in my use case (understood if not) which is
basically a poor mans onboarding tool, I'd love and happily spend some money
(not onboarding tool money, those seem to be $1k+/month) on the ability to
edit urls, share these, etc.

What I envisioned this as letting me do was creating versioned annotations
that let me share them to anyone and they can superimpose those annotations on
top of any URL that has the same basic structure behind it
amazon.com/$myuser/$mycluster$/$myportal etc. I thought it'd be a really
killer tool to use for helping new devs learn AWS more easily than digging
constantly digging through AWS documentation, ie: "What is lamba? _click
button_ 10 quick bullet points on Lambda" sort of thing.

A lot of these users will be using localhost, their IP, digitalocean IPs, etc
instead of a concrete domain name. It'd be great if we could regex the url
strings or something like that.

Thanks for your work!

~~~
tedmiston
Making the annotations portable across the same site hosted in difference
places is an interesting idea.

I use Hypothesis a lot to highlight e.g., the few important lines of codes,
commonly used config values, etc buried in longer dev docs.

This mostly works pretty well. But I do lose my highlights when the URLs
change e.g. version 1.0 to 1.1. Sometimes there is a "latest" docs site
available to work around this issue but not always.

I imagine that a script could be made with the Hypothesis API to migrate
annotations across pages and approximately re-anchor them as much as possible.

------
erikig
It seems like annotation is coming into its own. In addition to the W3
standard, there are a lot of interesting UI/UX ideas in the space for instance
this UI/UX that was posted by Azlen Elza (in Twitter) that would be perfect
for annotation.

[https://twitter.com/azlenelza/status/1272600877493137408](https://twitter.com/azlenelza/status/1272600877493137408)

------
kontxt
I'm the founder of a new service in this space:
[https://www.kontxt.io](https://www.kontxt.io). Check it out, and let me know
what you think.

Enterprise - It’s Slack, but on top of digital content.

Education - LMS with interactive layer for notes and assignments.

Website - It’s Disqus, but inline with your content.

Personal - It’s Pocket, Genius, and Reddit combined.

------
onyva
My biggest problem with it was the lack of support for frames. Many sites I
use serve documents in frames making hypothes.is unable to see them. Also, why
not work with NextCloud and create an app to annotate documents there.
NextCloud over public or private share?

~~~
karlicoss
In a way, Nextcloud is a silo (even though it's open source and can be
selfhosted). If you annotate documents there, you'd have to snapshot the page
and move it into the app which kills the social function?

In contrast, Hypothesis only keeps track of the metadata and uses fuzzy
anchoring to make it resilient to the markup changes.

------
godzillabrennus
These guys have been working on this problem for years and have a cool
solution if you have a budget: [http://www.myire.com](http://www.myire.com)

------
ncr100
Does this fingerprint text to be annotated and collect annotations, allowing
the "top rated annotation" to be the most strongly "trusted" annotation for
this unique slice of text?

And does it ensure the integrity of the text, so that no bad actor may modify
the text to turn it into "fake news"?

I think that ^^ could help diminish corruption of knowledge / the negative
impact of information distribution which the Internet enables.

~~~
tedmiston
I'm not entirely sure that these fully answer the questions you're asking,
but:

\- The annotations get "anchored" to the text in place at a given URL. There's
a video online of Rap Genius [née Genius] discussing fuzzy annotation
anchoring [1]. I would guess Hypothesis does something similar. Also, their
client code is on GitHub.

\- Annotations can be replied to but as far as I know there is no mechanism
for voting or any one particular annotation being more trusted than others.
The site owner could make an official group for their site I suppose.

\- The underlying page can be modified at any time. If something is annotated
and that underlying text is significantly changed or removed, the annotation
becomes "orphaned" and shows in a separate area. If you really want to, you
could archive the page first with e.g., archive.is and then annotation an
archived version.

\- I do think that Hypothesis maintains an archived version of an annotated
page, or at the very least the portion of its text which you've anchored
annotations to so that you can view them outside of the context of the page.

[1]:
[https://www.youtube.com/watch?v=FJyqfRcyYIQ](https://www.youtube.com/watch?v=FJyqfRcyYIQ)

------
hliyan
Privacy question: is this typical wording? "By giving us this information, you
agree to it being collected, used, disclosed, transferred to the USA and
stored by us."

Does "USA" here mean the geographical USA, the legal jurisdiction, citizens of
the USA or the government of the USA?

~~~
throwaway744678
It probably means their servers are hosted in the USA (geographical) under
it's jurisdiction. Not sure what "citizens of the USA" means in this context,
but you can assume the US government may access the data if needed.

------
yumraj
Just last year I was looking at annotations in general and potentially using
Hypothesis as the initial MVP implementation to solve a problem in a very
different, well different from research, market.

Unfortunately I got distracted, but I still believe in it and may come back to
it again.

------
jordic
It's so amazing to see them on HN, I learned a ton about python, pyramid,
sqlalchemy and elastic on it's repos. So happy to see traction on open source
projects! Thank you!

------
zem
i wonder if they have any concrete plans to prevent abuse of their platform, a
la [https://slate.com/human-interest/2016/03/news-genius-
wants-t...](https://slate.com/human-interest/2016/03/news-genius-wants-to-
annotate-the-entire-web-at-what-cost.html)

------
kebman
I tried to see myself out, but all that happened was that the wall behind the
door became blue. In all seriousness, I was just curious if he had some cool
"see-yourself-out" site that he'd send you to, like some people do, but no
such luck this time. So I guess I'm stuck with linear regression, then. It's a
pretty interesting topic anyway. :)

------
robertlagrant
Web Graffiti as a Service!

Seriously though, I always wondered why something like this didn't exist. Very
cool.

------
genericone
Like dissenter but instead of a comments section you can annotate anywhere?

------
holler
how does this compare to Genius?

edit: looks like Genius moved away from aiming to annotate the web, and
focused instead on their core business of annotating music.

~~~
tedmiston
Exactly what you said. Also Hypothesis has more features around privacy
(private annotations) and groups.

For me Genius's annotations were more of a tool for surfacing authoritative
knowledge on a song. I use Hypothesis more for personal notes across the web,
but other people use it collaboratively as well.

------
pivic
I've used Hypothesis for years. Very good blog support (plugin for WordPress)
and built with WC3 in mind. Very fast support. Great tool.

------
heyoo
Clever domain!

