I'm not entirely up-to-date on the annotation spec, but if all that's needed is to host the annotations somewhere, then Solid should already support that.
It's just not really clear to me how a user should find out that there are annotations about a page available somewhere.
There is quite a bit of activity happening around the spec, I believe (though I'm not following it too closely), but lots is spread around a number of different repositories [1], and it's not immediately clear which are active. (People are discussing how to improve that [2].)
In terms of servers: there has indeed not been too much activity there recently, which relates to efforts having mostly focused on stabilising the spec (as far as I've seen, at least) rather than new things having been added.
It's undeniable that Inrupt and the project are very intertwined, given how many of the original project members are working for/with Inrupt, and it being founded by TimBL. I think both the community and Inrupt are still figuring out the best way to work together. And I think we've had the reverse problem as well to some extent, where the community might be making themselves a bit too dependent on Inrupt. I think the project can grow in two (complimentary) ways: top-down, with large organisations getting the project adopted at scale in big bangs, and bottom-up, where early adopters encourage additional adoption and it spreading from there. Inrupt is probably best positioned to support the former, but we haven't seen too much from the latter yet. That said, there are a number of community projects that aim to make a dent there, and the best way to keep up with those is through the This Month in Solid newsletter [3].
If you use a saas, unless it's open source and you can self host, you are at the mercy of it.
If the service is good enough and the team behind it seems in good faith, it can give you confidence for the switch.
But promising to adopt a standard then not doing it doesn't inspire confidence.
The system being closed source (no open API can change that), they need to build trust with some users. Particularly the tech saavy ones that have been burnt in the past and hold data that is precious to them.
I tried the docker some time back and it was broken. Has anyone ever managed to set it up or are there any tutorials or blogs of any one outside the org using their own setup?
My PhD work would not have been possible without Hypothes.is. My group uses it for a variety of information extraction and curation tasks, and I'm fairly certain that one of my colleagues has the highest number of human made annotations in the system (though in private groups).
Hypothes.is is great for prototyping curation workflows to explore what is possible before spending time implementing yet another tool. It also hits the sweet spot for having a UI that is just accessible enough for non-technical users that can display annotations made by bots. We have been using hypothes.is as the backend to crawl [0] research resource identifiers from the scientific literature for 4 years, and it has been solid the whole time.
I see some questions/concerns about the w3c compliance of the API. I'm pretty sure that there is just a switch the needs to be flipped, and that it has been implemented. Yep, it has [1]. Also as others have suggested it is fairly straight forward to set up your own annotation store, though it is not simple to replicate the Hypothes.is setup.
I hope that the pivot to education continues to go well so that it can fund the infrastructure, because Hypothes.is provides a very valuable (if hard to capture the value of) service.
No, we retain a local copy of the annotations, but that's about it. Last time I checked the hypothes.is client didn't have the ability to use an arbitrary backend. We did at one point create a fork of the web extension that could point to an arbitrary backend but the maintenance overhead ended up not being worth it.
Dan built and sold an Internet travel app in the 90s, and then became involved in climate work. He's been committed to it for many years.
Hypothes.is was created in order to address the problem of disinformation on the Internet, especially related to climate change; e.g. the Wall Street Journal publishes an Op-Ed from Charles Koch denying that human activity is contributing to global warming, and an annotation channel run by climate scientists can dissect it point by point.
I respect and admire that vision, and I think it's trying to get to one of the root causes of climate change denialism.
(Obviously, Hypothes.is has many other great uses!)
Wow that's interesting. This makes total sense as to why you can't change URLs then (which I just posted my use case needs), I suppose to maintain provenance on the original documentation. I hadn't considered that.
The idea of collaborative annotation has been around since the beginning of the web. I tried to launch an annotation startup myself back in 2000, and there were several funded competitors back then. It never catches on. It would be interesting to analyze why people just don't use these systems when given the opportunity. Is this a solution in search of a problem?
I have a pet theory that Twitter is the ultimate annotation platform that’s been right in front of us the whole time. The reason it doesn’t seem like one is because it blurs the line between “annotation” and “conversation”.
I think the dynamic is similar: while a normal annotation is a comment on a fragment of text, a Twitter reply is (or at least can be) a comment on a single portion of a longer thread.
Two notable differences:
- While a typical annotation platform takes a freely-written text and allows any segment to be annotated, Twitter forces the author of a thread to compose their text with the segment boundaries in mind.
- While a normal annotation system doesn’t allow you to annotate the annotations, any given “annotation” on Twitter may itself be a thread which allows for annotation. It’s recursive!
Maybe the above is a stretch, but personally I think it’s an interesting way to think about Twitter.
People's informational bits are not unified as html or pdf. They could be anywhere: html on web (more than one browser), html on mobile (which browser is it?), pdf here, pdf there; tons of apps icons, whatsapps, facebook, bookmarks galore, blah, blah. We are all over the place. Annotation on the other hand is a tool for organized individuals. Life is not necessarily organized for most people, hence, this never got picked up.
Annotation should have been part of the browser from day one. It should have evolved with it when things moved on to the mobile world. It never happened.
I think generalized annotation has the following problem:
Draw a graph of all web pages ordered by how many times they are visited. You get a power law distribution. On the left are things like CNN's front page, on the right things like a text file containing a cheesecake recipe someone posted in 1995 that is technically still online but nobody has visited in years.
On the left side are pages that are visited too often, and public annotations inevitably descend into madness and chaos, even worse than comment sections which have the courtesy to at least be isolated on the page to a certain area. On these pages, trying to use the annotation software is worse than not using it at all.
On the right are the pages that nobody visits, so when you visit them, there is no chance of any annotations being on the page, nor any chance that if you leave any annotations anybody will ever see them, or care if they do (i.e., just giving people a directory of "rarely annotated pages" doesn't solve the problem, the problem is that nobody cares about these pages anyhow). Annotation software doesn't harm these pages technically, but the user experience is at best that the user will just forget about their annotation software, and at worst, they'll feel alone on the page, a concept not previously in their mental model and one that is not contributing to a positive assessment of the annotation software.
In between is the sweet spot, where participation is adequate that the annotation software brings some sort of value, but it isn't just overwhelming chaos.
I submit to you that viewed through the power law lens, that sweet spot is actually fairly narrow. Moreover, if you slice through a single user's browser history looking for when they hit pages in that sweet spot, it'll still be the minority of pages that they viewed, so basically by statistical necessity, the pages where the software added value must be a tiny minority of the pages you visit. The majority of pages they visit, the annotation software experience ranges from extremely net negative to at best neutral, and it's hard for the positive experiences to make up for that.
Moreover, this sweet spot is moving. As more of the general public tries out your software, the sweet spot moves to the right, which also has the effect when viewed from a single user's point of view of making the pages where it is useful become more rare. When you're just starting out and you've got a 100 users, the useful pages are things like "the front page of CNN" and "the hot wikipedia article about $CURRENT_EVENT", but as your user count increases it moves down to specific articles on CNN and only the linked content from the $CURRENT_EVENT, then only to archival content on CNN and random Wikipedia pages, and just in general into stuff that becomes increasingly difficult for it to be any significant percentage of the pages you visit.
I think this is why A: They can't get popular B: the ones that I know about that have been around for a while and are therefore presumably successful enough for someone to consider them worthwhile can only stay so provided they don't get more popular and C: there's no chance you can make money in this space because you get an anti-network effect... the more users you get, the less valuable the service becomes to every existing user!
I think this is also why it seems like such an appealing idea. You create a prototype, get a couple of friends on it, it seems like fun and to be useful. You annotate some popular pages, you have some similar interests and you annotate, I dunno, the latest game console announcement or something, and it seems like fun. The problem is, rather than this being the worst the experience will ever be as it just gets better and better as more people come online, this is the best it will ever be.
Also, I can tell you from the first couple of times around that if this did become popular, the content producers of the Internet would fight you tooth and nail. They did the first couple of times when the math sort of worked to at least get to the point that these things could get in the news. With the web so much larger and power-law-y and the anti-network effect correspondingly so much more powerful, now this sort of thing can't even be successful enough to so much as get noticed by anyone before it has already collapsed.
(I specifically said "generalized" annotation at the top, because specialized use cases can get around some of these issues. But it's going to be hard to make any money on the size of user base you can support, because while you can mitigate the anti-network effect, it's always going to be looming over you if you try to get large enough.)
Annotation systems don't have to be public. The default should be private with an option to make parts public as necessary. Comments in some sense are annotations that are anchored at the wrong place and with the wrong visibility (public instead of private).
The main value of annotations is as a personal knowledge system anchored on top of public knowledge. As one's private knowledge grows it eventually reaches a point where it needs to be summarized. A good annotation system helps with this summarization process (which mostly comes down to having a good search and navigation system).
I think annotations can be made to work but the starting point has to be about personal use and not public use.
The person I was responding to specifically mentioned "collaborative annotation" and startups. Non-collaborative annotation has different tradeoffs. It may be useful to you, but I can't imagine there's any money there for a startup. It avoids the anti-network effects that kill annotations, and it avoids publisher objections too, but it also doesn't have a very compelling value proposition for very many people. It's like trying to make money off of people who like mind maps, literate programming, or who use personal wikis.... it's non-zero amounts of people, often very passionate people, but it's not much of a market.
Why is there no money in a personal annotation system? I'd think any large organization would be willing to pay a lot of money to use a tool that would make its members more productive and effective.
In fact, Coda and Notion.so are very successful products and they're basically personal/private annotation systems (modulo storing everything in the cloud). I can imagine extending Coda and Notion.so with programmatic capabilities and adding offline storage and turning them into pretty successful personal annotation systems with premium features for supporting organizational work and collaboration.
There is plenty of money in this market when viewed from the right angle.
This is really insightful. I know it's different when you're talking about private annotations, but in the context of the original dream of public annotations where it was hoped that you would benefit from benevolent and knowledgeable strangers who previously annotated the factual errors in the article you happen to be reading, I think you've absolutely identified the key problem.
This is causing me to realize that the value of Wikipedia is not the fact that the pages are editable by the public; it's that there exists a process (as flawed and controversial as it is) for debating over the various contributions and edits that leads to a single, unified version of any given page. More than anything else, that consolidation process is Wikipedia. A successful consolidation process has to be present in order for any system of public contributions (annotations or otherwise) to produce a coherent result.
I agree. I used to work for Hypothesis a very long time ago and this is the conclusion I came to as well. Pivoting to serve the academic community annotating PDFs was a good idea (this happened after I had left).
I'd imagine an issue is that there's just no place to go for it. If a big existing community introduced annotations, I'd bet it could catch on. Imagine clicking through an article from facebook and you automatically get annotations. Hell, on mobile you already have these apps using built in browsers so you don't leave the app. Something like that, where annotations are discoverable, could definitely gain traction. Then there'd be no escaping facebook (or twitter or reddit or HN) even on other sites.
I've long felt that annotation was one of the greatest missed opportunities of the web. Every major content website implements commenting differently and often poorly. I'd much prefer to have comments come from a third party system where I can: own my own comments, not be subject to removal by the author of the content, and choose the group of people I want to discuss with. Imagine if hackernews discussions appeared right along the content instead of being hidden away where no reader of the content could find them.
HTML hyperlinks as they are - unidirectional - are a very long standing design decision the last futile revision of which was in 1099 or so with XLink. You probably have it seen in action with SVG-in-HTML (and SVG fragment inclusion in SVG itself) where it kindof rears its ugly head in that, for the longest time, you had to use xlink:href (with xlink bound to the XLink XML namespace). XLink itself is based on SGML/HyTime concepts from 1990s. I believe Sun back then patent-encumbered XPointer (which is the part of XLink for addressing fragments), and especially claimed IP rights on its representation of range addressing, which didn't help to make it popular I guess.
Before changing HTML link semantics though, HTML should really consider allowing placing href on any element rather than using the dedicated <a href=...> element though (with the expectation that this acts equivalently to an anchor link), like MathML does.
In general, I'm not sure hosting discussions on third-party aggregators such as discus.com is really a win for privacy, though.
> not be subject to removal by the author of the content
That's why everyone is doing it themself. They don't wanna make it to easy for people to find their dirt, also make it harder for trolls and spammer to harm them. Protecting yourself from others is a legit problem with this.
But besides that, aren't sites like hackernews and reddit basically like that? All you need is an extension which querys them for the active url or domain and shows the results on the side.
1. The comments aspect of this doesn't really jump out at me from the landing page.
2. The words "knowledge base" scare most people.
3. "You will be our customer and we promise to treat you the way we’d like to be treated too… with respect" is much more ominous with ellipsis than it would be with an em dash or a colon.
I understand. Thanks for the feedback. I think about this too.
I'm not an anonymous person. I've been a Hacker News user since the very beginning. I share my contact info in my profile. My reputation is my livelihood. I've never done anything shady and don't intend to start now :-) So I'd like to believe that people see that judge for themselves whether to trust me or not.
Knowing who is behind a product, at least for me personally, makes a difference. Just a couple of days ago I started a Youtube channel to openly talk about my thoughts: https://www.youtube.com/channel/UCHkgOonAQd5haT8HHJhpg6g
But yeah I hear your concerns, and I respect your decision to not use my app.
The problem is the change with success. Page and Brin where once idealistic hackers and even came up with a catch phrase of "Don't be evil", fast forward and it did not work and the reality is it probably had very little to do with them. Success and size whittle away at idealistic visions. You need to put together a framework before you are successful, on how you are going to enforce that your core values reflect with success and size and it has to have teeth. I am confident you are a good person and you are idealistic. The problem is, that you won't always be steering the ship.
Yes, that's a great point. Is a strong privacy policy the solution? I do have one in place. I'm happy to put in stronger commitments.
Another thing to highlight here: Histre is a paid product. I'm bootstrapping it. I think that it is the "free" products that go the adtech route. By charging the users, I'm making it clear that it is their money that supports Histre.
Your target audience might not care, but I don't think that's strong enough.
- Companies display a willingness to force agreement to new policies in order to continue using the service, even for paid customers, and often on questionable or void legal footing.
- The fact that it's a paid service right now might not be a strong enough signal. Even large device manufacturers (e.g. HTC) have retroactively patched ads into paid-in-full, physical products. What's to say that you won't have to pivot to make Histre succeed?
- What happens if you are bought out or otherwise personally leave the product? Even if success doesn't affect how you act on your commitments, what's to keep others from malicious actions?
- Adding to the above (re-iterating an ancestor comment), if you're successful enough to need employees you won't have a hand in every decision. How then is privacy still protected?
Privacy aside, if Histre goes under what options do customers have for, e.g., exporting their data?
I would say a legal poison pill, something that builds a framework that the company is dissolved if the core tenants are violated, or that all digital assets are forfeited to a no-profit privacy group.
I know very little about this kind of stuff, but is there anything that would prevent a controlling party from striking the poison pill provisions from the bylaws?
Usually poison pills are to trip up parties without controlling interest, so I'm curious what kind of legal framework would making something like that unchangeable.
A little-known feature of NCSA Mosaic for Windows was a sort of annotation system built into it. So little-known, in fact, that by the time Netscape Navigator 1.0 shipped, nobody on the Mosaic team claimed any knowledge of how the damned thing worked. That information left with the original team.
Every so often over the years I have recalled that feature and wished it had survived, especially when some organization desperately needs a watchdog group calling them on their bullshit.
Comment systems carved out a piece of that problem domain, but I’ve always wished for comments curated by an independent party. Which I am sure is how we got Digg, Reddit, and HN.
Love this project. I definitely have some long-term concerns around filtering out spammers and being able to grok a heavily annotated page with resolved/unresolved conversations but I have faith they'll figure them out.
In the meantime, I've embedded the sidebar into my site [0] and hope more people do the same! Original idea from this [1] post.
Love this but please for the love of everything holy make a firefox extension. The bookmarklet works but you only see annotations if I click on the link. It's sad and clear chrome is the new IE and mozilla is again the underdog.
It's so insulting they have an open issue and said they're working on it for more than 2 years when Firefox extensions framework is 99% compatible with Chrome extensions. Probably just have to repackage and publish, but apparently they can't be bothered.
I interviewed with this little company back in 2014 trying to land my first dev gig. Didn’t get the job but I would’ve taken it for less money if they’d offered.
Impressed they’re still going! I’ve worked for two failed startups since.
Hypothes.is is great, but: annotations and living documents don't mix. I learned angular 3 years ago by annotating angular.io. Mistake. Now most of my annotations point to deep space because the text is gone or altered beyond recognition. I should have made my annotations stand by themselves.
I wonder if anyone else remembers hoodwink.d by _why
For those that don't, it was a locally running proxy service that would inject its own annotation UI into any site you wanted. Basically your own pocket comment/discussion board where you could discuss whatever URL you were on with other users of hoodwink.
Hi, I started using this as a training tool for people I'm trying to help in meetups and my community (people who have never used AWS, GCP, etc) but a huge problem I found out after writing a lot of annotations (and clearly having the wrong expectation) was that you can't edit or have any control over the URLs that the annotations are based on.
If the devs are interested in my use case (understood if not) which is basically a poor mans onboarding tool, I'd love and happily spend some money (not onboarding tool money, those seem to be $1k+/month) on the ability to edit urls, share these, etc.
What I envisioned this as letting me do was creating versioned annotations that let me share them to anyone and they can superimpose those annotations on top of any URL that has the same basic structure behind it amazon.com/$myuser/$mycluster$/$myportal etc. I thought it'd be a really killer tool to use for helping new devs learn AWS more easily than digging constantly digging through AWS documentation, ie: "What is lamba? click button 10 quick bullet points on Lambda" sort of thing.
A lot of these users will be using localhost, their IP, digitalocean IPs, etc instead of a concrete domain name. It'd be great if we could regex the url strings or something like that.
Making the annotations portable across the same site hosted in difference places is an interesting idea.
I use Hypothesis a lot to highlight e.g., the few important lines of codes, commonly used config values, etc buried in longer dev docs.
This mostly works pretty well. But I do lose my highlights when the URLs change e.g. version 1.0 to 1.1. Sometimes there is a "latest" docs site available to work around this issue but not always.
I imagine that a script could be made with the Hypothesis API to migrate annotations across pages and approximately re-anchor them as much as possible.
It seems like annotation is coming into its own. In addition to the W3 standard, there are a lot of interesting UI/UX ideas in the space for instance this UI/UX that was posted by Azlen Elza (in Twitter) that would be perfect for annotation.
My biggest problem with it was the lack of support for frames. Many sites I use serve documents in frames making hypothes.is unable to see them. Also, why not work with NextCloud and create an app to annotate documents there. NextCloud over public or private share?
In a way, Nextcloud is a silo (even though it's open source and can be selfhosted). If you annotate documents there, you'd have to snapshot the page and move it into the app which kills the social function?
In contrast, Hypothesis only keeps track of the metadata and uses fuzzy anchoring to make it resilient to the markup changes.
Does this fingerprint text to be annotated and collect annotations, allowing the "top rated annotation" to be the most strongly "trusted" annotation for this unique slice of text?
And does it ensure the integrity of the text, so that no bad actor may modify the text to turn it into "fake news"?
I think that ^^ could help diminish corruption of knowledge / the negative impact of information distribution which the Internet enables.
I'm not entirely sure that these fully answer the questions you're asking, but:
- The annotations get "anchored" to the text in place at a given URL. There's a video online of Rap Genius [née Genius] discussing fuzzy annotation anchoring [1]. I would guess Hypothesis does something similar. Also, their client code is on GitHub.
- Annotations can be replied to but as far as I know there is no mechanism for voting or any one particular annotation being more trusted than others. The site owner could make an official group for their site I suppose.
- The underlying page can be modified at any time. If something is annotated and that underlying text is significantly changed or removed, the annotation becomes "orphaned" and shows in a separate area. If you really want to, you could archive the page first with e.g., archive.is and then annotation an archived version.
- I do think that Hypothesis maintains an archived version of an annotated page, or at the very least the portion of its text which you've anchored annotations to so that you can view them outside of the context of the page.
Privacy question: is this typical wording? "By giving us this information, you agree to it being collected, used, disclosed, transferred to the USA and stored by us."
Does "USA" here mean the geographical USA, the legal jurisdiction, citizens of the USA or the government of the USA?
It probably means their servers are hosted in the USA (geographical) under it's jurisdiction.
Not sure what "citizens of the USA" means in this context, but you can assume the US government may access the data if needed.
Just last year I was looking at annotations in general and potentially using Hypothesis as the initial MVP implementation to solve a problem in a very different, well different from research, market.
Unfortunately I got distracted, but I still believe in it and may come back to it again.
It's so amazing to see them on HN, I learned a ton about python, pyramid, sqlalchemy and elastic on it's repos. So happy to see traction on open source projects!
Thank you!
I tried to see myself out, but all that happened was that the wall behind the door became blue. In all seriousness, I was just curious if he had some cool "see-yourself-out" site that he'd send you to, like some people do, but no such luck this time. So I guess I'm stuck with linear regression, then. It's a pretty interesting topic anyway. :)
Exactly what you said. Also Hypothesis has more features around privacy (private annotations) and groups.
For me Genius's annotations were more of a tool for surfacing authoritative knowledge on a song. I use Hypothesis more for personal notes across the web, but other people use it collaboratively as well.
UPD: there is not a single issue open on the main repo wrt W3C standard implementation: https://github.com/hypothesis/h/issues?q=is%3Aissue+is%3Aope.... The post I took as a promise: https://web.hypothes.is/blog/annotation-is-now-a-web-standar...