Don't get me wrong, I've tried it out and it's awesome. I'm just not seeing how it's going to take off as a knowledge repository with its limitations. The last thing I want is yet another place I have to look for my files, and I sure don't want to have to keep things in sync with a repo that holds the rest of my files.
I'm also concerned with the ability of this thing to scale. There doesn't appear to be any system for grouping items other than tagging. Maybe tagging works for others, but it's not something I trust (if I have hundreds of items and missed a tag, the item is lost).
Definitely don't want this to be a negative comment. It's more "you're 95% of the way to an outstanding product, but I think the last 5% might be an obstacle to adoption."
Key point of order.
We don't actually convert the websites to PDF. We cache the pages offline so you have the original in your archive.
This is massively cool and I need to do a better job of making it clear that this isn't just conversion to PDF.
> Don't get me wrong, I've tried it out and it's awesome. I'm just not seeing how it's going to take off as a knowledge repository with its limitations.
You mean specific limitations to Polar?
> I'm also concerned with the ability of this thing to scale. There doesn't appear to be any system for grouping items other than tagging. Maybe tagging works for others, but it's not something I trust (if I have hundreds of items and missed a tag, the item is lost).
Some users want to have more of a hierarchical folder view and we might implement that.
But I don't get your point about a missing tag.
> Definitely don't want this to be a negative comment. It's more "you're 95% of the way to an outstanding product, but I think the last 5% might be an obstacle to adoption."
Well one thing is that I think the Hacker News crowd is very very very picky in terms of features and functionality vs the normal user base.
The features I've seen from HN just aren't going to be necessary for all users. I'm fine adding them if they fit into the roadmap but some just won't be implemented.
One issue is the ability to use other cloud providers which has been brought up a number of times (which is why I wrote this post).
That's a massively complex feature and while that would be cool in some sense it would just be far far far too expensive to implement.
I think the remaining 5% is doable but we will see..
One valid criticism is that PDF is the only format. We're working on markdown and ePub too...
At least in my experience, interfaces that use tags (Gmail, Google Docs prior to the Drive upgrade) end up being a massive global heap that make it impossible to find anything. Tags help but only marginally, and only when I'm very selective about applying them. When a tag becomes too big it, it's just the soup problem all over again. In my experience with Gmail/Docs I usually find things via search, or not at all (and unfortunately more often than not the latter).
You could argue the same problem occurs in hierarchical folders. But I think one major takeaway I've had is that defaults matter: hierarchical folders normalize a usage where you categorize everything. Therefore, I can find utility bills for an apartment I moved out of some years ago, with about four clicks through my folder tree. As another example, a while back I was asked for my daughter's birth certificate when checking in for a flight. No problem, five clicks and I've got it on my phone. In my experience this is hopeless in a tagging system unless search is very good or the tags are hierarchical and very granular, and at least so far I've never been able to develop the discipline to do this.
Happy to explore feedback, thoughts, use-cases collaboration ideas.
Plus, with tags you can have multiple categories.
Also, Polar suggest tags for you based on your history. So it gets smarter over time.
We will probably add some sort of hierarchy viewer but I'm torn over whether it should be compiled based on the tags or built manually.
This is exactly what I'm talking about. Tags may be supported but it's not a tag-first interface, per se.
When I open finder/explorer/nautilus on my computer, what shows up is the top level of my file hierarchy. This would be like an interface where you showed only top-level tags and then you have to click through one of those to get to any actual documents. You can say this increases friction, but the tradeoff is that it promotes a system of organization that becomes faster overall when I have large numbers of documents.
Or again, the issue of uncategorized files: you say it's like saving a document in /, but when I save a file on my local machine it doesn't put it in / first and then as a second step I have to manually put it somewhere else. But that's how most tagging systems work: documents go into the soup of the untagged list by default, I have to manually tag it if that's what I care about. Again, it's normalizing an approach where you dump things in and organize them later except the second step usually doesn't happen.
Like I said before, it's not that I don't think tagging could be used with discipline to maintain organization, it's that systems that rely on tagging tend to design their interfaces in a way where tagging is at most an afterthought and often isn't done at all. Whereas on my desktop I literally can't save a file without putting it somewhere in the file hierarchy, and the UI is designed to put that hierarchy front and center.
Tags are now front and center and on by default in the UI.
You can easily filter by tag too:
Polar 2.0 will be web based ... there have been thousands of people checking out the website but not many people downloading the app. Had it been web based I don't think that would have been an issue.
I think they would have started using it.
> it doesn't put it in / first and then as a second step I have to manually put it somewhere else.
no.. but it will pick either the users home or their Documents or their Downloads. That's not much better.
You may want to be careful with this. It sounds like it will lead to a small number of tags becoming increasingly massive to the point where they are basically useless and it is difficult to find anything.
> But I don't get your point about a missing tag.
I find tags to be very complex because I have to remember exactly which tags go with which types of material. If I don't attach a particular tag to a file, that file is as good as lost, because it will never come up again.
In contrast, if I organize by projects, I dump all my files into the project and I can come back in two years and have everything I need. Tags just aren't in my experience a substitute for proper organization.
With strong reservations I switched over to using a $12/month G Suite account because it offers a uniform ‘cloud search’ to find and access all of my digital artifacts.
After doing this transition I am happy enough having uniform access to everything, but Google has also fallen down flat on a few things: not letting my GSuite email account be a family member for the large number of purchases I have made over the years with my gmail account; the Google iOS app is really not functional with my G Suite account (no calendar interop, and other limitations). So, they are close to providing a very good $12/month service, but not quite there yet.
For Polar: expand the service to allow email with custom domain, cloud file storage, etc., and unified search, then they will have a dynamite product.
At the same time I agree with his idea that a knowledge base + spaced repetition is a good combination, however outputting spaced repetition into a 3rd party app is not a good idea.
I see that burtonator is trying to push this product here every time a relevant discussion pops up, I don't have an issue with that. But I do think that this product is a long way from being useful for me in a daily setting. Right now it's nothing more than an electron wrapper around pdf.js and knowing how far I read a pdf.
Except for cloud sync, and annotations, and flashcards, and sync to anki, and pagemarks, and archived documents, and flagging, and tagging, and stats on your reading, and a dedicated annotations view, and the ability to export your annotations, and the ability to capture entire web pages for offline use and annotations.
But other than that it's exactly like pdf.js :)
I couldn't parse that. Is this good or bad?
> At least I bumped into small ui bugs the author should catched himself before a release.
If you can point these out we can fix them. I'm still working on fit and finish in some places and rapidly iterating. I'm pushing 1/2 builds a week now based on user feedback so if something is broken let me know and I'll fix it :)
In a nutshell:
1. I'm working on more document formats including Markdown and ePub.
2. I'm working on some preliminary support for reference management so we can lookup the full reference info via DOI and have reference formatters for you to use for your research. Some of the DOI storage is already done.
3. I'm working on a webapp and mobile support. Probably a month away.
4. The captured web content isn't actually converted to PDF - it's actually a cached web archive. Full HTML is kept in your repo.
5. Polar 2.0 will be more web aware. You can annotate anything without actually having to store the full thing in Polar. We will just store a URL.
6. Pagemarks will be improved her shortly so you can toggle them on and off and possibly even highlight through them directly.
7. There are definitely small bugs and fit and finish issue we're working on. If you have an issue please report it: https://github.com/burtonator/polar-bookshelf/issues
8. Yes. Polar isn't fully self hosted but that defeats the point. My cure hypothesis is that knowledge benefits from being social which is the main thing I'm trying to test. By the time I'm done I think 95% of you would agree that keeping it private is insane as you don't benefit from the community.
9. If you like Polar please consider donating. https://opencollective.com/polar-bookshelf ... we've received essentially zero Open Source contributions from the community and very little funding.
> 3. I'm working on a webapp and mobile support. Probably a month away.
Okay, that would likely be fine. I'd suggest at least creating a iOS Shortcut to save to your webapp because not being able to save via the normal share sheet would be a drag.
Then why are you calling it a Personal Knowledge Repository? It sounds more like you're trying to make it a social media network.
If you are willing to use the zettelkasten method, then you can use one based on Sublime Text, or one that is a standalone program.
You can use a more free-form wiki, with a simple plugin for Sublime
These Markdown files will work a slew of different wiki servers, including Gollum, the wiki used for Github, Gitlab, and BitBucket, and are only one step away from vim-wiki.
But, since they are simple text, you could even access them on most other platforms.
Depends upon internally-created data or external. If you talk about exposing internal data there‘s /Interoperability Problems/.
You don‘t have to think about other apps creating invalid files. State stored won’t vanish because another app decided it‘s not needed and ignores it during editing+resaving. And all the other interoperability problems you may encounter.
If its external files I may agree.
The OS doesn't provide annotations, highlights, flashcards, doesn't have a tag index, doesn't support offline capture, etc.
Install Polar and take the tour...
I have a haphazard way of managing notes and docs that has too much overhead.
I would like a squirrel program to store and retrieve more easily and will eventually just write one.
I think we need more tools that rely on self-hosting cheap nodes for data where the owner has complete control. Knowledge management systems are really personal and if third parties go away then that is critically bad. I intend to keep this info for 50-100 years so need something that does a good job of just storing to git or whatever and then rely on different front-ends and things that can be run without any third party access to data.
The iOS model is closest to this in terms of data and apps, but is not OSS and I spend so much time in front of a full computer so phone access only is too limited.
I'm sure more sophisticated tools could be created around this concept.
Most of the time I want to attach binary files to notes, so I thought about using a sync solution like NextCloud or FileRun (with syncthing) and just link to my files.
The binary thing is a challenge for me as well as part of my filing includes always making copies of web sites and office and pdf to check for versioning and edits. I frequently try to remember or check if web sites changed info or released a new version so I need a link and a frozen copy. Git does not like this and gets real big, real fast.
Even if you scroll down like 5px or if you scrolled multiple pages at once?
How long was the lag?
> overlays for selecting progress are messing with highlighting and re-reading. It could be showed instead simply as rectangle with borders, without this blue background, if that's possible and if not there should be other way to manage progress.
Yes.. I'm planning on evolving this a bit. I didn't realize that people would be so annoyed by this and it isn't part of my normal usage.
I read forward and the pagemarks lag behind my reading and highlights.
I think I can implement a feature to poke through the pagemark to be able to annotate through it.
Therefore I'm using org-mode, its just a UTF8 encoded file, even if org-mode disappeared over night, I can still read the data.
I agree that it's important to be sure you can get your data back out of whatever you're using. And another benefit of something like org mode that uses text files as a backing store is that you can edit/consume them with whatever tools you want.
I like Emacs too and have written literall hundreds of thousands of lines of elisp (seriously)...
I'm trying to make Polar usable for everyone not just Emacs users.
Had similar thoughts and I named it knowledge graph.
The center is you, than you tag books, lectures, wikipages and have a visual representation of all of your knowledge.
I also imagined having your whole education in it and because it's a graph, you could import a path which you can follow. 1+1 needs to be done before you can do 1*1 etc.
I thought about it while learning for a university course and a lot of time was spent on finding good material which explained a topic. And after the exam I started to forget stuff again.
It felt really stupid.
I wanna be in control on what I know, what I want to know and what I no longer need to know.
And all tools out there are not that good. Anki is doing its job but nothing else. Memrise does it more playful but also misses stuff. Duolingo (most of online learning has this problem) feels more like 101 <topic>.
How is it possible that 1000thands of universities record videos of lectures every year and are often paid by all of us (Germany for example) but all those videos have one or all of the following issues:
- bad audio
- bad video
- horrible handwriting
- no slides
Should it not be possible to sit together and create small topics and build one ONE comprehensive learning page?
Edit: and such a platform could not only benefit a lot of people it would also be financially a good deal. From 1-6 grade it could support parents and teachers, than teachers and pupils and later profs and students.
Lots of knowledge doesn't change every year.
1. deal with various file formats(pdf,markdown,code,flashcards,etc)
2. easy to annotate, tag, highlight
3. works across web and smartphone
4. can be self-hosted and portable
Read the linked article. I talk about this at the bottom.
IMO 80% of the VALUE of this data is that you're collaboratively building your knowledge with others.
My core hypothesis is that building knowledge is social.
This is what spaced repetition gives you.
According to his theory, the mind has essentially infinite capacity for memories, but we have only a finite amount of time in which to search for them.
The key to a good human memory then becomes the same as the key to a good computer cache: predicting which items are most likely to be wanted in the future. That it’s a perfect tuning of the brain to the world, making available precisely the things most likely to be needed.
In putting the emphasis on time, caching shows us that memory involves unavoidable tradeoffs, and a certain zero-sumness. You can’t have every library book at your desk, every product on display at the front of the store, every headline above the fold, every paper at the top of the pile. And in the same way, you can’t have every fact or face or name at the front of your mind.
“Many people hold the bias that human memory is anything but optimal,” wrote Anderson and Schooler. “They point to the many frustrating failures of memory. However, these criticisms fail to appreciate the task before human memory, which is to try to manage a huge stockpile of memories. In any system responsible for managing a vast data base there must be failures of retrieval. It is just too expensive to maintain access to an unbounded number of items.”
From the book: Algorithms to Live By.
Additionally, we track application errors which helps us find bugs and to prioritize which issues to fix.
This data is sent to 3rd parties which provide the infrastructure necessary to provide the analytics services needed to analyze and store the data.
We avoid sending personally identifiable information at all times."
I am forced to click on "Accept" actually, which I don't like.
Overall it looks like calibre to me, however Calibre does not do highlighting
you can audit the code but we only track what features you use and so forth.
I can't build Polar without the analytics data as I don't know what's breaking and have no ability to optimize it and improve usability.
Would that give you 80% of what you want?
I did some analysis and 40-60% of PDFs have the DOIs embedded.
However, my main complaint with these tools is that while they are customizeable, it is sometimes nice to have a WYSIWIG since some notes I take will only every be looked at as I am writing them so having instant feedback is nice, even if the resulting document is imperfect.
It applies to PDF's also. More context: https://web.hypothes.is/blog/annotation-is-now-a-web-standar...
To be clear I was one of the inventors of RSS and Atom so it's not like I'm naive in this area...
I just find that implementing standards can slow down a proof of concept that may or not be successful.
Plus I need to review the spec to see if it's compatible with our model and I plan on iterating rapidly on our model and it might not match up with the standard.
"The editor of a lifetime"
Yes.. that's literally why I called it Polar.
It's designed to be a permanent vault for your data. This is why it's Open Source.
The long-term vision is a crowd-sourced alternative to search engines like Google and Bing. But the short-term focus is on something targeted and immediately useful.
> If the company that runs your PKR goes out of business you’re entire education might be in jeopardy.
I'm dealing with this problem in part by making the Digraph source available under the MIT license.
 https://digraph.app/, https://digraph.app/about
Edit > Accessibility > Setup Assistant > Set all accessibility options > Next > Next > Next > Next > Reopen documents to the last viewed page
Edit > Preferences > Documents > Restore last view settings when reopening documents
> I believe this is a new class of application
I haven't used SuperMemo (the app that inspired Anki development), but looks like it has all these features plus tons of other stuff (https://help.supermemo.org/wiki/Incremental_learning). It has some flaws (not cross platform, doesn't support PDF directly, complex UI, and of course closed source), but I think it's unfair to claim "a new class of application".
Lack of support for any formats other than PDF (it should at least support EPUB, all other formats if possible: AZW, DJVU, FB2, Markdown, single-file HTML) limits its usage for studying e-books.
And it's annoyingly slow (but I'd consider this tolerable if it was really helpful).
I just hope this is just the beginning and Polar is actually going to improve.
My current approach - which I really like - is vimwiki saving to markdown instead of the default .wiki. it's on a self-hosted NextCloud instance, giving me access across all my computers as well as my phone.
It's not perfect - I'm extending it with some standardized project templates and other things - but everything's ultimately text; it's self-hosted, so not dependent on a third-party service that might disappear; and I can easily extend it to meet my requirements.
I find ideas do tend to be quite hierarchical and you can link between nodes if need be.
What would be cool here would be the ability to reference between documents and notes, like you would with hypertext.
Imagine you write an essay or blog post, and you want to instantly browse the citations and quotes you use. That document could live inside polar, maybe as a simple MarkDown editor.
Either way I've downloaded and installed. I wonder if I can get some fancy PDF scanning gear so I can get book segments into it.
I think this is why google free-text search is so powerful: you don't have to have an organizing principle, you just have to know a few key words, or a semantic construct which can be extracted from (meta) data.
OTOH the chaos implicit in my desk (I use volcano filing) probably perpetuates online in this model.
Any plans for a mobile app for Polarized? Evernote doesn't support annotations usefully, and the incremental reading feature of Polarized also sounds cool.
Part of the problem is actually Anki.
It's kind of janky and then there is the issue of firewalls that some people are running into which isn't fun.
I'm trying to smooth out the UI a bit more but it's an iterative process.
- Copying its link
- Switching over to the app
- Hunting for a way to add a page to the archive
- A browser like window opening, pasting the link into the address bar, clicking an accept button
- It rendering out the page, asking whether it looks okay - it did, so accepting
- Opening the resulting PDF and it being completely fucked up.
I gave up. I like the idea of Wallabag much more, but can't get the FF extension to work.
Edit: Polar does offer a Chrome extension, but research about a FF version (the API is THE SAME) is hindered by me not being able to find any reference to the Chrome extension anywhere outside of the app. Idk, to me it just seemed fishy. I don't hate burtonator, I've just been frustrated. Why would I save my web pages as PDF anyway? That's an even more obscure, hard to process format. I tend to like the WARC format.
Does Wallabag support a full-text search?
Wait. What seems fishy? I don't follow.
I'm planning on revamping this entirely with more of this functionality done within the chrome/FF extension itself where we just capture what's already rendered without any of the above complexity.
It's waiting for Polar 2.0 which is about 1-2 months away.
> Why would I save my web pages as PDF anyway? That's an even more obscure, hard to process format. I tend to like the WARC format.
Big misconception. It's not saved as PDF.. It's saved as a captured HTML format. We might do something like WARC in the future though.
But what we really need is Polar for EVERYTHING!
PDF, docx, jpg, MD, ...
All of them, like a new modern Filesystem. But with Metadata and auto OCR...
That would revolutionary!
For me having flashcards means having them on my phone.
I was pretty disappointed that FireFox integrated a service that was proprietary and not even one that was GDPR-compliant.
I have been looking for a hackable alternative to Google Keep. In the past, Pinboard.io had served me well, but now I am looking for something more. I want not just to store bookmarks, not just to archive web pages, but really to have all these at my fingertips for analysis. I want to shift the balance I have between consuming new information and munging on the information that's already come my way.
Polar seems to be a great alternative because it's a computer application that's 100% self-hosted. I would like something that could be run in a federated way, like Mastodon, but this is still _very_ nice. Since I own the data, I'm not worried about any limitations. I especially like that various forms of cloud sync are supported (looking at you, Syncthing).
I see Electron bashed on HN from time to time. I have no opinion about it personally. Even if there were some noticeable overhead, the capabilities this offers are probably worth it.
Super exciting. I'll give it a go.