Relatedly, I wish I could automatically freeze and archive every single web page I visit, minus heavy media, possibly with very low quality images. I tried squid and the internet archive’s proxies, but MITM’ing myself is just slightly too annoying. There’s SingleFile[0] which does pretty much exactly what I want, ripping every single page into a self-extracting HTML+zip file, but it runs inside the browser so it adds a little delay after you navigate to a page, again slightly too annoying. Anyone have a recommendation for a seamless way to do this? Otherwise I’ll probably roll my own extension that pipes every URL to a local process that rips in the background with e.g. selenium.
I wish there were a way to run fully privileged extensions in Firefox, i.e. in the browser context instead of the page...
Bullseye! That's perfect. I should've spent more time playing around with the preferences. After browsing around a bit, the only page I notice a lag on is google search results, it freezes for ~300 ms after load. I'll keep messing around to see if I can eliminate that. Now just have to write a little bash script that sits in the background and moves the pages out of my download folder. (edit 5 seconds later): never mind, you can change the filename to put them in a subfolder. Thanks for your extension!
Saving every single page feels a little overwhelming to me, I open lots of pages looking for a piece of information or an answer to a question and many aren't relevant, many others are outright spam.
That said, I use pinboard to save/bookmark links, and I paid for the archival account type which automatically stores the pages I save. There's a handy bookmarklet so saving a page is a one click operation.
How do you deal with the eventuality of Pinboard going down? (Which will almost certainly happen sooner or later.)
While the civilization doesn't depend on my data, I always like to have a backup, so paying for a service to store my web archive is barely more future-proof than saving links.
One thing to note is that pinboard’s retrieval is not instantaneous. It will save your page “eventually”, which sometimes means “never” because the scraper will get there too late. It happened to me quite a few times, which is part of the reason I’m not a subscriber anymore.
If you really wanted to you could roll your own browser with an Electron core - you'd be able to inject stuff there willy-nilly.
At work we instrument our puppeteer (headless chrome scripted) tests so that it basically sumps out the Dom and other important rendering things. I then wrote a tool that lets one step through the dump with an interactive timeline showing the current state of the page along with console events, user generated events, etc.
So the power is there if you want to get your hands really, really dirty.
I had to step away from using electron because I encountered segfaults like `Received signal 11 SEGV_MAPERR 000000000060` just on visiting cnn.com and clicking one of their nav links (without much going on otherwise, which I found kinda crazy).
Electron also only recently added support for PDFs, and it's still a little buggy. I was able to get around most of these with pdf.js.
I ended up using a combination of chrome extension for injecting stuff into webpage + rpc to a local node.js server running inside an electron app (for convenience). I only made this for personal use, so I'm ok with <arbitrary_caveat>s.
> because I encountered segfaults like `Received signal 11 SEGV_MAPERR 000000000060` just on visiting cnn.com and clicking one of their nav links (without much going on otherwise, which I found kinda crazy).
Yeah, web browsers have so much compatibility code this does not surprise me...
Yeah, I also looked into qutebrowser and Falkon, which let you run arbitrary python against the QtWebEngine bindings. That’s probably the best route, but neither of them fully support WebExtensions / have the ecosystem of chrome or Firefox.
pinboard.in is a fantastic bookmarking service that will archive your bookmarked sites for you if you pay for the $39/year archival account: http://pinboard.in/tour/
I made an extension to save web pages as eBooks: https://github.com/alexadam/save-as-ebook. You can easily modify it to remove the images (or scale them down) and invoke it automatically on each page visit or refresh.
I'm working on something similar. In chrome you can request an image of the active tab from a background page, write it to a canvas to downscale it, and then dump it into indexeddb as a blob of base64. This only captures tabs that have been active at least once but I haven't noticed any additional latency using it.
As an alternative, there is also Save Page WE which works well for me. It takes some time to save a page but I cannot remember a delay when opening the resulting HTML.
I personally save pages as plain text though so that I can use grep etc. later on. Pages with many essential graphics I also save as PDF or epub.
I use the "Save as ePub" addon or simply "Print to file". I save the files in a directory with all the thematic subdirectories in place and then run a script to distribute the file to several places.
With earlier versions of firefox I used the shelve addon, which unfortunately isn't supported anymore.
Back when I was lurking Futaba Channel, the standard protocol to share a stale post(they don’t archive posts) was to take MHT files through “Save As...” and uploading it. I think even some mobile apps for browsing Futaba had dedicated save/load features. Saves everything, even some ads.
It would be nice if we had a proper proxy that could run as a local daemon and basically caught every file request the browser made and saved the appropriate ones, possibly downsampling the images.
If you wanted to keep this off your local box, you could create a separate pinboard.in account, pay for their archival service, and script your browser to pin every URL you hit.
Obviously not answering for the OP, but: because sites disappear or change and also having a local database that does not depend on internet access is wonderful. (Zotero is also great in this regard).
Having notes for each bookmark (which I usually use to record key data points I find in web pages), and also notes at the topic and board levels (which can be used for similar purposes): is there still any benefit (above cost) of keeping an exact copy of the webpage you bookmarked?
I guess it all depends on what's the chance of needing to go back to older bookmarks to re-visit something, right?
Another thought: google has become very aggressive at indexing new content, so much so that old stuff becomes hard to find. There’s pages I know I visited that I just can’t find anymore, no matter how hard I use advanced search.
Maybe you could get clever with a service worker injected by an add-in. The worker context doesn’t have DOM access but maybe you could stream the DOM in chunks through postMessage and do some assembly in the worker. Though honestly that is a lot of hackery just to have it happen in the browser, but it does sound like an interesting experiment! Maybe I’ll try it out myself later..
Although 'synthesise' has broad meaning, I think organising the bookmarks this way helps to fetch our bookmarks when needed e.g. A need gap - 'I forget my web bookmarks quite often'[1] was posted on my problem validation platform which I think this tool can effectively address.
Did you have the intention of solving that problem when creating this tool? Then I would suggest improving the copy in the home page from 'Synthesise' to something on the lines of -
'Don't forget your bookmarks again, get them back when you need it'.
Also, an address bar integration which brings results from Klobie first could serve the purpose better IMO.
My specific use case is doing initial, exploratory web research to work on company, industry and market analysis. This initial phase requires reviewing many sources and identifying the "important bits" that will help structure the main project. I take notes for that, at the bookmark, topic and board (theme) level. I get what you say, the wording of the one-liner maybe doesn't reflect all possible use cases, or even properly describe my own! I'll consider options for this.
Not sure if this will fit the bill exactly, but have been looking for something like this to accompany knowledge base note tools like Roam, Obsidian, etc., since those don't work that well with web content IMO. Using Zotero at the moment, which is fine for something not web-based. are.na is maybe another similar service aimed at content beyond bookmarks. klobie looks good though!
A couple of suggestions: I like the 'card view' of the board pages, but assuming one has a lot of tags, and many bookmarks per tag (or even many bookmarks for a single tag), I feel the 'overview' you get with the board page kind of vanishes and the individual tags take over. Something like a fixed height option for the cards that becomes scrollable would be nice. Alternatively, each tag becomes a separate page, with the bookmarks being listed once you enter the tag page instead of the board page, but not sure if you want that kind of hierarchical structure. Oh, and maybe a tighter column view might be nice!
Thanks! I appreciate the feedback. I'm not familiar with all those tools, but yes, the idea here is bookmarks for now.
I get what you say about boards with too many topics/bookmarks. I've been thinking of different ways to display information, so this kind of feedback is very useful!
Good question, I forgot to add that to the help section :)
B = board
t = topic
b = bookmark
bt = bookmarks per topic
The percentage is the "coverage" received by a given topic in each board. If a board has 10 bookmarks and 3 of them are labeled e.g. "topicX", topicX's coverage is 30%. Totals may add up to more than 100% due to multiple topic assignments.
It helps you figure out where you stand in terms of the information you collect. I sometimes have to e.g. research and compare a number of topics within a given theme (e.g. companies in nanotechnology), and I want to make sure I do enough digging on each one.
Thanks! Data import is at the top of the pending features, and plan to work on it within the next week. I'll look into those two sources and see what I can do.
I like the app. One thing that did not work well for me was the requirement to enter a topic. I want to quickly bookmark a page and move on. Why is topic required?
Great point. I wonder if you use bookmarking to save URLs "just in case"? When you are on mobile, and then return to the website when you have more time for that? Any other use case?
In my use case, it's a matter of organization. But totally get what you say, classifying things into buckets is extra work. I'll consider options to address this but, would adding a topic "later" be a temporary solution to you? Then you can edit bookmarks' topics as you please, but they would initially appear under the same card corresponding to the topic "later".
It actually has already an export function hidden somewhere :) But I had to disable it temporarily. So, yes, it'll definitely have it back in a few days.
Question: are CSV exports ok? Any other format you'd prefer?
Thanks for the feedback! Would you rather have the entire board with bookmarks, notes, etc. all in markdown keeping the structure (cards, etc.) or just the list of bookmarks in markdown?
Bookmarking does not work. Saving every page of the slightest interest to me, even if I don't read it immediately, was one of the best things I took as a habit.
Sure, but the value added by klobie is by organizing and presenting the bookmarked sites in a useful way. Take a look at the sample board https://klobie.com/v/8oqy1eg/coronavirus
This looks great! There seems to be a renewed interest in bookmarking services. I've experimented with a few options over the past few years and recently settled on the Notion web clipper. Although Notion doesn't have all the features of a dedicated bookmarking service, the ability to save everything to a table that you can tag, add notes and filter is nice. I will keep Klobie in mind as an option.
Thanks! I'm glad you found a bookmarking tool that works for you. But please don't hesitate to reach out to me directly to discuss what features would make you consider using klobie. Cheers.
Fantastic work, I have been looking for something like this. I have started using it and it works great, just a little feedback: In Chrome when entering a new bookmark I have to click back in the box once a topic has been entered and I wish to add another, which is annoying. It would also be great if the title was auto populated via the URL and then leave it to the user if they want to customise this.
Re: title: it actually is auto populated when you bookmark a webpage with the bookmarklet AND the webpage has a title. Let me know if I'm missing something here.
Re: topics: yes! You are absolutely right, I'm fixing that asap.
> Diigo is a multi-tool for personal knowledge management dramatically improve your workflow and productivity easy and intuitive, yet versatile and powerful
Love it! Thank you for creating this. Is there a way to add the bookmarklet to my toolbar instead of having it on the bookmarks toolbar which I keep hidden.
EDIT: Also can I get to my home page when I click on Klobie than clicking on my username? And can I tag bookmarks with topics under different boards?
What browser do you use? Once you created the bookmarklet, you can move it to your e.g. "bookmarks bar" in Chrome and it's pretty handy (that's my current set up).
I'll consider your navigation suggestion – any specific reason why you prefer to click on the logo to go back to your home page rather than the username? Any problem with the interface? Visibility?
Each board has its own set of bookmarks and topics. If you want to save a bookmark to multiple boards, you'll need to do save it to each one of them. Sorry for that – this is at least for now, a copy bookmark function is an option that I'll consider.
I'll be here if you have questions, or you may want to just send an email to the contact email (address here: https://klobie.com/help)
Great job, looks clean and useful! I think it'll help to put the site's favicon or some kind of visual differentiator next to each link so it's easier to scan when you're looking for a specific link.
Thanks! It's a great idea. I've been considering something along those lines but haven't decided about the ideal way to implement it. But I think I'll have an update on this in a few days. Stay tuned! I appreciate the feedback.
Nice but I forgot my password during signup and now can't find a way to retrieve it. Where is the forgot password page because I can't find link for it on the login page either.
Sorry for the inconvenience! I still need to implement that feature (I know, it's important). Would you like to sign up again using the same username? Please contact me at k at klobie.com and I'll make sure you can access/use the tool.
Thanks!
Thanks! A connection with Google search does sound interesting! I'll look into it. Feel free to contact me directly if you try the app and have questions.
2 validation errors detected: Value at 'username' failed to satisfy constraint: Member must have length greater than or equal to 1; Value at 'username' failed to satisfy constraint: Member must satisfy regular expression pattern: [\p{L}\p{M}\p{S}\p{N}\p{P}]+
Hey thanks for asking. That's not an option though, at least for now :) But do let me know if there is any other feature that could help with tagging in the meantime!
Thanks for the feedback and sorry for the inconvenience!
The sign-up page actually does include the password requirements: "The password must be at least 8 characters long and include upper and lowercase letters, numbers and symbols."
In sum:
* 8 chars long minimum
* include all of these: upper and lower case letters, numbers and symbols
I do realize that such a combination of characters is somewhat hard to remember compared to simpler passwords. I'll consider options to simplify all this.
I keep getting an error with the username and finally inspected the validator. I think you could be applying your pwd validation to the username field but I could be wrong, anyway I figured it out eventually.
I feel pretty dumb, sure enough it's right there. Pardon my error I glanced right past that. Positioning the text with the password field would likely help, and/or include all rules in the error message, not just the one that's violated.
I'm late to the party but do you have tips for adding pages quickly on mobile? The boomark I created in the desktop Chrome client either isn't syncing or isn't available in mobile Chrome (Android).
Thanks for the feedback! The bookmarklet should actually work on both desktop and mobile. Please consider that bookmarking on Chrome (Android) requires you to type the name you used when creating the bookmarklet into the browser's address bar to bookmark a web page. You may want to reach out directly: k at klobie.com if it still gives you a hard time.
Relatedly, I wish I could automatically freeze and archive every single web page I visit, minus heavy media, possibly with very low quality images. I tried squid and the internet archive’s proxies, but MITM’ing myself is just slightly too annoying. There’s SingleFile[0] which does pretty much exactly what I want, ripping every single page into a self-extracting HTML+zip file, but it runs inside the browser so it adds a little delay after you navigate to a page, again slightly too annoying. Anyone have a recommendation for a seamless way to do this? Otherwise I’ll probably roll my own extension that pipes every URL to a local process that rips in the background with e.g. selenium.
I wish there were a way to run fully privileged extensions in Firefox, i.e. in the browser context instead of the page...
https://addons.mozilla.org/en-US/firefox/addon/single-file/