1. Identity is hard to do on the LAN, so any sharing ends up using the cloud to figure out who has access. Similarly, identity is hard to move around, so representing your Facebook comments feed outside Facebook is difficult to do.
2. Any time you have a "special" server that handles identity, merging, or any other task, it ends up in the effective role of master, even if the rest of the parts of your system are peers. You want all your collaborations in Dropbox to survive their infrastructure being offline? It's tough to do.
3. p2p stalled a bit in the mid-2000s when storage and bandwidth got much cheaper--in a period of just two years (2002-2004), it became 100x cheaper to run a cloud service. But what continued to stall p2p was mobile. Uploading and syncing needs to run in the background, and if you're on a limited bandwidth client or a battery-limited device like iOS, sync can effectively diverge for months because the applications can't run in the background. So changes you "thought" you made don't show up on other devices.
4. For avoiding mass surveillance, what we are missing from this older time is the ability to make point to point connections (between peers) and encrypt them with forward secrecy, without data at rest in the cloud. Even systems that try to do some encryption for data at rest (e.g., iMessage) keep keys essentially forever, so data can be decrypted if you recover a private key later on. A system that only makes direct connections between peers does not have this issue.
5. Anytime you have multiple peers, you have to worry about old versions (bugs), or even attackers in the network, so it's fundamentally harder than having a shared server that's always up to date and running the latest code.
The apps all currently use one identity per device but it's not actually hard to use the same identity on two devices and multiple apps share the same identity commonly.
Old versions could of software could be a concern but really that's what Versioned APIs are for. It's a solved problem.
The only notable problem with local first dev that I'm aware of currently is storage requirements being confounded by the limited size of SSDs on most people's computers.
There are a number of startups working on simplified or "passwordless" auth, but it seems that none have substantive traction. I'd love to be proven wrong, here, though!
Worse, people's fears in this area are completely justified. Precious few companies have proven themselves to be contentious with our personal data. Some go as far as to repackage and outright sell said data for personal gain.
A decentralized blockchain-like system might work for this, but as far as I know it has not been attempted.
Unfortunately this idea didn't gain more traction, and there are only a few references to it online now, such as this research paper and a YouTube video by one of the authors:
Also, it didn't address all the methods by which a malicious identity provider could track the user, so it would probably have to be extended by having support added in the browser.
As I recall, one of Google's many "failed" projects was a means for transferring large numbers of photos person to person called "HELLO". I reckon this existed for a short time post-2004, then AFAIK disappeared from public view (or morphed into something else). I could be remembering the details incorrectly.
The issue I face with PhotoStructure is that people's home network frequently has throttled upstream rates. I'd love to provide a caching CDN, but I want all content encrypted. I don't want my pipes to see any data from my users.
How would you do perfect forward secrecy when only the library owner's key is available at upload time? Is it possible? If not, it seems that every bit of shared content would have to be re-encrypted and re-uploaded when someone new is granted access to that content.
Content is encrypted and published by the owner, but the owner uses a new random key to do the encryption. This new random key is the "envelope key."
The owner then takes the envelope key, encrypts it with the public key of recipient A, and publishes the encrypted message containing the envelope key. (This message with the envelope key should also be signed by the owner using the owner's private key.)
Anyone who obtains the envelope key effectively has read-only access to the content. The owner can publish a separate message with the envelope key, all these messages have recipient B, C, and D's different public keys, so if you are a recipient you will want to know which message is likely to be encrypted by your public key, after which you can decrypt it to get the envelope key.
The owner also needs to sign the content. Signing the content and later verifying the content is done with the owner's public/private keypair, not the envelope key.
A glaring weakness of envelope keys is that the recipient can share the content freely once they have it. There's no content protection after the content and the envelope key are published.
The recipient can easily just share the envelope key if they want to. But they can also just copy the content (once they have decrypted it).
I've been working on my own personal-use app for my twin and I to collaborate on documents from different continents so pretty close to the problem.
There's a 'collaboration' mode which allows peer-to-peer sharing of a session via CRDTs over IPFS. My partner and I select our meals for the week, and then when one of us is doing the shopping, we can mark ingredients as found -- the other person's view reflects those updates in near-real-time.
If either of us lose connectivity, we can continue to use the app, and when data access is restored, those changes are synced with the shared session (with automatic conflict resolution). All data in the shared session is encrypted, and the collaboration link contains the keys.
Much of this functionality is thanks to peer-base, which is an experimental but extremely useful library:
A side-benefit of this approach is that all user data can be stored locally (in browser localStorage) - there are no cookies used in communication between the app and server.
This is really cool!
I found the share link, and just tried it between two Chrome browsers, and it worked great!
Thanks for using peer-base! There's a lot of great work happening with js-libp2p recently that would be awesome to incorporate ... I'm hoping to get active developing on it again in the new year. I've got so many ideas for improvements.
If & when you're looking for any more contributors and/or testing, let me know; I'd be glad to pay it back.
I'm hoping to open source the reciperadar application's stack soon and happy for it to be part of any ecosystem of examples (jbenet's peer-chess was a big help to me getting started).
* Allow "patchcables" with different hooks, so for example a different recipe database, or a different language.
* Allow groups of ingredients, ie. tag ingredients. For example, if I say I don't want meat, I still get shrimp suggested which might be what I want but it might also not be what I want.
* In line with that, allow to specify allergens or blacklist ingredients (in line with dietary preference). Allergens are more important than dietary preference.
* Allow weighting of ingredients, such as "80% tomato and 20% bell pepper", or something like "tomato or bell pepper" or "tomato xor bell pepper".
* Some kind of verification reading at store or fridge (bar codes / tags) to remove entries automatically. Been dreaming of this for a long time. The way I envisioned it stores would share an API, with price, and customers could use it to compare prices and order appropriately (including taking into account S&H). Then the smartfridge (and other smart storage) would read the incoming ingredients. It would also read outgoing ingredients, and it could detect when they don't reenter. It could read weight checking how full the package still is. I had this idea 20 years ago, but I never saw a viable business case in it. You'd need to sit between all these stores who already claim they have low margins. Plus, if you'd sit between them, how would you earn profit from such software? My take is that if you want to secure privacy, software like this would require an initial investment, and then being FOSS, or a subscription and it being partly FOSS.
I really like the idea of being able to patch in an alternative recipe dataset. It strikes me that a default-offline, peer-to-peer search engine (i.e. 'distributed lunr.js') is technically feasible (even if no such codebase exists at the moment, afaik). That would, in my opinion, be the ideal pluggable data source in terms of achieving local-first behaviour, resilience, and privacy, with the option of fully offline & single-device operation.
Shorter-term, once the application's backend code is open source, at least it'll be easier for anyone to run their own instance (it's a containerized set of Kubernetes microservices, FWIW).
You're completely correct about ingredient searching and handling. There's a lot to do here. One of the upcoming 'large' work items on the roadmap is to build a true knowledge graph over recipes and ingredients, with relationships (like nutritional information, substitutions and pairings) between entities. I've started exploring this space, but it's going to take a while, and I want to do it carefully and thoughtfully.
Allergies in particular are a sensitive topic and I don't want to give users any false sense of safety -- that said, it's also a very valid use case which the app should cater for.
Almost every time we do our shopping here we discuss the pricing and stock-keeping problem you mention. In many ways the application would work best if it already knew what you have in the kitchen (and how fresh your ingredients are).
There may be some use for image recognition and OCR of receipts -- or, perhaps better, at point-of-sale in stores. Keeping the application client-first is a goal, and feasible I believe - tesseract.js exists today, for example.
The project roadmap (and changelogs) will be published on the site in the not-too-distant future, and I'd like to use an issue tracker to track bugs and rank feature requests. I'll keep a note to include items from your feedback and you'll get a credit against them once implemented. Cheers!
Let me also echo the idea of a "pantry inventory". Getting something like that working (well) would by itself be a very strong feature. Like others, I have been interested in that for a long time.
I even took a stab at it many years ago, to some success, but abandoned it when we hired a nanny who took care of cooking.
I had a dedicated optical scanner attached to the wall plugged in via USB to an Arduino microprocessor. It had a single button and a single RGB LED. Pushing the button would "wakeup" the system and it would resume previous mode. Two modes: check-in and check-out. (Pushing the button again toggles the mode. System reboots by holding the button down for 3 seconds. Auto "sleep" occured after a set inactivity.) The Arduino just sent a POST request via HTTP API upon a successful SKU scan, and flash a red or blue light depending on thr mode. And then a little CRUD interface I could access from my phone.
There are many SKU datasets out there available, and some amount of management of that data (organization, categorization) would be what could really set you apart.
The setup you had sounds extremely cool - did you hit any particular challenges around the SKU data wrangling?
I've added a naive rules-based approach to categorizing products into supermarket departments (bakery, fruit & veg, ...) - it's really barebones at the moment, but already helps planning walking around the shops.
I'm not sure where you're based but your Arduino setup reminds me a bit of supermarket self-service checkouts in the UK - you hold each item up to a scanner and get a little 'beep' confirmation for each one during checkout.
A typical grocery store here in the US has about like, what, 20k SKUs? But I personally only really deal with maybe 200 of those, so at the time I just managed the data manually. (Plugged that USB scanner into a laptop and scanned every item into Excel --Scan, type. Scan, type. Then imported that into MySQL.)
I looked into data providers mostly because I wanted item photos. (Ultimately I just scraped images from Target.com cause I could search via SKU and it was super easy.)
Unfortunately that's my only experience with SKU data aggragators.
My scanner was attached just inside my pantry to the left, and it worked well enough. Hands free.
Funny you mention it, cause I'm not a big fan of self-checkouts here in the US. They are ergonomically incorrect, technologically unreliable, and impractical when purchasing more than 15 items. With traditional checkout the conversation is "how are you, did you find everything today?" and with self-checkout they make it clear they know you are a potential theif. shrug
It's entirely possible that I'm wrong, but as a retail worker, that's my understanding.
I'd argue there is also a societal effect as well. Those employees are trained on how to spot theft, and not about being helpful to the customer. I'm not interested in being treated like I'm in a prison. :) I'm fine waiting in a checkout queue with my kids. They enjoy shopping with me, but it's a small nightmare trying to do it all by myself.
This will invariably lead to people like me having to pay a premium for an actual employee of the store prepare my bill of sale. --Which, when that time comes I'll pay.
Also, I really don't want the liability of performance on me. I scan a wrong barcode or I miss something and then I'm in a small room talking to a police officer having to explain myself. No thank you. :)
We also tried a couple home delivery services, but it was too unreliable.
What is popular around here is pickup at the store. Submit an order and a "picker" does the work. You either get curbside delivery (small grocery chain) or pickup at a dedicated counter inside (Walmart).
I never use them, unless its one small plain barcoded item.
I still have the components in a box somewhere. I was thinking about installing it in our basement "cold storage"/"emergency food storage" room for longer term items. But I just don't access those items enough to justify drilling a hole in the concrete wall for DC power.
One note, as I have not used it much: the "Most relevant" sorting scheme (or perhaps just the search function itself) needs some love. I searched for "chicken" as the sole ingredient, and of the 10 results on page one, only the ninth result is a recipe containing chicken. Many of the rest are desserts. I'm guessing this works better when you include multiple ingredients, but I would think that if chicken isn't in a recipe, it wouldn't be included in the results at all. I'm curious to know if you have experienced this.
In response I've made some changes to search indexing for 'chicken'-related ingredients; I explicitly want some terms like 'chicken breast' to fall under the category 'chicken', but yep, 'chicken bouillon' and a few others don't seem relevant.
Hope that helps - glad to keep on narrowing down on any extra cases you find too. There's a small 'feedback' button at the bottom-right-hand corner of the page which sends messages directly to my inbox.
How are you parsing the ingredients? I went deep into the rabbit hole on that part and ended up spinning off a separate ingredient parsing SaaS:
I wrote about the experience last year:
It's great that you found this post; as it happens I'm using your fork of nytimes/ingredient-phrase-tagger (thanks a lot for updating and containerizing it!) in combination with 'ingreedypy' for ingredient parsing.
How's zestful doing? It looks great and I did consider using it; the quantity of parsing I was doing led me to choose the container version just to keep my own operational costs down (albeit at the loss of any model retraining you've done).
Zestful's doing okay. It's in maintenance mode at the moment, but when I have spare time, I like to tinker with it to bump up the accuracy. Customers seem to mostly use it in big bursts, so it'll be a few thousand requests one month, then the next month, it'll drop to only a few dozen.
If you ever have patches you want to push back upstream to the open source tagger, I'm happy to review. And if you ever want to do a bulk parse of ingredients on Zestful, I can certainly offer volume discounts.
A Pragmatic Photo Archiving Solution: https://docs.google.com/document/d/1JzqT-DJFlS2e8ZC00HrsQITq...
It's the culmination of software I've written  + a workflow that's resulted from it [2, 3, 4, 5].
 Elodie - https://github.com/jmathai/elodie
 Understanding my need for an automated photo workflow - https://medium.com/vantage/understanding-my-need-for-an-auto...
 Introducing Elodie; Your Personal EXIF-based Photo and Video Assistant - https://medium.com/vantage/understanding-my-need-for-an-auto...
 My Automated Photo Workflow using Google Photos and Elodie - https://medium.com/@jmathai/my-automated-photo-workflow-usin...
 One Year of Using an Automated Photo Organization and Archiving Workflow - https://artplusmarketing.com/one-year-of-using-an-automated-...
All this relies heavily on Google Photos, but I have my own local backup of all original files. So if I need to change service, it should just be a one-time effort to migrate.
I'm really the only one who cares about archiving photos so I'll transfer the shared photos from Google Photos to Google Drive (using the "share" functionality from the mobile app).
This kicks off a workflow that simultaneioously organizes the shared photo into my library and copies it too Dropbox and uploads it to my Google Photos library . (I use Google Drive as a transport mechanism to get photos off my phone and onto my Synology).
Not ideal but once I got it set up it's worked really well.
I thought this was killed recently
(this = google photos appearing in google drive)
Does it still work for you?
If yes, how? :)
Not keeping Drive and Photos in sync really killed it for me. I ended up switching from Google Drive to Dropbox but I still use Google Photos.
I have photos added to my library in Dropbox automatically added to my Google Photos library and this has worked well so far. 
it was just the thing I was thinking of building recently as it was getting really tiring to manually organize photos
one extra idea that I had: there's a cool project that would enable offline geocoding, which would help get rid of API limits while making the reverse geocode queries almost instant
(the included dataset is pretty limited, but it's not hard to extend from an openstreetmap planet dump)
For the time being, elodie does cache responses from the MapQuest API. And to reduce the number of API calls it does some approximation by seeing if an entry exists in the cache within 3 miles of the current photo --- if so, it uses that instead of looking up the location. 
- file formats that won’t lock you in or are even openly hackable (allows you to automate things)
- no clouds that will break the software once it is gone
- local storage with custom syncing or backup options
- strictly no weird data collection or “We own the rights to your data”-Type of terms
So if I get the slightest feeling of a lock in or unnecessary data collection you are scaring me away, because mentally I would then already look at the time after you decide to scrap your cloud or abandon your file formats. The data collection bit shows me your users aren’t front and center but something else is which makes your product even less of a good choice.
If your product runs on the web, allowing for self-hosted solutions is also a big plus.
I am the author of a SaaS app (https://partsbox.io/). I export in open formats (JSON), there is no lock-in, it's easy to get all of your data at any time. But the app is online and will remain so. Why? Economics. Maintaining a self-hosted solution is an enormous cost, which most people forget about. You need to create, document, maintain and support an entirely different version of your software (single-user, no billing/invoicing, different email sending, different rights system, etc). And then every time you change something in your software you have to worry about migrations, not just in your database, but in all your clients databases.
I am not saying it's impossible, it's just expensive, especially for companies which are built to be sustainable in the first place (e.g. not VC-funded). Believe me, if you don't have VC money to burn, you will not be experimenting with CRDTs and synchronizing distributed data from a multitude of versions of your application.
I regularly have to explain why there is no on-premises version of my app. The best part is that many people think that an on-premises version should be less expensive than the online version, and come without a subscription.
A good example are note taking apps. My notes should be private and I want to be able to read them ten years later. For me your product would need to add something valuable to this that a filesystem and a bunch of files that I synchronize myself can't do. As for now there is no note-taking app I found where the benifits outweight the preceived loss of privacy and reliability. The online thing can make sense, but syncing my phone with Nextcloud works even better, so I don't really see why I'd need it.
This is potentially different with an app like yours, because the benefits of using your app vs using e.g. a spreadsheet seem truly tangible. Using JSON and allowing export at any time is a huge plus. Having it a web-only kinda makes sense, as your app seems to be geared toward teams (and any serious parts managment makes a lot more sense when you are not alone).
While the additional work that would have to go into documentation and programming if you were to offer this as a self-hosted variant is non-trivial, from my standpoint offering the option to self-host can also be seen as an act of communicating: "Don't worry, whatever happens to this project, you will not lose the time invested".
I am not obsessed with this kind of reliability, but I just want to avoid the future hassle of having to deal with this, especially when I put a lot of time into it.
A lot of devs have moved away from that because subscriptions provide better peace of mind and smooth out the income stream, but it's not clear it's really better for users. Certainly they lose some optionality and there's less market pressure on the suppliers to ship big new features that motivate upgrades.
Also, a lot of people seem to forget that even if we pretended that support isn't necessary, just the existence of a standalone version imposes ongoing costs: there is more testing, and the scope of changes that can be made is more limited.
Subscriptions are the only sustainable way of maintaining software in the long term. We can either accept that and move on, or keep pretending we "buy" software that will work forever, and then pay every year for a "major new version", which apart from the fundamentally fictional nature of the deal, results in developers cramming in new and unnecessary features instead of focusing on software quality.
I could be a potential customer, since my company works on the embedded market and we design and manufacture all our PCBs.
There was a time the company has a moment of big growth, and we looked into solutions like yours. The question was: what would happen if this guy (not specifically you, others similar to you) closes tomorrow? The answer generally was: you can export all your data and then you feed it into your local database. So the matter ended with: OK, we'll stick with our current database. Goodbye.
At the end of the day we saw such services as a glorified data-entry/store/search database that hardly will suffer many modifications or require updates, and since pretty much of our distributors (apart from mouser, digikey, avnet, etc.) won't support data exchange and PO/Offers will be negotiated in the traditional way (also negotiating price reductions). No need for VC funding or anything else.
We need to be able to open an schematic/BOM/whatever in 10 years and be able to operate as if it was just created yesterday, so much of the software we use is "a relic of the past" (in SaaS vendors eyes), but effective and "the way is supposed to be".
Online/cloud or subscription-based CADs? Also kicked out of the door.
With this I want to say that people like the article author seem to live in a little box, thinking on what they use daily as web-service-saas-whatever developers, without realizing the world is much bigger than they think it is.
It is pretty much sustainable. Except few natural cases SAAS is just a wet dream of vendors trying to have constant revenue stream the easy way.
If software needs support/new features you pay for upgrades. Or not if the old version works just fine for you. In my case I use tons of software for which I only paid once (sometimes I pay for upgrades as well). If I had to pay monthly fee for each title I use I'd be spending insane amount of money.
I strongly disagree. I have always, up to this day, sold software on that model, and I will probably always do so. There's nothing unsustainable about it.
What is true is that rent-seeking can be much more profitable. But fortunately for those of us who find it very distasteful, it's not necessary.
one-time fee only works if you have a constant stream of new customers into your niche that is sufficient to pay your salary.
with upgrades or subscriptions you can have a group of users who love what you do and are happy to support you without having to hope for new users and constantly market to get them.
If I want to continue active development on it, then I sell upgrades. If not, then I'll continue to do free maintenance releases, but my main business will be from different products.
> one-time fee only works if you have a constant stream of new customers into your niche that is sufficient to pay your salary.
You can continue to get income from existing customers. You just have to provide real value to them in exchange for it.
I think you're not looking at this from a local-first perspective. From that perspective you can have the same app locally as on the server. There's only one version. Yes, it does require more planning and atypical approaches, but it's 100% doable.
I'm in the planning stage of a local first web-app that will have a server side version, and it's literally going to be the exact same code on both.
I can see some arguments for having _slight_ differences between server and client software but nothing that isn't easy for a solo-dev to maintain. Mostly set it up once and never touch it again type things.
thinking local-first has fundamentally changed some decisions i normally make without thinking for web apps, but I think it will absolutely be worth it for my users in the end.
And how do you plan to achieve that? What will be running on a server side? Will it be headless or server-rendered?
Do you have a GitHub repo with that project?
Sorry for lots of questions and thanks in advance!
Which makes sense IMHO, provided they are not expecting any updates to their on-prem installation. It can just be a fork of your current codebase with no new features or warranty. Maybe you can include some terms for critical security updates but that is about it.
I believe that the whole concept of "software without support" is fundamentally flawed.
I have been asserting exactly the opposite in this thread.
None of which are insignificant.
In addition to that you must now support (and test for) a million configurations, rather than just one.
I also have experience developing and maintaining cloud solutions. Total amount of work that goes into large cloud apps and amount of things things that can brake or not work properly is fairly impressive. Definitely not any less then on premises
But they will expect it.
In my case for example I have few older products (desktop apps that still sell and bring some money) for which I have source code that has not been touched for years. Not interested in development new features either as those apps are mature and there is not much ROI in developing new features the whole 2.5 customers are willing to pay for.
Long story short I do not have any critical bugs there. Maybe do exist and hiding somewhere but since nobody had ever discovered them I am totally cool. The only occasional customers complaint I have every once in a while is RTFM type. Not real bugs.
As a user, I strongly favor software I buy once, download, and run as-is, forever, untethered to the Internet.
However, that doesn't mean it needs to be "hosted" anywhere. It's trivial to make an app run a little web server that they access with their browser instead of a native or QT UI.
Browser based interfaces are an excellent bet (probably the best bet) for "run as-is, forever" or as close to "forever" as we can reasonably get right now.
While you're right that doing this is less convenient, I can say with the personal experience of developing a few such projects on a shoestring that doing this isn't as hard or expensive as you're making out. It's entirely doable. It just requires thoughtful engineering.
We take the local-first concept and p2p to the next level with CRDTs and replication. But what we really do is leverage things like AWDL, mDNS, and or Bluetooth low energy to sync mobile databases instances with each other even without internet connectivity. www.ditto.live
Check it out in action!
We found that CRDTs, local first, and eventual consistency REALLY shines in the mobile phones since they constantly experience network partitions.
Exist a resource in how leverage that tech with boring stuff like inventory, invoices, etc? Hopefully,. without a total change of stacks (I use postgresql, sqlite as dbs, and need to integrate with near 12+ different db engines)
Syncthing solves a large part of syncing data between devices using your own VPS, server(s), etc. If your VPS provider goes out of business, you can then just fire up a new VPS and hook it back up to your local machine(s).
Thing is, like Syncthing, it lacks a collaborative feature. Nextcloud has it, but only if you have the Nextcloud accessible (I want to host only on LAN). Something like IPFS (or Tor) is a solution to such problem.
It does not, you can share folders with anyone you want without them even needing an account.
It would unlock scenarios like share your files with me and I'll share files with you and neither of us will know the others files unless your house burns down in which case you can get your files back just by hookin the sync back up with the encryption key pulled from your password manager.
With Syncthing I don't (yet) have this problem.
The comment about CouchDB and the "difficulty of getting application-level conflict resolution right" I am not really certain how it applies, You dont have to handle conflicts in Pouch/CouchDB if you dont want to, there is a default model of last write (actually most edits) wins but you can handle them if needed
I've been down the CouchDB/PouchDB path several times with several different engineering teams. Every time we were hopeful and every time we just couldn't get it to work.
As one example, I worked with a small team of engineers to implement CouchDB syncing for the Clue Android and iOS mobile apps a few years back. Some of my experience is written down here. After investing many months of engineering time, including some onsite help from Jan Lehnardt we abandoned this architecture and went with a classic HTTPS/REST API.
Other times and with different teams we've tried variations of Couch including PouchDB with web stack technology including Electron and PWA-ish HTML apps. None of these panned out either. Wish I could give better insights on why--we just can't get it to be reliable, or fast, or find a good way to share data between two users (the collaboration thing is kind of the whole point).
I hope this approach avoids most of the pitfalls you mentioned.
Your Git analogy is also spot-on, but I think you don't take it far enough. Creating a repo is cheap, and I believe CouchDB databases are, too (altough I'm still very new at this). You seemed hesitant to create too many.
Good point about notifications, though. I think you'll still have to have a server process that manages that kind of thing (and probably inserts notifications in other users's databases).
Like most people here I'm fairly hard line when it comes to personal data abuses but I still struggle with the concepts of owning data about yourself. It's a confusion I see amongst less technically literate people when a well meaning person explains to them the importance of some latest data breach and they try to understand the concept that they owned this data, it was theirs but now it has been "stolen" or abused in some way.
I would consider going as far to say that framing the data as owned by you is a bad approach, but maybe I'm just being pedantic about the language. Company A does have data about me, but I don't own it, and they have responsibilities to protect it (or delete it if requested), but I don't see any ownership in the equation, especially when the nature of the data can become quite abstract while still maintaining some reference to you.
Not to take away from the intention or sentiment of framing it that way though, I'm just musing.
I'm not even quite convinced they can't grasp the concept. People think they 'own' their digital copy of The Rescuers Down Under, when technically they've leased it. They generally understand they shouldn't post their credit card numbers online. I think it's more likely that they just don't understand the consequences, and even if they did, generally feel dis-empowered to do anything about it. That's a dynamic that has existed since before the internet.
I think most people would agree that if you take a photo of someone who agrees to you taking it, you own the rights to that photo. But if a company collects data about someone voluntarily using that company's product, the person owns that data? I don't understand where the line is drawn.
For example, if I own a book then I can read it, doodle on it, or lend it out at any time. I can't prevent other people from reading/defacing/whatever their copies of the book.
why not treat data the same way ?
yes it will be very disruptive to some businesses. i hope.
Companies should possess your data, not own it.
That's why I'm designing any new apps around a file format that can be accessed even without the app.
I have a "local-first" Kanban/Trello-style app, "Boards" (http://kitestack.com/boards/), that uses zipped HTML files (to support rich text with images). No collaboration and cross-device support just yet, but it works without a network and saves everything locally.
"A new Mac app to boost your productivity in school, at work, and for personal projects.", "For macOS 10.14 Mojave and later"
The cloud apps are not slow only because of moving data, but there is also a problem that an average server is fast(16cores CPU + 64GB RAM), but If it's used by let's say 100users, It means one user has only 0.16core + 0.64GB memory. So an average laptop(4cores/4GB) or phone(4cores/1GB) is way faster. Basically people buy billions of transistors to use them only as a terminal to the cloud. Not to mention privacy risks.
A week ago, I did showHN for skyalt.com. It's a local accessible database(+ analytics, which is coming soon). I'm still blown away how fast it is, that you can put tens of millions of rows to single table with many columns on consumer hardware and you don't pay for scale or attachments.
This is overly simplistic. You're pretending that cores/memory are "allocated" to users, but really, a user might only make a few tens of requests, and the server only needs to spend a second or two servicing each request. On a server with only 100 users, it could very well be the case that a user has all 16 cores + 64gb available at the time they make a request. Also, as another commenter pointed out, you could use a large chunk of that memory for shared resources, and then each request might only need a few mb of memory to service a request.
I can't quite get this point. From my perspective software engineers love/adore electron applications.
Look at VScode as the example:
- telemetry included
- proprietary build with "open core"
It is literally the most popular code editor right now (p.s. I don't use it). Why as a tech savvy user you will use something you don't like for 5-10 hours each day to do your work?
Only answer I can see that electron is not an issue here.
That is, if JetBrains had made JS plugins first class citizens of their products, possibly VS Code wouldn't be as popular as it is.
After taking time to learn my tools I can't use vscode anymore because of how inefficient and restrictive it is now for my workflow. I mainly prefer tools that can last a lifetime and time you've investend now can yeld much better results across several decades of usage.
Here is short list of tools I can't live without: magit, org-mode, undo-tree (actualy it was a feature request and vscode team said it's too compticated for broad audience), ability to hack your code editor as you wish. Ability to work all day without touching a mouse once.
Also, there's an expectation that an IDE will be somewhat heavyweight. I don't mind if VSCode or IntelliJ grabs a few gigs or RAM, because I live inside those applications and depend on the features they provide.
What I don't want or need is Yet Another Electron Based Markdown Editor that gobbles up half my laptop's memory so that I can edit a hundred line text document.
I stopped using VS Code because of the billions of files it ships with (does anyone remember DLLs these days or what?) and its performance and debugging capabilities were very disappointing.
I am pretty sure that all of these giant companies releasing garbage Electron apps have a bigger budget than the budget on the 2 people (us!) that wrote what we wrote in 2.5 years.
When I see how abysmal apps like Skype and Slack are, I despair. Colossal amounts of RAM and simply displaying text and pictures.
It might be more convenient for the developer to write in the first language they learned but it is producing giant bloated applications. More CPU cycles, more allocations, more power consumed, shorter battery life on mobile devices (laptops), more charging of devices, more fossil fuels burned.
You care about delivering a high quality and high performance application, big corps just want to sell a product.
Also, just because they have a bigger budget doesn't mean they are all about spending it, they probably want to squeeze it as much as they can.
All using iCloud APIs to synchronize what is essentially local-first software.
You could even count Mail, Contacts, and Calendar, although they rely on more established protocols to sync.
The problem is not inherently technical. The solution must address the fact that the software businesses favor cloud solutions and other systems that make it difficult to stop spending money
For this reason, I avoid using cloud SaaS for anything where can avoid it.
Google - yes, web e-mail obviously is similar to the above; as for their office suite, I recently found a good excuse to justify shelling out for a proper Microsoft Office subscription (though I don't like that it's a subscription), and I stick to using the faster, locally-available, file-using, much more powerful (if still proprietary) software.
"Show me your flowchart and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won't usually need your flowchart; it'll be obvious." -- Fred Brooks, The Mythical Man Month (1975)
We went from local, to local infrastructure/IT, to cloud services in which data is mixed with other data and ownership is nebulous.
The business as you say, favours this, because the 'downsides' of the model are more on the side of risk (your data leaking, losing rights to your data, your data being 'sold' etc.) - which we don't like to pay for until it's too late.
Unless there are 'big scares' or regulatory requirements etc. I don't see that much changing.
But a single major leak from Salesforce or say Google Docs could see a massive shift in how we think about such things.
So far, the model seems promising. It's fully peer-to-peer, and supports decentralized identity, configurable conflict resolution, read/write access, asset storage, and currently is running across 3 different transports:
- HTTP (augmented with extensions proposed by the Braid project )
I've included two simple demos, a collaborative document editor (well, it's just a textarea at the moment), and a chat app. Would appreciate any feedback or participation that folks are willing to give.
Thank you OP, your work is wonderful to read and even though I've spent a few months on the idea already I haven't thought of reusing Dropbox or similar. I think exciting things are about to come :)
I'd like to submit Working Group proposal to the IETF.
Why would we need an RPC for Independent Apps?
Independent Apps are surfacing as being a solution to the lack of control of our own data. oAuth Framework has allowed a more secure web, but even if it makes a difference between an identity provider and a resource host, it does confuse the resource and the service hosts.
Independent Apps should NOT be claimed by a lone company, let's make it something that the web owns.
How would it be structured?
I personally believe there should be multiple subjects treated by the IWA Framework, as one being the qualities of independent apps, and second being how data is accessed. Both of these are currently Topics of Interest for the IETF : https://ietf.org/topics/ - However the way this Working Group would proceed should be discussed and decided by it's members.
Why not submit a single person draft?
I could propose a draft but it wouldn't have the same meaning as if it would be drafted by a Working Group. As individuals, we are motivated by our own agenda and the quality of said draft wouldn't be the same. I'm volunteering, but I'd like to allow other persons to join in as well.
You can join your mails here https://forms.gle/igNdd6rH4MnPK8rb8 , at December 6 I will send the Working Group proposal to the IETF with gathered persons, if accepted I believe it should remain open to anybody to join.
High quality desktop apps, data saved in discrete documented file formats, optional ability to save in the cloud, the presence of collaborative editing, privacy is protected if you’re using it locally only, etc.
Why couldn't an enterprise run a "device" (a server) which others can easily sync to ("sync.enterprise.com") and which also only allows authorized users to access data which they're allowed to access? Maybe using Macaroons or something and devices can still sync locally via Bluetooth, Wifi, or whatever.
Now you have a full back up of everything on that server which IT could now more easily ensure it backed up, secured, and etc.
Not to mention the same idea could be used by a normal person just running a NAS at home or a server in DO/AWS/GCS/etc.
I've been doing some work on it over the last few days: https://datatracker.ietf.org/doc/html/draft-toomim-httpbis-b...
As soon as that one company abandons the local-first model, a gap opens, which will (usually, eventually) be filled by a new company offering local-first until that new company does the same.
As long as the companies don’t band together and agree to end it, there should be a company offering that model somewhere somehow.
Solving this problem isn't about being local-first. It's about being local-last. You have to be able to make more money by selling a software license than you make by selling equity and chasing user acquisition and retention.
Then we'll see people waking up to the fact that all this proprietary data is a liability and subscriptions are golden handcuffs and people will finally get back to making real software again
That said, one area undervalued is Partially homomorphic cryptosystems, where the cloud never ever gets to see unencrypted user data.
I hope the future is fast-local compute on cached data, with the cloud holding a much larger, encrypted but permissioned data store, offering utility functions like search over encrypted data
Switching to another cloud provider this way is trivial and usually only involves changing the Terraform configuration to setup a k8s cluster on another cloud. All k8s-specific config/deploy files can be reused on the new cluster.
This of course only works if (as you suggest) you stay away from cloud-specific services (SQS, aurora, ECS, S3) and run everything in-cluster, or use managed services that are available on multiple providers (Postgresql via RDS, or Digital Ocean managed Postgres, Cloud SQL on GCP)
Based on my limited experience, I highly doubt this. Have you actually deployed cross-cloud k8s setups, or is this merely a theoretical statement on your part? Deploying to another cloud provider brings a whole new universe of failure modes and auth quirks, let alone migration and switch-over woos.
Note that most of these were not production clusters so switch-over was just data restore and DNS changes.
I build clusters from the start to not use cloud-specifics where possible and all cloud-specific configuration is on the cluster edges in terraform which you have to rewrite anyway when switching clouds.
Auth things like IAM permissions are not an issue if everything is “in cluster” and auth/permissions are checked there.
Most of these deployments consist of several application servers, PG databases, redis, rabbitmq etc
That seems to still exist with the introduction of Apple Music. So all library data (play counts, skips, file locations etc) are stored locally, but streaming files are hosted remotely.
Although whether this was by accident or design I have no idea.
HackerNoon & Internet Archive are using (mine) https://github.com/amark/gun already.
Local-first is very much the mantra of the whole dWeb community. I'm liking this naming "local-first" as an evolution to "offline-first".
Ink & Switch had a good article on this:
Also, for doing End-to-End Encryption, we've built some really good tooling around this as well: https://gun.eco/docs/SEA , wraps WebCrypto and works across Browsers, NodeJS, and React Native, so you can do some really cool cross-environment/platform apps now.
I really love making apps this way for some reason. I think it’s the focus on just the UI and not worrying about the back end until later.
For this particular app I’d consider “smartwatch first” to have been better as its for fitness!
We really see this largely in manufacturing industry. If not local, we should provide private servers and data security.
At the moment the app is a local only service and there aren't any backups. Next year I plan to add a backend. I'll be keeping some of the ideas in this article in mind. Currently I'm using the browser's local storage api to store data locally. It mostly works, but will be bolstered significantly with a cloud backup.
/shameless but on-topic plug
The only thing I'll point out is that CouchDB getting rated "partially meets the idea" seems pretty weak to me. They reference v2.1 but the latest version is 2.3.1 and here's link to the docs on how conflict resolution is dealt with:
If finer grained control is needed it would be up to the developer to implement it, and it really shouldn't be difficult to do that.
In my case, I use PouchDB to perform a "live sync" with all connected users so they all get the latest updates to a document. If a conflict arises it's easy for any one of the users to fix and push it to everyone connected.
Some of this could also be alleviated by Tim Berners-Lee's pod idea: https://solid.inrupt.com/. But local-first is better. I just want files on my machine.
We’ll be implementing CRDTs soon, but the concept of local control of all data, authenticated and encrypted communications, etc. is implemented.
One fundamental difference between apps that support this and those that cannot: agent-centric vs. data-centric design.
Strangely, many “distributed” applications (eg. Bitcoin) didn’t make the “leap” to agent-centricity, and thus missed out on some key enabling optimizations.
As a result — they are forced to implement “global consensus” (expensively), when they didn’t need to, to achieve their goals: a valid global ledger, in Bitcoin’s case.
It turns out that, to implement things like cryptocurrencies, you don’t need everyone, everywhere to agree on a “total order” for every transaction in the entire network!
It also depends on _who_ owns the data. In an enterprise environment the company usually has a vital interest in the data and on-premise deployments are a good way of retaining cloud computing without giving up data ownership. I'm surprised that more SAAS products don't offer on-premise given the privacy and ownership benefits. The tricky part there is making software that is easy to deploy and maintain, which might be the reason that it isn't done more often.
A product like Grammarly that allowed on-premise deployment would side-step a lot of the issues with sending all that data to a third party. I can't imagine a law firm ever being able to (legally) sign up for that.
So, I figured I'd create my own paid one, and am working on https://imgz.org. However, I want to add a free tier for people who are willing to host their own images, and was thinking of writing a daemon that would run on the user's computer and store all their images on a directory there. It would have to be mostly-on, but not always-on, since I'm going to be using a caching CDN.
Is this a good idea? I don't know how many people would know how/want to run this, but it feels empowering from a data ownership perspective. What does everyone here think?
As one entrepreneur/engineer to another: Don't underestimate the legal and logistical effort you'll incur from a caching CDN. People post pirated, abusive, and generally bad things, and if it's on your server, it's (becoming moreso) your responsibility. DMCA and takedowns will consume noon-trivial time, and makes simple corporate insurance decidedly not simple (or cheap). It's typical for media hosting companies to hire teams to handle these issues. I was shocked while working at CNET (way back when) when I found out most of a floor (in a large building) was for webshots' trust and safety team.
That sounds great, thanks! Is there an easy way for me to distinguish photographs from my photography work from snapshots I took with my phone?
By the way, your "Get early access" button does nothing on Firefox beta with uBlock/Privacy Badger.
> People post pirated, abusive, and generally bad things
Oh ugh :( I was hoping this would be curtailed by the fact that this service is paid-only, although I now realize I might have to rethink my "accepting cryptocurrency" idea. Thanks for the heads up!
Yeah! You can browse by camera (and by lens).
Thanks for the heads-up on the get early access button issue! The link just scrolls you down to the bottom of the page where the login form is. I use FF with privacy badger and ublock (and a pihole) on linux and android, and both of those work. What OS are you using?
That's not entirely helpful because I have multiple cameras... Is there something like a smart category where I can specify multiple cameras, or directories, or something like that?
I'm running 71.0b5 (64-bit) Ubuntu, by the way.
Depending on the app, it may or may not be a good fit. The performance you get out of it depends a lot on what features you are using.
There's a whole ecosystem of projects (eg. IPFS Companion, IPFS Desktop, IPFS Cluster) and 3rd-party services which are important to consider when deploying a production-ready app.
There's a lot of work ongoing to solve some of the biggest pain points (eg. content discovery across NATs), so expect the performance profile to improve dramatically in the short term and for it to become an option for many more apps.
I run Eternum.io and IPFS has been such a pain that I am considering just shutting the service down. The node has been consuming so much RAM and CPU (even though it's behind a caching proxy and the gateway should get minimal traffic) that it was disrupting everything else on the server. The memory leaks have been off the charts for ages, so now I just restart the node every day.
I set up IPFS cluster on that and another machine with the intention of moving node to the second machine, I waited for weeks for files to be pinned between the two nodes but the queued count was going up and down.
In the end I set the pinning timeout to ten seconds and it finished faster, and still a bunch of files didn't manage to pin, even though the two nodes were directly connected (`ipfs swarm connect`). I shut the first node down anyway because I couldn't deal with it any more. At least now the rest of the stuff on the server isn't flapping every day.
And this is on top of the atrocious pin handling that requires you to have a connection open to the IPFS daemon when you want to pin a CID for the entire duration of the pin. I opened a ticket years ago to get a sort of download manager in the node so we could have pins happen asynchronously, but there has been no movement on that at all.
I'm glad it works well for you, but it has been nothing but pain for me. Hell, most of the time the gateway doesn't manage to discover files I have pinned on my local computer.
I really want something like IPFS to succeed, because it has immense potential to literally change the world, but I can't even recommend that people run a node locally because I know it's going to eat their battery and slow their computer down. I don't know why these problems haven't gone away after years of work and millions in funding.
I think I've experienced some of the same frustrations at times, and I think most of the bugs/problems are well known to the development team and are being actively worked on. It's a very open process. I know there's been a lot of improvement in the year that I've been working on it. The release train moves along slowly at times, especially as we are actively working on improving our testing processes.
I'm just frustrated with the glacial progress I'm seeing as an end user...
Now browsers consider local files that access other local files suspect and will refuse to load anything unless beaten. So I now use a python script to run a simple local http server to view my local files from a single "origin". However http itself is already considered suspect and many claim it should be deprecated for https.
In the future I will have to provide a lets encrypt signed https server with valid domain so anyone can view those files on their browser without having to mess with about:config settings or their own certificate cache. The cloud is the future, do not dare to build something that runs locally.
Their recent-ish history, when the their free tire become limited to only syncing a few devices, illustrates that even if software is fully local, and supports open formats, having the functional cloud matters, a lot.
But yeah otherwise Evernote is pretty good for a single user.
My team and I have taken the initiative to offer a e mail design tool that is considered first-class software to the OS (https://bairmail.com). The last thing I would say is developing desktop apps vs a web app is considerably harder thus most companies are aware they are saving by controlling software updates, versioning etc.
I'm not entirely sure how supporting collaboration in real time belongs on this list. Seems like a nice to have that isn't really related to the rest of the list.
I'm having a hard time understanding the differences between "Local-first technology" and something like Blockstack. I'm not saying BS completely solves the issues pointed out in the blog post, but it seems to me its pretty close.
What do you think?
here is a list of the current apps available: https://app.co/blockstack
My phone automatically uploads all pictures to the NextCloud.
Then there are apps. For instance I use Nextcloud with the Music app to stream my own mp3s from my Nextcloud to my phone running Ampache.
There are also collaborative editing tools, and various options to edit all sort of documents in a web UI, and always the local editing fallback (or the opposite way, as you see fit).
All tracked stocks stay within the app. You only pull information from the servers and store that information locally for offline use.
O O - - O - O
I think BLOON is closer than Git+GitHub.
And great article & paper! They do give us many inspiration about improving BLOON. Thank you!
It's basically a P2P based Dropbox with no accounts, full end-to-end encryption and no folder size limits.
It's not open source, but it can work without a central server if you need it to. It's also amazingly simple to set up, much simpler than sync thing.
if you pay for resilio there is an option to add all your folders in one go but on some computers I don't want to add all of them anyway so that's not much use to me.
with the free version you have to manually add folders one by one but to do that you need the key which means you need to copy them to a text file and add them on another computer.
with syncthing, it will detect other syncthing devices on your home network so you just have to add the ones you want then accept the request from another device.
once that is done you select which folders the device has access to and then a notification will show up on said device asking you to connect. so basically no fiddling with keys or having to store them somewhere secure
(this is all presuming i was using resilio correctly, maybe there was an easier way I was not aware of)
At least some of the motivations are the same.
What was the last app you used on your laptop that wasn't:
That guy must be living in a very limited / imaginary world world. I use boatload of local software