The most important aspect is the security and you learn this by doing it.
My entire self-host apps are hosted behind a private VPN called Pritunl, it provides self-hosted corporate VPN like setup where you can manage users and access to servers.
I host these following apps/products right now:
- Pritunl (corporate like VPN)
- Superset (Analytics)
- Bitwarden (Password Manager)
- OpenVPN with Pihole (Personal VPN with Adblock)
- Wireguard with Pihole (Personal VPN with Adblock)
- Drone.io (CI/CD)
- Posthog (Web Traffic Analytics)
- Papercups (Web chat support)
There’s a struggle against manual processes in self hosted environments, or aggressive automation with bespoke or otherwise incomplete tools. What you want is glue code holding together open source tools without too much abstraction over the top. You should always have a hint what’s going on underneath. I find myself having to spend way too much social capital on this.
While I much prefer self hosted, there is a clear advantage of third parties inasmuch as you can bond over the stupid things their solutions do, instead of driving wedges between teams by engaging in that kind if catharsis.
My startup focuses on big data projects and is currently building a web-based Bloomberg Terminal alternative(https://quantale.io) which is an infrastructure heavy project.
here is why I use 3 VPNs:
1. Pritunl is used as an enterprise VPN setup. Using this VPN, I provide access to different webapps/self-host to different users in my team/clients. for eg, the stage version of Quantale is hosted on a server(with ufw rule to only allow connection from pritunl VPN IP) and I can provide access to you by creating a vpn config for you and you will only be able to access stage server, you wont be able to access any other servers behind this VPN.
2. OpenVPN with Pihole, I use this VPN as a personal VPN. This VPN blocks ads and trackers using PiHole and my self-hosted password manager is only accessible via this VPN
3. Wirehole with PiHole, this is a backup VPN to access my password manager in case I loose access to my OpenVPN server.
The Superset self-host analytics is used to provide analytics services to a client in manufacturing sector.
I'd like to explore that. Specifically, the idea of small communities where a group of people maintains the underlying tech, and - kicker here - everyone in the community knows more or less everyone else in the community.
That offers a bit more security/safety/continuity than just self-hosting everything, while still not ceding control to a faceless corporation.
Granted, there will always be other reliances outside of the community - like internet and electricity providers - but a line has to be drawn somewhere.
The basic story, though, is that before the dot-com crash, a lot of SF nerds kept their pet projects on work bandwidth. That became risky during the crash, so I and some pals rented a fractional cabinet in a colo provider and split the costs. I think we ended up using 4 providers over the years and peaked at a full cabinet, almost all 1U servers.
I was glad I did it and at the end I was glad to be done with it. A co-op is hard to wrangle and it's basically impossible to make sure that the workload is evenly spread, so you have to be comfortable with the fact that somebody, probably you, is going to be doing a bunch of unpaid work, even if it's only keeping track of what needs doing and herding people into doing it.
Eventually, I decided running physical hardware was more hassle than it was worth to me. Trying to solve mysteries like, "Why does google sometimes decide my email is spam" was a multi-year effort that I never did solve, even though I knew people at Google. And I grew to dread the chance that something would break and I'd have to rush down to the colo, possibly having to return from vacation (or beg a friend to be remote hands). So eventually I shifted some of the stuff I was hosting off to service providers (yay Fastmail!) and the rest into Terraform-built slices of AWS.
I do sometimes miss the ability to fully run down a problem (e.g., by looking at mail server logs). But mostly it's a relief. I'm happy now to get my hardware kicks on things where uptime doesn't matter.
People in my parents’ generation all have stories of some grandma’s house fire eating the family hoard of photos, including the only copies of Great Grandpa Frank as a child. We don’t want Uncle Steve losing those pictures just because his house is in the 100 year flood plain.
It should be fairly trivial to deploy a number of existing FOSS photo hosting solutions to a server in your house though. Syncthing should make for trivial replication between the local instance and one at a relative's house. It recently gained support for E2EE (ie untrusted servers) so you could even throw it up on a cheap VPS as an extra backup and to allow for more reliable distribution between nodes. There's no bandwidth aggregation to be had here with current tooling though.
Or how about accessing your personal library of data while outside of the house (like while visiting family)?
Backups and professional media work from home? Those need upload too.
Consumers need symmetrical data connections, or at least something much closer to symmetrical, than any ISP (in my area at least) has been willing to provide.
Sometimes "impersonal" is a feature, not a bug. I really don't want community sysadmins with access to logs of information about other community members. That has much more potential for abuse than a more impersonal service with a stricter expectation of privacy.
It’s interesting to see what happens to social connections and expectations when we grow beyond the number of people we can meaningfully connect with.
Are you suggesting that small towns where everyone knows everyone and people gossip a lot (for better or for worse) are equivalent to communities where a sysadmin knows everything everyone does on the internet?
a sysadmin will commoditize the process, make it cheaper, faster, more convenient, transferable (a family moves to a different place), etc. This sysadmin's business would grow, kill the competition, and then we would be back at the current state where we've ceded computing/networking to corporations.
It was very reasonably priced, you were able to have physical access and is more of an enthusiast club than a business.
Seriously considered it, but I don't live in Amsterdam and they recommended being able to speak Dutch to participate properly.
Sounds like it might be one of the hackerspaces there: https://wiki.hackerspaces.org/Amsterdam
I don't know anything about ansible, or much about docker, or self hosting. And I was able to set it up and it's working quite well for my family and friends. You don't have to enable federation.
Set federation_domain_whitelist to an empty list, and poof, federation disabled.
DNS settings are pretty easy too - especially if you can allow your instance to take control over an entire domain (and don't have to host other web services other than what the playbook supports). Don't need the SRV stuff here:
If you just have a private server for < 100 users, 1 vCPU and 2GB RAM is enough. I also use it for bridging to IRC using heisenbridge (which the playbook supports) and it's no problem on the tiny server.
Updates are very easy, pull the latest playbook, and run setup again. Done.
Sounds like you, like me, have had your brain broken by the ease of deploying Go programs :) The current Synapse server is written in Python, so it's a bit of a trial. That said, I run it on a tiny linode instance and it Just Works after maybe an hour of fiddling around (I seem to remember something about DNS records being the fiddliest part to get right).
But also, I am not sure that "parity with Synapse" is necessarily required or desirable. Sure, it needs to be fully functional and handle all the basic federatable communications, but its not like most small scale self-hosters need/want all the enterprise features being packed into Synapse. I think it is better for the Matrix ecosystem to have various server implementation that target different segments but can all still communicate.
I'd expect us to ship 1.0 once these numbers hit 100%, which at the current rate should be before end of the year.
As a participant in a number of small, mostly-volunteer tech community groups, I think this might be a difficult endeavour.
I'm not sure where my thoughts are going, as I'm not exactly surprised Nextdoor has more use than a more decentralized system for this use, but it's salient to me as I'd think something like SSB or Mastodon would ideally occupy the space that Nextdoor is occupying. I'm not sure if it is highlighting the legwork that Nextdoor did to build up its userbase (physically mailing people in a community), or the lack of technical sophistication of users in general, or the relative infancy of Mastodon/SSB/etc, or something inherent about getting a foot in the door with decentralized stuff in terms of mindset, or some inherent limitations of decentralization (can you really just compel/convince people to use decentralized services? People just use them).
I'm trying to imagine, for example, local police posting to Mastodon about some local safety issue in the same way as on Nextdoor. With Nextdoor, it's something known nationally, the state probably gives them recommendations, they just post to Nextdoor. Nextdoor might have even reached out to them. With e.g., Mastodon, I suppose I could see it being recognized as a thing if use got up, but where are they posting? The local popular servers? Do they run their own police server? Some kind of city government server?
This isn't a criticism of decentralization -- I'd like to see everything more decentralized. I just think something like Nextdoor is an interesting case to me to think through these issues because Nextdoor is so localized, and it seems like that's kind of the ideal use case for decentralized services.
Ideal use cases for decentralized services are also ideal business opportunities. You want to find collective action problems, charge rent for solving them, then manipulate your users to make you even more money from whatever resource is being collectively managed.
edit: I can easily imagine an app started to organize and coordinate people who wanted to volunteer to pick-up and clean public parks 10 years later becoming a app that was de facto required in order to visit a public park.
The owners are in for a huge payday.
I don't get the allure of the site. The site seems to attract complainers. That is not my point though.
I am interested in decentralized sites, like Mastadon.
Does anyone know of a good site that would walk a developer through building a rough clone of Mastadon?
I know it uses Ruby on Rails, React, etc., but would like a detailed walkthrough.
I did a rough search, but didn't find much on the programming of a decentralized social website on the technical side.
There are already full featured clones that federate. Pleroma and Pixelfed for starters. There are also multiple alternative frontends for both Mastodon and Pleroma. Granted none of those are walkthroughs but they do provide full featured examples in multiple languages.
Indeed. But what do you think (or suspect, or hope) would be different about a Mastodon instance playing the same role?
My hunch is that the characteristics of ND are caused by its perceived role, not the technology, hosting arrangements or (lack of) federalization/locality.
I think you get the allure of the site perfectly.
Is it something US only? Is it a social network that starts with your address?
We all just split the cost of internet and server upgrades, etc, which may have come out to like $40 a year or something on average. We probably did this for a decade or so until the hardware got too old and there wasn't as much interest in maintaining it all.
While I just have a VPS now, I do miss that old server and all of us working on it, and literally being able to do whatever we wanted with it. All it takes is a few buddies to get together and try it out. Experiment and see what happens, let it grow organically.
The most fractured and worst run companies have been concerned with self-hosting and frankly they sympathized with a lot of the stuff I'm seeing on here.
But I recognize we still need people self-hosting because it drives innovation and competition. I believe there are ways of doing things well while self-hosting, but I'm not sure what those ways are.
For mid-sized companies where you're paying someone to maintain things, whether that's an employee or a 3rd party, I think you need to assess your tools in terms of your negotiating power and how important it is to maintain control of everything. What if GitHub bans you? I don't think I'd try to self host a customer facing app at this scale either.
I think you can be short-term successful by throwing caution to the wind and using every shortcut available, but will the first to market advantage be enough to offset the price competition of people that aren't locked in to some proprietary API gateway or WAF? For example, what if I take extra time to build on OpenFaaS and you build on AWS everything. Who wins long term? You're faster, but I have better negotiating power (ie: less costs) by threatening to switch vendors.
Or is the idea of switching hosting vendors detached from reality at this point? Is hosting cost so negligible it doesn't matter? All I know is that everything looks crazy expensive from where I am.
We see the ”scaling” argument a lot, but honestly it's overrated for most businesses: for 95%+ percents of the businesses you can run online, the traffic you have can easily be handled by a single cheap machine running a PHP backend, as long as everything is properly set up (caching, using a CDN for media, etc). Modern computers are fast!
Same for redundancy: your business is unlikely to fail because your website is down for 2h every month (heck, this even that much worse than github /s).
Independent developers almost never need to care about scaling and redundancy.
This is one of those things that I was talking about them sympathizing with.
It's unrealistic. If you're getting banned from github or a cloud provider maybe you should reconsider what you're doing.
> You're faster, but I have better negotiating power (ie: less costs) by threatening to switch vendors.
Okay, but AWS isn't expensive as it is. If you build to the way these resources are meant to be built you can seriously minimize your costs.
I've seen plenty of lift and shifted apps get their costs dramatically reduced once redesigned in a "cloud native" architecture. The lift and shifted design was in the thousands of dollars a month. Rebuilding for AWS resources brought our costs to almost literally nothing on this same app. I'm talking $5,000 a month to $5 per month.
So I have a hard time following the premise that you'll do it cheaper with an on-prem or self-managed system.
> If you're getting banned from github or a cloud provider maybe you should reconsider what you're doing.
That’s a sentiment popular among Big Tech types (such as Eric Schmidt: “If you have something that you don’t want anyone to know, maybe you shouldn’t be doing it in the first place”), and distinctly unpopular among those who consider the cloud providers’ decisionmaking arbitrary, capricious, or dangerous.
There's still some topics that are taboo and that some businesses refuses to touch.
Self-hosted organizations that I think of off the top of my head are Google, Facebook, Amazon, Backblaze, Microsoft, Stack Overflow, and possibly Wal-Mart & CloudFlare.
They moved everything including the hardware and the data in house, and next year they’ve doubled the provided storage (to 1TB) with no additional cost.
Lastly, they’ve doubled the storage to 2TB and added some features for $20 more.
Because for us it was different. We got rid of Slack and decided to just use basic Google Chat we already had with Google mail. It was definitely a productivity boost and insanely decreased the number of times people messages each other. It was helpful.
So what, in your case, it had to with self hosting? Or did you mean it was just the magical effect of Slack being Slack?
Slack has a lot more product thought and nice features that keep on getting added. Our competency is not managing mattermost, it's doing what the company is made to do.
If you want to reduce communication, well that is a different thing altogether and you can choose if you want chatroom software for your company or not.
The thing though; is that this argument can be used to outsource everything.
Office space, HR, recruitment, even making coffee.
At some point the externality puts restrictions on your maneuverability, in the case of office space it's because you might not be able to just run a few power cables somewhere, for HR (imagining a scenario where it's outsourced) there's limitations on speed, or in the worst cases limitations on how you can even interface with the HR department.
Making coffee is one of the cheapest things you can do, outsourcing it may mean, though, that you lose choice of what beans you have, or what types of milks you want to stock, and there's a premium on the price because another company wants to make a profit supplying this service.
This is a contrived example, but overall my point is: there are times that in-sourcing your tools actually gives you the time and freedom later. It's a gamble you make.
We'd love to improve, are there one or two things top of mind for us to change? You mentioned occasionally missing notifications?
Once recent new feature slack added that I like was scheduled messages, but that is a pretty new thing.
As Chomsky (who I don't agree with on everything by any means) has written, when the government talks about doing things for "security," that usually means security of the government from its own people.
Doing it right means achieving multipliers. Using each thing you do to improve everything else you do.
How do you improve your product when you’ve outsourced customer support? How do you align it with your business philosophy?
You need to know what you are doing.
Example: Dropbox is open to the world. You can share files with everyone. Can you properly secure a nextcloud instance?
VPN may not be applicable, because you have to share files with others. Even then, you need to have fair amount of knowledge about networking, protocols, security, current software, vulnerabilities, etc. Even with SSH, you need to be careful. And this is only the security part, I am not getting into a dozen of other concerns.
Overall, as software complexity grows, self-hosting will be increasingly harder.
Encrypting client-side and using a managed solution is a compelling option.
When you see that large companies get hacked all the time with you sensitive info and password released in the wild, it makes you think twice about "security" when your data is not in your hands. I'd say both are dangerous anyway, and certainly trusting a third party with any kind of data is a big gamble (plus, they may be spying on you as well).
- A FreeBSD firewall (requires continuous patching)
- 6 DNS/NTP servers (don't ask!), most of which are in the cloud
- 2 VMware ESXi hosts
- 3 ethernet switches (an 8-port 10Gbe, 24-port 1GBe, 8-port 1GBe)
- 2 WiFi Access Points
- 12TB TrueNAS server
- 2 laptops, 1 desktop
- countless VLANs, countless VMs.
Effectively I run my own AWS. But it comes at a cost: countless evenings & weekends. Endless updates (OS, BIOS, firmware), periodic hardware failures.
Also, as pointed out, security. My unpatched DNS server was compromised, and the intruder managed to get root on my server (this was back in '99, before BIND was heavily re-vamped for security).
Self-hosting is a labor of love, but I'd be hard-pressed to recommend it to anyone who didn't enjoy it.
Nowadays I simplify to the extreme (refrain to run something I do not need, always using the simplest solution) and it works pretty well for me:https://benou.fr/www/ben/14-years-of-self-hosting.html
Don't forget that the whole DIY thing is also incredibly educational. People tend to forget that when weighing the pros and cons.
It's not always directly teaching useful skills for work as most companies will just want you to know how to talk to AWS. But general computing and security knowledge is always useful IMO.
I didn't run into any specific issues, but instead I ended up realizing that I had to monitor the services myself to ensure that they were still functioning properly and that they had security patches applied. That's not a responsibility I want to deal with.
And as strange as it sounds, I also noticed that there actually were privacy advantages to not hosting stuff myself. Maintaining multiple identities when self-hosting is only possible with a domain per identity and not reusing the same machine for services across identities.
I've been self hosting for over a decade with no intrusion to my knowledge, although I'm sure some state-level actor has access. On the flip side I've had many of my login credentials stolen over the years due to a wide range of companies getting hacked- haveibeenpwned currently lists 11 breaches for just one of my emails. It's probable I'll get owned eventually, but I've got some catching up to do.
Not only that, but the reward is a lot smaller for the attacker and the overall damage is smaller for the community. If attackers get into Google Analytics/Tag Manager servers they will be able to find data and sensitive information about most of the websites in the world and be able to control them. If they get into your self-hosted analytics server they would only find out your stats which can't be used for much.
There is one thing to find the name and phone number of one person and another thing to find the name and phone number of millions of people.
Hackers wont even know if your self-host server exists. I self-host Bitwarden and that's how I am able to sleep at nights.
Would all that traffic still have to go through the VPN tunnel?
I'm not a sysadmin or a security expert.
I don't keep vital or sensitive stuff on anything I'm hosting but it's still frighting.
Setting up self-hosting is not easy, except that it can be, as I see in the responses to this comment.
I am not sure I understand what "as software complexity grows" means. My observation is that "as software complexity grows" it eventually (and hopefully) fails, and we go back to simpler software, albeit using a few things we've learned along the way.
"As software complexity grows" is not a desirable trait. I hope that there is no need for such software, but I can't predict the future.
The rest of my stuff is all local/vpn only.
My public server has a couple of ports open to the internet, but SSH, SFTP, etc., are only accessible on the LAN with access by key (no passwords). It does things like XMPP (hashed passwords, no locally-stored chat data), public websites, and the like.
On top of that, many hosting providers offer to set up popular open source projects for you.
It is INCOMPARABLY more secure in a broad sense just because you control your infrastructure.
Yes, you need to know what you are doing, but this is applicable to everything, does it not? Of course, mindlessly subscribing to bazillion of services is much simpler, but it's plainly not professional.
On a side note, do you think Dropbox is any more secure than any other service, including self hosted? Or any other service?
After years of seeing how those companies are made from inside I am personally quite free from those illusions.
I run mine in digital ocean, but if you want to run it off your home network it’s basically just figuring out the vpn bit to safely get on your home network and everything else is good to go. You can also use something like tail scale or zero tier to skip the vpn part (but I know less about those things).
Hopefully in time even this will get easier with UI that guides you through the process.
> Encrypting client-side and using a managed solution is a compelling option.
You need a similar amount of expert knowledge to properly configure your client-side encryption, ensure the algorithm wasn't cracked, the implementation you're using doesn't have any severe vulnerabilities, etc.
If we're in a situation where we can trust no one, not even ourself, then we have a problem.
You can use a self-host app like Pritunl to host a private vpn server and put all the other self-host instances behind this vpn.
There are pre-packaged solutions such as the Uniform Server - a complete WAMP stack fully hardened for placement on a public server. This is an EXTREMELY COMMON PROBLEM and PEOPLE HAVE OPEN SOURCE PACKAGED SOLUTIONS.
This constant "it's too hard, waaa!" bullshit is just lies.
What her blog triggered for me is that we can have a better digital life by being conscious and taking control of our assets, control over interactions with people and companies, etc.
I had a bad experience in using non self hosting. I used weebly for my blog because it was free and convenient. Without warning they disallowed free access. I can't modify my data and can't export it. That gives me an unpleasant feeling about weebly and such type of free service.
I now do true self hosting as far as I can. I wouldn't even trust an association.
“Cheap”, only if you don’t value your own time.
Especially since it was originally used in the Linux desktop context.
If you have enough skill (or the willingness to learn) and initial investment of time, then the ROI on these DIY projects can be immense.
I am far more productive with a Linux desktop and self-hosted / managed "solutions" than their commercial alternatives.
For example: My media server setup far outperforms Netflix and Spotify in terms of ROI and /even/ convenience.
Similarly my Linux desktop PC is better for work and play compared to any off the shelf MacOS or Windows experience.
If you have the perseverance and initial time to invest, you end up over time saving so much time and money.
I self host a ton of stuff. Sometimes I feel like I'm wasting time that could be spent writing code, but, ultimately, I think having good sysadmin and network admin abilities makes a difference in the quality of software development.
Sometimes I see developers that barely seem to know how networks and DNS work.
And the whole argument about time spent is getting weaker. My stuff has gotten to the point where it's a bunch of Docker containers that I could auto-update if I wanted. The hardest part is picking containers that are maintained, but all the official ones are nowadays.
We're coming a full circle. At work, we just installed a couple of massive 64-core Xeon machines. On prem. Like it is 2002.
Building the skill requires an investment of time, which has to be compared against more productive (read: profitable) alternatives. Remember that all endeavors have opportunity costs.
Every time I've done the math, this only comes out ahead financially if you already have a huge library or if you are willing to torrent.
Is there something I'm missing?
hmmm... Let's count : 8 hours of work = 8 hours + 1.5 hours traveling to work + 1 hour for noon break. Then I sleep 7 hours. Then I need 1 hour to get ready in the morning. In the evening, it takes about 1.5 hour to cook (don't tell me it's my choice to spend time cooking instead of eating pre-made-full-of-sugar-and-fat food). Total = 20. So 4 hours left. But somehow, work is sometimes hard, so I need about an hour of rest. So in the end 3 hours left per week day. On the weekend, I'll spend 2 hours doing groceries, 2 hours keeping the house clean and doing repairs. Unless you are alone, you'll have time spent socializing, which is not exactly a choice neither, you need it for your mental health. And if you do some sports, again because it's fun but also because, at some point, it's for your health (i.e. being able to use your non-working time in a useful way). So well, it's not like there's much left. And I don't even count the kids... (but that was a choice :-) )
A raspberry pi is not sufficient for running things like nextcloud in any kind of performant way.
I think people see those ridiculous rack-mount servers some people run at home that suck down 300+ watts and assume that's just normal!
I went for even lower power usage, with an i3-7100u box that uses about 2W most of the day and cost $75 plus some extra RAM.
These days power usage might be workable with something like a mac mini server. I did a test and my ryzen 5 server with 3 HDDs was drawing 75w minimum and my area has quite expensive power so it just didn't make sense to keep running it.
A VPS also comes with a lot of really useful advantages. You aren't tied down to the hardware. As your needs change, you can change the scale of the VPS. Right now I still have the homeserver sitting here waiting to be sold as well as some other previous machines which were not powerful enough.
A VPS is also relatively unaffected by things like power and internet outages. It just keeps working. It's more convenient when you move house since you don't have downtime in the process. It has a dedicated fixed IP address and ipv6 with no fucking around with CGNAT or blocked ports.
Just buying a fixed IP address would cost an extra $5/month.
Once you consider every cost, a VPS can seem pretty good value in many cases.
I'm guessing you're looking at the preowned market?
For those prices, people might consider themselves lucky to get an underpowered Celeron with BYO RAM and storage, brand new.
Most servers with enough GB of RAM and powerful processors can cost in the 50/100 USD range to rent per month. It's much cheaper to self host beyond a rock bottom VPS. Leaving a modern PC on the whole time will not cost that much in a month, and what you invest in hardware will pay for itself with the difference over time.
That's a ridiculous take, because the skills you get through self-hosting are actually marketable afterwards.
As Chaucer would have it: “The lyf so short, the craft so longe to lerne”
That said, the learned skills are only actually valuable if you can use what you learned later on in life. I've done my fair share of fiddling around with raspberry pis and kernel compiling when I was younger, but can't think of a single time in the last few years where I had to use that knowledge in my day job now that everything is containers+k8s+<some cloud hoster>. Maybe we can argue that it gave me a slight speedup when trying to grok the container execution model or something like that, but I could have gained that knowledge much more efficiently in other ways.
On the backend you need to own the persistent data but not the real-time data, so you will distribute your database on 2x or more home hosted setups and the regional live servers (asia (AWS and GCP), central US (GCP and IONOS) and europe (here anything goes)) will connect to those.
You need 1Gb/s up+down fiber on two homes for this.
You also need a software/hardware stack that can saturate those 1Gb/s at very low wattage so you can have lead-acid backup power (make sure your appartement building has a UPS on the switch in the basement).
The real tricky part is the license you apply to all of this so that others are incentivized to fill the demand for you in the case that blows up!
I'm going to go with with monthly payments in proportion to your revenues starting at $20/month.
For end customers I'm thinking $10/year.
I can only speculate that the abysmal state of self-hosted software for the general public is because there is not enough money to be made in terms of recurring subscriptions or constant inflow of data.
The problem with general public (people at home) is that most of them really don't want to pay for such software anymore and for the developer it requires to worry too much about stuff always getting broken as it runs in a galore of different configs.
For my stuff 90% of support issues are just bad permissions, bad mounted filesystem, somebody forgot to run apt update, etc. People really think that all those issues are our responsibility, just to educate the customer is a waste of everybody's time.
That's exactly what it is. Some software charges a subscription for self hosting. You maintain everything like a sysadmin and pay a huge per user per month subscription fee. It's insane.
Look at authentication systems to see how ridiculous the price discrimination / gouging has become. It costs $0.0055 per month for and AWS Cognito user or $0.00325 per month for an Azure AD External Identities user. However, as soon as you use Active Directory for employees it's several dollars per month per user. The P1 plans are $6 per user per month. What makes auth for an employee worth 184,000% more than it is for a customer?
I think big tech is absolutely scamming everyone, especially small businesses. They're taking "charge what the market will bear" to a whole new level and the only reason it's working is because anti-trust laws aren't being enforced. If we had fair competition the cost for a lot of tech would drop substantially IMO. There's a lot of room in a market with 2000x markup.
upload is aggressively throttled, filtered, sniffed, redirected, and otherwise treated as a hostile act by ISPs, to submit to the demands of the media industry and keep squeezing businesses for exorbitant rates for the same bloody service but with the filters turned off.
your average consumer ISP account where i live can't even run sshd without using complicated work arounds.
the system has de-democratized web hosting and monolithic services have rushed to fill the vacuum left by the death of the ISP hosting era.
A friend who basically was freelance and did tons of self-hosted IT shit for years, has given up and is now doing contract work in fucking Photoshop, because you can't practically run own hardware services anymore.
It's a joke, and it upsets me that it all seems to just have happened quietly under everyone's nose, and no one seems to be worried about it at all.
I only wish people knew what "e.g." meant -- "exampli gratia" or "free example". Folks on HN frequently use e.g. when they mean i.e. or "id est" or "that is". When you know the Latin, it rankles you every time.
However despite Latin, "data" is stuff just like "hair" in standard English. "The hair _is_ on the floor", not "The hair _are_ on the floor." And thus "the data is collected", not "the data _are_ collected". English isn't a slave to Latin, but some misuses are too egregious to be tolerated.
Just like you, I substitute "e.g." whenever I want to use "for example", and "i.e." for "that is".
I find the difference quite straightforward once I "get it".
I've only ever seen it abbreviated. The only time I've seen it spelled out is when I looked up what it meant.
This means that the administrator must be sufficiently skilled to be able to handle all anything that might come at him. Which means that he's expensive. This makes it economically hard to compete against hosting, where they administrators can be cheaper due to having more controlled environments.
One solution is for administrators to insist on that environment conforms to some sort of standard. But no meaningful standardization currently exist for this context.
Making the device/software resilient enough, is also very hard, and suffers the same problems as with human administrators. If you install a device in a network with a faulty router, then what is that device supposed to do? How does it even know the router is the culprit?
I don't see why it can't be both.
I run an eCommerce website and pay $5/month to DigitalOcean for what is basically a VPS running Wordpress and Cyberpanel (free and good cPanel alternative).
The reason I'm happy to pay that $5 is because it makes a whole host of technical problems disappear. I don't have to worry about maintaining hardware or dealing with an outage if my (consumer) internet is down for maintenance. I don't have to configure my router or set up the CDN, and the bandwidth they have at these data centres is 10x what I have at home.
If these technical problems disappeared, I wouldn't need to outsource the hosting to an external provider and could save myself the hosting fee. On the other hand, if cloud hosting were significantly more expensive (or if I was just running a website as a hobby and didn't care about downtime), I'd definitely spend the time learning to self-host.
Developers should be talking to the Ops folk. It informs your architecture decisions with practical considerations, like physics, and how many NICs you can plug into a homogenous switch before you have network hops screwing up your pretty but naive designs.
When you stop self hosting, the number and quality of those people goes away when they realize they should find someplace else to be. And when we need fewer of them, we stop making new ones.
I try to push architectures that allow for a degree of heterogeneity, where we have one data center we own, and use others we don’t for geographic redundancy and speed of light concerns.
For a read mostly system 5a Reading an entire zip file's contents and writing out a brand new zip file could be an extremely slow process.
For read-mostly systems, that may mean for instance that we keep the system of record (I’m doing just this to bootstrap a personal project that has a read-mostly information architecture) but distribute the UI out into the Someone Else’s Computers.
When you use third-party services, the government can go to them. The third party might not fight the request the same way you would. And, you might not even know it happened. The third party might be expressly forbidden from telling you it happened, in fact.
This was why Hillary Clinton wanted to host her personal email in her basement. A physical server that she owned, on property she owned; there was no legal way to request that data without going to her personally. If she had used the State Dept server for her personal email, Congress could have accessed all her personal emails simply by asking State to send them over.
That’s a controversial example, but the same principle is followed by many companies and organizations who have kept some portion of their data self-hosted. It’s often email or some core of file storage that they consider legally sensitive.
This is getting harder to do, though. Look at the recent revelation that the government tried to get newspaper email metadata from Proofpoint, a spam filter provider. Self-hosting a good spam/phishing filter seems almost impossible in 2021, because of the huge amounts of data needed to train filters well.
Generally I really like mailcow. It makes dealing with all the ugly parts of hosting E-Mail fairly simple
Incoming spam is hardly a problem. Spammassassin, rspamd and those catch most. Greylisting the rest. Once a year I see an uptake in spam, spend a few minutes dilligently marking everything a spam/not spam which the server the uses to retrain itself a little.
Spamfilgering when selfhosting is hardly more work than on gmail, live, proton and such.
Your outgoing mail icw spamfiltering, however, is an entirely different, and tough problem.
That makes me think what type of contingency I should have in place to stay minimally operational after such event happens to me. A VPS somewhere with my work toolkit installed and files synced via syncthing, for example? Maybe... but what if the police could get to the same VM via the confiscated devices? I don't know...
Secondly, you're local machine should encrypt itself if that's your threat model. They can take it while it's still on but if that's actually a concern for you, you can figure out a way to trigger a lock or a shutdown if things change. If it's a stationary machine, it can be easy to notice your environment changing. maybe you can't find the mac addresses of your switch any more, maybe all 10 of your neighbor's ssid info is no longer visible. Perhaps lack of internet is good enough.
Phones are a lot harder because their environment changes a lot more, but you can still check things like has my computer decided to go to lock itself? In the end, if your threat model involves that kind of risk, you can set your devices up to brick themselves or at least shutdown and encrypt themselves.
Last, you'd probably want a device so that you can do the things. A phone and or old laptop with an OS already installed that you can retrieve.
1. Find some friends or people you trust to not sell you out to the police. Ideally, these people should be in another country.
2. Place a server box on their property. This box will be a replica of your every-day home-server and devices.
3. However, in order to stop law enforcement from technically  finding this replica-box, you will need to use Tor. This ensures your home-server does not store the ip address or the physical location of the replica-box.
4. If your home-server is taken by law enforcement, you can buy another home-server and use memorized details (or call your friends on a burner phone) to restore a backup from the remote device .
 Please note that law enforcement can legally compel you with threats of jail time to reveal where these replica boxes are.
 Since you will probably be under surveillance, it's unlikely law enforcement will allow you to freely communicate on the internet with new devices and servers.
This is usually what passwords are for, something you know that cannot be stolen (short of rubber hose cryptography)
In the first case I would set up a "append only" system where you cannot delete anything, just append information. This could simply be a incremental backup system.
Have it managed by someone outside your country, you would just be a user.
In that case if they grab everything they cannot delete what you have there, and the cannot access it as administrator either.
If you want to protect from the second case, its gets much more complicated.
You need to encrypt the systems that hold the data and make it so that the encryption key is wiped from the systems if they are in a panic state. This can go as far as you want: no more Internet (the machine was disconnected), or the trigger on the door of your basement starts a countdown of a few seconds you can only stop by logging in - otherwise the system shuts down (or better, cuts the power).
An extra complication is if you fear that you can be forced to provide decryption keys. In such a case you could either go for dynamic keys that are provided to you by someone else outside your country, though a process that ensures that you are safe.
This is my biggest concern. Confiscated devices are never returned to their owners.
I suppose your other issue is making sure that payments still get made in a timely manner if for some reason your home country freezes your assets or you get arrested. I guess it just depends on what you're up to and how paranoid you are. Personally I wouldn't worry about it too much. You could probably just deposit a portable hard drive with a friend or relative periodically if you're that concerned.
I just read the LinkedIn Incident  from the Darknet Diaries, and it's scary how the FBI managed to get all that information about the Russian hacker.
 though this is before Microsoft acquiring them, so it was probably just the usual startup reckless abandon.
I don’t know how law-enforcement is funded there, but presumably they do use contractors and service providers.
If their alternative is to pay a contractor who has no familiarity with your system, it would be preferable to simply pay the person who knows what they’re doing and be done with it.
Some agencies will pay companies to perform digital forensic work necessary to offer data. Even administration of the job of forensic work, ie PM level assistance can be compensated.
This includes if a company is served a subpoena and a warrant to provide certain data.
Billing can be good, certainly valley-competitive.
Iirc, this has been well covered as a thing that happens at big tech companies, including fb, but it also can apply to small ones.
Right, and your example of a literal server in a basement supports that, but if you are colocating or using a VPS they will almost definitely go to your provider first and probably won't even tell you.
With VPSes, they can get your data and you might never know. It's an extremely important distinction.
I'm not saying they can't I just don't see that they would spend their time doing this when they can send to the request to the server's owner and then it's no longer their problem to deal with.
The OS may boot up, but one could have the data on a separate volume. Services won't start until that volume is mounted, which could be manual-only. Either LUKS-on-any-FS or encrypted ZFS would work.
With encrypted (Open)ZFS you can actually send encrypted bits remotely: the destination does not need the key to save the bit stream to disk, so you can have a secure cold storage copy of your data.
> There's an even more compelling reason to choose OpenZFS native encryption, though—something called "raw send." ZFS replication is ridiculously fast and efficient—frequently several orders of magnitude faster than filesystem-neutral tools like rsync—and raw send makes it possible not only to replicate encrypted datasets and zvols, but to do so without exposing the key to the remote system.
> This means that you can use ZFS replication to back up your data to an untrusted location, without concerns about your private data being read. With raw send, your data is replicated without ever being decrypted—and without the backup target ever being able to decrypt it at all. This means you can replicate your offsite backups to a friend's house or at a commercial service like rsync.net or zfs.rent without compromising your privacy, even if the service (or friend) is itself compromised.
No, you don't. Both of those implementations provide hardware attestation via vendor keys securely embedded in the CPU. I have no idea if any providers currently make such features available though.
No, it's very easy to filter spam locally. You don't need huge amounts of data, just your regular email. Which makes it much better on your data.
Running my own email infrastructure for a long time, filtering spam is a non-issue.
Please try to avoid using O365 as they literally are the main culprits that make self-hosting email a pain in the butt.
I've set up everything according to best practices (SPF, DKIM, TLS, static IP for almost a year, reverse DNS, blacklist removal, spam checks).
I've also repeatedly contacted Microsoft support to get unblocked. All my requests to whitelist the IP in the last year or so have been ignored.
Microsoft is the sole bad actor I've encountered in more of a decade of self hosting email.
On principle, I've decided not to use a different provider, and users on Microsoft services will not get emails from me or from my websites.
This will only change if enough people complain. As a paying O365 customer, I'd encourage you to open support tickets that you're not receiving emails from some the smaller email servers, e.g. those hosted on DigitalOcean.
And better, if you catch wind they're after you, you can format your HD to zeroes, or (if you don't want even the physical drive around) throw it in a fire or something :).
Email is communication with other people, if you are sending an email to a person using Gmail your basement server for email gives you no protection over your email data a such. Govt. can easily request email data from Google of the recipient's account.
It does give you protection on the fact that they then need to know the recipients emails and do multiple warrants to gather them if they are over multiple providers, which may or may not go through. For sure it's easier considering that most people use a few US providers, but it's not always the case (even less so for governments matters, which include foreign countries, thus foreign providers too).
The most notable example of self hosting going right is CNN, they self hosted their emails and were therefore able to fight the court order until it is narrowed and there was a change of leadership in the white house & DoJ.
If you aren't going to self host write it into the contract that you must be informed (Google pushed back on court order because it would have violated the contract with NYT)
Instances of data seizure that went unimpeded: Phone records (both work and personal) for all orgs. Emails for Politico, buzzfeed, the Times, a congressional staffer, and more. iCloud metadata for at least a dozen individuals associated with the House Intelligence Committee, and more.
The fact that I can host as many domains and accounts as I want with all kinds of filters and rules and forward them all to my main account as needed with rules is just gravy.
Note that Hillary Clinton was not prosecuted despite the subsequent administration basically running on a promise to do so.
Official business with classified information is never done via email, even if everyone is using the government email servers. There are separate networks, devices, and protocols for storing and operating with classified information.
It is really hard to see this as anything other than a bugdoor.
My laptop has this TPM chip. I am really glad I never used it, and even went so far as to disable support for it when I built my coreboot image.
Products sold with the buzzword "trusted" are a magnet for this sort of garbage. They've painted a "please bugdoor me" target on their back. The only thing you can hope to trust is general-purpose computing devices, with a large market, that obey their owner. Unfortunately it is increasingly difficult to find those.
Most FDE schemes don't run crypto ops on the TPM itself - key derivation occurs there, then the results are cached in RAM ( or sometimes, protected CPU registers, in which case they may be able to inject privileged code into the kernel address space? ).
LUKS on a colo will probably protect you if you're a fentanyl distributor or movie pirate. Probably not if you're a terrorist or a high-value nation-state target.
Sure, but my ability to stop them is probably substantially smaller than, say, Amazon’s legal departments capabilities.
A few months later, all my personal gmail account are seized and I reveive an email (that I could read after changing my password) from a police department in god fuck knows where middle of nowhere countryside asking me for data on the proxy usage.
Sadly I had revoked the server subscription since I didnt need it anymore (and probably hadnt kept any logs anyway since I was just playing aroud with a server) but I really really wanted to help.
I mean, it s rare the police would call you for a legitimate usage and political suppression. They call you for fraud with damage and it s awful being responsible in small part but unable to help... I was not mad they read all my emails, I was sorry someone lost money because of my mistake.
Maybe I haven't had enough coffee, but I'm failing to connect how leaving a proxy open was a major enabler for fraud. What kind of fraud?
Amazon? Trust? People trust Amazon to exist and to bill. Providing services to those who pay the bills is almost incidental.
Especially when the companies are already happily selling account metadata.
It didn't effect Experian.
It didn't effect Yahoo.
It didn't effect Sony.
It didn't effect AT&T.
You, as a consumer don't really get to choose experian or not.
>It didn't effect Yahoo.
Who says it didn't?
>It didn't effect Sony.
So a bunch of internal business documents got leaked. As a consumer I couldn't care less.
>It didn't effect AT&T.
If every provider was mandated to do this, then I wouldn't call it "poor data security reputation".
Depending on the legal issue at stake, it might also be possible to access additional legal expertise pro bono, or through an organization like the ACLU.
Even Twitter doesn’t like to roll over, and they’ve got a lot less at stake. https://www.latimes.com/politics/story/2021-05-17/twitter-fi...
Also, you can encrypt it with keys that they will NOT use to decrypt.
The data will also NOT leave the region (or country) that you specify
Shredding the data on your own hard drive gives you a pretty good guarantee. Drilling a big gaping hole through it afterwards gives you an even better one.
Yes, and rather than sending a letter to the hosting company, they can come to your house and confiscate all electronic equipment. (that's not a joke btw, when local LE comes to your house, you can lose anything electronic from laptop/server down to backup drives and ipod, possibly taking years to recover) For me that doesn't sound like a good potential tradeoff.
> This was why Hillary Clinton wanted to host her personal email in her basement.