Those are the two problems caused by "big email". I've used hetzner, ovh and mythic beasts and had no issue with blacklisted IPs, and if you follow the Mox instructions you will be trusted and shouldn't get put in spam
i spent some time today buying a new domain and setting up mox on a hetzner vm. the IP was on 3 blacklists on first check, after fixing the reverse dns it's on 2, one of which is apparently fake? dkim and dmark seem to be working, sending a mail to protonmail succeeds the checks, and yet it lands in spam - however, i'm confident once the domain is older than "just now" and i've set up DNSSEC (takes 1-3 days for this to start working in my country apparently) things will improve.
worst case i'll have to request a blocklist to unblock me, but i'll see.
For your first point, the key is an IP range that isn’t on a blocklist. Pick a very reputable hosting provider (not AWS/GCP/Azure), who has strict no-spam rules, and check out some spam reports from their ranges. Hetzner I’ve heard is good, digitalocean as well, but your mileage may vary.
For your second point, you live with it. I haven’t found a solution, at least. I’ve never landed in spam for corporate offerings (cloud O365, google workspace or whatever they call it now) or (very rare these days) anyone self-hosting with rspamd or equivalent, just regular personal mail (hotmail, gmail, iCloud, etc). That’s usually pretty easy to detect and work around (“hey I sent you an email” “oh I didn’t get it” “did you check your junk?”) Irritating, but not the end of the world.
I’m going to try hosting from my residential IP sometime this year, now that I have sufficient redundancy in terms of power and networking. I don’t know if I’ll have better or worse luck than with hosting providers’ IP ranges, though.
Bro, I owned a /23 at a colo for over 10 years. Registered my ip space with ARIN, had abuse contacts, setup a mail server on a /27 on a /24 that remained mostly unused outside of dev and test servers (strictly controlled). The mail server was also strictly configured to never emit a single email that wasn’t sent by me. So no forwards, no bounces etc.
Mail server still gets blocked by random domains. Nope. Done with hosting email. Everyone assumes you are spam and won’t accept your mail unless you pay them (to be your mail provider).
How recent is your experience? Did you set up TLS, SPF, DKIM, DMARC, DANE/MTA-STS? That's what makes modern mail secure and deliverable (besides basics like matching reverse DNS). The beauty of Mox is that it tells you what exact DNS records you need to set up and it takes care of the certificates. Once it's done I found I have better internet.nl score than some big companies.
It's a damn shame. At this point it's basically in then favour of large providers to randomly block domains since otherwise hosting your own would be trivial.
Some providers are reputation based now. So you need to send emails and slowly ramp up amount over time. Difficult to do if personal though, as you won't get enough throughput.
If people just want to stick it to the Man by moving out of the cloud, then the solution might be "medium email": hosted by a commercial provider, so you don't have to do all the admin, but not self-hosted.
What solutions are folks using to solve queries like "How many of these 1000 podcast transcripts have a positive view of Hillary Clinton"? Seems like you would need a way to map reduce and count? And some kind of agent assigner/router on top of it?
But in general we found the best course of action is simply label everything. Because our customers will want those answers and rag won’t really work at the scale of “all podcasts the last 6 months. What is the trend of sentiment Hillary Clinton and what about the top topics and entities mentioned nearby”. So we take a more “brute force” approach :-)
At the moment this repo is designed to handle more RAG-oriented use cases, i.e. that require to recall the "top pieces of information" relevant to a given question/context. In your specific example, right now, FastGraphRAG would select the nodes that represent podcasts that are connected to Hilary Clinton, feed them to an LLM which would then select the ones that are positively associated with her. As a next step, we plan to weight the connections between nodes given the query. This way, PageRank will explore only edges which carry the concept "positively associated with", and only the right podcasts would be selected and returned, without having to ask an LLM to classify them. Note that this is basically a fuzzy join and so it will produce only a "best-effort" answer rather than an exact one.
I don't have a dev answer, but in case its relevant, I've seen commercial services that I imagine are doing something similar on the back end-- ground news is one of them. I wish they had monthly subs for their top tier plan rather than only annual, but it seems like a cool product. I haven't actually used it though.
What feature(s) of the top tier plan do you wish you had? I have no idea how their subs work but have seen a few ads for the product so have a vague idea that it rates news for bias but don’t see how that would involve many different tiers of subs.
It’s been a while since I looked, but unless they changed it, you needed the top tier plan to get a report analyzing the biases of your reading choices and recommending things to balance it out.
I like the idea of offline LLMs but in practice there's no way I'm wasting battery life on running a Language Model.
On a desktop too, I wonder if it's worth the additional stress and heat on my GPU as opposed to one somewhere in a datacenter which will cost me a few dollars per month, or a few cents per hour if I spin up the infra myself on demand.
Super useful for confidential / secret work though
In my experience, a lot of companies still have rules against using tools like Copilot due to security and copyright concerns, even though many software engineers just to ignore them.
This could be a way to satisfy both sides, although it only solves the issue of sending internal data to companies like OpenAI, it doesn't solve the "we might accidentally end up with somebody else's copyrighted code in our code base" issue.
Building simple CRUD apps are often a single code-generation command in Phoenix/Rails/Laravel, and adding common features like Auth, Queues, Emails, File Uploads, etc. are similar.
The downside is that this is a stateful monolithic approach that requires a server running 24x7 and can break without some effort to cache and reduce the load on the database. They are also often memory-hungry frameworks.
The tradeoff for productivity is worth it in my view, for the vast majority of cases where it's just a small team of 1-3 developers.
Yes, and most often we don’t fully understand the problem without partially solving it, which is perfect for these monolithic, batteries included frameworks.
I find building apps in Django, with the prescribed convention to box me in, helps tremendously to stop me from overthinking and just experimenting with the problem space.
I’ve even started using Django for apps that aren’t web apps, just because it has whatever I will eventually need, whether it’s database, auth, caching, admin portal, tools for building out CLIs, it’s all there.
> What are people doing where rewriting tens of thousands of dev hours of work from scratch makes more sense than spending money on servers?
1) "spend money on servers" is a boring solution. This is no good if you are a career software engineer and your career, salary and prestige within the company depends on writing code.
2) thanks to the cloud, server/infrastructure is now priced with ~100x margins, so "spend money on servers" is a significant investment, with these (stupid) moves to serverless/etc being seen as a cost-saving measure (that of course just like with the cloud itself supposedly being cheaper will never actually materialize).
I also agree with the sentiment, although two "but..."s did come to mind so I'll play Devil's Advocate:
1: does this contribute to the inefficiency and bloat we see on the web and applications nowadays? Of course everyone complains about Discord and Teams and other Electron apps (which aren't a direct comparison) but one I deal with regularly is the Microsoft Power BI Gateway application, which allows access to on-prem data for reports and automations. I'm sure it does a little more than just establish connections to Azure and send data, but it's a 672MB download! That's larger than the ISO for Windows XP. Throwing more hardware at a problem becomes less effective when the application has fundamental inefficiencies.
2: although server hardware is affordable compared to western salaries, a lot of the world has far less purchasing power and server hardware prices aren't as regional as labor costs. So some developers may have more time than money for more silicon. I haven't run any numbers on this and doesn't mean that rolling your own "everything" is worthwhile.
Do people still use them for full production systems? We use them a bit for ancillary things but TBH, if you have some k8s or similar, solution, it's maybe not worth it to not use a standard container deployment environment that everyone knows.
I inherited some lambdas on my team and the amount of effort I have to go through to:
- make them testable locally
- planning/qa-ing yaml configs with devops team, because they only grant me readonly access on their precious over engineered helm chart stack, they don’t even offer running my unit tests in their pipeline for me
- painful debugging process because everything that touches a lambda is another aws service config somewhere
I honestly don’t know why anyone wastes their time. We will be deprecating these aws lambdas for traditional api on our next version of our app. Serverless is garbage way to deploy code and is designed to tax you/charge you fees at every turn. It is for people who want to deploy poorly thought out code and rewrite it later, and explain the bill later.
Yes, I inherited a Lambda code-base too, and it is an absolute knot - Lambdas sending requests to other Lambdas, Lambdas sending messages and other Lambdas reading the messages, Lambdas reading and disregarding messages due to poorly thought-out queues etc.
I remember at one job, they were talking about the overall architecture of code at the company and when I asked how can I run this on my computer, they said well you can just run it but I pushed... How can I run this whole thing on my computer to interact with everything else and it was met with silence.
I can understand having to stub out external calls to vendors and clients but this is ridiculous. There is no local story.
Oh yeah I have a lambda that needs another lambda to complete before running. The first is on a Cloudwatch event interval. So the data in the second can be stale, but the one saving grace of the situation is that no one cares enough to make me fix it, not even me.
I worked on a system built on FAAS for a few months and it was (already when I arrived) one of the more well-tested systems I ever worked with, so it is clearly doable.
I took the system from dev to prod and much of it was boring in the very best sense of boring (also helped that I had a good team, although they were new to it as well).
I had one major gripe with it (and everything Azure) afterwards:
On dev there are absolutely no guarantees: components might go down half a workday and they don't care.
And when you go to production the prices goes absolutely through the roof (IIRC $9000/year for the most basic message broker I could configure that wouldn't have the risk of being offline half a day, and every component is like this).
So while it was really cool to work on a cloud native system if someone ever asks me to design one for them, that design will be presented with a price tag for what it will cost to take it to production.
I wouldn't equate the lambda UX with "serverless" at large. I work on a serverless system that runs the same code you upload (e.g, python). You write it as a traditional API then upload it to the cloud and done.
One thing that makes it possible is that "orchestration" is embedding in your code, using a library (https://github.com/dbos-inc/dbos-transact-py). With lambdas you need step functions which is not exactly easy to test locally.
There are a bunch of gotchas outside of UX, that make them just as awful. Like if you need secrets you have to find a way to cache them so that aws.secretsmanager doesn’t ding your bill every request. Depending on how your devops team requires you to load secrets this can make the code worse in operation and review.
But the main point why serverless is garbage is that the old stack does everything better.
Yes, SST [1] uses lambdas heavily but makes it more seamless and less visible, just the place your code runs.
I’ve also found Azure Container Apps to hit the right balance. It’s kubernetes under the hood, which you don’t have to mess with at all, except that it can use KEDA [2] scaling rules to scale your containers to zero, then scale up with any of the supported KEDA scalers like when a message hits a queue.
Except when you scale to zero, you get a 23+ second cold start time on .net apps. Google cloud run pulls some black magic to get ~3 second cold starts on .net apps, and ~500ms for golang/python/native apps.
I really enjoy laravel currently. It's just fun that I can focus on my app instead of the stuff that's just tedious. Also not relying on some 3rd party Auth is a huge bonus for me.
As someone who has written a few experimental apps with Phoenix, with and without LiveView, and later had to deal with many inscrutable errors when attempting to upgrade from Phoenix 1.16 to 1.17 with basically no help whatsoever around the web, putting it next to Rails and Laravel is kinda laughable. HN-darling-that-will-bite-in-the-ass alert! Be warned. Use something boring instead.
reply