Hacker Newsnew | past | comments | ask | show | jobs | submit | kleebeesh's commentslogin

I'm a random dude on the Internet, but my partner completed her PhD at MIT. While there I knew and knew of a few PhD grads who worked at MIT in some non-tenure-track role (postdoc, staff researcher, etc). Typically for a couple years and then they get a better-paying or more permanent job. But several remained "affiliated" in some way. They kept their MIT website/email, some in academia continued to collaborate to some extent. Things like that. But AFAIK they weren't getting a paycheck from MIT. And it's somewhere between neat and genuinely professionally valuable to be affiliated w/ a prestigious university, so I don't blame them for claiming affiliation. My best guess is he's "affiliated" in a similar way.

Neat!

> Right now, accessing my apps requires typing in the IP address of my machine (or Tailscale address) together with the app’s port number.

You might try running Nginx as an application, and configure it as a reverse proxy to the other apps. In your router config you can setup foo.home and bar.home to point to the Nginx IP address. And then the Nginx config tells it to redirect foo.home to IP:8080 and bar.home to IP:9090. That's not a thorough explanation but I'm sure you can plug this into an LLM and it'll spell it out for you.


Also recommending using a DNS server that points `*.yourdomain` do your reverse proxy's IP. That way requests skip going outside your network and helps for ISPs that don't work with "loopback" DNS (quotes because I don't know the proper term)

You can then set your DNS in Tailscale to that machines tailnet IP and access your servers when away without having to open any ports.

And bonus, if it's pihole for dns you now get network-level Adblock both in and outside the home.


Personally I'm using haproxy for this purpose, with Lego to generate wildcard SSL certs using DNS validation on a public domain, then running coredns configured in the tailnet DNS resolvers to serve A records for internal names on a subdomain of the public one.

I've found this to work quite well, and the SSL whilst somewhat meaningless from a security pov since the traffic was already encrypted by wire guard, makes the web browser happy so still worthwhile.


This worked for me to get subdomains and TLS certificates working on a similar setup: https://blog.mni.li/posts/internal-tls-with-caddy/

Caddy is increasingly popular these days too. I use both and cannot decide which I prefer.

Caddy's configuration is so simple and straightforward, I love it. For sure a more comfortable experience for simple setups

I like Caddy's integration with Cloudflare for handling SSL and when I originally saw the idea it was promoted as an easy way to have SSL for a homely but I don't use real domains for my internal apps and that is required with Cloudflare.

caddy has tailscale integration i think too, so your foo.bar.ts.net “just works”

The pain I've had with it is distributed configuration, i.e. multiple projects that want to config rules. I've been using the JSON API rather than their DSL.

Do you know how I might approach this better?


I think most homelabbers default to Caddy and/or Traefik these days. Nginx is still around with projects like NPM (the other NPM), but Caddy and Traefik are far more capable.

DevOpsToolbox did a great video on many of the reasons why Caddy is so great (including performance) [0]. I think the only downside with Caddy right now is still how plugins work. Beyond that, however it's either Caddy or Traefik depending on my use case. Traefik is so easy to plug in and forget about and Caddy just has a ton of flexibility and ease of setup for quick solutions.

[0] https://www.youtube.com/watch?v=Inu5VhrO1rE


far more capable is an exaggeration

I use both, they are by and large substitutable. Nginx has a much larger knowledge base and ecosystem, the main reason I stick with it.


I agree with you that they're more or less equal. I don't like the idea of my reverse proxy dealing with letsencrypt for me, personally, but that's just a preference.

One tricky thing about nginx though, from the "If is evil" nginx wiki [0]:

> The if directive is part of the rewrite module which evaluates instructions imperatively. On the other hand, NGINX configuration in general is declarative. At some point due to user demand, an attempt was made to enable some non-rewrite directives inside if, and this led to the situation we have now.

I use nginx for homelab things because my use-cases are simple, but I've run into issues at work with nginx in the past because of the above.

[0] https://nginx-wiki.getpagespeed.com/config/if-is-evil


I'm not sure why Apache is so unpopular, it can also function as a reverse proxy and doesn't have the weird configuration issues nginx has.

Some people take this way too far, for instance I've send places compiling (end of life) modsec support into nginx instead of using the webserver it was built for


Just as one small example: if you're deploying in k8s and want the configuration external to Nginx, you want built in certificate provisioning and you need to run midleware that can easily be routed in-config...

Traefik is far more capable, for example. If all you're doing is serving pages, sure.


The part you are leaving out is that you also need to set up something like a pihole (which you can just run in a container on the homelab rather than on a pi) to do the local DNS resolution.

IME androids dont respect static routes published by the router. I guess self hosting DNS might be more robust but I usually just settle for bookmarking the ip:port

This (reverse proxy) is essentially what "tailscale serve" does.

Or just use Tailscale serve to put the app on a subdomain

Maybe a more accurate take: Half-assed soft deletion definitely isn't worth it.

If you're just going to throw in some deleted bool or deleted_at timestamp without thorough testing, you might as well just skip it. It's virtually certain to go wrong.


I've been using firefly over two years now. Data entry has not been been a problem:

* Run the data-importer as a separate container. Takes maybe 20 minutes to configure correctly if you already know docker and docker-compose.

* Download the transactions as CSVs from each of my 5 or so credit card and bank accounts that get regular use.

* Upload them through the data-importer. It lets you configure and save settings for each CSV format. I took five minutes to set that up the first time I imported CSVs and just keeping using the same settings.

I upload all of my transactions once a month and it takes about an hour to download them, import them, and categorize all of them (I also have a bunch of rules to auto-categorize, but there are inevitably a dozen or so bespoke transactions).

I've found that any of the solutions built on top of syncing backends like Plaid inevitably have issues: duplicates, missing txs, debit/credit mixed up. I even built my own custom Plaid-to-Firefly syncer at one point and found the data quality was very mixed, even when all my accounts are at major US banks. The data-importer takes some more up-front work, but it's more secure, way more predictable, and generally a solved problem.


I just never could bring myself to enter my bank password into Plaid.

Sure “everyone does it”, but most banks have disclaimers in their terms of service that if you lose money because your password was compromised through password sharing they aren’t liable for that loss.

I instead built a series of playwright scripts to automate signing into my bank accounts and downloading the CV, then importing (to lunch money, but I might take a look at this later as an option).


Chase is the only US bank I know that provides a proper API for plaid to connect to (plaid redirects you to chase.com for login, you acknowledge, and then give plaid an oauth token for read-only access). I don't have an impression that chase is particular tech-savvy comparing to other banks and I wonder why other banks cannot support this.


The EU has an open banking API regulation that requires banks to make such an API available, without having to do password sharing.

https://www.bloomberg.com/professional/blog/europes-new-api-...

There's no reason (beyond the general dysfunction of our legislature) that the US couldn't pass a similar law. Failing to do so is holding back out banking industry and making us all dramatically less secure.

We desperately need to mandate banks provide this kind of access. It's absolutely ludicrous that I don't have a way to give various services READ-ONLY access to my bank account information. It seems intolerable to me that I would have to give WRITE access to Plaid for them to enable someone to categorize my transactions.


What a nuisance


I appreciate the breakdown but I think you’re out of touch with how easy auto-import has become for most PF software users. They just see whatever new transactions have come in, as often as they care to look at their budgets. Like checking email.


Maybe out of touch. I've used them all (Mint, ynab, pocketsmith, buxfer, some google-sheets-based app, probably tried a dozen others). All of the auto-sync features had obvious bugs. It works for a few months and then you end up with junk data. But if it works for you, carry on!


Contra anecdote: been using auto import with YNAB for years with multiple accounts and never had junk data. At least, not of that means the data is wrong/corrupt.

The two issues I have had:

1. Have to reauthorize bank connection, but that's Plaid's issue I guess.

2. Payee names are all over the place because it's often the merchant name. I used to try and fix these and only have a single page payee for each entity, but gave up.


Side question, why are merchant names always so cryptic?


My best guess is you have an unusual data source most don't have. I wasn't describing an anecdotal experience but the reliability in aggregate.


this sounds like such a pain in the ass just to do my finances when other tools do it all for me…


I guess it comes down to the tradeoffs you're willing to accept for privacy. I personally find it quite sketchy that Plaid takes your username and password to your literal money and then does some opaque screen-scraping just to grab your transactions. I also spent a lot of time working with their API and it did not inspire confidence.


Not really their fault. The banks fault.

I have two banks and one CC that use an OAuth type system now.


Sadly he's not some kind of local hero. The area much prefers sports to academics. I studied CS there for undergrad and knew of him but rarely heard his name mentioned or celebrated.


I mean, if you asked random passersby in Stanford (my alma mater) about Donald Knuth, 4 out of 5 wouldn’t have heard of him. Or 9 out of 10, more pessimistically. Approximately nowhere are academics hailed as local heroes.


...What? $10 x 1000 = $10k / month. $10k x 12 = $120k. That is a new grad software engineer salary in any US city. You'd pay more than that for a single dev with the devops and security experience to keep GHE running and patched for 1000 devs.


The person was replying to a comment saying they spend more on a SINGLE HOUR of a dev's time than the monthly GH bill, which is not true for an org of more than 20 people or so (depending on hourly rate).


Ah, totally misread it. Thanks.


Just a bone to pick... new grad engineers in my US city started around 60-70k in 2018 when my college cohort graduated. Southern US...


Things have changed considerably over the last four years.


It still isn’t that high except for at a hand few of places and even then they’ll start you off less but give you a total comp that exceeds. Still it isn’t far of the mark (at least in Seattle where I live.)


yea and starting salary for zero experience developers is not $120k in most places


Well, considering you'd likely spend an average of 5 minutes per day doing it I wouldn't mind it.


There are a lot of problems with this from the business angle:

(1) An engineer getting paid 120k doesn't "cost" 120k, probably >150k with federal taxes, health insurance, benefits, and so on. Not including the cost to recruit, interview, and train said person.

(2) I don't know of many 1,000 person companies that would trust a new grad software engineer with no experience to manage critical infrastructure.

(3) You need N engineers to manage said service, because what happens when your one engineer gets sick, takes PTO, or quits for some reason? You also need a manager for said engineer(s).

(4) You now need to secure an internal service you never did before, so expect to have to hire external security consultants or re-allocate security engineers, since it's high risk.

(5) Github is FedRAMP compliant, SOC1 and SOC2 compliant and GDPR compliant. If you or your customers need any of those things, expect to hire external auditors on a recurring basis to validate your home-grown solution meets those requirements.

I hate to make these points because I'm a big believer in the scrappy startup mentality, but if you want to do things right, in the context of a large enterprise that is accountable to a lot of people, expect a project like this to cost $1MM per year minimum, and it probably won't reach parity with a cloud offering in terms of reliability, multi-region performance, proper backups, and so on. This is why Github can charge ~$200 per user (Or $200k per year for 1,000 seats) and still come away looking like a bargain.


I've personally had some painful experiences with refreshing materialized views in Postgres. In particular, highly variable performance on read replicas that were receiving a refreshed matview every few minutes. Maybe we were just doing it wrong, but I tend to avoid it if I can. Plus the eventual consistency can introduce confusion.

In any case, there's an interesting feature called Incremental View Maintenance that is being worked on by some Postgres developers: https://wiki.postgresql.org/wiki/Incremental_View_Maintenanc...

This would let us define a materialized view that gets automatically updated as the source tables change. When I last checked (late 2021), they were saying it might land in PG15.


This is a really interesting area, would love it if Postgres provided a strong option here. In the meantime, Materialize has very good Postgres integration if it might work for you:

https://materialize.com/


Thanks for the feedback -- sorry I missed this a couple days ago! Had no clue it got posted to HN.

FWIW, I've found GIN is a bit faster if you're just looking to filter. IIRC it was maybe 10-15% faster for the particular use-case I was looking at. So, worth a try, but don't expect a 10x improvement.


Yeah it's a shame the concat_ws function doesn't quite work here.

I've had success using Slick, an ORM-ish Scala library, to abstract away this tedious concatenation in app code.


Thanks for the feedback -- sorry I missed this a couple days ago! Had no clue it got posted to HN.

I've deployed a solution that uses roughly this same method with multiple tables. I experimented with a materialized view that would centralize all the text columns, but ultimately found that it was much simpler and fast enough to have a single expression index in each of the tables. Then the query either joins the tables and checks each column, or you can run a separate query for each of the tables and stitch together the results in app code.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: