Hacker Newsnew | past | comments | ask | show | jobs | submit | dbuser99's commentslogin

I don’t get it. This whole thing says single writer does not scale, so we stopped writing as much and removed reads away from it, so it works ok and we decided that’s enough. I guess thats great.


This article has very little useful information...

There's nothing novel about optimizing queries, sharding and using read replicas.


It has one piece of useful info: their main data store even for 800M users is a single instance of postgres (for writes) without sharding.


The post tells you there is a single point of failure: if you wanted to DDOS OpenAI, you'd target write-heavy operations.

For that reason, I find it actually bold that they disclosed it, and I appreciate it.

The article reminded me of a similar post about MySQL use for Facebook from the Meta team, which had the same message: big database servers are powerful workhorses that scale and are very cost-effective (and simpler to manage than distributed setups where writes need be to carefully orchestrated - a very hard task).

The two core mesages of both articles combined could be read as: 1. big DB servers are your friend and 2. keep it simple, unless you can't avoid the extra complexity any more.


What Facebook post are you referring to? Generally speaking, Facebook's MySQL infra has been heavily sharded for a very long time, and doesn't rely on abnormally-beefy servers. It's basically the complete opposite approach of what OpenAI is describing here.


when I joined twitter in 2011 there was a single mysql master user (not tweets) database and a few dozen read replicas. it was writing about 7000 updates per second and during bursts it would go too high for the single-threaded replication in mysql at the time to keep up with the master which would cause replication lag and all kinds of annoying things in the app. you just have to pick the right time to make the switch before it is an emergency.


Postgres setups are typically based on physical replication, which is not an option on MySQL. My testing shows the limit to be about 177k tps with each transaction consisting of 3 updates and 1 insert.


Be careful. During consulting I ran into similar magnitude of writes for a mostly CRUD workload.

They had huge problems with VACUUM at high tps. Basically the database never had space to breath and cleanup.


Sometimes it is convenient that there are no backups. Just saying…


So sam was getting paid - possibly in egregious amounts while lying to congress?


VC huckster lies to the public, news at 11.


AI has to be the solution to everything to justify the kind of investments going into it right now.

Sam’s a savvy businessman so he obviously understands that and goes few steps further. He promises exponential returns and addresses any regulatory and societal concerns. This piece is strategically crafted, not for us, but for investors.


Don’t think you will become as rich as musk, sam


100% agreed. If you read this piece from the lens of a SV executive or VC it makes perfect sense and serves the interests of all parties. Content? Who cares


What are you on about? They publish their research advancing the field. And gemini has caught up with openai and anybody else.


I am glad they are advancing the field but I think it's unfortunate that doesn't make them top dog. Gemini is not top tier to me but I admit that confusing naming and spotty worldwide rollout might be a reason why I am not familiar with their best model. But that's a signal on it's own.

The launch was faked and I don't think the real thing is here yet https://techcrunch.com/2023/12/07/googles-best-gemini-demo-w...


Based on this comment, decided to try out gemini.

Total disaster. Doing similar tasks to openai and claude, it just borks. And it is complaining about my desire to use a gender guesser python libary, and tells me that's inappropriate for non-binary people, and it won't do it.

That's fun.

Edit 1: Also it refuses to print the entire script. I've tried many work arounds, it seems to only want to output a very small number of total lines.

Threw it into ChatGPT and immediately it fixed all the issues with Gemini, and worked on first try.

Edit 2: The only thing better about Gemini as far as I can tell, is that the copy code button is on the bottom. ChatGPT's is at the top, and that's dumb.

Edit 3: I'm being downvoted heavily now, to be clear, I didn't intentionally seek out the gender issue, it's just what I was working on.

I'm currently trying to generate infographics based on wrestlers, and I needed to split the men from the female for championship title rankings.

I have no problem with it in general, it just came up, so I communicated it.

Multiple times Gemini removed the code using the gender guesser library because it felt I shouldn't use it. When trying to determine wrestlers, and their Title Chances, it makes a lot of sense...

But Gemini just refused to allow me to use it, which seems like a ridiculous thing. I want to make the choices here.


I've had the same exact experiences.


Problem with Google summed up. Ethics and pseudo sciences folks wanting to opinionate technology. That's akin to a kitchen knife refusing to cut gift wrapping paper because that's inappropriate use of a knife. The silliness


The problem with Gemini is the guardrails they've built into it which makes it useless for me. Which is a problem that has to do with Google and not any AI smarts.


Financial rewards. Money.

Investors could not take the risk of disturbing the value of their investment.

It is that simple


Man. No wonder openai is nothing without its people


In this particular case of sharding a postgresql solution, in my opinion, the parent is right. Any major cloud provider would give companies of their scale assistance. This is their bread and butter. The posts likely hide the requirement of stay on aws, but we don’t know they did not talk about that. Likewise cockroach or yugabyte were also available options.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: