More

hehdhdjehehegwv · 2024-06-17T06:57:01 1718607421

The phrase “unsustainable gigantic moral hazard” applies to the current pharmaceutical and health industry. You know, the one that killed millions with opioids.

As far as I am concerned, burn it all down and put the most unhinged cokehead VCs in charge, who cares at this point. Hell, add some crypto AI NFTs in the mix while we’re at it.

Literally cannot get worse than it is now.

hehdhdjehehegwv · 2024-06-17T04:05:40 1718597140

Sell NVIDIA, buy buckets of worms, profit.

bugbuddy · 2024-06-17T04:37:13 1718599033

Your next Apple will come with worms inside.

hehdhdjehehegwv · 2024-06-17T05:27:17 1718602037

I guess that would make AppleCare a bucket of compost.

hehdhdjehehegwv · 2024-06-16T01:10:43 1718500243

It most definitely does and I’m serving several read-only production databases from memory. It is insanely fast if you do it right.

https://www.postgresql.org/docs/current/pgprewarm.html

hehdhdjehehegwv · 2024-06-16T01:05:22 1718499922

The number of societal-wide problems that occur if your Lightroom database gets corrupted is zero.

The amount of hell that would be unleashed if the financial systems layers upon layers of database transactions got broken is impossible to comprehend.

So if you mean “important” as “necessary for society to function”, then no, your browser bookmark files, contact list, or the other two dozen things your laptop and phone use SQLite for are not important.

tibbar · 2024-06-16T01:09:16 1718500156

This is a false equivalence, as one user’s particular usage of SQLite is not comparable to all financial instructions’ databases.

Better to compare to the prospect of all smartphones being irreversibly corrupted at once.

hehdhdjehehegwv · 2024-06-16T01:54:42 1718502882

Literally all smartphones dying at once is preferable because they are all backed up into heavy duty databases. You can restore every phone from the cloud in this thought experiment.

You can not restore a data center from a pile of phones.

wavemode · 2024-06-16T01:12:08 1718500328

The goalposts keep moving here. You can just take any arbitrary definition of important and use it to exclude SQLite deployments.

It's all semantics, and also irrelevant to the original article anyway, since nowhere does it argue that SQLite holds the most data (important or otherwise).

hehdhdjehehegwv · 2024-06-16T01:18:11 1718500691

I didn’t comment earlier so I can’t move a goalpost I never set down. I agree with the root comment that there may be more instances of SQLite, the most important data is not in SQLite.

For what it’s worth, I have used any number of databases over the years and SQLite is very good for a number of things.

None of those things are the core infrastructure that stores your emails, money, and other must have, shared, high availability data.

There are different tools for different jobs, that’s fine.

hehdhdjehehegwv · 2024-06-15T22:52:13 1718491933

Yeah, I’m busy doing actual work - Ollama acting as a wrapper is exactly what I want.

rvnx · 2024-06-15T22:55:07 1718492107

Basically someone created a rather well thought .sh (and even with a nice GUI) that is super helpful.

It's free to use, and free to re-use, as it is under MIT licence.

Well, just thanks for their work.

hehdhdjehehegwv · 2024-06-16T01:00:35 1718499635

Yeah and the GitHub for .cpp links directly to ollama - it’s literally advertised by the core maintainer as a good project.

hehdhdjehehegwv · 2024-06-14T19:18:41 1718392721

It’s hard not to think this is just the FAA trying to protect Boeing again by making it look like Airbus is equally bad.

FAA should just be rehoused under department of commerce where the job is actually to promote and protect American business interests.

At least then we can admit we have no regulatory oversight of aviation safety. Let’s be honest as a country for once.

Jtsummers · 2024-06-14T19:23:23 1718393003

The false provenance was discovered by an Italian company, and then Spirit did their own investigation and found they had titanium from the same supplier with the same issue of false provenance. Spirit notified both Boeing and Airbus. Spirit produces parts for both Boeing and Airbus. This isn't about the FAA helping Boeing cover their asses, this is a real issue that impacts both Boeing and Airbus since the titanium ended up in planes from both companies.

hehdhdjehehegwv · 2024-06-14T19:13:02 1718392382

At least their extortion is polite?

IshKebab · 2024-06-14T19:25:45 1718393145

How is this extortion? 6 months is plenty of time to sort out an alternative if you need to (assuming an alternative is available).

arccy · 2024-06-14T19:42:08 1718394128

going from $0 to $600/month even with discount with 1 day notice feels like extortion...

hyperhopper · 2024-06-14T19:42:53 1718394173

They don't have 6 months. They have 1 day. People might not even be checking their inboxes on this Friday, leading to unannounced charges in an unreasonable timeframe or the extortion of paying or having downtime.

Disgusting behavior.

IshKebab · 2024-06-14T20:58:00 1718398680

That's not correct.

https://news.ycombinator.com/item?id=40684934

hehdhdjehehegwv · 2024-06-14T19:10:12 1718392212

I dropped $5k on an A6000 and I can run llama3:70b day and night for the price of my electricity bill.

I’ve gone through hundreds of millions, maybe billions, of tokens in the past year.

This article is just “cloud is expensive” 101. Nothing new.

EvgeniyZh · 2024-06-14T20:51:54 1718398314

1B of tokens for Gemini Flash (which is on par with llama3-70b in my experience or even better sometimes) with 2:1 input-output would cost ~600 bucks (ignoring the fact they offer 1M tokens a day for free now). Ignoring electricity you'd break even in >8 years. You can find llama3-70b for ~same prices if you're interested in the specific model.

hehdhdjehehegwv · 2024-06-15T05:24:26 1718429066

I answered the financial thinking in another reply, but another factor is I need to know if the model today is exactly the same as tomorrow for reliable scientific benchmarking.

I need to tell if I change I made was impactful, but if the model just magically gets smarter or dumber at my tasks with no warning then I can’t tell if I made an improvement or a regression.

Whereas the model on my GPU doesn’t change unless I change it. So it’s one less variable and LLM are black box to start with.

I may be wrong for Gemini, but my impression is all the companies are constantly tweaking the big models. I know GPT on Monday is not always the same GPT on Thursday for example.

hereonout2 · 2024-06-14T20:17:18 1718396238

I've worked professionally over the last 12 months hosting quite a few foundation models and fine tuned LLMs on our own hardware, aws + azure vms and also a variety of newer "inference serving" type services that are popping up everywhere.

I don't do any work with the output, I'm just the MLOps guy (ahem, DevOps).

You mention expense but on a purely financial basis I find any of these hosted solutions really hard to justify against GPT 3.5 turbo prices, including building your own rig. $5k + electricity is loads of 3.5 Turbo tokens.

Of course none of the data scientists or researchers I work with want to use that though - it's not their job to host these things or worry about the costs.

hehdhdjehehegwv · 2024-06-15T05:16:00 1718428560

So my main motivation is not so much to have the lowest cost, but to have the most predictable cost.

Knowing up front this is my fixed ML budget gives me peace of mind and gives me room to try stupid ideas without worrying about it.

Whereas doing it in the cloud you can a) get slammed with some crazy bill by accident, b) have to think more about what resources testing an idea will take, or conversely c) getting GPU FOMO and thinking “if just upgrade a level all my problems will be solved”.

It works for me, everybody mileage varies but personally I like to budget; spend; and then totally focus on my goals and not my cloud spend.

I’m also from the pre-cloud era, so doing stuff on my own bare metal is second nature.

logicallee · 2024-06-14T19:53:27 1718394807

Super cool, thanks for sharing. Do you mind sharing what you used the hundreds of millions (or billions) of tokens on?

hehdhdjehehegwv · 2024-06-15T05:04:42 1718427882

Doing really nuanced classification of documents at very large scale. Needle in the haystack type problems.

elorant · 2024-06-14T20:29:08 1718396948

Is this at 4-bit quantization? And how many tokens per second is the output?

hehdhdjehehegwv · 2024-06-15T05:07:06 1718428026

I’m doing non-interactive tasks, but in terms of the A6000 running llama3 70b in chat mode it’s as usable as any of the commercial offerings in terms of speed. I read quickly and it’s faster than I read.

brcmthrowaway · 2024-06-14T19:52:40 1718394760

Hows your ROI?

hehdhdjehehegwv · 2024-06-14T20:09:23 1718395763

Absolutely phenomenal.

brcmthrowaway · 2024-06-15T00:33:37 1718411617

Are you using it for trading?

hehdhdjehehegwv · 2024-06-15T05:00:28 1718427628

Nope, powers some low-level infrastructure-ish stuff.

hehdhdjehehegwv · 2024-06-14T19:04:12 1718391852

This is why you need proactive privacy evaluations before you ship.

The standard of the past 25 years of “let’s violate every privacy law and know it won’t catch up with us” is over.

You either ship privacy complaint product, which means painful and slow review and adjustment which is an obvious financial cost…OR you go to market out of compliance, get slammed by the press and regulators, and the entire project eats shit.

What seems like short-term cost saving is really just torching the entire investment.

The underlying reason Boeing planes fall out of the sky and these privacy hostile products fail is the same: speed and greed.

hehdhdjehehegwv · 2024-06-14T06:50:37 1718347837

I read that article earlier, but it didn’t mention Boeing so I wasn’t questioning it.

But where Boeing goes, bad things follow.