Hacker News new | past | comments | ask | show | jobs | submit | rossdavidh's comments login

Even if the engagement hours are massive, if they're not growing at the rate they claim, their stock price is not justified. They have not made a profit with their existing "massive" traffic, so the prospect of growing into something that is profitable is key to the case for their current stock price.

I have to agree on the other half of the headline, although if they're already unprofitable then there is a legit question of whether or not they can afford the amount of human oversight that every large platform (let alone a kid-oriented one) is going to require.


So, I think most devs would say it helps them, with certain kinds of tasks. For example, getting the general layout of a given kind of function, or that kind of thing. This makes it feel like you're faster. The problem is, this is not normally the rate limiting step.

The rate limiting step for when a software project completes, is usually figuring out the architectural issues. Huge schedule misses are usually because something about the big picture was not understood, and a lot of work was done making the wrong thing, and then needed to be undone or thrown away once the problem was realized.

It's similar to how being a faster typist is nice and all, but it doesn't usually speed up your coding that much, because most of the time is not spent typing. Typing speed is not your bottleneck.

Having said that, I think it's too early to say this couldn't be a useful tool. For example, I have heard experienced devs say that AI assists can be very helpful when they have to switch languages to one they are less familiar with. Over time, it is plausible that AI assistants might make dev teams less often make the mistake of trying to cram the entire project into one language, when really it would go better if (to give one example) the query was in SQL and the frontend was in vanilla javascript and the machine learning was in python but the backend was otherwise in Go. Currently, teams are not willing to do that, and are more likely to attempt (mistakenly) to do it all in javascript, or do it all in python. If could be the case that in a couple years, as devs learn how AI coding assistants are best used, that they change how they work, and only then see the advantages.

Or, you know, it could all be a bunch of hooey. But, maybe not, and it will take a while to see.


Funny, almost everything he said about htmx made me think, "this htmx sounds interesting, I should check it out".

SEEKING WORK | Austin, TX or remote

10+ years experience in python, both scientific (numpy, pandas, matplotlib, statsmodels, jupyter) and web backend (django, flask, sqlalchemy). Also experienced in R (experiment design, linear and nonlinear analysis, graphic presentation of data). Previous experience in experimental design and analysis when I was a manufacturing engineer in semiconductors. Able to work W2 or 1099, remote or in-person (as long as it's in Austin). Willing to do full or part-time.

https://www.rosshartshorn.net/RossHartshornResume.pdf


Location: Austin, TX, USA

Remote: Either/or

Willing to relocate: No

Technologies: Python (scientific stack or web), R, jupyter, statistical design of experiments (DOE)

Résumé/CV: https://www.rosshartshorn.net/RossHartshornResume.pdf

Email: ross.hartshorn@gmail.com


I have, many many times, seen a large organization "crash" a project, that is to say put a bunch of new developers into it. It almost never works. The first thing that happens is that everything stops, while the existing devs tell the new ones what is going on and what needs to be done.

The second reason it doesn't work is that splitting the app into separate parts that can be worked on in parallel, is essentially determining the software architecture. If you haven't sorted that out then the different parts will get built, not work together, and then there will be a sh*%storm of blaming each other for why they don't work together.

There are cases where new devs can help, but if you don't have the overall architecture sorted out yet, then they will not, and if you do then there is a finite pace at which new devs can be integrated and brought up to speed.

The old analogy to pregnancy is, in fact, spot on.


I'm having a hard time figuring out if this is satire, or "for real". Which probably says something about the moment we live in.


They are, but their biggest expense is probably cloud compute, and most of the "investment" that Microsoft made in them was in the form of cloud compute credits. Essentially, Microsoft has spare cloud capacity, doesn't want to admit that to Wall Street, and so covers that up by giving away the extra capacity in the form of an "investment".

Now, in the end, it likely won't amount to much, but if it keeps Microsoft stock up for a while, it may pay off for the executives involved. Apparently not OpenAI execs, though...


"Microsoft has spare cloud capacity" definitely not, if they're actively building out new datacenters to keep up with demand.


> ... doesn't want to admit that to Wall Street, and so covers that up by giving away the extra capacity in the form of an "investment".

Is this just a wall street thing? I'm betting that this is an excellent tax shelter as well.


Good point


Well I saw plenty of commercial software failing at this kind of thing even a few years ago, when presented with real-world pictures. Seems to be solved now, though.


"humanity discovered an algorithm that could really, truly learn any distribution of data (or really, the underlying “rules” that produce any distribution of data)..."

This statement is manifestly untrue. Neural networks are useful, many hidden layers is useful, all of these architectures are useful, but the idea that they can learn anything, is based less on empirical results and more on what Sam Altman needs to convince people of to get this capital investments.


100%. If they could learn anything, then shouldn't modern ML systems be able to solve the big mysteries in science -- since we have large datasets there describing the phenomena in various ways? E.g. dark energy, dark matter, matter-antimatter asymmetry, or even outstanding problems in pure mathematics.

The intention of this sama post is as you said, it's to build narrative so he can raise his trillion from the Arab world or other problematic sources.

In pseudocode, this is Sam Altman:

while(alive) { RaiseCapital() }


Well, you could certainly train a big-ass model to mimic the distribution of all that physics data. That doesn't mean the model could, eg, formulate interesting new theories which explain why that distribution has its particular structure.


The emperor truly has no clothes here.

Six months ago, he probably could have gotten away with saying this and there would have likely have been enough people who were still impressed enough with the trajectory of LLMs to back him on it. But these days, most of us have encountered the all-too-common failure mode where the LLM shows its hand, that it doesn't truly understand anything, and that it's just _very very good_ at prediction. Each new generation gets even better at that prediction, but still hits its weird stumbling points, because its still the same algorithm, and that algorithm cannot do what he is ascribing to it.

These are the words of a man who has an incredible amount of money sunk into something and as such, is having a really hard time taking an honest accounting of it.


1. What failure mode do LLMs have that proves they don't understand anything at all ? And why can't i prove the same with humans (who have an abundance of failure modes)

2. You genuinely think that a system whose goal is to predict the data it's given and continues to improve is limited in what it can learn ? Of all the shortcomings of the Transformer architecture, its objective function is not one of them.


> What failure mode do LLMs have that proves they don't understand anything at all ?

Try to get it to write something in a programming language not commonly used on the internet, say Forth or Brainfuck, with only the specifications of said languages. Humans are able to grasp the law of reality through a model and use it to act upon the real world.

> You genuinely think that a system whose goal is to predict the data it's given and continues to improve is limited in what it can learn?

Not GP, but Image generators have ingested more images that I've seen in my life and still can't grasp basic things like perspective or anatomy. Things that people can learn from a book or two. And there are software that already have models for both.


>Try to get it to write something in a programming language not commonly used on the internet, say Forth or Brainfuck, with only the specifications of said languages. Humans are able to grasp the law of reality through a model and use it to act upon the real world.

My Experience with this has been SOTA LLMs generating sensible code at rates much greater than random chance even if it may not be as good as i'd like. I don't see how that is evidence LLMs don't understand anything at all especially since there are probably humans who would write less workable code.

>Not GP, but Image generators have ingested more images that I've seen in my life and still can't grasp basic things like perspective or anatomy.

The human brain didn't poof from thin air. It's the result of billions of years of evolution tuning it for real world navigation and vision amongst other things. You are not a blank slate. All Modern NNs are much more blank slate than the brain has been for at least millions of years.


You're moving the bars. In fact, these bars are so laughably low, I don't know that we're having the same conversation any more.

Nobody's saying it can't write "sensible code at rates much greater than random chance." We're not competing with an army of typing monkeys here. We're saying it actually doesn't "know" anything, and regularly demonstrates that quality, despite it seeming very much like something that knows things, most of the time. You're being tricked by a clever algorithm.

> All Modern NNs are much more blank slate than the brain has been for at least millions of years.

All well and good if we were talking about interesting research and had millions of years to let these algorithms prove themselves out, I suppose. But we're talking about industries that are being created out of whole cloth and/or destroyed, depending on where you stand, and the time frame is in single-digit years, if not less. And these things will still confidently make elementary mistakes and get lost in their own context.

Look, they're obviously not useless, but they're a tool with weaknesses and strengths. And people like pg who are acting like there ARE no weaknesses, or that a simple application of will and money will erase them, they are selling us a bill of goods.


>We're saying it actually doesn't "know" anything

Yeah and I'm saying this is a nonsense statement if you can't create a test (one that would also not disqualify humans) that demonstrates this. If you are saying what LLMs do is "fake understanding" then "fake understanding" should be testable unless you're just making stuff up.

>All well and good if we were talking about interesting research and had millions of years to let these algorithms prove themselves out, I suppose

Did you even read what the commenter I replied to was saying. This is irrelevant. We don't need to wait millions of years for anything.


I believe “learn any distribution of data” is his attempt at describing the Universal Approximation Theorem to the laymen.


Almost certainly true, and all the people crapping all over his description should really take a step back and consider that. He isn't out on some island all by himself here.


Universal approximation doesn't mean we've got (or ever will) algorithms to learn good enough models for any problem and the resources to run them, just that those models conceptually exist.


If you read what he wrote closely, he doesn't claim what you just claimed. Read it word for word:

"humanity discovered an algorithm that could really, truly learn any distribution of data (or really, the underlying “rules” that produce any distribution of data). To a shocking degree of precision, the more compute and data available, the better it gets at helping people solve hard problems"

He's an optimistic guy, no doubt, but he isn't full of shit.


The entire vocabulary around machine learning is and always has been really weird.

We don’t personify database interactions the same way we personify setting weights in a neural network.


We also personify synapses and axons in human brain tissue, though. My point is, while I agree with your first sentence to a degree, we shouldn’t judge the whole solely by its elementary parts. Clearly an LLM exhibits very different behavior from a conventional database.


LLMs exhibit very similar behavior to a search algorithm.

Text query in -> relevant text out.

I don’t say that search algorithms “learn” or “think” outside of ML.


The algorithm that learns is the training algorithm whose output is the LLM, not the inference algorithm employed when using the finished model.


Sure, but the whole process is referred to as “AI”

Nobody says the algo that chatgpt was made with learned information which it encoded into weights, even if that’s technically correct.


>Text query in -> relevant text out.

Wait that's it ? I guess humans also exhibit similar behavior to a search algorithm in certain instances. Nothing about LLM inference seems particularly similar to search even with our limited understanding.

All you're saying here is Input goes in > Output comes out. Well no shit.


I think LLM inference is extremely similar to search.

One of the killer features of LLMs is summarization (which can be thought of as searching a noisy set of data for the most relevant information) and document QA, which is also a search function.

Even implementation wise, transformers encode information in their weights and include that relevant information in their response.

Image generation models work the same way as google image search. Key word soup in -> relevant image out.

They encode information seen in their training within their weights then filter down those weight over many layers and what’s leftover is relevant data.

Idk how you’d look at how these models work and what they do and not see search.


It has to be. Most people don't understand the basic math involved and hence you can't explain it in concrete terms (neither what it's doing or how it's doing it) so you have to sort of make up analogies. It's an impossible task.


Yeah, maybe it’s just unfortunate that “learn” and “think” are such fuzzy subjects.


> The entire vocabulary around machine learning is and always has been really weird.

I would argue it took a staggeringly weird turn around 2022/23. Machine learning has been around for a long time and only recently since OpenAI and it's slavish desire to harness true AI (which thanks to their horseshit now has to be called AGI) and Sam Altman in particular's delusional ramblings upon the topic that he clearly barely understands beyond it's ability to get his company fantastical amounts of capital has it truly gone off the rails.

I cannot wait to watch this bubble pop.


I don’t think it was just post 2022.

“Neural networks” “learned” data by being “trained” since they were first described in the late 1900s.

The same language was used in Ian Goodfellow’s (excellent) 2012 text book “Deep Learning”


But… he’s saying something here that is academically true: that neural networks can approximate any possible function, to any arbitrary degree of precision you require (given infinite capacity / depth).

https://en.m.wikipedia.org/wiki/Universal_approximation_theo...

I will highlight one thing, which is that the theorem does not say anything about it being practical to learn this function, given available data or any specific optimization technique.


'the underlying “rules” that produce any distribution of data...' is clearly meant to convince the reader that it can produce something we would describe as a "rule", that is, a coherent and comprehensible regulating principle. This isn't just because he isn't being precise enough; he quite clearly wants the reader to understand this as neural networks being able to create a mental model of anything, in a manner similar to how a biological neural network would.

It doesn't, it can't, and it won't in our lifetimes.


> the idea that they can learn anything, is based less on empirical results and more on what Sam Altman needs to convince people of to get this capital investments.

Techbros love to pretend that they created digital gods (and by extension are gods themselves). We should all be thankful, worship, and of course surrender unconditionally -- Sam's will be done, amen.


It's a layman commentary on the Universal Approximation Theory so it is true.

The problem with the UAT is that it's never said anything about how trivial such an exercise would be. But he obviously believes we've stumbled on the architecture to get us there (for problems we care about anyway)


Come on now. This description is basically the universal approximation theorem. He isn't just making stuff up. You can take issue with the theorem and have a debate around it, but he isn't just wildly off base making stuff up here.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: