Hacker Newsnew | past | comments | ask | show | jobs | submit | FairlyInvolved's commentslogin

I realise no one is infallible but do you not think Daniel Kokotajlo's integrity is now pretty well established with regard to those incentives?


We are going to scale up GPT4 by a factor of ~10,000 and that will result in getting an accurate summary of your daily schedule?


Unfortunately with the way scaling laws are working out, each order of magnitude increase in computer only makes models a little better.

Meaning they nobody will even bother to 10,000X GPT4.


If we’re lucky.


There's a pretty good summary of how well it has held up here, by the significance of each claim:

https://www.lesswrong.com/posts/u9Kr97di29CkMvjaj/evaluating...


Outside of a few retail investors institutional buyers are no so naïve that they take a loss-making company paying out a dividend as a measure of good financial health.

Long Intel but as Dylan (and many others) noted they should have stopped the dividend a year ago.


Writing down a merger/acquisition is probably the main scenario where this could extend to the billions.

I don't have an exact list but I think you'll see differences of that order in the years following big acquisitions like AMD / Xilinx for this sector.


I don't think debris 'going higher' isn't much of a problem. Whenever this happens the orbit is going to be more eccentric - meaning a lower periapsis, and consequentially lots of drag that will cause a rapid deorbit.

On the second point about parabolic orbits I also find that probably relatively low risk because we are only talking about a fraction of an orbit for a collision to occur so unless the debris field was massive the chance of another collision is probably still low. Remember when we are modelling orbit collisions normally we are often talking over 25+ years - 100,000 + orbits.

I think the main problem is busy orbits (e.g. sun-synchronous polar orbits at popular altitudes) where most of the debris remains roughly in the same orbit following an acute collision but has a lot of other potential collision targets. Also as satellites are disabled by a collision they lose the ability to avoid other objects already in the same crowded orbit - i.e. the fraction of objects able to take avoidance decreases increasing the chance that future collisions are from 2 incapacitated satellites, removing the possibility of avoidance.


I think that's just a scaling issue, fundamentally there's no reason why a model trained on video couldn't come to create coherent motion in the same way that image models can now product coherent lighting/themes.

Smaller image models had the same problems with logical inconsistency just because they didn't have sufficient general understanding of how visual concepts.

The same is almost certainly true of video - early smaller models will likely create janky movements/motion, however once they've seen enough video to understand how a person walks, how a scene is framed etc.. there's no reason we couldn't get to the same level of maturity as today's image models.

I think the real issue will come from labelling - most video is only going to be labelled simply with basic info/captions without detailed descriptions of the camera pan, movement of subjects. The amount of text required to accurately describe a scene is much larger than a still image and I'm not sure how once would go about collecting this.


I kind of get the sentiment about openness but I think it's way more nuanced than you are making out.

There are very good reasons for withholding SOTA models, primarily from the info hazard angle and avoiding escalating the capabilities race which is basically the biggest risk we have right now.

Google / Deepmind have actually made some good decisions to try and slow down the race (such as waiting to publish).


They're not slowing down anything. The cat's out of the bag.

What good does a few months lag do when nobody is bracing for impact?


I'm not saying they are doing a good enough job, but that doesn't mean their approach isn't entirely without merit.

Even ignoring the infohazard angle if they published everything immediately that would escalate the race. By sitting on their capabilities and waiting for others to publish (e.g. PaLM, Imagen vs GPT-3, DALL-E) they are at least only playing catch up.


Capabilities race, seriously? This is not nuclear warfare my guy. It's mathematics.


Nuclear warfare is much less concerning than misaligned AI.

Take a look into scaling laws and alignment concerns, this is a very real challenge and existential risk not some crackpot theory.


In the same sense that deep learning is just linear regression with a steroid problem.


Information warfare is pretty dangerous too!


I wouldn't necessarily say a major breakthrough as such but I do think some architectural change is needed. There are concepts in images that we don't rely on purely visual understanding for - like words, we have a language model that we use to rely on when we see text in images. I think we need the same thing in our models to reach the next level of capabilities by combining models across different domains. I don't know if this manifests as pre-training with a language model and then expanding and updating the tensors as part of image training, or some more complicated merger of the models.

To learn logical concepts just from images seems entirely impractical, like we can't rely on having enough images such that models can understand words coherently as language. You could draw a picture of a sign that says "children crossing" not because you can understand and remember exactly what an image of such a sign would look like, but because you have an understand of English and the character set that would let you reproduce it. If you tried to learn to create the same sign in Arabic you'd either need to see a huge number of signs to learn from or (more likely) build a language model for Arabic.

The kind of abstract understandings that we know we can train in language models just aren't learned by image transformers at this scale (or likely any practical scale). A language model could easily understand: "A red cube is stacked on top of a blue plate, a green pyramid is balanced on the red cube" and infer things like the position of the pyramid relative to the blue plate, image models quickly fall over with such examples.

An interesting nascent (and hacky) example of the benefits of combining models is people are using language models like GPT-3 to create better prompts for image models.


And N7 is almost certainly cheaper (holistically) than Intel 7, from an economic perspective AMD have done more with less.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: