Hacker News new | past | comments | ask | show | jobs | submit | 6gvONxR4sf7o's comments login

> I think this should be _easier_ than producing a realistic image from scratch

Think of this in terms of constraints. An image from scratch has self consistency constraints (this part of the image has to be consistent with that part) and it may have semantic constraints (if it has to match a prompt). An animation also has the self consistency constraints, but also has to be consistent with other entire images! The fact that the images are close in some semantic space helps, but all the tiny details become so important to get precisely correct in a new way.

Like, if a model has some weird gap where it knows how to make an arm at 45 degrees and 60 degrees, but not 47, then that's fine for from-scratch generation. It'll just make one like it knows how (or more precisely, like it models as naturally likely). Same with any other weird quirks of what it thinks is good (naturally likely): It can just adjust to something that still matches the semantics but fits into the model's quirks. No such luck when now you need to get details like "47 degrees" correct. It's just a little harder without some training or modeling insight into how an arm at 45 degrees and 47 degrees are really "basically the same" (or just that much more data, so that you lose the weird bumps in the likelihood).

I wouldn't be surprised if "just that much more data" ends up being the answer, given the volume of video data on the internet, and the wide applicability of video generation (and hence intense research in the area).

There is! Start with a textbook. Most books (and practically all good ones) will include a preface or note from the author at the front saying “we wrote this with the intent of being accessible to people who know XYZ.” The best even say “if you don’t know that, we refer you to these books for those prereqs.” It’s a fantastic set of resources, but people seem to always want to consume a million blog posts instead of a couple books!

Textbooks will get you foundations, and then for more up to date stuff, you need to read academic papers. Start with survey papers and literature reviews on your topic, and when you can’t find one, start with any old (new) paper on the topic and read the related work section (all papers have them). You’ll have to learn to distinguish the “we’re building off of XYZ” from “ABC is kind of like our stuff,” because XYZ is the ‘prereq’ work. Then just go recursively until you hit stuff you understand from the textbook foundations.

Most academic resources contain pointers to their prereqs, so use them!

If you specifically want pointers where to start, this is my field so I’d be happy to point you to a path if you like.

The only reason I've held off on switching to Kobo is that I always switch back and forth between the audiobook and the text for what I'm reading. Kindle books make it seamless, but as far as I can tell, there's no audio-text sync in the Kobo ecosystem. Is that still accurate? Even if you use books bought from amazon/audible?

FYI Kindle is in the process of rolling out a new feature that enables read aloud with synchronized text highlighting. It doesn't sound as good as an audiobook, of course, and it's not as good as the Alexa-powered read-aloud. But it's still nice because you don't have to switch between the Kindle app and the Alexa app.

I understand that some users currently have access, and it will be rolled out to everyone in the next month.

Is the license not transitive? Like could your impact report be “i want to remove this part of the license?”

I like the way you think but 2b might prevent that.

The "don't blunder" advice applies to sports as well, where you hear it phrased in non-tautological ways frequently, typically something like "you need strong foundations."

Even at the peak of college level, I remember winning and losing games because someone did something that you mostly stopped doing soon after learning the sport, like pass to a teammate who wasn't looking, or fumble the ball in a preventable way, or things like that. Some teams focused on fancy plays and corner cases while their fundamentals still failed frequently, leading to more losses due to bad fundamentals than than wins due to good advanced plays.

That's how I read the "blunders" stuff. There's some small set of a priori known foundations you need which are fairly simple to keep from going wrong, which nonetheless lead to a large portion of failures/points given up/whatever.

Memorization usually refers to training data. It's often useful to have something that can utilize instructions losslessly, which is the distinction between these models.

If you had to reduce it to one thing, it's probably that language models are capable few shot and zero shot learners. In other words, training a model to simply predict the next word on naturally occurring text, you end up with an tool you can use for generic tasks, roughly speaking.

It turns out a lot of tasks are predictable. Go figure.

> it doesn’t seem more exotic or interesting to me then asking “how does the pseudo inverse of A ‘learn’ to approximate the formula Ax=b?

Asking things like properties of the pseudoinverse against a dataset on some distribution (or even properties of simple regression) is interesting and useful. If we could understand neural networks as well as we understand linear regression, it would be a massive breakthrough, not a boring "it's just minimizing a loss function" statement.

Hell even if you just ask about minimizing things, you get a whole theory of M estimators [0]. This kind of dismissive comment doesn't add anything.

[0] https://en.wikipedia.org/wiki/M-estimator

You raise a fair point, I do think that it’s important to understand how the properties of the data manifest in the least-squares solution to Ax=b. Without that, the only insights we have are from analysis, while we would be remiss to overlook the more fundamental theory, which is linear algebra. However, my suspicion is that the answer to these same questions but applied to nonlinear function approximators is probably not much different from the insights we have already gained in more basic systems. However, the overly broad title of the manuscript doesn’t seem to point toward those kinds of questions (specifically, things like “how do properties of the data manifold manifest in the weight tensors”) and I’m not sure that one should equate those things to “learning”.

I've enjoyed working on teams that use an auto-formatter, like black in python. Using a formatter that's not my favorite is much more enjoyable than not using a formatter. Ditto for as many style issues as you can automate. It sucks to spend code review making humans spend time (re)arguing things machines can say just as well.

So if someone proposes that their language should have stated idiomatic style, I'd go further and say that a language style guide should come with tooling to automatically fix and automatically detect issues in as much of the style guide's scope as reasonably possible.

> It sucks to spend code review making humans spend time

For some humans, they don't have to be "made" to do it. They love squabbling over style issues and rearranging deck chairs. I will never, ever understand caring about style issues, so auto-formatters are a gift from the heavens. Now the guy that used to waste time in code review can waste time tweaking the formatter rules, and everyone else can move on with their life.

> Now the guy that used to waste time in code review can waste time tweaking the formatter rules, and everyone else can move on with their life.

Had worked with one such person before, unfortunately after introducing formatters he found a new thing to be pedantic about and started annoying everyone with that instead (it was C++ and his second obsession was ensuring that all objects are always moved correctly and no CPU cycle is being wasted while our enterprise app is waiting on the network to resolve tons of API calls).

C++ for high-level code tends to be this way. Waste tons of focus on saving a couple of CPU cycles, but the whole thing is waiting on RPCs anyway.

Also you probably used less efficient algorithms because the smarter ones were too annoying to implement in C++, and there's possibly a solid Python lib that does it with hyper optimized C code anyway.

Yeah I agree. The language lives in a weird niche where both intricate low-level details are visible, but also supports abstractions only visible from outer space.

You always have some low-level detail which can be nitpicked about, even though it does not matter in the slightest.

Admittedly, choosing C++ for such a high-level project wasn't a good idea at all, but that is what the first few devs knew best and sticked with it.

Oh and those measures that save 2 extra CPU cycles do things in a way more error-prone way, which some day causes memory corruption.

... which you spend more CPU cycles than you ever saved trying to debug it

I care about style. If I’m reading the same code day in and day out I want it to look nice. Not nice code is a distraction, maybe it’s an OCD or a ‘visual misophonia’, but inconsistency is one of the first things I notice. I truly don’t understand why some people don’t care how their code looks.

I have noticed that some people will present tidy code during interviews and amazingly drop that habit once they’re employed.

It's not only about it looking 'nice' (although there's that component as well as an added bonus).

Once you've read tons of code formatted with the same formatter, passing the same linter rules and following the same general idiomatic rules, it becomes so much easier to read and review new code that adheres to all the same rules.

This is definitely true. Your eyes tell you that something's off long before your brain figures out what it is.

I passed the style test at work then instantly stopped following any of the rules. My teammates who also don't care approve the changes. We deal with lots of bugs each day, and none of them have ever been because things like using `auto` in C++ when we shouldn't have. There are far better things to spend time on.

And if someone does care about something in particular, often they will contribute to an auto-cleanup tool that makes it a certain way. So there was no reason for me to do it manually.

Which came first, the bugs or the nonchalance!

The bugs of course

Problem is even with auto-formatters, people will squabble over pointless stuff. I just do whatever comes naturally and assume someone is going to want it a certain other way, then I change it without arguing. That goes for designs too.

I see that it has an inbuilt compiler-compiler, prolog, and features for DSLs. Is the idea that Shen good for implementing statically typed, compiled languages? Or for metaprogramming?

Looks like it's been around for a while and I'm curious what folks have used it for.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact