Hacker Newsnew | past | comments | ask | show | jobs | submit | ActivePattern's commentslogin

I still don't understand is why they don't even make an attempt to apply overlayers, when (as the author notes) there is ample secondary evidence that it would be present. It's not like there isn't already some element of inference and "filling in the blanks" when reconstructing how something was painted from the scant traces of paint that survived.

This is somewhat an unfounded theory of mine and I was hoping if anyone has any insight: but I sense that this is perhaps a construction of Western restoration/preservationist theory. A lot of effort seems to be taken to either preserve original material, not take liberties etc. While touring temples and museums in Japan, I got a sense that restorations were much more aggressive, and less regard was taken to the preservation of material (or building "fabric"), with a greater focus on the use of traditional techniques during restoration.

I assume you didn't read the article, since that's their exact point...

"Since underlayers are generally the only element of which traces survive, such doctrines lead to all-underlayer reconstructions, with the overlayers that were obviously originally present excluded for lack of evidence."


Maybe it's the author of the article? :P

> I can only say being against this is either it’s self-interest or not able to grasp it.

So we're just waving away the carbon cost, centralization of power, privacy fallout, fraud amplification, and the erosion of trust in information? These are enormous society-level effects (and there are many more to list).

Dismissing AI criticism as simply ignorance says more about your own.


I've never heard the caveat that it can't be attributable to misinformation in the pre-training corpus. For frontier models, we don't even have access to the enormous training corpus, so we would have no way of verifying whether or not it is regurgitating some misinformation that it had seen there or whether it is inventing something out of whole cloth.

> I've never heard the caveat that it can't be attributable to misinformation in the pre-training corpus.

If the LLM is accurately reflecting the training corpus, it wouldn’t be considered a hallucination. The LLM is operating as designed.

Matters of access to the training corpus are a separate issue.


I believe it was a super bowl ad for gemini last year where it had a "hallucination" in the ad itself. One of the screenshots of gemini being used showed this "hallucination", which made the rounds in the news as expected.

I want to say it was some fact about cheese or something that was in fact wrong. However you could also see the source gemini cited in the ad, and when you went to that source, it was some local farm 1998 style HTML homepage, and on that page they had the incorrect factoid about the cheese.


> If the LLM is accurately reflecting the training corpus, it wouldn’t be considered a hallucination. The LLM is operating as designed.

That would mean that there is never any hallucination.

The point of original comment was distinguishing between fact and fiction, which an LLM just cannot do. (It's an unsolved problem among humans, which spills into the training data)


> That would mean that there is never any hallucination.

No it wouldn’t. If the LLM produces an output that does not match the training data or claims things that are not in the training data due to pseudorandom statistical processes then that’s a hallucination. If it accurately represents the training data or context content, it’s not a hallucination.

Similarly, if you request that an LLM tells you something false and the information it provided is false, that’s not a hallucination.

> The point of original comment was distinguishing between fact and fiction,

In the context of LLMs, fact means something represented in the training set. Not factual in an absolute, philosophical sense.

If you put a lot of categorically false information into the training corpus and train an LLM on it, those pieces of information are “factual” in the context of the LLM output.

The key part of the parent comment:

> caused by the use of statistical process (the pseudo random number generator


OK if everyone else agrees with your semantics then I agree

The LLM is always operating as designed. All LLM outputs are "hallucinations".

The LLM is always operating as designed, but humans call its outputs "hallucinations" when they don't align with factual reality, regardless of the reason why that happens and whether it should be considered a bug or a feature. (I don't like the term much, by the way, but at this point it's a de facto standard).

not that the internet had contained any misinformation or FUD when the training data was collected

also, statments with certainty about fictitious "honey pot prompts" are a problem, plausibly extrapolating from the data should be more governed by internal confidence.. luckily there are benchmarks now for that i believe


Yes, as is implied by the word "improvements"


Which, as practice shows, tend to be understood differently by customers and PMs.


On the contrary, farmed fish is among the most sustainable protein sources for those not willing to go full vegetarian [1]

[1] https://ourworldindata.org/grapher/ghg-per-protein-poore


Greenhouse gas emissions shouldn't be the only factor people consider for sustainability of their food. In the case of fish, this very article talks about the issues with farmed fish. Even a plant-based diet can be filled with unsustainable sources, such as plantations that destroy endangered habitats for palm oil, or industrial farming operations that spray lots of pesticides to harm the insect population and allow lots of fertilizer runoff into natural waterways. We're still polluting and depleting resources for many many vegetarian foods in the world.

I'd argue that if we're looking for a full top-to-bottom sustainable food system, animals will play a role. But we need to be cognizant of the whole system, not playing whack-a-mole with issues.


"...among the most..."

According to your source, there are 15 sources of protein that emit less greenhouse gases (GHGs) per 100g of protein than farmed fish, including poultry and eggs, and 16 sources that emit more (including items that are not known for their protein content like coffee, apples, and dark chocolate). Being highly charitable, farmed fish is squarely in the middle.

Additionally, farmed fish emits twice the GHGs of tofu, and almost 22 times that of nuts. So just comparing placements on the list paints a misleading picture.

As for "not willing to go full vegetarian": you may as well say "not willing to stop eating fish", because they are equally unserious limitations when discussing these topics. "Not being willing" is only a slightly more mature version of a child saying "I don't want to".


I don't think it's "unserious" to recognize that >85% of the world's population eats meat.

If you're quibbling about wording, all I meant was: farmed fish and chicken are among the most sustainable meat sources.

I'm not making a statement that people should eat meat, but many people do eat meat, so it's worth comparing which meat sources are better than others. I think it would be great if more people knew that beef produces 10x the greenhouse gases than chicken/fish do.


It's not "quibbling" to correct your mischaracterization of the truth.

If you'll forgive me borrowing your logic: "I'm not saying that people should eat beef, but many people do eat beef, so it's worth comparing which beef sources are better than others."

Plant-based diets are a very good answer to the problems caused by animal agriculture. If someone takes issue with that answer, I'd need a better reason than their personal pleasure to take them seriously in the conversation.


I agree it’s worth comparing beef sources! That was my point about within-category differences and harm reduction. Saying "tofu is cleaner" doesn’t make beef comparisons pointless - just like the existence of bicycles doesn’t make car fuel economy comparisons pointless. We should compare across categories and within them, so people who aren’t switching today still choose the lower-impact option.


I hesitate to use the word "quibbling" now, but it seems like a poor use of time to compare beef when even the most environmentally-friendly beef is multiple times worse than alternatives.

I think this harm-reduction approach might make more sense from a governmental policy perspective, but is otherwise silly for us to take as individuals because we have such comparatively little influence over each other's choices. I wouldn't waste that small influence encouraging someone to make a slightly less bad choice.

The comparison of food to transportation is a bad one. Nutrients are nutrients, and everything else is personal pleasure. In other words, you can easily hit your same macros by replacing animal products with plant products without even having to change grocery stores. You cannot easily transport a mattress on a bicycle instead of a car.


You started this by objecting to my wording ("among the most") when I said fish/chicken are the most sustainable meat options. They are, by a wide margin. Beef’s footprint is roughly 10× higher, so swapping a beef meal for chicken or fish cuts ~90% of those emissions. That’s not a "slightly less bad choice".

Calling harm reduction "silly" because tofu exists just shifts the target. We can hold two thoughts at once: (1) plant-heavy diets are best, and (2) for the vast majority who aren’t going vegan tomorrow, steering from beef to chicken/fish dramatically reduces damage right now. Dismissing that because it’s not maximal purity guarantees we leave real cuts on the table.


Most recently you said, "I agree it’s worth comparing beef sources!"

So is there any hypothetical harm reduction that you believe is too small to be worth your time to encourage?


Farmed seafood is among the worst garbage you can eat. Tons of antibiotics, growth hormones, fish are fed utter cheap junk so ie salmon meat has more like pork composition than a wild salmon, shrimp are even worse. If you ever saw a shrimp 'factory' and grow pond/cage and its surroundings in a typical 3rd world country where most come from, you wouldn't eat it for a long time if ever again. Literally nothing lives around those places.

Good in theory, horrible in practice.


That take’s outdated. In the US/EU, routine antibiotics in fish farming are banned [1]. Growth hormones aren’t used in edible fish. Farmed salmon’s feed changed (more plant oils), but it still delivers high omega-3s and usually less mercury than wild [2].

[1] FDA “Approved Drugs for Use in Aquaculture” — https://www.fda.gov/media/80297/download

[2] Jensen et al., Nutrients 2020 — https://doi.org/10.3390/nu12123665


OK thats a good development. But overall difference in quality of meat is even visible - farmed salmon looks like a completely different fish than wild one (if you even can get one) - akin to difference between say boar and domesticated pig (lean muscle vs tons of fatty wobbly tissue). It doesn't scream 'healthy' but that may be just emotions playing old tune.

Also what I wrote about shrimp from any 3rd world country is valid - I've seen such place this summer in Indonesia, and from what I've heard whole south east Asia is exactly like that, or worse. Getting shrimp from some western democracy with strong consumer protection rights ain't possible in many parts of Europe, not sure about other places


Hah, why don't you try implementing your 3 little functions and see how smart your "AGI" turns out.

> not a particularly capable AGI

Maybe the word AGI doesn't mean what you think it means...


There is not strong consensus on the meaning of the term. Some may say “human level performance” but that’s meaningless both in the sense that it’s basically impossible to define and not a useful benchmark for anything in particular.

The path to whatever goalpost you want to set is not going to be more and more intelligence. It’s going to be system frameworks for stateful agents to freely operate in environments in continuous time rather than discrete invocations of a matrix with a big ass context window.


I don't think you've understood the paper.

- There are no experts. The outputs are approximating random samples from the distribution.

- There is no latent diffusion going on. It's using convolutions similar to a GAN.

- At inference time, you select ahead-of-time the sample index, so you don't discard any computations.


I agree with @ActivePattern and thank you for your help in answering.

Supplement for @f_devd:

During training, the K outputs share the stem feature from the NN blocks, so generating the K outputs costs only a small amount of extra computation. After L2-distance sampling, discarding the other K-1 outputs therefore incurs a negligible cost and is not comparable to discarding K-1 MoE experts (which would be very expensive).


You are probably right, although it's not similar to a GAN at all, it is significantly more like diffusion (although maybe not latent, the main reason I assumed so is because the "features" are passed-through but these can just be the image).

The ahead-of-time sampling doesn't make much sense to me mechanically, and isn't really mentioned much. But I will hold my judgement for future versions since the FID performance of this first iteration is still not that great.


It doesn't play nice with a lot of popular Python libraries. In particular, many popular Python libraries (NumPy, Pandas, TensorFlow, etc.) rely on CPython’s C API which can cause issues.


FWIW, PyPy supports NumPy and Pandas since at least v5.9.

That said, of all the reasons stated here, it's why I don't primarily use PyPy (lots of libraries still missing)


But pypy doesn’t necessarily perform as well, and it can’t jit compile the already compiled C code in numpy, so any benefits are often lost.


A “sufficiently smart compiler” can’t legally skip Python’s semantics.

In Python, p.x * 2 means dynamic lookup, possible descriptors, big-int overflow checks, etc. A compiler can drop that only if it proves they don’t matter or speculates and adds guards—which is still overhead. That’s why Python is slower on scalar hot loops: not because it’s interpreted, but because its dynamic contract must be honored.


In Smalltalk, p x * 2 has that flow that as well, and even worse, lets assume the value returned by p x message selector, does not understand the * message, thus it will break into the debugger, then the developer will add the * message to the object via the code browser, hit save, and exit the debugger with redo, thus ending the execution with success.

Somehow Smalltalk JIT compilers handle it without major issues.


Smalltalk JITs make p x * 2 fast by speculating on types and inserting guards, not by skipping semantics. Python JITs do the same (e.g. PyPy), but Python’s dynamic features (like __getattribute__, unbounded ints, C-API hooks) make that harder and costlier to optimize away.

You get real speed in Python by narrowing the semantics (e.g. via NumPy, Numba, or Cython) not by hoping the compiler outsmarts the language.


Python'a JIT could do the same, it could check if __getattribute__() is the default implementation and replace its call with p x directly. This would work only for classes that have not been modified at runtime and that do not implement a custom __getattribute__


People keep forgetting about image based semantics development, debugger, meta-classes, messages like becomes:,...

There is to say everything dynamic that can be used as Python excuse, Smalltalk and Self, have it, and double up.



edit and continue is available on lots of JIT-runtime languages


First, we need to add the word 'only': "not ONLY because it’s interpreted, but because its dynamic contract must be honored." Interpreted languages are slow by design. This isn't bad, it just is a fact.

Second, at most this describes WHY it is slow, not that it isn't, which is my point. Python is slow. Very slow (esp. for computation heavy workloads). And that is okay, because it does what it needs to do.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: