Hacker Newsnew | past | comments | ask | show | jobs | submit | krisoft's commentslogin

You will have a long trek to do that. We have a javascript interpreter deployed at the second Sun-Earth Lagrange point.

https://www.theverge.com/2022/8/18/23206110/james-webb-space...


I live happily in the knowledge that in 20000 years when that eventually drifts off into another system and is picked up by aliens that they will reverse engineer it and wonder why the fuck '5'-'4'=1

What a mess.

> One author of a case report was surprised to learn of the correction — because the case described in her article is true.

So they managed to mess up even the correction of their giant mess.

> correcting the correction "would be difficult."

I bet. That's why they should have got it right in the first place. I would be absolutely ballistic if they would be libelling my work like that.


Yeah, they seem to have been quite sloppy with these vignettes.

Thought note that in the situation of the mislabeled real case, the formal solution is could be a retraction of the entire highlight article since it is against the (poorly implemented) policy to have a real case study.

Don't know how patient consent for being used in a case study works, did this author get a perpetual license, did they just copy something from another article they wrote, or from an article someone else wrote?


You can see the full article here: https://www.cpsp.cps.ca/uploads/publications/pxy155-Teething...

It looks like it has a short intro paragraph that talks about a specific case with no identifying details (beyond "a previously healthy 4-month-old boy"), citing this report by other doctors: https://pubmed.ncbi.nlm.nih.gov/27503268/ followed by further discussions of physician reports and survey data.

The correction is explicitly listed as applying to that article (https://academic.oup.com/pch/article-abstract/24/2/132/51642...), which itself seems false since that article doesn't seem to include a fictional vignette.


It looks like they labelled all of them fiction based on a single instance of one of the authors fabricating their case, a gross overcorrection. I wonder if they flinched at the prospect of actually assessing the validity of all of them and decided it was safer to just disclaim them.

> It looks like they labelled all of them fiction based on a single instance of one of the authors fabricating their case

Does it? That's directly at odds with what the article and editor say


> The corrections come following a January article in New Yorker magazine that mentioned one of the reports — “Baby boy blue,” ... was made up.

> “Based on the New Yorker article, we made the decision to add a correction notice to all 138 publications..."

Emphasis mine.


Sure, if you emphasize selectively you can make it sound like it says that. Here are some other quotes from the article that clearly refute your interpretation:

> The journal decided when it first started publishing the article type “that the cases should be fictional to protect patient confidentiality,”

> While the instructions for authors for Paediatrics & Child Health has at times indicated the case reports are fictional, that disclosure has never appeared on the journal articles themselves.

> “The editor acknowledged that the editorial team is at fault for overlooking the fact that our case was real during the review process,”

It's pretty clear that the journal always thought of these as fictional vignettes, and either didn't realize or didn't care that that had not made that sufficiently clear to the readers. The New Yorker article clued them into the fact that it was a problem, so they added the correction to all of their case studies to clarify that they were intended to be fictional. In (at least) one case, the author also didn't realize they should be fictional, and submitted a real case study which has now been incorrectly corrected.


> While the instructions for authors for Paediatrics & Child Health has at times indicated the case reports are fictional, that disclosure has never appeared on the journal articles themselves.

Sounds like they were asking authors for fiction, so probably plenty of them are.


They asked the authors for fiction “at times”. Meaning that some are fiction, and some very well might not be. The best they can do is try to contact the authors and see if the case report they wrote is fictional or not. The second best is to admit that they made a mess and say “the case reports might or might not be fictional, we have no way of knowing”.

I suspect you're reading too much into that phrase. It seems more likely to me that the reporter here contacted one or more of the case report authors directly to ask for a copy of what instructions they received from the journal at the time. (This would be good journalistic practice, rather than just take the journal's word for it, when they might have an incentive to lie.) But they obviously couldn't explicitly confirm that every single author received similar instructions, so they used the “at times” phrase to cover their ass.

If they had direct evidence that some author's instructions failed to ask for the case study to be fictionalized, I think they would have specifically said that. It's more definitive, and catches the journal in a lie.

I'm pretty sure what happened here is that:

1) The journal always asked for and thought they received fictionalized case studies.

2) It never occurred to them that they were presenting the case studies in a way that could be misinterpreted. (This is indefensible negligence, but I also understand how it could have happened "innocently".)

3) Once the issue came to light, they issues blanket corrections to every case study study to describe them as fiction because they asked for fiction and edited them all as fiction. (I.e., Didn't do any fact checking or independent confirmation, beyond medical broad strokes.)

4) At least one author didn't read the instructions carefully enough and sent in a real case study, which as the article says, wasn't caught by the editors during the review process. (And really, how would they catch it? If they thought they asked for fiction, they wouldn't be fact checking it.)

I actually think the disclaimer may be appropriate, even on the article that was written as a true story, if it wasn't reviewed as one.


> If they had direct evidence that some author's instructions failed to ask for the case study to be fictionalized, I think they would have specifically said that.

Which they do. They specifically say that. “Neither the instructions for authors from 2010 — when Koren and his coauthor Michael Rieder would have written their article — nor the linked list of article types — state the cases are fictionalized, or fictional.”

“An archived version from September stated, ‘Each highlight is a teaching tool that presents a short clinical example, from one of the studies or one-time surveys,’ with no mention of fiction.”

These are direct quotes from the article. The exact kind you are asking for. With inline links to the archived documents. And yes it is very definitive.

> I'm pretty sure what happened here is that:

No need to speculate. Just read the article.

> 1) The journal always asked for […] fictionalized case studies.

This is false. As evidenced by the article.


> I would be absolutely ballistic if they would be libelling my work like that.

Genuine question, could they sue for this? It seems like a pretty good case.


> maybe consider why Iran spent over $500B developing offensive nuclear weapons.

To protect themselves from the exact scenairo happening right now? The reason why Putin is sleeping peacefully in his bed while Khamenei is dead under rubble is that one has nuclear deterent while the other din't have that protection.

> supposed aggressor

I don’t know if there is anything “supposed” about that aggressor given the present situation.


Right. Because Israel is finally fulfilling its aspiration to annex a country 1,000 miles away. Do you think history started two days ago?

Maybe the Israelis are idiots, but it would seem so much more practical to attack closer countries first - Saudi Arabia, Egypt, etc. I wonder why they aren't?


Iraq and Iran has been netanyahus goal for 4 decades. Remember him lying to congress about wmd as the Iraq invasion started.

No. But there are a few layers to that.

First no is that the model as is has too few parameters for that. You could train it on the wikipedia but it wouldn’t do much of any good.

But what if you increase the number of parameters? Then you get to the second layer of “no”. The code as is is too naive to train a realistic size LLM for that task in realistic timeframes. As is it would be too slow.

But what if you increase the number of parameters and improve the performance of the code? I would argue that would by that point not be “this” but something entirely different. But even then the answer is still no. If you run that new code with increased parameters and improved efficiencly and train it on wikipedia you would still not get a model which “generate semi-sensible responses”. For the simple reason that the code as is only does the pre-training. Without the RLHF step the model would not be “responding”. It would just be completing the document. So for example if you ask it “How long is a bus?” it wouldn’t know it is supposed to answer your question. What exactly happens is kinda up to randomness. It might output a wikipedia like text about transportation, or it might output a list of questions similar to yours, or it might output broken markup garbage. Quite simply without this finishing step the base model doesn’t know that it is supposed to answer your question and it is supposed to follow your instructions. That is why this last step is called “instruction tuning” sometimes. Because it teaches the model to follow instructions.

But if you would increase the parameter count, improve the efficiency, train it on wikipedia, then do the instruction tuning (wich involves curating a database of instruction - response pairs) then yes. After that it would generate semi-sensible responses. But as you can see it would take quite a lot more work and would stretch the definition of “this”.

It is a bit like asking if my car could compete in formula-1. The answer is yes, but first we need to replace all parts of it with different parts, and also add a few new parts. To the point where you might question if it is the same car at all.


Very useful breakdown; thank you!

They were practicing object recognition, movement tracking and prediction, self-localisation, visual odometry fused with porpiroception and the vestibular system, and movement controls for 16 years before they even sit behind a steering wheel though.

> Astute readers will note what’s been missed here.

I’m not astute enough to see what was missed here. Could you explain?


If I'm not mistaken, BERT is a classifier (enters text, outputs labels) so it is not a "Language model", as it cannot be used for text generation.

The abstract of the original BERT paper starts with these words: "We introduce a new language representation model called BERT, [...]" The paper itself contains the phrase "language model" 24 times.

It might not be considered a language model today, but it was certainly considered one when it was originally published. Or so it would seem to me. Maybe there is a semantic shift which happened here?


> we can ignore that specific example

We are not ignoring it. It is just not an example of a load bearing excel sheet.



Modern landmines do have safety features like what you describe.

For example consider this Department of Defence policy from 2020: https://media.defense.gov/2020/Jan/31/2002242359/-1/-1/1/DOD...

“The Department will continue its commitment not to employ persistent landmines. For the purposes of this policy, ‘persistent landmines’ means landmines that do not incorporate self-destruction mechanisms and self-deactivation features. The Department will only employ, develop, produce, or otherwise acquire landmines that are non-persistent, meaning they must possess self destruction mechanisms and self- deactivation features.”

“ For example, all activated landmines, regardless of whether they are remotely delivered or not, will be designed and constructed to self-destruct in 30 days or less after emplacement and will possess a back-up self-deactivation feature. Some landmines, regardless of whether they are remotely delivered or not, will be designed and constructed to self-destruct in shorter periods of time, such as two hours or forty-eight hours.”

This distinguishes “self-destruct” where the mine blows itself up and “self-deactivation” where the mine disarms itself. The first is safer because it doesn’t leave explosive material behind, which could chemicaly detoriate and become unstable decades later. The second is used as a failsafe in case the self-destruct fails.

> Or are the reasons technical

They certainly were when the really old mines were made. Some of them are nothing more than just spring loaded pressure plates. But today modern landmines are much more sophisticated. Some of them can distinguish the seismic signature or a truck from a tank. There are also radio controlled mine fields where soldiers can remotely activate / deactivate the whole mine field as the threat evolves.


I thought it would be longer than 30 days.

They aren't 100% reliable either, nothing is.

> It's rare for people to buy art just bc oil paints go brrrrrm

It is rare to buy oil paints period. It is an expensive luxury in more than one way.

That being said I do buy art hanging from the wall because it looks pretty. In fact that is the only way i ever did. I see it. I feel it. I say “hi, hello, how much? That sounds good, here you go. Yes please package it.” And then i hang it on my wall. Don’t care about who the artist is and couldn’t tell you.


> you can't experience something like a movie without trying to figure out what the actual message behind the movie was

I believe you that your brain works like that but this is absolutely not how mine works. I care if i enjoy the movie, and if the characters are believable, i absolutely do not care what the message is supposed to be.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: