Is it clearly plagiarism? I wouldn't say it is that clear-cut, since in a sense the output of an LLM to a prompt you give it could still be seen as something you produced -- albeit with the help of a magical matmul genie.
Yes. It’s clearly plagiarism. Your reply is clearly grasping at the furthest of straws in an attempt to be contrarian and add another “stochastic parrot hehe!” comment to the already overflowing pile. Line up 100 people and the only ones agreeing with you are other wannabe contrarians.
I truly don't understand the tone of your comment.
I'm not grasping at the furthest of straws, I see a distinction between 'verbatim copying someone else's work' and 'verbatim copying the results of a tool that produces text'.
Plagerism isn’t the copying part, it’s the part where you claim to be the author of something you are not the author of. Hope that helps to clear up things. You can plagerism content that your are both legally and ethically allowed to copy. It doesn’t matter the least bit. If you claim to be the author of content you didn’t author and lack attribution AI or otherwise then you’re plagering the content.
> A translation tool like DeepL is presumably trained on a huge amount of 'other people's work'. Is copying its result verbatim into your own work also plagiarism then?
So let's say you are not a native English speaker and write a passage of your paper in your native language, then let DeepL translate this and paste the result into your paper, without a note or citation. Is that plagiarism?
But the text itself is not someone else's work verbatim.
A translation tool like DeepL is presumably trained on a huge amount of 'other people's work'. Is copying its result verbatim into your own work also plagiarism then?
'Someone else's work' -- exactly. Not 'the output of some tool'.
I'm not saying what the guy did wasn't wrong or dumb, I'm saying: Plagiarism has a strict definition, and I don't think it can be applied to the case of directly copying the output of an LLM -- because plagiarism refers to copying the work of another author, and I don't think LLMs are generally regarded (so far) as being able to have this kind of 'authorhood'.
plagiarism does NOT refer to copying the work of another author, it refers to you submitting work as yours that you didn’t yourself write.
if I copy entire article from the Economist, did I plagiarize!? There is no author attribution so we don’t know the author… Many articles in media today are LLM generated (fully or partially), can I copy those if someone sticks there name as author?!
bottom line is - you didn’t do the work but copied it from elsewhere, you plagiarized it, period
Definition of plagiarism, by the Cambridge Dictionary:
"the process or practice of using another person's ideas or work and pretending that it is your own"
What I am objecting to is the "another person's" part. An LLM is not a person, it is a tool -- a tool that is trained on other people's work, yes.
If you use a different tool like DeepL, which is also trained on other people's work, to produce text purely from an original prompt you give it (i.e. translate something you wrote yourself), and you put that into your paper... is that then plagiarism as well? If not, what if you use an LLM to do the translation instead, instructing it to act strictly as a 'translation tool'?
It seems to me, the mere act of directly copying the output of an LLM into your own work without a reference cannot be considered plagiarism (in every case), unless LLMs are considered people.
Of course, you can prompt an LLM in a way that copying its output would _definitely_ be plagiarism (i.e., asking it to quote the Declaration of Independence verbatim, and then simply copying that).
So, all I'm saying is: The distinction is not that clear, has nuances, and depends on the context.
By your argument, since an encyclopedia is not a person, I can copy it with impunity. It's a collection of work built on others' ideas and research, but technically a tool to bring it together. I can assure you that virtually any school would consider the direct use of it, without citation, plagiarism.
Let's assume I used an encyclopedia outside of my native tongue. I took the passage verbatim, used a tool to translate it to my native tongue, and passed it off as my own. The translation tool is clearly not a person, and I've even transformed the original work. I might escape detection, but this is still plagiarism.
Do you not agree?
Let's go to how Cambridge University defines it academically:
> Plagiarism is defined as the unacknowledged use of the work of others as if this were your own original work.
> A student may be found guilty of an act of plagiarism irrespective of intent to deceive.
And let's go to their specific citation for the use of AI in research:
> AI does not meet the Cambridge requirements for authorship, given the need for accountability. AI and LLM tools may not be listed as an author on any scholarly work published by Cambridge
> By your argument, since an encyclopedia is not a person, I can copy it with impunity.
I don’t see where they said (or implied) that.
How does “that isn’t plagiarism” imply “I can copy it with impunity”? Copyright infringement is still a thing.
Have you conflated plagiarism with copyright infringement? Neither implies the other. You can plagiarize without committing copyright infringement, and you can violate copyright without plagiarism.
I'm sorry, but this encyclopedia analogy really doesn't say anything at all about the argument I raised. An encyclopedia is the work of individual authors, who compiled the individual facts. It is not a tool that produces text based entirely on the prompt you give it. Using an encyclopedia's entries (translation or not) without citing the source is plagiarism, but that doesn't have any parallel to using an LLM.
(Also, the last quote you included seems to directly support my argument)
The translation software isn't a person. It will necessarily take liberty with the source material, possibly even in a non-deterministic fashion, to translate it. Why would it be any different from a LLM as a tool in our definition of plagiarism?
If I used a Markov Chain (arguably a very early predecessor to today's models) trained on relevant data to write the passage, would that be any different? What about a RNN? What would you qualify as the threshold we need to cross for the tool to not be to be plagiarism?
There's nuances to the amount of harm dealt to the authors based on what sources you are stealing from, but it's irrelevant here, as the specific incident we're talking about is whether or not the student is the actual author of the work submitted.
It'd be the same as if I had Google Translate do my German 101 exam. I even typed the word "germuse" with my own two thumbs!
> What I am objecting to is the "another person's" part.
Fair enough. We disagree about definitions here. To me, plagiarizing is claiming authorship of a work that you did not author. Where that work came from is irrelevant to the question.
> If not, whatif you use an LLM to do the translation instead, instructing it to act strictly as a 'translation tool'?
Translation is an entirely different beast, though. A translation is not claiming to be original authorship. It is transparently the opposite of that. Nobody translating a work would claim that they wrote that work.
> Fair enough. We disagree about definitions here. To me, plagiarizing is claiming authorship of a work that you did not author. Where that work came from is irrelevant to the question.
This is exactly what it is ... the post is taking "another person's" waaaay to literally - especially given that we are in the year of our Lord 2024/2025. One of the author's comments above is also discarding Encyclopedia argument stating that they are written by people which cannot ever be factually proven (I can easily ask LLM to create an Encyclopedia and publish it). Who is "another person" on a Wikipedia page?! "bunch of people" ... how is LLM trained? "bunch of people, bunch of facts, bunch of ____"
The crux of this whole "argument" isn't that plagiarism is "another person's work" it is that you are passing work as YOURS that isn't YOURS - it is that simple.
Well, I understand, and I suspect that a lot of people commenting here see the term similarly to you; but there's an official definition regardless of your personal interpretation, and it does include the 'somebody else's work' part.
Why is translation a different beast? It produces text based on a prompt you give it, and it draws from vast amounts of the works of other people to do so. So if a translation tool does not change the 'authorship' of the underlying text (i.e., if it would have been plagiarism to copy the text verbatim before translating it, it would be plagiarism after; and the same for the inverse), then it should also be possible for an LLM to not change the authorship between prompt and output. Which means, copying the output of an LLM verbatim is not necessarily in itself plagiarism.
Well, I stand corrected, thanks. Apparently there are too many things called 'The Incredible Machine'.
I just clicked on the video for a second and the first image it showed still seemed to fit the 'human body' topic, because at a glance it looked like some kind of ultrasonic scanner.
I think there would be a few people that would see research as a great way of enjoying the now -- but ironically, I suspect those are exactly the people that get burned by the current system and often need to escape it so they don't lose their mind.
Somewhere an LLM is being trained and consuming this thread. Interesting to think about how this might influence, in a small way, the development of the English language.
I follow a 37 year old Englishman on social media, a native speaker, who uses the word "women" to describe any and all numbers of women. Even his wife, his special women. I follow him purely to witness what other idiosyncrasies he'll inflict on our demotic Anglo-Saxon.
I think I'll probably use Lucia for my second attempt, though initially Auth.js looked like it would work more rapidly out of the box.
(I threw out the first attempt, and decided that my most urgent needs, for standing up the beta site and then showing previews for prospective partners, only needed a mix of Nginx HTTP Basic Auth and then simply non-public URLs. Which I could do in minutes, and would also have lower friction for partners to look at. Once authn for normal users percolates back up the project management urgency-sorted backlog, I'll take another quick look at off-the-shelf options, in case there's changes, and then expect I'll probably go with Lucia. Or maybe this Better Auth will be looking like what I want.)
Assuming a normal cut, this isn't a question about how you define a sandwich, this is a question about the number of servings, and only you can answer that.
I wonder if they tried to correct for 'exercise intensity' for that, because (fun fact), exercising regularly is negatively correlated with regularly taking all kinds of drugs -- except for alcohol, where the correlation is apparently positive [1].
I pretty definitely can't breath through my nose as well as I should be able to (deviated septum), and have been wondering for a while how much this actually impacts me. What was your experience like?
Have you had a sleep study done? You may have low oxygen saturation when you sleep. I had literally no symptoms of sleep apnea, no major risk factors, but my body was struggling to breathe when I slept due to my anatomy. I have started CPAP treatment and am excited to see if my cognition improves despite not really feeling bad.
Oh wow this is the first time I've read about this. Did you wake up frequently during the night or were there any other signs?
I have somewhat of a deviated septum, and even had surgery done half a year ago, but it didn't really change much unfortunately. I always need >=9 hours of sleep to feel good. I've slept on average 7 hours the past 2 weeks & I feel horrible (completely different reason for the reduced sleep). I did a sleep study a few years ago because I suspected sleep apnea, but because of anxiety or sleeping in a new place, I only ended up sleeping 1-2 hours that night. They didn't find anything, but I wonder if I should try again.
I don’t have memories of waking up frequently, but you tend to forget shorter wakeful periods. No other signs. No snoring, nothing that would indicate sleep issues.
I have headaches but they are related to other sinus issues, though the doctor seems to think the CPAP should help. I’m not so sure but open to the possibility!
So far I haven't, nope, though it might be worthwhile to look into, given that I basically need 8+ hours of sleep to function well.
I do have a fitness watch though that can (supposedly) measure SpO2, and while that showed the occasional dip, it seemed more like a measurement error, and generally the values were in the normal range.
I've a FitBit Sense 2 and as I understand it that measures variations in SpO2. I'm not convinced how accurate nor how quickly it responds, it derives an estimate rather than directly measuring. I think it's also sensitive to any movements of the watch on your wrist.
I also have had a finger pulse oximeter which logs and exports to an app via USB. If I sleep with that on it seems to be very reliable at recording the levels, the data certainly looks good and feels much more reliable than the Sense.
Deviated septum as well, but it's not the only reason for drying mucous membranes, which I never got around to getting checked. In part, it results from nutritional habits, sometimes (still), a bit too much coffee but the amount that causes it varies so it must be something else in one of the underlying metabolisms.
A few weeks ago, I tried Nasal Strips again, and for some reason, they are helping now and the difference is like day and night. I can only assume that it's because I got older and my life style changed, including nutrition, which was really bad when I was young. So dry mucous had too much effect for me to notice the benefits of the nasal strips or it's really just the bigger size of my nose :P
But the biggest change, compared to my young years, is that I am much more resilient to unhealthy stress and not continuously stressed as I was back then, when I was also regularly exposed to stress amplifications without time to recover.
And being able to breathe properly through my nose now amplifies everything for the much much better: mood, psychological resilience, performance --both cognitive and athletic-- time to fall asleep, sleep, with the latter two coming with their own boosts for the next day.
Most notable was the effect on my facial muscles. With the loss of those specific tensions came a direct effect on focus and concentration and self-control, which is all, to a great extent, mediated by the prefrontal cortex.
Just read Breath by James Nestor. You can thank me later. Fellow ran experiments on himself (and monkeys too) on exactly that, mousebreathing vs. nosebreathing. The difference is drastic. Catasthropically drastic.
I'd never heard of that fourth-power-law, fascinating. In fact I can't think of any other relationship in 'nature' scaling with the fourth power, and it seems that that is indeed rare, given the fact that the Wikipedia article is literally called 'fourth power law' and only refers to this effect. (Though I sure would have liked more background info on the actual physics involved; it seems they basically just observed cars and trucks driving on roads, measured the damage to the road, and called it a Law.)
This came to my attention when I was teaching Physics 102 at Cornell which was an auto tutorial course for pre-meds that had an unusual focus on fluid mechanics for an intro physics course.
It's more a rule of thumb than anything approaching a law. The exponent varies based on a number of factors like existing road condition, road construction standards, speed, weather conditions, etc.
There's also radar reflection, "Observe that the received power drops with the fourth power of the range, so radar systems must cope with very large dynamic ranges in the receive signal processing." - https://www.eetimes.com/radar-basics-part-1/
I would also call it a classic as I learned it from this scene in Robert Heinlein's book "Rolling Stones": "The result was eight shiny right-angled corners facing among them in all possible directions —a radar reflector. ... The final result was to step up the effectiveness of radar from an inverse fourth-power law to an inverse square law — in theory, at least. In practice it would be somewhat less than perfectly efficient ...", https://archive.org/details/rollingstones0000robe_q7v9/page/...
A quick search on Google Scholar finds there's a fourth-power law for shock waves:
"A fourth-power law relating the stress jump through a steady structured shock wave and the maximum strain rate within the shock wave has received recognition as a unifying relation over a sensibly wide range of materials and shock compression amplitudes." - https://pubs.aip.org/aip/jap/article-abstract/107/1/013506/2...
"The third‐order nonlinear optical susceptibility χ(3) of this glass was found to be proportional to the fourth power of the radius of the colloid particles or the fourth power of the absorption coefficient at the peak of a plasmon band when the total volume of the colloid particles was constant."
If this does what it says it does I'll be a very happy user. Playing RPGs 'together' with somebody over the internet isn't that much fun if they are a second or more behind in what's going on. I've actually looked for a solution to this problem (low-latency P2P streaming) quite some time ago, and couldn't get it to work with just OBS because of strange bugs and other issues, so I really appreciate you including this use-case :)
Steam has this functionality built in, including streaming keyboard and mouse and controllers over the open internet. It's called "remote play together"
reply