Hacker News new | past | comments | ask | show | jobs | submit | SomeCallMeTim's comments login

> If ChatGPT verbatim reproduces

Copyright covers "derivative works." Verbatim is absolutely not a requirement for infringement.

If you take a copyrighted image and modify it, even to the point where it's unrecognizable, if the image is being used in the same way (i.e., isn't a "transformative use"), then it's still a derivative work.

Yes, you are likely to get away with it if you're not caught. But that doesn't mean what you're doing is considered fair use, just that you won't get sued.

Thing is, every piece of text generated by ChatGPT is incrementally using every character of training data. So legally speaking, everything it produces is arguably a derivative work of ALL of the training data.

Generative AI isn't even a legal gray area; under current law, there's no blanket exception for "how much" of a copyrighted work is used. At best there's a fair use _guideline_ that lists, as one of four criteria, the amount and nature of the copyrighted work used. But really it's the entirety of millions of copyrighted works being used to generate the models, and those works _can_ be reproduced verbatim in many cases, proving that the works are encoded into the model.

Generative AI is only permitted because there's big money behind it along with associated lobbyists. And there are many in-flight lawsuits trying to shut down both GPT and various art-generating AIs.

Maybe they'll change the law. Maybe courts will side with the AI companies. But until then, it seems obvious to me that anyone arguing that generative AI based on models built with copyrighted works is completely legal is using motivated reasoning.


I understand OpenAI is a US company, but this is a US-centric view. This is especially since TFA is about a Brazillian operation.

> under current law, there's no blanket exception for "how much" of a copyrighted work is used

Under fair dealing laws, there are. [1] Though, as always, if commercial fan art is legal, then so should something that uses only a couple bytes of information per work, bar overfits.

> But until then, it seems obvious to me that anyone arguing that generative AI based on models built with copyrighted works is completely legal is using motivated reasoning.

It is completely legal in the EU, Japan, South Korea and Singapore. [2]

[1] https://libhelp.ncl.ac.uk/faq/43267

[2] https://www.reedsmith.com/en/perspectives/ai-in-entertainmen...


Your link re: Fair Dealing guidelines does NOT make it 100% legal. For one, the ENTIRE works are encoded into the model--not a part of them. For another, those are just guidelines, not explicit exceptions, just like Fair Use in the US. It's all very hand-wavy, even more so in the UK, apparently, so there's no way you can list those guidelines and say that anything is clearly allowed.

Your second link means it's legal for them to CREATE THE MODEL. This is true in the US as well: The model is a clearly transformative use of the data.

But as soon as the model produces works in the same use category as the original work (code -> model -> code, for instance, or image -> model -> image), it is no longer transformative.

If you understand the law and the technology, it's clearly generating derivative works.


Entire works are encoded in the model in the same way that if I cut up a document into individual words and put it in a bag with a bunch of other documents, if I was a no life loser I could spend a long time "recreating" the document from individual words. The bag of cutout words is NOT copyright violation though.


> How could they prevent the framework become over bloated with semi baked plug-in?

...not sure how they plan to, but how they COULD do it is by making it easy enough to directly access native resources directly from the script language (like NativeScript) or by making it so easy to write native code (Kotlin/Swift are listed as first-class options) that you just write any specific API access code in the appropriate native language.

It doesn't give you the Electron "write once run everywhere" experience, since you need to write some of the code per-platform, but many apps are 95% UI and only 5% platform-specific functionality. So by abstracting the UI by having it be HTML/CSS/JavaScript, you're getting a "write once run everywhere UI" and the minority of the code that needs to differ is all you have to maintain per-platform.

If writing a plug-in is a high bar, then you get tons of semi-baked plug-ins as the (seemingly) only way to access native features. If instead you can drop in native code easily and quickly, then you can focus on app development and cut out the middleware. ;)


This reminds me of the "Parable of the Two Programmers." [1] A story about what happens to a brilliant developer given an identical task to a mediocre developer.

[1] I preserved a copy of it on my (no-advertising or monetization) blog here: https://realmensch.org/2017/08/25/the-parable-of-the-two-pro...


I had an idea once but when I tried to explain it people didn't understand.

I revisited earlier thought: communication is a 2 man job, one is to not make an effort to understand while the other explains things poorly. It always manages to never work out.

Periodically I thought about the puzzle and was eventually able to explain it such that people thought it was brilliant ~ tho much to complex to execute.

I thought about it some more, years went by and I eventually managed to make it easy to understand. The response: "If it was that simple someone else would have thought of it." I still find it hilarious decades later.

It pops to mind often when I rewrite some code and it goes from almost unreadable to something simple and elegant. Ah, this must be how someone else would have done it!


> Ah, this must be how someone else would have done it!

This is a good exclamation :D

And it's a poignant story. Thanks for sharing.


That’s pretty good. It needs an Athena poster :-)


“Give me six hours to chop down a tree and I will spend the first four sharpening the axe.”

― Abraham Lincoln

I have started to follow this 'lately' (for a decade) and it has worked miracles. As for the anxious managers/clients, I keep them updated of the design/documentation/though process, mentioning the risks of the path-not-taken, and that maintain their peace of mind. But this depends heavily on the client and the managers.


I can't seem to find it in a google search, maybe I'm just recalling entirely the wrong terms.

In the early computing era there was a competition. Something like take some input and produce an output. One programmer made a large program in (IIRC) Fortran with complex specifications documentation etc. The other used shell pipes, sort, and a small handful or two of other programs in a pipeline to accomplish the same task in like 10 developer min.


The Knuth link in the sibling comment is an original, but you're probably thinking of "The Tao of Programming"

http://catb.org/~esr/writings/unix-koans/ten-thousand.html

"""“And who better understands the Unix-nature?” Master Foo asked. “Is it he who writes the ten thousand lines, or he who, perceiving the emptiness of the task, gains merit by not coding?”"""


There was also the "Hadoop vs. unix pipeline running on a laptop"-story a few years back, a more modern take: https://adamdrake.com/command-line-tools-can-be-235x-faster-...



Sounds like "Knuth vs McIlroy", which has been discussed on hn and elsewhere before, and the general take is that it was somewhat unfair to Knuth.

[1] https://homepages.cwi.nl/~storm/teaching/reader/BentleyEtAl8... [2] https://www.google.com/search?q=knuth+vs+mcilroy


This is the competition I was thinking of. I must have read it in a dead-image PDF version some other time on HN. This paper isn't the one I recall but the solution is exactly the sort I vaguely recalled.

I'm trying to copy-in the program as it might have existed, with some obvious updates to work in today's shells ...

  #!/bin/sh
  tr -cs A-Za-z '
  ' "${2:-/dev/stdin}" |\
  tr A-Z a-z |\
  sort |\
  uniq -c |\
  sort -rn |\
  sed ${1:-100}q
Alternately (escapes not yet tested) $ tr -cs A-Za-z \012 "${INPUTFILEHERE:-/dev/stdin}" | tr A-Z a-z | sort | uniq -c | sort -rn | sed ${MAXWORDSHERE:-100}q

Edited: Removed some errors likely induced by OCR / me not catching that in the initial transcription from the browser view of the file.


Just to be clear, it was not a competition. For more, please follow the links from some of the previous HN discussions, e.g. https://news.ycombinator.com/item?id=31301777.

[For those who may not follow all the links: Bentley asked Knuth to write a program in Pascal (WEB) to illustrate literate programming—i.e. explaining a long complicated program—and so Knuth wrote a beautiful program with a custom data structure (hash-packed tries). Bentley then asked McIlroy to review the program. In the second half of the review, McIlroy (the inventor of Unix pipes) questioned the problem itself (the idea of writing a program for scratch), and used the opportunity to evangelize Unix and Unix pipes (at the time not widely known or available).]


I was both of those developers at different times, at least metaphorically.

I drank from the OO koolaid at one point. I was really into building things up using OOD and creating extensible, flexible code to accomplish everything.

And when I showed some code I'd written to my brother, he (rightly) scoffed and said that should have been 2-3 lines of shell script.

And I was enlightened. ;)

Like, I seriously rebuilt my programming philosophy practically from the ground up after that one comment. It's cool having a really smart brother, even if he's younger than me. :)


This is unrelated to the excellent story, but it's annoying that the repost has the following "correction":

> The manager of Charles has by now [become] tired of seeing him goof off.

"The manager has tired of Charles" is as correct as "the manager has become tired of Charles". To tire is a verb. The square bracket correction is unnecessary and arguably makes the sentence worse.


Sure enough. Presumably my brain was switched off if I added that "correction" myself.

Or it was already there in whatever source I managed to copy it from. No idea.


Without more backup I can only describe that as being fiction. Righteous fiction, where the good guy gets downtrodden and the bad guy wins to fuel the reader's resentment.


It's practically my life experience.

Sometimes I'm appreciated, and managers actually realize what they have when I create something for them. Frequently I accomplish borderline miracles and a manager will look at me and say, "OK, what about this other thing?"

My first job out of college, I was working for a company run by a guy who said to me, "Programmers are a dime a dozen."

He also said to me, after I quit, after his client refused to give him any more work unless he guaranteed that I was the lead developer on it, "I can't believe you quit." I simply shrugged and thought, "Maybe you shouldn't have treated me like crap, including not even matching the other offer I got."

I've also made quite a lot of money "Rescuing Small Companies From Code Disasters. (TM)" ;) Yes, that's my catch phrase. So I've seen the messes that teams often create.

The "incompetent" team code description in the story is practically prescient. I've seen the results of exactly that kind of management and team a dozen times. Things that, given the same project description, I could have created in 1/100 the code and with much more overall flexibility. I've literally thrown out entire projects like that and replaced them with the much smaller, tighter, and faster code that does more than the original project.

So all I can say is: Find better teams to work with if you think this is fiction. This resonates with me because it contains industry Truth.


To me it is a story about managers clueless about the work. You can make all the effort in the world to imagine doing something but the taste of the soup is in the eating. I do very simple physical grunt work for a living, there it is much more obvious that it is impossible. It's truly hilarious.

They probably deserve more praise when they do guess correctly but would anyone really know when it happens?


Yes: Programmers who start at twelve are often the 10x programmers who can really program faster than the average developer by a lot.

No: It's not because they have 10 more years of experience. Read "The Mythical Man Month." That's the book that popularized the concept that some developers were 5-25x faster than others. One of the takeaways was that the speed of a developer was not correlated with experience. At all.

That said, the kind of person who can learn programming at 12 might just be the kind of person who is really good at programming.

I started learning programming concepts at 11-12. I'm not the best programmer I know, but when I started out in the industry at 22 I was working with developers with 10+ years of (real) experience on me...and I was able to come in and improve on their code to an extreme degree. I was completing my projects faster than other senior developers. With less than two years of experience in the industry I was promoted to "senior" developer and put on a project as lead (and sole) developer and my project was the only one to be completed on time, and with no defects. (This is video game industry, so it wasn't exactly a super-simple project; at the time this meant games written 100% in assembly language with all kinds of memory and performance constraints, and a single bug meant Nintendo would reject the image and make you fix the problem. We got our cartridge approved the first time through.)

Some programmers are just faster and more intuitive with programming than others. This shouldn't be a surprise. Some writers are better and faster than others. Some artists are better and faster than others. Some architects are better and faster than others. Some product designers are better and faster than others. It's not all about the number of hours of practice in any of these cases; yes, the best in a field often practices an insane amount. But the very top in each field, despite having similar numbers of hours of practice and experience, can vary in skill by an insane amount. Even some of the best in each field are vastly different in speed: You can have an artist who takes years to paint a single painting, and another who does several per week, but of similar ultimate quality. Humans have different aptitudes. This shouldn't even be controversial.

I do wonder if the "learned programming at 12" has anything to do with it: Most people will only ever be able to speak a language as fluently as a native speaker if they learn it before they're about 13-14 years old. After that the brain (again, for most people; this isn't universal) apparently becomes less flexible. In MRI studies they can actually detect differences between the parts of the brain used to learn a foreign language as an adult vs. as a tween or early teen. So there's a chance that early exposure to the right concepts actually reshapes the brain. But that's just conjecture mixed with my intuition of the situation: When I observe "normal" developers program, it really feels like I'm a native speaker and they're trying to convert between an alien way of thinking about a problem into a foreign language they're not that familiar with.

AND...there may not be a need to explicitly PROGRAM before you're 15 to be good at it as an adult. There are video games that exercise similar brain regions that could substitute for actual programming experience. AND I may be 100% wrong. Would be good for someone to fund some studies.


That childhood native-fluency analogy is insightful! Your experience matches mine.

I started programming at age 7 and it's true that the way code forms in my head feels similar to the way words form when I'm writing or speaking in English. In the same way that I don't stop and consciously figure out whether to use the past or present tense while I'm talking, I usually don't consciously think about, say, what kind of looping construct I'm about to use; it's just the natural-feeling way to express the idea I'm trying to convey. The idea itself is kind of already in the form of mental code in the same way that my thoughts are kind of already in English if I'm speaking.

But... maybe that's how it is for everyone, even people who learned later? I only know how it is in my own head.


I totally get the same sense that I'm just "communicating" using code. I just write out the code that expresses the concepts I have in my head.

And at least some people clearly don't. I was talking to one guy who said that even for a simple for-each loop it was way faster for him to "Google the code he needs and modify it" than to write it. This boggled me. I couldn't imagine being able to Google and parse results and find the one I wanted and copy and paste it and modify it being faster than just writing the code.

Even famous developers brag about their inability to code. DHH (RoR developer) has a tweet where he brags that he couldn't code a bubble sort without Googling it. A nested loop with a single compare and swap...and he's "proud" of the fact that he needs to Google it?

I have no words.


The association with video games in your last paragraph makes a lot of sense to me. This is how I feel solving problems.

I always thought that people who start at 12 and keep at it are good because they really love it.I see people who struggle a lot with learning, and it's because they hate it but are doing it for other reasons.


People are also prone to love doing things that they're good at, so it's hard to know which came first. :)


That's true!!!


> Most people will only ever be able to speak a language as fluently as a native speaker if they learn it before they're about 13-14 years old.

Very few people both have a ton of exposure to a language and actually study the grammar and stuff as adults. If you don't learn the grammar you will still speak like a dog after living in a country for 20 years. A lot of people in an average company don't write hard things at their job, didn't read any textbooks etc. and spend loads of time in meetings etc.


> Very few people both have a ton of exposure to a language and actually study the grammar and stuff as adults.

Very few people actually learn to speak a language as a native speaker by "studying the grammar."

I remember people trying to learn what was and what wasn't a run-on sentence in junior high school, and being shocked that they had a hard time telling the difference.

And studying language explicitly doesn't change the brain regions used to the same that are used by a native speaker.

And that's my point. I didn't really "study" programming explicitly as much as understanding it intuitively. When exposed to a new concept, I just immediately internalize it; I don't need to use it a bunch of times and intentionally practice it. I just need to see it and it's obvious and becomes part of my tool-set.


Honestly, most everything listed on the page as an advantage of Zig, is a disadvantage from my point of view.

I'm sure Zig has its use cases. For what I write, I not only don't care if there's a hidden function call or error handling, I see those as 100% necessary for a modern language.

Needing to handle errors inline is a huge mess for anything nontrivial. It distracts from the logic that's important at that point in the code. Being able to override an accessor to do something instead of being a raw access is incredibly useful; a tiny change and rebuild is all that's required to track information that you would otherwise need to rewrite an entire app to support.

If you're writing extremely low level code and libraries, especially embedded, then fine, minimizing hidden behavior is important. Being able to operate without a standard library is also important in that case. Outside of that niche, though, there are few places I'd call those "features" of Zig an advantage.


> Needing to handle errors inline is a huge mess for anything nontrivial

Maybe try zig first before you make such a blanket statement?


Zig's error handling is unobtrusive. You can just write `try` if you want to propagate errors instead of handling them in some specific way.


Yeah zig just doesn't offer enough advantages over "I'll just use c++ as a better c with raii, containers, safe string class, template functions, and very simple classes/powerful structs (no inheritance), and a threading standard library"


Another way to say "decide they're against the merger" is "evaluate the situation and make a timely ruling that they oppose the merger as illegal."

Which is exactly what they were supposed to do. Adobe and Figma tried to argue with the regulators or find a compromise, but couldn't come up with a solution that satisfied all parties.

If you try to extract subtle implications from the phrasing of a commenter on HN, you're likely to jump to the wrong conclusion.


But that's the point: it's not timely. It's a massive millstone to Figma, who now have to figure out what to do with their pixel-perfect UI designer collaboration tool in a world of an and coming Adobe Firefly.


The output of LLMs is ... rarely well-designed. Well-documented (with often incorrect documentation), well-formatted for sure, but profoundly not well-designed, unless you're asking for something so small that the design is trivial.

Even with GPT-4, if you ask it for anything interesting, it often produces code that not only won't work, but that couldn't possibly work without a major rewrite.

Not sure what you've been requesting if it's always been good output. Even when asking GPT-4 for docs I've had it hallucinate imaginary APIs and parameters more often than not.

Maybe the questions I ask are not as common? Given my experiences, though, I wouldn't recommend it to anyone for fear it gave them profoundly bad advice.


I’ve come to the conclusion that GPT produces code at a level of a new graduate at best. In actually getting it to solve something more or less on its own, it did ok on simple tasks and failed as soon as requirements became a bit more nuanced or specific. It’s also not very good at thinking out of the box, it’s solutions are all very clearly tied to its training data, meaning it struggled doing anything that strayed too far into the abstract or different.

However it’s been great at being my rubber duck and it’s been great as a tool for helping me eg write complex SQL queries — never without me being a key part of the loop, but as a tool to help me fill in gaps in my own skills or understanding. That is, it amplified my abilities. It was also pretty good at creating interesting metaphors for existing concepts, explaining terminology and even explaining bits of code I gave it.


My experience as well. Heavy GPT-4 use (for a variety of things). Great for boilerplate, great for retrieving well-known examples from documentation, saves a fair amount of time typing and googling, but often completely wrong (majorly and subtly) and anything non-trivial I have to do myself.

Great tool! Saves a ton of time! Not a dev replacement (yet)


Now I am definitely doing much simpler things I would wager than you folks are but I have found that with a bit of back and forth you can get pretty good results that work with only a bit of revision. I have found reminding it of the purpose or goals of whatever it is youre working on at the moment tends to make the output a bit more consistent


> with a bit of back and forth you can get pretty good results that work with only a bit of revision

The problem for me is that the "back and forth" and "a bit of revision" steps very often end up taking more time than writing the code myself would have.


Thats because you actually know what youre doing haha.

In all seriousness I am not a software engineer and GPT has enabled me to build things in a couple weeks that would have taken me months of effort to create otherwise.

I am sure an actual software engineer could have made those same tools in a day or two but its still incredible for my use case.


Yeah, also, you tend to need to know the answer before you prompt it.

It's kind of like a Ouija board, in that you get out what you expect.


> LLMs are very, very good at generating code

Ummm.... Awful code that often looks right at first glance, maybe.

Maybe LLMs can generate the kind of code that's really shallow in its complexity, but for literally everything I would call interesting LLMs have produced hot garbage. From "it doesn't quite do what I want" to "it couldn't possibly work and it's extremely far from being sane," though it always looks reasonable.


> Ummm.... Awful code that often looks right at first glance, maybe.

> Maybe LLMs can generate the kind of code that's really shallow in its complexity, but for literally everything I would call interesting LLMs have produced hot garbage. From "it doesn't quite do what I want" to "it couldn't possibly work and it's extremely far from being sane," though it always looks reasonable.

None of this has any bearing.


What would that even accomplish?

And no, even if it were created as part of vaccine or coronavirus research, that would not equate to it being a biological weapon. That's an absolutely unjustified leap. Do you know how virus research works? Clearly not. They still have smallpox viruses sitting around in labs. I don't know if there is any active research related to improving the smallpox vaccine, but if there were, and there were a release, in what way could you possibly classify vaccines as biological weapons?

But back to the original point: WHY? There is ample evidence that this particular lab had lax safety protocols that might have resulted in a leak of the virus. There is also evidence that similar viruses existed in the wild animal population of the area. They may or may not have been studying one of those viruses in the lab, but that alone doesn't prove where it first infected a human.

But say you have incontrovertible proof. What does that change?

People who died will remain dead. People with long COVID will remain ill. We can rattle sabers at China, but they're likely to continue to deny it was their fault. So what does happen? We can extrapolate from the past:

1. People of Chinese (or any east Asian) ancestry will be treated badly or even killed in twisted "revenge" fantasies of various idiots around the world.

2. The saber rattling could escalate to actual hot war, and more people would die.

3. Say I'm wrong and China does admit the lab screwed up. What then? They'll fire and/or execute people who were responsible. And...? It's not like they're going to pay compensation to everyone who lost a loved one around the world. They'll perform some political theater, and after a news cycle it will fade away.

I don't see an upside unless you're hoping for #3 and think that killing or jailing a few more people will somehow even the scales? Killing or jailing people for incompetence seems cruel and unusual to me. Never attribute to malice that which is adequately explained by stupidity.


Web3 morphed to mean "things that use a distributed blockchain." That's why we hate it. (It's not just currency either; it's anything blockchain, including NFT.)

"P2P web" needs a new name.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: