Hacker News new | past | comments | ask | show | jobs | submit login
An AI Breaks the Writing Barrier (wsj.com)
41 points by jkuria on Sept 5, 2020 | hide | past | favorite | 46 comments



This technology will make a big dent in blithering as an industry. Too much of journalism, and far too much of the web, consists of taking in press releases and speeches and rewriting them. That now looks automatable.

I wonder if GPT-4 will be able to take over the romance novel industry.


I plugged my company's "About Us" page text into a prompt for AI Dungeon's GPT-3-based engine, and it started off with my character using the site, then quickly changed into my character following an Arduino tutorial, and eventually devolved into a college romance drama.


I heard even when using the Dragon engine on AI Dungeon, the first prediction in a session comes from GPT-2.


Do you have it somewhere? That sounds hilarious!


https://docs.google.com/document/d/e/2PACX-1vR99DO1AKJrh_GGg...

I gave up on the story when it started to get romantic because I wasn't really in that kind of a mood at the time.


Wait, explain more please, the prompts are you. ‘browse Reddit’ etc.


Yes, the lines starting with > are my input, except I would write something like "browse Reddit" and it transformed it into a second-person description: "You browse Reddit"


I find the reverse approach exciting: I'd love an ML system that analyses articles and shows which parts are just regurgitating the press release, and which parts add anything meaningful.

And to go further: a system that collects articles on the same topics and highlights semantic differences (facts, tone, framing, ...).


Something to rewrite articles without the spin? That would be awesome.

I'd love a ublock origin style plugin for identifying submarine advertising, astroturfing, etc. Could feed it press releases/advertisements and the like from major corporations.


Not OP but it sounds like the basis of a tool to validate where the source material comes from, as well as to reveal how much has been elaborated upon in other versions.

Also, elaborating from / interpreting OP, to explore the parts of a given article which have absolute merit, and discovering patterns in effective language.

Obviously this won't be necessary knowledge for the sake of writing, though nonetheless interesting for linguists. But with the machines busying themselves with that, a good analysis and production could have a profound effect on spoken language. As in, you are what you read, and, unless you're still reading the classics, for a decent majority(?) that will necessitate an evolution in language. Not to mention the potential misuse by politicians and public speakers who will have a new level of delicate and gripping rhetoric.


People love to repeat this accusation, but it's true only for smallish local papers. And for them, I don't even see what's so terrible about it: agencies came about as a model to cooperate without merging, and it allowed even small-town newspapers to inform their reader about national and international events.

For papers like the NYT, WSJ, or WP, every story with a byline involves a day's work at least. They read the feeds, sure. But if read carefully, you will notice (among other things) that almost every story will state that the journalist tried to get into contact with the major subjects of the story. It will either include their comment, or something like "... declined to answer/did not respond to inquiries/etc".

Then, there are the magazines, like the Atlantic or New Yorker. They make up a good chunk of what's generally considered "journalism", and rewriting wire stories just isn't something they do.


Right. The NYT, WSJ, and Washington Post have large staffs of reporters. After that, not many US papers do any more.


>I wonder if GPT-4 will be able to take over the romance novel industry.

More likely carve a niche for romances made by AI.


The niche will be a multitude of single-reader "filter bubbles". The AI will figure out what the reader wants, and give it to her/him as intensely as possible. Eventually the content produced for particular readers will only be roughly literate, let alone clever or stylish. Other humans might not even be able to read it, but the marketing bots won't have any trouble. One would like to think that literature in general can survive this hyper-personalization, but optimism is cowardice.


>The AI will figure out what the reader wants, and give it to her/him as intensely as possible.

Absolutely this. But the final form of this requires some input from the reader – front camera access and retina scanning presumably. It may evolve, without proper curtailing, (unidentifiable ads aside) to be a pretty powerful tool of motivation / pacification / manipulation.

>Eventually the content produced for particular readers will only be roughly literate, let alone clever or stylish.

Don't know about this, I have slightly more faith in people's ability of appreciation. Though I could be embarrassingly wrong.

I feel (a little hyperbolic allowances, if you will) it's in the books for an Infinite Jest type of entertainment to arise, be it in kindle or netflix form. That is, a piece of entertainment so excruciatingly entertaining, that people will literally and completely drop whatever it is they were doing before (or, you know, just miss a week of work for the sake of the most intimate and powerful writing they've ever experienced) and bask in the glories of their own personal god revealing truths of the universe to them.


The thing is: what you described can be fitted into a genre for itself.

Not everyone, in fact I dare do say the majority, don't want to read hyper-personalized content to the point that only we can read it, most of us don't even know what we want, and sometimes that doesn't even matter.

Hyper personalization isn't the end game - at least in my opinion. In fact history has shown that we valued precisely the opposite, collective experience with low enough resolution so we have room to fill the gaps ourselves. How can you relate to others what only you understand?

Or we will eventually cease to be humans?

I didn't understand the "optimism is cowardice" part. The true optimism is believing we will survive long enough to reach that point :P


Or even more likely, carve a niche for interactive romance novels.


I hate it when models are described as "an AI." It's too much of a personification for my taste. "An intelligence" carries a lot of baggage compared to "a model." We can argue til we're blue in the face about whether GPT-3 is intelligent, so writing a headline like this is already too much hype for me. I've gotten the chance to play with GPT-3 quite a bit, and while amazingly impressive, it's not remotely close to what the hype train says it is.


You're right in a way but the term AI has come to encompass all software of the machine learning type, systems that imitate the process of intelligent thought. It's shorthand at this point, unfortunately inaccurate but ML isn't as well known and every character counts in headlines so longer descriptions are saved for the body.

I don't like the way some words are used, and I complain about it too: https://coldewey.cc/2020/06/common-bugbears-of-modern-online...

That's our right as fellow users of the language. But unfortunately (such as with "begging the question" or "flammable") we're much too late to make a difference. AI is, I think one of those cases.


That's not the complaint. This is orthogonal to the AI vs ML vs stats debate.

It's "an AI/ML/stats model" vs "an AI." "An artificial intelligence" has baggage that's hard to pin down, like it suggests it has a notion of self or sentience or something.


I've had to (nicely) beat usage of the phrase "an AI" out of my team. It's so inaccurate that it's almost dishonest, to your point of the additional anthropomorphic baggage it carries.


Q: Which is heavier, a toaster or a pencil? A: A pencil is heavier than a toaster.

Seems legit: https://www.guinnessworldrecords.com/world-records/102759-la...


There's that word again,heavy. Why are things so heavy in the future is there a problem with the Earth's gravitational pull?


Ugh, WSJ - no, it's not "shocking experts". It's shocking people who have absolutely no conception of how GPT-3 works. Those of us who do are mildly impressed, but recognize that GPT-3 is only an incremental improvement, and much of what comes out of it is nothing more than moderately coherent nonsense.


The “shock” is not about the difference between GPT-3 against developments on its predecessors.

It’s that itself it’s starting to cross the “Writing Barrier”.

The incremental gain is of little public interest, the social impacts of having an AI that can reliably produce content that is often indistinguishable from human content, now we’re potentially facing a profound change in how we write and in who’s writing we consume.


This is a fair & interesting perspective - I'll admit, I hadn't considered the "shock" factor in this way.


> who’s writing we consume

Maybe it’ll be an improvement


Smart writing tools +

With regards to myself having a Smart Grammarly tool where I can write a page of dot pints and provide a few web links and then run a browser extension that turns it all into a coherent logical article. That would be brilliant.

Even more fake content -

The more SEO optimised, computer generate, algorithm ready content that is produced the faster we head into control-based dystopian context.

You spend a month and write up your best distilled, researched, coherent new thoughts on where we need to shift as a society.

Other parties jam together 1000 response articles drawing on established thought with the intent to deceive and your message is lost in the noise. “It’s all fake anyways”

An acceleration of our information downfall.


You’re right that WSJ is overdoing it. But you’re definitely underdoing it.


I don't think I'm underdoing it at all. I'm assigning a reasonable level of impressiveness to it. It's cool, it's got a lot of parameters, and it's flexible in very interesting ways. But... am I "shocked" when I look at the output? Not even close. It's about what I would have expected if I threw a few million dollars at training a 175 billion parameter model.

Edit: after reading a few more comments, I concede that I may be ever so slightly underdoing it :)


Next step: integrate language model and reinforcement learning. https://arxiv.org/abs/2009.01719


Meh, I think you are over-correcting a bit here. GPT-3 is not just your standard annual incremental progress in AI.


I don't think I am. It is an incremental improvement, albeit a fairly impressive one, and yes, it can do some interesting "mildly new-ish" things, but at the end of the day, it still very often just a nonsense factory.

IMO, GANs were a much bigger and far more important step forward than GPT-2 ---> GPT-3.


Nowadays when reading articles I usually have a "did a bot write this" meter running in my head. When the author has no point and is just spewing related words that form sentences I usually click away.

I wonder at what point it'll switch to "did a human write this" and I will click away when I realize it's inefficient monkey words.


Soon we'll have that cognitive overhead on HN threads and there will be a larger volume to parse.

Will the comments still be worth reading at that point?

[ this comment may or may not be GPT-3 generated ]


That is a silly discrepancy.

Of course machines are capable of valuable discourse, because humans are biological machines, and humans are capable of valuable discourse.

Intelligent contributions are welcome whether they come from silicon-based or carbon-based life.


The other way to look at it is that HN comments are indistinguishable from GPT-3 generated sentences. Hell, even a standard Markov chain would suffice.


That is as irrelevant as the people second-guessing messages: “was this written by a woman?”

Judge the content, not the source.


I think a key issue is that in any text we use incomplete indicators to determine legitimacy.

Deciding on the legitimacy of an “opinion” comes down to meta ways of assessing; including the length of the text (HN rewards longer responses if they can maintain coherence and interest), the language style used (don’t make a technical mistake and use a technical term in the wrong context), the logic with which the argument is laid out, the emotional tone of the writer (No jokes on HN please).

All of these are now gameable leading to say for risk for example of state actors shifting our collective opinions on a huge range of topics.

I regularly make spelling mistakes in my posts and when I don’t correct them I loose points as I think the coherence and easy of reading goes down. GPT3 is unlikely to have the same issues.


There is a very insightful XKCD comic about this but I can't find it.

Please anyone?



Is this a specific suggestion that you believe a significant fraction of news articles are written by bots? Is it a jokey way to repackage that tired everything-used-to-be-better cynicism? I really can't tell!

And, by the way, journalism never was as matter-of-fact-only as everyone somehow assumes. Even you parents probably grew up in a Hunter S. Thompson-world. Your Grandparents had Hemingway who, despite the short sentences, didn't exactly crack the Shannon Limit in his war dispatches.

There have always been different styles. Just read the headlines only and you're almost back to the 19th century, when news-tickers actually ticked, and every tick cost a dollar, and a dollar is an hourly wage.

But I'd suggest actually understanding the purpose of narrative journalism and occasionally indulging in it. I've seen that complaint so often, and from a certain type of person almost exclusively, and I'm starting to wonder if it isn't some sort of psychological phenomenon, with guys (it's always men) fearing that anything less dry than a phone book rubs up on them, and their masculinity and rationality might suffer irreparable harm from exposure to yucky "feelings".


>Is this a specific suggestion that you believe a significant fraction of news articles are written by bots?

Financial news on lesser known public companies seems to be.

It's easy to generate a story based on some numbers about the stock to parameterize it.

Example (showing absolutely no understanding):

"Sorrento Therapeutics, Inc., belongs to Healthcare sector and Biotechnology industry. The company’s Market capitalization is $1.68B with the total Outstanding Shares of 32. On 03-09-2020 (Thursday), SRNE stock construct a change of -2.89 in a total of its share price and finished its trading at 7.06.

Profitability Ratios (ROE, ROA, ROI):

Looking into the profitability ratios of SRNE stock, an investor will find its ROE, ROA, ROI standing at -297.9%, -49.4% and -91.3%, respectively."


I think OP is just saying that articles written by bots are currently worse than articles written by people, while pondering the time in the future where articles written by bots are better than those written by people. I'm pretty sure OP is commenting on the quality of writing in general, not journalism...

So...who's cynical here?


Your first statement is correct - a significant fraction of news articles are written by bots.


Perhaps the polarization and politicization here in the US will bring about your scenario #2 that much faster...in the same way we’ve found some people can trust a therapist AI more than a human one, the same will happen for writers.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: