Hacker News new | past | comments | ask | show | jobs | submit login
To Keep Track of Reddit Conversations Around NYT Articles, We Built a Slack Bot (nytimes.com)
111 points by jprob 33 days ago | hide | past | web | favorite | 43 comments

Good on them! Being able to view responses to their content is ideal for both the publisher and the readers, this seems like it's the internet working at its best.

I wonder how viable this might be as a service?

Side note: The current title ("NYT Tracks Reddit Conversations Around nytimes.com Content") is quite editorialized, and somewhat misleading.

I would wager they are doing the same thing for HN. Also of note, ever since I started coming back here for the first time in several years, I couldn’t help but notice the high percentage of NYT articles posted here. I’m not saying they don’t fit the HN mold/interest bucket, but I was just surprised to see so many headlines from them here, on HN, so regularly. I think NYT is playing a strong grassroots/AstroTurf/social media game.

ever since I started coming back here for the first time in several years, I couldn’t help but notice the high percentage of NYT articles posted here

The Times has had a dedicated Science section for as long as I can remember (going back at least to the 80's), even pre-dating the existence of its Sports section by a decade or more. And it's well-known for being on top of business news and current events. So it just seems natural that its articles align with the interests of many in the HN crowd.

Over the last ten years, especially the most recent five, the NYT has done a lot of leading-edge things in the arenas of technology, transparency, and disclosure. It seems to be more and more a tech-friendly publication, in a good way. (See also: The Guardian.)

It's always done the hard data mining, but more frequently it's showing people how things are done behind the scenes, and done some very interesting Apple-quality experiments in presenting its articles online.

And if you're into design, the printed version of the Sunday Times is full of different interesting delights each week.

/Not an employee, just a print/online subscriber.

FWIW, nytimes.com accounts for ~4% of all posts that appear on the HN homepage: https://toddwschneider.com/dashboards/hacker-news-trends/?q=...

That's quite a lot! One of every 25, so probably at least one on the front page at any time? Are there domains that account for considerably more than 4%? Would be interesting to see a top ten list.

nytimes.com is #3 domain of late, really #2 behind only github.com if you exclude ycombinator.com (all of the Shows, Asks, etc.)

Top 10 domains by # of items on front page since 1/1/18:

   rank |     domain      | count 
      1 | github.com      |  2041
      2 | ycombinator.com |  1911
      3 | nytimes.com     |  1818
      4 | bloomberg.com   |  1028
      5 | medium.com      |   826
      6 | techcrunch.com  |   735
      7 | theguardian.com |   666
      8 | github.io       |   615
      9 | bbc.com         |   558
     10 | arstechnica.com |   493

While that is true, NYT posts are more highly upvoted than most other frontpage items

It's kind of a pedantic point, but this is not really a "New York Times article" the same way a news story by Maggie Haberman is.

This is a post on the blog that the NYT tech and design teams use to talk about their work.

NY Times has a long history of fostering technical innovation. I believe the D3 project originally started as a library at the NY Times for powering their interactive graphics.

Whether or not they use a Slack bot for it, I bet some NYT folks lurk or even participate here on HN.

Or, just maybe, the New York Times is the world's leading newspaper and therefore happens to publish a large number of articles, some of which overlap with HN's collective interests?

They probably have a bunch of sites they do this to. Each time someone discovers a new one they probably create another config for that site and then whenever a link is posted the bot spams the channel with "hey, $author's article about $title" is being discussed on $site at $link2comments".

Is funny to see this because I've developed sentiment systems that would parse in (somewhat) real-time and get handed over to govt and big business paying for lucrative contracts. You'd be surprised how obscure a forum we'd be mining and selling off.

Some people seem to get that it's happening, but I think very few know the extent.

There would probably have been more WSJ submissions if it weren't for the paywall. NYT's paywall at least offers a few free articles each month.

The other reason NYT links appear often is because there just aren't that many outlets doing journalism (and followup articles) on areas that are of interest to HN's audience.

HN punches above its weight class when it comes to influence so it's unsurprising the NYT wants to keep their visibility here high.

About 90% of the time when some group of people claim they have influence they have fuck-all influence. Reminds me of the time some 4chan users tried to have ReCaptcha label things 'penis' because they had this inflated sense of influence.

ReCaptcha said that it was trivial to handle the increased traffic and that it was not only auto-filtered out but also a drop in the bucket.

Everyone always thinks they're hot shit. But the world is a big place.

I remember when folks felt stupid spats on slashdot mattered.

It didn't.

The high percentage of NYT on the internet and especially social media is due to the NYT spending the last few years bullying social media platforms into giving it special treatment.

They aren't playing a "game". They are using their power as the establishment media to force tech companies to serve nytimes' ( and the elite's ) interests.

As the nytimes and other "authoritative" sources get special treatment online, it's going to crush local news and smaller news.

Funny how a few years ago, it was local news and smaller news upstarts ( vice, vox, huffpo, etc ) dominating online and "authoritative" sources struggling to compete. Why compete when you can change the rules to favor yourself. The NYT isn't playing a strong grassroots/AstroTurf/social media game. They tried for the past 15 years and they failed. So they decided to change the rules in their favor. Why compete when you can cheat?

Also, it was pressure from the NYT and other authoritative sources that forced reddit to start censoring. It was NYT and other authoritative sources that forced twitter to censor criticism of privileged journalists. The push for censorship and control online has been spearheaded by the elitists at the "authoritative" source companies.

I'm also interested in what you consider to be bullying social media platforms into giving them special treatment / using their power to force others to serve their interests.

When I hear this, I think of the equivalent of large beer distributors trying to push retailers to not have tap handles of independent breweries. Or big chip companies paying for shelf space and also to dictate the placement of their competitors. I'm wondering how this works in the news space.

Do you have any sources for this bullying? I like the NY Times. I have a feeling watching nearly every newspaper around them collapse or struggle has spurred them to act in creative ways to stay relevant. I'm not saying they're not throwing their weight around, but I'm just not certain what you're referring to and you seem to really really dislike the NY Times.

And I think it's worked personally, and they've brought some great stories to light as of late.

Plus, as a New Yorker, I enjoy the New York section, and probably have some sense of pride in having such an established and premier news outlet so close.

Sangria is an excellent GraphQL library for Scala! Very sadly Oleg Ilyenko, the maintainer, passed away recently[0].

[0] https://twitter.com/etorreborre/status/1126921902386184195?s...

Understanding the response to your work seems logical, in every field.

Also hello NYT slack!

Wait, Google doesn't support Perl?

Considering they said 'managed services' you can guess that they use Cloud Functions, App Engine or something like that. With Google Cloud Run you could easily do what they wanted, but then you have to have the Docker tooling.

Of course if they were using GCE, GKE, or any other way to deploy compute they'd be fine. In this case, because they're using GAE, they have limited language support without doing lots of (to them) complex stuff.

It's the right decision to port here.

I used to work at the times, they definitely have the docker tooling. Most production stuff runs on GKE. The author of the article has a long-standing affinity for GAE, though, plus it's just a quick maker week project so GAE still makes more sense.

The relevant part of the article, for those like myself who read the comments first:

> When I looked at the code, I realized I was not going to be able to reuse anything because it was written in Perl. My plan was to rely on many of Google’s managed services, which don’t support Perl.

I didn't know Google did shared hosting these days (like, PHP or whatever they do support, I knew about VPSes and hosted databases).

Who'd voluntarily do so in 2019?

The NYT, surely - as they helped produce one of the best Perl profilers out there: https://metacpan.org/pod/Devel::NYTProf

I'm told that Amazon and Booking both have huge production Perl code bases so some major orgs still care about it

Good point. There's probably good money in the "host all the stuff that's old and scary" business.

My org is still writing new code in perl5. The degree to which it's unappealing is, in my opinion, greatly overstated.

Most folks who like complaining about perl5 are doing so because they've worked with some terrible old codebase written in it. Terrible code is terrible code. But well-written perl5 code can be fun to work with.

I'm not much of a Perl user, but [its user community](https://techbeacon.com/app-dev-testing/perl-not-dead-it-was-...) is surprisingly strong and enthusiastic (even if they do talk about Python the way the TF2 community talks about Overwatch).

The article's actual title is "To Keep Track of Reddit Conversations Around New York Times Articles, We Built a Slack Bot."

There's some strong editorializing in this post's title.

Thanks, we've updated the title from “NYT Tracks Reddit Conversations Around nytimes.com Content” to that of the article with a more mechanical shortening.

The actual title is too long for the HN submission. But I agree that the editorializing is poor; a better title would have been: NY Times tracks Reddit conversations around their articles using a Slack bot

There's always a shorter way to get key elements into a headline. How about:

To Track Reddit's Take on NYT Articles, We Built a Slack Bot

At that point, what's the point of the Yoda sentence split-and-reorder?

How about: {We/NYTimes} Built a Slack Bot to Track Reddit's Take on NYT Articles

I'm not privy to the technical reasoning, but the Yoda split-and-reorder is part of their brand identity, so to speak.

It's an in-joke in journalism that the NYT routinely starts headlines with prepositions (followed by "location/context, noun/subject verb noun...")

My take would be "In the NYT Slack, a Bot to Track Reddit Conversations".


It's interesting how "NYT tracks Reddit conversations" sounds so much less malevolent when you know exactly what they are doing and why. Ie, posting slack links of public Reddit threads so the article authors can see and participate in the resulting discussion.

Even the editorialized title doesn't sound all that negative. Editorial teams reading public forums to see how their articles are being interpreted and discussed sounds pretty normal.

You can already simply click a domain on reddit and it shows you all the posts that linked it.

...I've been on reddit over 10 years and never noticed that.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact