Hacker News new | past | comments | ask | show | jobs | submit login
Tell HN: t.co is adding a five-second delay to some domains
749 points by xslowzone on Aug 15, 2023 | hide | past | favorite | 421 comments
Go to Twitter and click on a link going to any url on "NYTimes.com" or "threads.net" and you'll see about a ~5 second delay before t.co forwards you to the right address.

Twitter won't ban domains they don't like but will waste your time if you visit them.

I've been tracking the NYT delay ever since it was added (8/4, roughly noon Pacific time), and the delay is so consistent it's obviously deliberate.




This is what I have come to expect from every person that calls themselves a "free speech absolutist." What they actually believe is that they should be able to say whatever they want and do whatever they want, personally, without any consequences for themselves. There is no grander principle than "my ability to do what I want and exert power over others however I want, without critique or criticism."

I really wish the term hadn't been polluted this way.


Update: hours after being exposed and publicized in the Washington Post, the behavior has stopped:

> On Tuesday afternoon, hours after this story was first published, X began reversing the throttling on some of the sites, dropping the delay times back to zero. It was unknown if all the throttled websites had normal service restored.

https://archive.is/2023.08.15-210250/https://www.washingtonp...


I still see a roughly 2 second delay on first grab. The second is immediate.


Disclaimer: I am not comparing Twitter to warlords, dictators or genocide. But this quote (from Lord of War) really encapsulates a lot of what you say:

> Yuri Orlov: [Narrating] Every faction in Africa calls themselves by these noble names - Liberation this, Patriotic that, Democratic Republic of something-or-other... I guess they can't own up to what they usually are: the Federation of Worse Oppressors Than the Last Bunch of Oppressors. Often, the most barbaric atrocities occur when both combatants proclaim themselves Freedom Fighters.


History truly is a circle


Time for news orgs to boycott Twitter, just like NPR did.


even worse, It's had this 5 second delay for Threads for a month. https://www.threads.net/@jank0/post/CuV_5fprO3z/?igshid=NTc4...


I call myself a free speech absolutist (or advocate at least, absolutist is more of a slur). False compromises belong in the past. What X is doing isn't free speech at all and they have stated that advertisers will dictate what content will be seen, there is no commitment to freedom of speech at all.

But at least I can hold them responsible for violating their own stated values. The former Twitter leadership just hid content that didn't fit theirs or third parties sensitivities and told me they are doing me a favor.

Restricting speech is always in the interests of those that have the power to shape discussions, so limiting speech is always counter productive.


> The former Twitter leadership just hid content that didn't fit theirs or third parties sensitivities and told me they are doing me a favor.

The former Twitter leadership was very clear about what sort of content would be his. And is was based entirely on the type of content ahead of time. Critiquing this sort of content policy is like saying that newspapers should not be allowed to have clear standard for what is publishable in classified ads.

All claims of "I'm being oppressed" by Twitter policies have been absolutely ridiculous, and discrediting to supposed free speech advocate/absolutist positions.

Similarly discrediting is the silence on Musk's attacks on the free web and attempts at censorship of specific disprefeerred news outlets.

We all see what gets fought ago and what is not faught against, and the answer is clears the right to attack and intimidate groups with threatening behavior is defending, but actual censorship of reasonable discourse is tolerated.


> Restricting speech is always in the interests of those that have the power to shape discussions, so limiting speech is always counter productive.

This is not true. Restricting hate speech is an obvious counterexample.


It isn't obvious at all. It doesn't help for that matter, but that is secondary. On the contrary, it is just a popular excuse to restrict speech because nothing about hate speech is objective. We see bad legislation around the globe and that will never protect any minority.

It is a bad idea and damaging and there is ample empirical evidence for that.


Please, link to the ample evidence. Quality research only, natch.


I love free speech, but you're going to have to convince me with a lot of evidence that Germany restricting pro-Nazi speech after WW2 was bad.


How about that restricting pro-Nazi speech before WW2 prevented nothing.


> advocate at least, absolutist is more of a slur

Those two are enormously different, though. I'd consider myself an advocate, just as anyone who believes in a fair and free democracy should. But I am very far from being an absolutist — and I have a secret suspicion that nobody actually is. Musk certainly isn't.


Spam is an intractable problem for any so called free speech absolutist. One person's spam could be another person's desired message. But if a platform is overrun with spam, it becomes unusable for genuine discourse.

Maybe the biggest challenge is defining what constitutes "spam." While some cases seem clear-cut (e.g., repeated identical messages from bots, malware, phishing), others are quite subjective. Subtle marketing? Aggressive marketing? Repetitive but sincere advocacy for a cause? Repetitive but insincere trolling? Repetitive but sincere trolling?

All this seems rather obvious, so I was kind of surprised to see how many people bought into Elon's vision for Twitter, it was never workable.


Of course spam is also tracible. There might be difficulties for government to regulate it because of the legal context, but the solution is to hand the decision of filtering it to the user as long as that user doesn't decide for others.


A few are and understand that most of the time your are defending scoundrels. But there is a sizeable and probable larger group that very easily wants to suppress speech they do not like. There never was a case where to much freedom of speech has been a significant problem, contrary to the other way around.

Next is misinformation and tomorrow you wonder why you cannot state your opinion anymore. A cycle that has been repeated ad nauseum. It just isn't a smart solution and causes more problems than it solves.


Do you get mad at google for automatically detecting and removing spam from your email inbox? For a lot of people, probably the majority, speech by scoundrels falls somewhere in that realm... there is simply no debate to be had about the basic humanity of certain classes of people. Capitalistic companies respond to this demand.

That said, I agree the government probably shouldn't be involved here for the most part (slippery slope, government is a blunt tool, etc.). As long as your "speech" isn't actually harming someone (harassment, revenge porn, incitement, etc.)

As long as we're defending scoundrels it's worth remembering we already lack so many protections for non-scoundrels. In a lot of states you can be fired if your boss hears a whiff of collective bargaining. But I digress.


We are not talking about spam if that wasn't a rhetorical question. Advertising not wanting any controversy attached to their product placement is no solution and isn't desirable. This isn't done in the name of users.

That there are limited worker protections in countries is a different problem, but is certainly not inhibited by too much speech, quite the contrary it would worsen the situation further. Civil liberties never suffered because too much speech was allowed, so the perspective to err on the side of freedom is only logical.

> there is simply no debate to be had about the basic humanity of certain classes of people

That is just an invalid generalization.


>We are not talking about spam

Why not?


When a company that provides a coherent speech product, their editorial decisions are made according to how they will affect the goal of user growth. The obvious result of a “free speech absolutist” social media coupled with the rules of network effects is one enormous, undifferentiated social network.

It probably goes without saying that this would be an extremely unpleasant place, but there would be nowhere else to go once the last platform won.

What we have today is a number of smaller social networks, each with a different strategy to shape the conversation. It may very well be true that the creators of a platform choose editorial methods and goals that resonate with them personally, but what’s important to the dynamic of the platforms and free speech is that until we are all on that one terrible platform, that methods used to moderate your speech are nothing more than a company’s efforts to differentiate their product from others.

Restricting speech is in the interest of product differentiation. This, of course, is in the interest of the owner of the product, but it is always also in the interest of the consumer who wants a rich speech market to choose from, and who loathes the idea of a global 4chan style megasite to the exclusion of all other social media. This is why failure to limit speech in the context of a coherent speech product is always counterproductive.


Worth pointing out that t.co has always been an instance of an annoying and seemingly unjustified practice I named "nonsemantic redirect". Rather than legitimately redirecting using an HTTP Location header, it instead is an HTML page with a META refresh tag on it.

You don't see this with curl/wget because they use user agent sniffing. If they don't think you're a browser they _will_ give you a Location header. To see it, capture a request in Firefox developer tools, right click on the request, copy as CURL. (May need to remove the Accept-Encoding tag and add -i to see the headers).


Could you explain what the intended/expected outcome is for this? What is accomplished by doing that?


The purpose is so that Twitter is seen as the source of the traffic. A lot of Twitter-sourced traffic comes from native apps, so when people click links from tweets, they usually don’t send referrer information.

If the redirects were server side (setting the Location header), a blank referrer remains blank. Client side redirects will set the referral value.

From Twitter’s POV, there’s value in more fully conveying how much traffic they send to sites, even if it minorly inconveniences users.


How does this inconvenience users? It sounds like you’re saying site owners will be able to distinguish between users with a blank referer and users whose “referer” was the desktop app. Ignoring the privacy angle, isn’t that a good thing?


Other than the privacy angle a meta redirect is always a bit slower than a location header. You need to send an html page (more bytes) that the browser needs to render and then act on (more work).

A location header is nearly unnoticeable, a meta refresh page gives you a flash of a blank interstitial screen.

(Not that I had the same annoyance, just explaining the difference to the end user of the two approaches)


With the amount of bloat we have on the modern web, I think sending an HTML meta tag rather than a Location header should be the least of our concerns, when it comes to performance.

If the whole purpose of it is to have browsers send a Referer header, I don't think it's that bad. Even from a privacy perspective, you can configure browsers to not send that header anyway.


Crawlers and tools will get the right location http header but browsers and users will get the delay.


Cookies?


No, in fact now t.co even returns an empty body with it's 301 response:

  % curl -vgsSw'< HTTP/size %{size_download}\n' https://t.co/DzIiCFp7Ti 2>&1 | grep '^< \(HTTP/\)\|\(location: \)'
  < HTTP/2 301 
  < location: https://www.threads.net/@chaco_mmm_room
  < HTTP/size 0


You didn't read the second paragraph of the comment you replied to-- which explained this exact issue before you replied "no":

> You don't see this with curl/wget because they use user agent sniffing. If they don't think you're a browser they _will_ give you a Location header. To see it, capture a request in Firefox developer tools, right click on the request, copy as CURL.


Indeed, sorry everyone


Firefox:

    <head><noscript><META http-equiv="refresh" content="0;URL=https://www.threads.net/@chaco_mmm_room"></noscript><title>https://www.threads.net/@chaco_mmm_room</title></head><script>window.opener = null; location.replace("https:\/\/www.threads.net\/@chaco_mmm_room")</script>


Turns-out it depends on the User-Agent: https://news.ycombinator.com/item?id=37139425


I can confirm. NYT shows a five-second redirect delay: "wget https://t.co/4fs609qwWt". It redirects to gov.uk immediately: "wget https://t.co/iigzas6QBx"


Oddly enough the delay is reduced to 1 second by using curl's useragent string (wget --user-agent='curl/8.2.1' https://t.co/4fs609qwWt)


Seeing this makes me wonder if it's some sort of server-side header bidding ad server gone haywire, rather than something nefarious. Why would they only delay browser agents otherwise?


Browsers are generally tolerant of long TTFB. Automation, on the other hand, is sometimes quite brittle.


Perhaps their own internal tooling also relies on the t.co redirector?


Probably a phishing/malware scan gone wrong then. NYTimes has Twitterbot in its robots.txt which might be related?

Even if it's deliberate, I don't see how people can complain. Google has outright blocked Breitbart for years. They prevent results from that domain from appearing at all unless you specifically force it with site: and apparently HN does the same. Politically motivated censorship and restricting "reach" is just how Silicon Valley rolls. Pre-Musk Twitter did freeze the New York Post's account and many other much worse things. It'd be a shame for Musk to be doing this deliberately, even though it seems unlikely. But that's the problem with creating a culture where that sort of behavior is tolerated, isn't it? One day it might be turned around on you.


Does the value Breitbart adds to the internet outweigh the negatives of turning people into dangerous fascists by weaponizing misinformation? No.

Does the value added by sources like the NYT outweigh the negatives of being occasionally biased or outright wrong? Yes.


There is no objective, public, or shared "value" at play here.

The only "values" that matter are the personal whims of whoever happens to own Twitter, or Google or Facebook.


Just because this is a very difficult question doesn't mean we can throw our hands up and pretend it doesn't exist. Many things in life are very difficult and yet worth solving anyway.


I didn't say the concept cannot exist: I said it's not at play here.

What gets a website censored, in the modern corporation-dominated Internet, is going against the interests and preferences of Big Tech owners - and nothing else. Nobody with any power is bound to look out for the public interest, however defined; ICANN is perhaps the only exception that comes to mind.

We can waste our time and attention debating over which targets were more or less deserving of censorship, based on our personal ideas of public interest. But as long as Big Tech is allowed to exist in its current form, we're like powerless peasants arguing about the decisions of kings.


“And nothing else”

That’s not true and you know it. Don’t ignore facts man.


You’re not making any sense, you’re just trying to sound contrarian.


Does the value the NYT adds to the internet outweigh the negatives of turning people into dangerous communists by weaponizing misinformation? No. Does the value added by sources like the Breitbart outweigh the negatives of being occasionally biased or outright wrong? Yes.


I can tell you all the ways in which you’re wrong, but something tells me you wont trust anything anyone says unless it confirms your biases.


NYT has arguably done far more damage to the world (Iraq, etc.) than Breitbart has.


> NYT has arguably done far more damage to the world (Iraq, etc.) than Breitbart has.

NYT may have more reach and definitely isn't neutral, but it's a far cry from the nonsense that Breitbart publishes. It's nakedly partisan.


Could this be explained by the UA derived redirect behaviour described in this other comment on the thread? https://news.ycombinator.com/item?id=37130982


Agree/confirmed - just recorded a number of different nytimes urls that pass through t.co, all 4.7s+. various cnbc and google articles through t.co were ~130-200ms response time from t.co specifically (not total redirect->page load).


I almost didn't believe OP, because it's so comically inept and petty. But, I can also confirm in some private testing there is a deliberate delay.


Considering how common it is to deliberately break (not just with a "you log in now" modal, but not loading content, spinning forever, breaking layouts, etc) websites for non-logged-in mobile users, that's exactly how petty in imagine them to be. Twitter and Reddit do it, and Imgur comes and goes so I can't decide if for them it's deliberate or just incompetence.


You can add Instagram and YouTube to the list of websites that are painful to use from a mobile browser


it's no doubt intentional if the site has an app. They can scoop up significantly more data about you from their app


"because it's so comically inept and petty"

This is precisely why I did believe OP. This is Elon Musk we're talking about.


Im not getting the same time delay with curl

- `time wget https://t.co/4fs609qwWt` -> `0m5.389s`

- `time curl -L https://t.co/4fs609qwWt` -> `0m1.158s`


And now add browser user-agent to the curl request and watch how slow it gets.

- `time curl -A "Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Firefox/81.0" -L https://t.co/4fs609qwW` -> 4.730 total

- `time curl -L https://t.co/4fs609qwWt` -> 1.313 total

Same request, the only difference is user-agent.


your URLs are different.


Only because I copied the first one incorrectly to put it here. I haven't selected full command, so there is a missing "t" at the end of first link


Is there some cache going on? On my first attempt, there is a 5 second delay. When I try it second time immediately it works without the 5 second delay. But if I try again after an hour, 5 second delay again!


Safari seems to be caching it for me, but I can reproduce the delay every time with curl - so long as the user agent doesn't include the string "curl".


I tested some substack.com links and there's a delay on those too.


Could it just be rotting infrastructure? I.e there is some logic on most visited domains to allow ease of moderation, that logic is read heavy and is now bucking under skew.

Or even like some junior dev removed an index


A few years ago I remember their URL shortener on android app directing somewhere that my hostfile adblocker would catch (like an analytics domain or something). This made it so first click on certain twitter links would fail, but if I clicked it again it would go successfully. Ultimately I never researched it deeply enough but my guess is they had some sort of handler that would log whether loading their analytics service failed and serve up the direct link on the second attempt.


Seeing that right now with NYT links. First click takes a few seconds to redirect, second click is almost instant


I've experienced similar behavior on iOS, not sure if it still does that though.


It’s about 4.5 seconds for me

https://imgur.com/a/qege0O9


4521ms according to curl

  curl -A "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/117.0" -I "https://t.co/4fs609qwWt"
  x-response-time: 4521


Any DNS resolver libraries have a 4.5 second timeout? Maybe their infrastructure is just rotting.


It'd be weird that it's limited to specific outbound domains in that case.


domains load balanced across different servers in some overly complicated distributed topology with only one of them busted?

although seems unlikely it just happens to be the NYT.


I don't think it's a problem with rotting infrastructure. Using curl those requests are quick, unless you pass user-agent from browser. 4.7s with firefox user-agent and 1.3s without it. Using twitter link to the same NYT article.


glibc defaults to 5 sec, but the server wouldn't need to resolve the redirect domain -- that'd be the client's job. Unless it's doing so for some proprietary reason, of course.

Or did you mean failing to resolve some internal service's hostname?


> Or did you mean failing to resolve some internal service's hostname?

Yeah, something more like that where the internal service is somehow 'sharded' due to some overly complicated distributed database nonsense, and there's a DNS lookup that is failing. Of course that'd mean the DNS lookup wasn't cached, so you're taking that normal latency on every single hit, which would be terrible architecture. The curl-vs-wget performance isn't explained by that though (although that's a bit weird in and of itself, and might suggest that they had to allow that for some internal tool that they didn't want to punish).

> glibc defaults to 5 sec,

The timeout being close to 5 seconds is what made me wonder about it. Its just off though.


Now do additional testing by adding and setting the http referer to t.co or twitter. Is it Twitter, or is it NYtimes doing this?


You can do that if you want; I don't take orders.


I don't see it:

  % curl -gsSIw'foo %{time_total}\n' -- https://t.co/4fs609qwWt https://t.co/iigzas6QBx | grep '^\(HTTP/\)\|\(location: \)\|\(foo \)'
  HTTP/2 301 
  location: https://nyti.ms/453cLzc
  foo 0.119295
  HTTP/2 301 
  location: https://www.gov.uk/government/news/uk-acknowledges-acts-of-genocide- committed-by-daesh-against-yazidis
  foo 0.037376


I think Twitter, err, X, just turned off the delay now that it's getting big media attention. I could reproduce it over and over again a little earlier, but now I can't anymore: https://news.ycombinator.com/item?id=37138161

[Edit:] I'm still seeing it with threads.net:

  curl -v -A 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.6 Safari/605.1.15' https://t.co/DzIiCFp7Ti


I don't see it with your URL either:

  % curl -gsSIw'foo %{time_total}\n' https://t.co/DzIiCFp7Ti | grep '^\(HTTP/\)\|\(location: \)\|\(foo \)'
  HTTP/2 301 
  location: https://www.threads.net/@chaco_mmm_room
  foo 0.123137
Doesn't matter if I do a HTTP/2 HEAD or GET:

  % curl -gsSw'%{time_total}\n' https://t.co/DzIiCFp7Ti 
  0.121503
HTTP/1.1 also shows no delay:

  % curl -gsSw'%{time_total}\n' --http1.1 https://t.co/DzIiCFp7Ti
  0.120044
I chalk this up to rot at X/twitter that is being fixed now that it was noticed.


> I don't see it with your URL either

That's because you're not spoofing the User-Agent to be a browser rather than curl.


Oh that's it, thanks! In fact it it returns a 200 not 301 then:

  % curl -gsSw'%{time_total}\n' -A 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.6 Safari/605.1.15' https://t.co/DzIiCFp7Ti
  <head><noscript><META http-equiv="refresh" content="0;URL=https://www.threads.net/@chaco_mmm_room"></noscript><title>https://www.threads.net/@chaco_mmm_room</title></head><script>window.opener = null; location.replace("https:\/\/www.threads.net\/@chaco_mmm_room")</script>4.690000
  % curl -gsSIw'%{time_total}\n' -A 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.6 Safari/605.1.15' https://t.co/DzIiCFp7Ti
  HTTP/2 200 
  ...
  content-length: 272
  ...
  x-response-time: 4524
  ...
  
  4.660211
The delay is not there for nyti.ms (anymore) but once you use the Safari UA it's handled as 200 response:

  % curl -gsSIw'foo %{time_total}\n' -A 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.6 Safari/605.1.15' https://t.co/4fs609qwWt https://t.co/iigzas6QBx | grep '^\(HTTP/\)\|\(location: \)\|\(foo \)'
  HTTP/2 200 
  foo 0.126043
  HTTP/2 200 
  foo 0.037255
It really does seem that twitter is adding a 4.5s delay to some sites from web browsers. Could be malicious, could be rot...


The specific logic with user agents is that it happened (I think they've ended it now?) whenever the word "curl" was not in your user agent string. If the substring "curl" was contained anywhere in your user agent string, it did not have a delay. I cannot imagine how it could rot in that specific way non-maliciously.


They both load instantly for me.


It's already been reverted.


The solution to X (Twitter) sucking is to stop using it. It will either: get fixed or go out of business and be replaced.

It seems we've become a society that rewards bad practices with attention which is all any company on the web is trying to get, your attention.


> It seems we've become a society that rewards bad practices with attention

I have a very different way of looking at this. It's not us that gives attention. It is them that take it via exploiting our evolved inflexible cognitive systems for attention/reward/desire/anger/lust. We are moths to a flame. The moth's free will isn't to blame for its inability to avoid it. Our cognitive systems are fixed, we can't just turn them off. If a sufficiently powerful dopamine-inducing technology is made, you can't just "opt out". It is not as simple as that. Any individual variation in the ability to opt out likely comes down to variation in genetics or other extraneous factors not inside one's immediate control.

This is where regulation needs to come in. Once you accept the reality that opting out is a comforting yet false illusion, you can then do something about it.


Tim Wu makes a similar point in his book The Attention Merchants. Humans are interested in things and throughout time various people and media (which tends to be controlled by a small number of people) have been working to capture our attention. It is very hard to totally opt out of something that is so pervasive, like fish trying to ignore water.


Mastodon has been a breath of fresh air and you can get a really interesting feed going when you follow the right people and hashtags



> And it can handle Twitter scale now

*in hosting costs.


Which people and hashtags? Trying to check it out but struggling to find relevant content. Is there a tech community somewhere? The ones I found appeared to be dead.


Fosstodon is a good place to start maybe? Plenty of tech on Mastodon, hopefully other interests will follow.


I think that HN itself also shadow flags submissions from a list of domains it doesn't like.

Try submitting a URL from the following domains, and it will be automatically flagged (but you can't see it's flagged unless you log out):

  - archive.is
  - watcher.guru
  - stacker.news
  - zerohedge.com
  - freebeacon.com
  - thefederalist.com
  - breitbart.com


Well, yes, many sites are banned on HN. Others are penalized (see e.g. https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...). None of this is secret, though we don't publish the lists themselves.

Edit: about 67k sites are banned on HN. Here's a random selection of 10 of them:

  vodlockertv.com
  biggboss.org
  infoocode.com
  newyorkpersonalinjuryattorneyblog.com
  moringajuice.wordpress.com
  surrogacymumbai.com
  maximizedlivingdrlabrecque.com
  radio.com
  gossipcare.com
  tecteem.com


It is a secret if the system does not inform the poster it's been penalized.


HN operates, based on a number of reasons, on numerous dynamics of friction and nudges. Mostly for the better. I've had my disagreements about things in the past, though as I watch the site and have studied it (particularly over the past few months, see: <https://news.ycombinator.com/item?id=36843900>) I mostly agree with it.

The parts that don't work especially well, most particularly discussion of difficult-but-important topics (in my view) ... have also been acknowledged by its creator pg (Paul Graham) and mods (publicly, dang, though there are a few others).

In general: if you submit a story and it doesn't go well, drop a note to the moderators: hn@ycombinator.com. They typically reply within a few hours, perhaps a day or if things are busy or for complex.

You can verify that a submission did or didn't go through by checking on the link from an unauthenticated (logged-out) session.


> if you submit a story and it doesn't go well, drop a note to the moderators

> You can verify that a submission did or didn't go through by checking on the link from an unauthenticated (logged-out) session.

Trustful users do not think to do this, and it would not be necessary if the system did not keep the mod action secret.


Trustful souls may not.

Those who have been advised to do so, through the Guidelines, FAQ, comments, or moderator notes, do, to their advantage.

(I'd had a submission shadowbanned as it came from the notoriously flameworthy site LinkedIn a month or few back. I noticed this, emailed the mods, and got that post un-banned. Just to note that the process is in place, and does work.)


You don't see the harm of elbowing out trustful people from the public square?


What we do is try to educate them and loop them back in.

I've done this on multiple occasions, e.g.: <https://news.ycombinator.com/item?id=36191005>

As I commented above, HN operates through indirect and oblique means. Ultimately it is is a social site managed through culture. And the way that this culture is expressed and communicated is largely through various communications --- the site FAQ and guidelines, dang's very, very, very many moderation comments. Searching for his comments with "please" is a good way to find those, though you can simply browse his comment history:

- "please" by dang: <https://hn.algolia.com/?dateRange=pastYear&page=0&prefix=tru...>

- dang's comment history: <https://news.ycombinator.com/threads?id=dang>

Yes, it means that people's feelings get hurt. I started off here (a dozen years ago) feeling somewhat the outsider. I've come to understand and appreciate the site. It's maintained both operation and quality for some sixteen years, which is an amazing run. If you go back through history, say, a decade ago, quality and topicality of both posts and discussions are remarkably stable: <https://news.ycombinator.com/front?day=2013-08-14>.

If you do have further concerns, raise them with dang via email: <hn@ycombinator.com> He does respond, he's quite patient, might take a day or two for a more complex issue, but it will happen.

And yes, it's slow, inefficient, and lossy. But, again as the site's history shows, it mostly just works, and changing that would be a glaring case of Chesterton's Fence: <https://hn.algolia.com/?q=chesterton%27s+fence>.


> What we do is try to educate them and loop them back in.

But that's selective education. You don't do it for every shadow moderated comment. The trend is still that shadow moderation more often disadvantages trustful users. Will you acknowledge that harm?

Over 50% of Reddit users have a removed comment in their recent history that they likely were not told about. When shadow moderation is in play, abuse runs rampant among both mods and users. Both find more and more reasons to distrust each other.


What alternative(s) do you propose?

How do you think spammers and abusers will exploit those options?

Again: HN works in general, and the historical record strongly confirms this, especially as compared with alternative platforms, Reddit included, which seems to be suffering its own failure modes presently.


> What alternative(s) do you propose?

A forum should not do things that elbow out trustful people.

That means, don't lie to authors about their actioned content. Forums should show authors the same view that moderators get. If a post has been removed, de-amplified, or otherwise altered in the view for other users, then the forum should indicate that to the post's author.

> How do you think spammers and abusers will exploit those options?

Spammers already get around and exploit all of Reddit's secretive measures. Mods regularly post to r/ModSupport about how users have circumvented bans. Now they're asking forums to require ID [1].

Once shadow moderation exists on a forum, spammers can then create their own popular groups that remove truthful content.

Forums that implement shadow moderation are not belling cats. They sharpen cats' claws.

[1] https://twitter.com/rhaksw/status/1689887293002379264


Your first three points are blind assertions without supporting justification or basis. All have been 1) identified as known issues for spammers (e.g., HN used to publish its block list, it no longer does, based on observed response, which mirrors experiences at many other sites), and 2) the workarounds given. You don't accept either the fact of the first or the utility of the 2nd, however you're on parlous ground in doing so.

The fact that some spammers overcome some countermeasures in no way demonstrates that:

- All spammers overcome all countermeasures.

- That spam wouldn't be far worse without those countermeasures.[1]

- That removing such blocks and practices would improve overall site quality.

I've long experience online (going on 40 years), I've designed content moderation systems, served in ops roles on multi-million-member social networks, and done analysis of several extant networks (Google+, Ello, and Hacker News, amongst them), as well as observed what happens, and does and doesn't work, across many others.

Your quest may be well-intentioned, but it's exceedingly poorly conceived.

________________________________

Notes:

1. This is the eternal conflict of preventive measures and demonstrating efficacy. Proving that adverse circumstances would have occurred in the absence of prophilactic action is of necessity proving a counterfactual. Absent some testing regime (and even then) there's little evidence to provide. The fire that didn't happen, the deaths that didn't occur, the thefts that weren't realised, etc. HN could publish information on total submissions and automated rejections. There's the inherent problem as well of classifying submitters. Even long-lived accounts get banned (search: <https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...>). Content moderation isn't a comic-book superhero saga where orientation of the good guys and bad guys is obvious. (Great comment on this: <https://news.ycombinator.com/item?id=26619006>).

Real life is complicated. People are shades of grey, not black or white. They change over time: "Die a hero or live long enough to become a villian." Credentials get co-opted. And for most accounts, courtesy of long-tail distributions, data are exceedingly thin: about half of all HN front-page stories come from accounts with only one submission in the Front Page archive, based on my own analysis of same. They may have a broader submission history, yes, but the same distribution applies there where many, and almost always most submissions come from people with painfully thin history on which to judge them. And that's assuming that the tools for doing said judging are developed.


> Your first three points are blind assertions without supporting justification or basis.

You asked me for an alternative and I gave one.

You yourself have expressed concern over HN silently re-weighting topics [1].

You don't see transparent moderation as a solution to that?

> The fact that some spammers overcome some countermeasures in no way demonstrates that...

Once a spammer knows the system he can create infinite amounts of content. When a forum keeps mod actions secret, that benefits a handful of people.

We already established that secrecy elbows out trustful people, right? Or, do you dispute that? I've answered many of your questions. Please answer this one of mine.

> That removing such blocks and practices would improve overall site quality.

To clarify my own shade of grey, I do not support shadow moderation. I support transparent-to-the-author content moderation. I also support the legal right for forums to implement shadow moderation.

[1] https://news.ycombinator.com/item?id=36435312


What is it if the information is freely available, to anyone asking, for a single domain they are trying to post at that time?

It’s not secret, because they’ll be provided an answer if they email the mod team.

It’s not free as in open source, because it isn’t available for anyone to download and study in full.

So, since it’s not secret, is it public, or private? Since it’s not published in full but any query of LIMIT 1 is answered, is that open, closed, or other?

Restrictions to publication don’t necessarily equate to secrecy, but the best I’ve got is “available upon request”, which isn’t quite right either. Suggestions welcome.


Content moderation systems often hide mod actions from the content author [1]. That's a secret.

The opposite would be to show the author of the content some indicator that it's been removed, and I would call that transparent or disclosed moderation.

Interestingly, your comment first appeared to me as "* * *" with no author [2]. I wonder if that is some kind of ban.

[1] https://www.youtube.com/watch?v=8e6BIkKBZpg

[2] https://i.imgur.com/oGnXc6W.png

edit I know you commented again but it's got that "* * *" thing again:

https://news.ycombinator.com/item?id=37130675

https://archive.is/Eov7z


It's not a ban. It appears when the user has 'delay' in their profile set to N minutes and N minutes haven't elapsed yet. We should probably make this more explicit.

Re the 'delay' setting see https://news.ycombinator.com/newsfaq.html.


Ah, that makes sense. Thanks!


There’s a protection system in place that can result in that; I don’t have the details at hand (since I’m not associated with HN/YC) but I remember seeing it once before on a highly contentious post, and an email to the mods helped explain/correct whatever was up.


> Suggestions welcome

"This domain is not allowed on HN" as an error message upon submission.


That’s not going to work, because now that’s an API for spammers to bulk process against a domain list. The only available API must be human communication to the mod team, or the spammers will overcome it with automation.


Spammers can already do that API call and see if the domain shows up. This only puts human users at the same level of consideration as spammer automation.


Seriously dedicated spammers can, yes! But antispam is about reducing the noise threshold, and eliminating low-effort spam opportunities that can be done to a single HTTP endpoint with a bash script is a big win. Simply having to access two pages is already too much to bother with for the vast majority.


That's trivial to figure out.

It's quite possible the reason the list isn't public is because it would give away information about what thought is allowed and what thought isn't.


> That’s trivial to figure out.

Elaborate.


That encourages switching to another domain for spammy submissions.


There's a lot of user hostile moderation practices that occur on this site, manual and automatic. They're not often, or really at all, discussed. Some of them don't work well, and haven't for as long as they've existed.


I don't want us to be user hostile. Can you link to some examples?


I would consider any moderation action that isn't visible to users to be user hostile.

If you're going to censor someone, you owe it to them to be honest about what you're doing to them.


You possibly haven't experienced how devious and determined and dishonest and unpleasant some bad actors are, including SPAMmers.

(Even when doing the RightThing(TM) would probably be easier...)

And, BTW, I occasionally get blocked by the mechanisms here, even though not doing anything bad, but understand that there is a trade-off.


I am fully aware of the issue.

That's one of the costs with having a public website.


The HN moderation policies are clearly effective, because the site is mostly full of useful information that attracts a wide audience of readers.

I really like this take on moderation:

"The essential truth of every social network is that the product is content moderation, and everyone hates the people who decide how content moderation works. Content moderation is what Twitter makes — it is the thing that defines the user experience."

From Nilay Patel in https://www.theverge.com/2022/10/28/23428132/elon-musk-twitt...


We may not all be fans of Musk at the moment, but one of his observations about PayPal was that its job was not especially about payments because that bit was easy, it was about preventing fraud. And as the ex-director of a small payments system (e-money issuer), I agree. The bit which everyone outside the system doesn't realise is the hard bit is dealing with all the bad actors.


And that cost is so high that over the ~25Y+* that I have been running my own sites I have not had UGC on any of them, other then a very brief experiment, which showed me what utter relentless turds the bad actors can be.

Operators of public sites should NOT have to pay that tax. So you are best are not fully aware of the actual cost, IMHO.

Congrats to HN for striking a reasonable pragmatic balance.

*I had some of the first live (non-academic) Internet connectivity in the UK, and the very very first packets were hacking attempts...


I agree with you both. The only thing I'd add is that it's a tradeoff - if we do it this way, it's only because the alternative would be even more user-hostile.


Does HN ever show a user that their comment was submitted, but the comment is not visible for anyone else? Or it’s not visible for most people? Without having the flagged tag


Indirectly: <https://news.ycombinator.com/item?id=37137757>

I suppose a sufficiently motivated spammer might incorporate that as a submission workflow check.


>If you're going to censor someone

unless HN is suddenly the government what you've misnomered is moderation, not censorship. Calling censorship just exaggerates your opinion and makes you look unhinged. It's a private website not national news.


Censorship is not limited to who does it.


Is deleting spam censorship?


So please explain the difference between censorship and moderation.


I like Scott Alexander's definition[1]. Quoting directly:

> Moderation is the normal business activity of ensuring that your customers like using your product. If a customer doesn’t want to receive harassing messages, or to be exposed to disinformation, then a business can provide them the service of a harassment-and-disinformation-free platform.

> Censorship is the abnormal activity of ensuring that people in power approve of the information on your platform, regardless of what your customers want. If the sender wants to send a message and the receiver wants to receive it, but some third party bans the exchange of information, that’s censorship.

Censorship is somewhat subjective, something that you might find offensive and want moderated might not be considered so by others. Therefore, Alexander further argues that the simplest mechanism that turns censorship into moderation is a switch that, when enabled, lets you see the banned content, which is exactly what HN does. Alexander further argues that there are kinds of censorship that aren't necessarily bad, by this definition, disallowing pedophiles from sharing child porn with each other is censorship, but it's something that we should still do.

[1] https://astralcodexten.substack.com/p/moderation-is-differen...


In my head, censorship is the removal of an idea that is offensive to a particular ideology but isn’t objectively harmful.

Moderation is the removal of content that objectively doesn’t belong in context, eg spam

Obviously that moderation definition is nuanced bc some could argue that Marxist ideas don’t belong in the context of a site with a foundation in startups. And indeed Marxist ideas often get flagged here


public versus privately owned forum.


Shadow banning is one of the most effective ways to fight spam and harassment. Not being "honest" with spammers and harassers can often be a good thing.


If it's visible, it can be worked around.

Blame the trolls that prevent us from having nice things.


I've twice had some "user hostile moderation practice" used against me on HN. Both times an email to the right person cleared it up - and one of those times in fact I had crossed a boundary that I shouldn't have crossed. Any long-time community member here knows what to do.


Understandable, but I think there should be some discriminating system for another class of sites, the "you can submit but not discuss" ones.

For example, a recent submission (of mine):

"Luis Buñuel: The Master of Film Surrealism"

it had no discussion space because (I guess) it comes from fairobserver.com . Now, I understand that fairobserver.com may had been an hive of dubious publishing historically, but it makes little sense we cannot discuss Buñuel...

Maybe a rough discriminator (function approximator, Bayesian etc.) could try and decide (based at least on the title) whether a submission from "weak editorial board" sites seems to be material to allow posts or not.


Oh I agree - https://news.ycombinator.com/item?id=36924205 was a fine submission. Can you please email hn@ycombinator.com so I can send you a repost invite for it?

That domain is a borderline case. Sometimes the leopard really changes its spots, i.e. a site goes from offtopic or spam to one that at least occasionally produces good-for-HN articles. In such cases we simply unban it. Other times, the general content is still so bad for HN that we have to rely on users to vouch for the occasional good submission, or to email us and get us to restore it. I can't quite tell where fairobserver.com is on this spectrum because the most recent submission (yours) is good, the previous one (from 7 months ago) is borderline, and before that it was definitely not good. But I've unbanned it now and moved it into the downweighted category, i.e. one notch less penalized.


I have just spent a little time checking the Fair Observer, most recent articles.

I would say that it contains chiefly a political part and a cultural part. Some of the pieces in the political part can be apparently well done, informative and interesting, while some others are determined in just blurting out partisan views - arguments not included.

Incidentally: such "polarized literature" seems abundant in today's "globalized" world (where, owing to "strong differences", the sieve of acceptability can have very large gaps). It is also occasionally found here in posts on HN (one of the latest instances just a few browsed pages ago): the occasional post that just states "A is B" with no justification, no foundation for the statement, without realizing that were we interested in personal opinions there are ten billion sources available. And if we had to check them, unranked in filing, an image like Borges' La Biblioteca de Babel could appear: any opinion could be found in some point of the library.

Yes, I have (now) noticed a few contributors (some very prolific) in the Fair Observer are substantially propaganda writers.

But the cultural part, https://www.fairobserver.com/category/culture/ , seems to more consistently contain quality material, with some articles potentially especially interesting. In this area, I have probably seen more bias on some mainstream news outlets.

I think that revolution that is showing valid for journalism today includes this one magazine: the model of The Economist, of having a strong prestigious and selective editorial board (hence its traditional anonymity of the contributors), is now the exception, so you do not read the Magazine but the Journalist. The Magazine will today often publish articles from just anyone; the Reader has today the burden to select the Journalists and follow them.

--

I will write you in a few hours for the repost, thank you.


Incidentally: I just saw a new piece, on a publication that at least in its UK version (which the USA version partially embeds) offers, in spite of frequent expression of biased views, a few remarkable articles. (Some say, "at least they are on average excellently written".)

It was an article about Eileen O’Shaughnessy - George Orwell's wife (I suppose this could raise interest, possibly also yours).

I have seen in that text unneeded references to Orwell's most private matters - as if spying in Mr. Blair's rooms.

And this should tell us how hints ("Well, it was published there"), while valuable to have at least some tentative initial ranking, are unfortunately not useful for reliable discrimination.


pg posted an early version of the list back in March 2009 when it include only 2096 sites:

<https://news.ycombinator.com/item?id=498910>

That grew fairly rapidly, it was at 38,719 by 30 Dec 2012:

<https://news.ycombinator.com/item?id=4984095> (a random 50 are listed).

I suspect that overwhelmingly the list continues to reflect the characteristics of its early incarnations.


hosts file aggregator last updated: August 17 2023 : https://github.com/StevenBlack/hosts

  current 'unique porn domains' = 53,644

  current adware, malware, tracking, etc. = 210,425 unique domains


Do you have get-out-of-jail or N-strikes-and-you're-out policies? What if someone's legitimate website gets caught in this? I've also long wondered about user specific shadow bans. Can you please shed light on this?


There's no automatic unban. That would require writing code that knows how to tell a good (for HN) site apart from a bad one, and if we could write such code, we wouldn't need to keep a list of banned sites in the first place. However, we're always happy to unban a site when we notice that it's actually fine for HN, or when someone points this out to us.

Re shadowbanning (i.e. banning a user without telling them), see the past explanations at https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que... and let me know if you still have questions. The short version is that when an account has an established history, we tell them we're banning them and why. We only shadowban when it's a spammer or a new account that we have reason to guess is a serial abuser.


You forgot to mention that you are also shadowbanning the ability of users to upvote or downvote things when you dislike their upvotes or downvotes—instances that you perceive as not contributing to the discussion or that are escalating the conversation.


I didn't forget to mention that - it's simply not what the word shadowban means, as I've always understood and used it.

This is a big problem with trying to explain these things - people mean very different things by the same words, and it leads to misunderstanding.


Which other word do you think would be suitable here? In my view, 'shadowban' aligns with the definition in this context, as you aren't notifying people about it (hence 'shadow') and their actions of upvoting or downvoting have no impact (so same as shadowbanning comments or submissions etc).


I would call it either a penalty or a loss of voting privileges, depending on the specific case. It's not a ban because the account is not excluded from participating in other ways. In the same way, downweighted or penalized sites aren't the same as banned sites.


Well, it seems wikipedia has different definition than yours, it matches to what I wrote before.

https://en.wikipedia.org/wiki/Shadow_banning

> Shadow banning, also called stealth banning, hellbanning, ghost banning, and comment ghosting, is the practice of blocking or partially blocking a user or the user's content from some areas of an online community in such a way that the ban is not readily apparent to the user, regardless of whether the action is taken by an individual or an algorithm. For example, shadow-banned comments posted to a blog or media website would be visible to the sender, but not to other users accessing the site.

This part matches shadow banning voting and is basically the same what I wrote in my previous comment just using different words:

> partially blocking a user or the user's content from some areas of an online community in such a way that the ban is not readily apparent to the user

And this part, which contradicts what you wrote in your last comment:

> More recently, the term has come to apply to alternative measures, particularly visibility measures like delisting and downranking.


I use the term "shadow moderation".


Thanks for the thoughtful reply. Is it also true that users with certain karma count or special permissions have more significant - and potentially lasting - downvoting weight that impacts to the downvoted party's long term reputation?


I'm afraid I don't understand your question but here are the basics: HN has downvotes (on comments, not submissions). The ability to downvote requires > 500 karma. When a comment gets downvoted, both its point score and the commenter's karma go down (in most cases - it's more complicated than that but this is the principle). Does that help?


I'm interpreting your question as "are there privileged HN members with supervotes", excluding moderators, and who can single-handedly kill submissions or comments.

So far as I'm aware, no, and there are comments from dang and pg going back through the site history which argue strongly against distinguishing groups of profiles in any way.

The one possible exception is that YC founder's handles appear orange to one another at one point in time (pg discusses this in January 2013: <https://news.ycombinator.com/item?id=5025168>). The feature was disabled for performance reasons.

Dang mentions the feature still being active as of a year ago: <https://news.ycombinator.com/item?id=31727636>

I seem to recall a pg or dang discussion where showing this publicly created a social tension on the site, as in, one set of people distinguished from another.

dang discusses the (general lack of) secret superpowers here: <https://news.ycombinator.com/item?id=22767204>, which reiterates what's in the FAQ:

HN gives three features to YC: job ads (see above) and startup launches get placed on the front page, and YC founder names are displayed to other YC alumni in orange.

<https://news.ycombinator.com/newsfaq.html>

Top-100 karma lands you on the leaderboard: <https://news.ycombinator.com/leaders>. That's currently 41,815+ karma. There are also no special privileges here other than occasionally being contacted by someone. (I've had inquiries about dealing with the head-trip of being on the leaderboard, and a couple of requests to boost submissions, which I forward to the moderation team).


Thank you, dredmorbius, for this very helpful response. dang has mentioned a procedure or system that involves making guesses for the purpose of shadowbanning. I wonder if downvotes (edit: or post flags) from special users like the ones you mention are used as strong signals in that guess-making?


I don't know about that, but emails to mods count a fair bit.

(I'll occasionally note an egregiously-behaving account that doesn't seem to have been already banned.)


Well that explains why all those links I posted to maximizedlivingdrlabrecque.com never got any traction…


That's a lot of domains! Did you source that from some other list, or is that a result of 67k individual entries? Either way, I appreciate it.

Out of curiosity, what's the rationale for blocking archive.is? Legal reasons I assume?


It's unavailable a lot ("Tell HN: Archive.* Is Unavailable" https://news.ycombinator.com/item?id=35749833 "Ask HN: Archive.is Captcha Problems Lately?" https://news.ycombinator.com/item?id=37077049) so discussions tend to end up being about archive.is instead of the content.


It's not sourced from any other list, it's just what mod actions and software filters have accumulated over the years.

Re archive.is - see https://news.ycombinator.com/item?id=37130177


> That's a lot of domains!

Not really. 67k/350m=0.02%


And 0.02% is really a lot.


Is it?


0.02% of a big number ... is still a big number

0.02% of 10,000 is 2 - pretty small

0.02% of 1,000,000,000 is 200,000 ... kinda big :)


In my opinion, absolute value is not important here, what matters is the fraction. I consider 0.02% to be large by itself for the given context.


What would be small?


This is subjective, but for me here a small fraction would be a few orders of magnitude less than that - few ten-millionths or less.


Exactly, it is subjective, 0.02% is small in my opinion. Yet I'm getting downvoted and told that isn't small. ¯\_(ツ)_/¯


What's important here is the negative impact of the blacklist on communication. Probably some downvoters mean that the blacklist is big enough that the impact is important, and disagree with your supposed implication that it isn't.


Agreed, you're probably right about that.


Maybe "major media" should include tech media like The Register, Ars Technica, Tech Dirt, etc.. Unlike with media like the NYT, Bloomberg or Reuters, I've never seen a story for which these sites were the best source and much of what they publish is blogspam summarizing stories that have already been posted on HN, usually with a votebait title.


Yes, those sites are all downweighted. Whether they count as "major media" or not, they're classified the same way by HN's software, for more or less the same reason: they produce a lot of derivative and/or sensational and/or otherwise not-great-for-HN content, and they also produce substantive articles that are good for HN.


This is mad take. Also, putting junk that nyt writes together with Reuters is just wrong.


SEO optimized domains, so 2010 :-)


radio.com looks legit, what is wrong with it?


We probably banned it because https://web.archive.org/web/20201027012245/https://kroq.radi... (posted to HN here: https://news.ycombinator.com/item?id=18253701) was spam.

I haven't dug into the logs, but most probably we saw that https://news.ycombinator.com/submitted?id=thebottomline was spamming HN and banned the sites that they were spamming.

Edit: if you (i.e. anyone) click on those links and don't see anything, it's because we killed the posts. You can turn on 'showdead' in your profile to see killed posts. (This is in the FAQ: https://news.ycombinator.com/newsfaq.html.) Just please don't forget that you turned it on, because it's basically signing up to see the worst that the internet has to offer, and sometimes people forget that they turned it on and then email us complaining about what they see on HN.


I just get 'This site isn't currently available in the EU'


I have the same question


So, is there an algorithm to be features in the front page? —other than upvotes. If a site can be banned, can another one be promoted?


I could be wrong, but I was always under the impression that companies that are in ycombinator get an inital boast in the jobs posts but also quickly fall off as such links don't allow comments.


Sorry, but I don't understand your question.


Would be nice if the lists were published though with a link to the list from the submission form.


The problem is that if you publish the lists it leads to more abuses. For example if spammers find out which sites are banned then they just post other ones.


I think there are two different types of sites you are blocking: (1) those which are just pure spam; (2) news/opinion/etc websites that you’ve decided are not suitable for HN for various reasons (such as being low quality and tending to produce more ideological flame-wars than curiosity), for example Breitbart

I agree that publishing case (1) causes harm (spammers will just use a different domain if they know you’ve blocked theirs.) But case (2) is rather different. I don’t think the same justification for lack of transparency exists in this case. And I think shadow-banning the submission in case (2) is not very user-friendly. It would be better to just display an error, e.g. “submissions from this site are blocked because we do not believe it is suitable for HN” (or whatever). A new user might post stuff like (2) out of misunderstanding what the site is about rather than malevolence, so better to directly educate them than potentially leave them ignorant. Also, while Breitbart is rather obviously garbage, since we don’t know everything in category (2) on the list, maybe there are some sites on it whose suitability is more debatable or mixed, and its inappropriateness may be less obvious to someone than Breitbart’s (hopefully) is


That's a good argument and subtle enough that I'm not sure whether I agree or disagree.


> For example if spammers find out which sites are banned then they just post other ones.

I don't think that makes sense. The supposed spammers can just try looking up whether their submissions show up or not when not logged in.


That also requires additional effort on the spammers’ part. Increasing cost of attacks is an effective defense strategy.


Increasing cost of attacks is effective against good faith people, not spammers.

Even Cory Doctorow made this case in "Como is Infosec" [1].

The only problem with Cory's argument is, he points people to the SC Principles [2]. The SCP contain exceptions for not notifying about "spam, phishing or malware." But anything can be considered spam, and transparency-with-exceptions has always been platforms' position. They've always argued they can secretly remove content when it amounts to "spam." Nobody has challenged them on that point. The reality is, platforms that use secretive moderation lend themselves to spammers.

[1] https://doctorow.medium.com/como-is-infosec-307f87004563

[2] https://santaclaraprinciples.org/


In my experience, increasing cost or delay even a little bit cuts out a disproportionate amount of bad stuff.

I once had the domain 'moronsinahurry' registered, though not with this group in mind...


In your experience where?

No research has been done about whether shadow moderation is good or bad for discourse. It was simply adopted by the entire internet because it's perceived as "easier." Indeed, for platforms and advertisers, it certainly is an easier way to control messaging. It fools good-faith users all the time. I've shared examples of that elsewhere in this thread.


I think that you are reading this too narrowly. SPAMers etc are often in a hurry. For example, simply avoiding responding for a second or two to an inbound SMTP connection drops a whole group of bad email attempts on the floor while no one else even notices.[0] Another example: manually delaying admitting new users to a forum (and in the process checking for bad activity from their IP/email etc) seems to shed another bunch of unwanteds, and raising the cost a little with some simple questions on the way in, also. This point about small extra delay and effort deterring disproportionately bad behaviour is quite broad.

[0] https://deer-run.com/users/hal/sysadmin/greet_pause.html


In your cost/benefit analysis, you overlook the harms created by secretive actions. That's why I asked for details about your experience.

The internet has run on secrets for 40 years. That doesn't make it right. Now that everyone and their mother is online, it's time to consider the harms that secrets create.


There are bad actors, and many of them are lazy/stupid. Their activity imposes a tax / harms on the rest of us. One way to minimise that harm to the good actors includes some mildly covert measures. The sendmail GreetPause is hardly a secret for example: it catches a common deliberate malicious protocol violation and is publicly documented. This is not unique to the Internet nor new; see also banking and personal security and so on.


This subthread started with a discussion about how "HN itself also shadow flags submissions" [1]. That's a slightly different form of moderation than the t.co delays.

Another commenter argued "Increasing cost of attacks is an effective defense strategy."

I argued it is not, and you said adding a delay can cut out bad stuff. Delays are certainly relevant to the main post, but that's not what I was referring to. And I certainly don't argue against using secrets for personal security! Securitizing public discourse, however, is another matter.

Can you elaborate on GreetPause? Was it to prevent a DDOS? I don't understand why bad requests couldn't just be rejected.

[1] https://news.ycombinator.com/item?id=37130143


Here's another reasonable summary:

https://www.revsys.com/tidbits/greet_pause-a-new-anti-spam-f...

I get several thousand SPAM attempts per day: I estimate that this one technique kills a large fraction of them. And look how old the feature is...


Okay, so the requests do get rejected, it just uses a delay to make that decision.

I don't consider GreetPause to be a form of shadow moderation because the sender knows the commands were rejected. The issue with shadow moderation on platforms is that the system shows you one thing while showing others something else.

Legally speaking, I have no problem with shadow moderation. I only argue it's morally wrong and bad for discourse. It discourages trust and encourages the growth of echo chambers and black-and-white thinking.


How do you view the rest of typical SPAM filtering, where the mail is apparently accepted for delivery but then silently thrown away? For simplicity assume a system such as mine where I control the MTA and client, so no one is making decisions hidden from me as the end user who wants to get the ham and see no SPAM. (I get tens of ham per day and many many thousands of SPAM attempts.)


With spam email, the recipient has a chance to recover the mail by looking in their spam folder.

No such spam folder is provided to the public on social media.


Note that in the GreetPause case the SPAMmer will not see the rejection errors since they don't look at the response to their hit and run (ie no one gets to see any error, neither sender nor target), and a legitimate sender should never get the error, so even this may be messy by your criteria I think!


> even this may be messy by your criteria I think!

Only if the recipient sent a false response.

If the response were misrepresented then I would object to the technique. But it doesn't sound like that's what happens.


OK, thanks!


platforms that use secretive moderation lend themselves to spammers

how is that? i can understand it not being useful, but how would it help spammers?


Spammers game the system while good-faith users get edged out. Spammers are determined actors who perceive threats everywhere, whereas good-faith users never imagine that a platform would secretly remove their content. Today, you see low quality content on social media, not because the world is dumb, but because the people who get their message out know the secret tricks.

Secret suppression is extremely common [1].

Many of today's content moderators say exceptions for shadowbans are needed [2]. They think lying to users promotes reality. That's bologna.

[1] https://www.removednews.com/p/hate-online-censorship-its-way...

[2] https://twitter.com/rhaksw/status/1689887293002379264


so to spammers shadowbanning makes no difference, but good-faith users somehow get discouraged even if they don't know they are shadowbanned just because they get no reaction to their posts? how is an explicit ban any less discouraging?

i can't see how shadowbanning makes things worse for good-faith users. and evidently it does work against spammers here on HN (though we don't know if it is the shadow or the banning that makes it effective, but i'll believe dang when he says that it does help)


> how is an explicit ban any less discouraging?

It's about whose messages are sidelined, not who gets discouraged.

With shadow removals, good-faith users' content is elbowed out without their knowledge. Since they don't know about it, they don't adjust behavior and do not bring their comments elsewhere.

Over 50% of Reddit users have removed content they don't know about. Just look at what people say when they find out [1].

> and evidently it does work against spammers here on HN

It doesn't. It benefits people who know how to work the system. The more secret it is, the more special knowledge you need.

[1] https://www.reveddit.com/#say


It has made sense since the internet was invented, spammers need everything thrown at them because they will abuse every nook and cranny of your system to get paid 1 cent more


You're correct again. Spammers and bots are the most determined actors, so these secretive measures don't impact them.

In fact, such secrecy benefits spammers. Good-faith users never imagine that platforms would secretly action content. So when you look at overall trends, bots, spammers and trolls are winning while genuine users are being pushed aside.

I argued that secrecy benefits trolls in a blog post, but I don't want to spam links to my posts in the comments.


Most spammers aren’t that competent. Hiding their posts without telling them used to be very effective on Reddit (now Reddit tells them). I guess it’s the same on HN.


Spammers are more competent than genuine users. They are advertisers, so they are more likely to be tracking metrics.


If that were right, then HN would be overrun by spam.


So you think secretive measures more often defeat spammers than trusting users? I'd argue HN's content could be a lot better than it currently is.

Content curation is necessary, but shadow moderation is not helping. When a forum removes visible consequences, it does not prepare its users to learn from their mistakes.

I'll admit, I find HN to be more transparently moderated than Reddit and Twitter, but let's not pretend people have stopped trying to game the system. The more secret the rules (and how they are applied), the more a system serves a handful of people who have learned the secret tricks.

Meanwhile, regular users who are not platform experts trust these systems to be transparent. Trustful users spend more time innovating elsewhere, and they are all disrupted by unexpected secretive tricks.


> So you think secretive measures more often defeat spammers than trusting users?

Yes. And it's really not a close question.

"Regular users" don't have to be platform experts and learn tricks and stuff. They just post normal links and comments and never run into moderation at all.


> They just post normal links and comments and never run into moderation at all.

On the contrary, secret suppression is extremely common. Every social media user has probably been moderated at some point without their knowledge.

Look up a random reddit user. Chances are they have a removed comment in their recent history, e.g. [1].

All comment removals on Reddit are shadow removals. If you use Reddit with any frequency, you'll know that mods almost never go out of their way to notify users about comment removals.

[1] https://www.reveddit.com/y/Sariel007/

archive: https://archive.is/GNudB


I'm talking specifically about HN, not reddit.


Why not publish the list? Users would know what not to submit in that case. Except maybe you’re worried about the list being heavily curated a certain way…


> Why not publish the list? Users would know what not to submit in that case. Except maybe you’re worried about the list being heavily curated a certain way…

The "certain way" is the experience of moderating HN. Publishing the list would help spammers know how to better circumvent it.


I can't believe you all fell for the whataboutism.


I don't see a single left-wing new source in there.


By that logic, the fact that penispowerworldwide.com is banned on HN* means we're biased against your politics.

Of the 67k sites banned on HN I would guess that fewer than 0.1% are "news sources", left- or right- or any wing. Why would you expect them to show up in a random sample of 10?

* which it is! I've unkilled https://news.ycombinator.com/item?id=1236054 for the occasion.


Well, unless surrogacymumbai.com is where we post right-wing news from, that's hardly a problem, eh?


I posted https://news.ycombinator.com/item?id=37131212 before noticing that you'd already made the same argument - sorry! But I have to leave it up because penispowerworldwide.com makes me laugh.


Finally pens are no longer constrained to just one island.


I like to think this is the parent company for Prestige Worldwide.


You don’t have to apply both sides logic to everything in life.


That's what "random 10 out of 67k sites" gives you. Your set was cherry-picked, dang's wasn't.

That said, dailykos.com seems to be banned. Happy now?


> Your set was cherry-picked

Not exactly cherry-picked, these were from things I submitted myself and noticed that were shadow flagged.

> That said, dailykos.com seems to be banned. Happy now?

No, I'd be happy when archive.is, Federalist and the rest of the non-spammy ones are unbanned. (Also, even if "balanced" censorship was the desired goal, having a single unreliable left-wing source banned vs a ton of right-wing ones doesn't really achieve that.)


Turn on show dead, browse https://news.ycombinator.com/newest and take the affirmative community moderation steps of vouching for those that are good.

Archive.is shouldn't ever need to be the primary site. Post a link to the original and then a comment to the archive site if there's the possibility of take down or issues with paywalls.

It is likely that people were using archive.is for trying to avoid posting the original domain and masking the content that it presented.


> Not exactly cherry-picked, these were from things I submitted myself and noticed that were shadow flagged.

Definitely not random, in any case.

> Also, even if "balanced" censorship was the desired goal,

Nobody claimed that. You merely stated that "I don't see a single left-wing new source in there." and I offered a counter-point.

> having one left-wing source vs a ton of right-wing one doesn't achieve that

I didn't do an exhaustive search for "left-wing domains" that are banned to present you a complete list, this was attempt 1 of 1.

Following your model, I could claim that 100% of left-wing domains are banned, but I won't.


Have you considered that may speak more to your biases than the site's? How many far left-wing news sites do you regular?


And you don't see that as censorship?


The word censorship has so many meanings that I have to ask what you mean by it before I can say whether I see it that way.

Is it censorship that the rules of chess say you can't poke someone's queen off the board? We're trying to play a particular game here.


that is a very interesting way of communicating that point. thank you filing that away :)


> The word censorship has so many meanings that I have to ask what you mean by it before I can say whether I see it that way.

Perhaps its one of those things that are hard to define. [1] But that doesn't mean clear cases don't exist.

> Is it censorship that the rules of chess say you can't poke someone's queen off the board? We're trying to play a particular game here.

No, but it is clearly political censorship if you only apply the unwritten and secret "rules" of the game to a particular political faction. Also, banning entire domain names is definitely heavy-handed.

[1]: https://en.wikipedia.org/wiki/I_know_it_when_I_see_it


> political faction

I remember some words that succinctly express something I often observe. To paraphrase:

> Left-wing and Right-wing are terms which make a lot of people falsely believe that they disagree with each other.

It is worth trying to find common ground with people “on the other side”.


> censorship if you only apply the unwritten and secret "rules"

I mostly agree. I argued in an article [1] that it's only censorship if the author of the content is not told about the action taken against the content.

These days though, mods and platforms will generally argue that they're being transparent by telling you that it happens. When it happens is another story altogether that is often not shared.

[1] https://www.removednews.com/p/twitters-throttling-of-what-is...


Censorship has a very defined meaning. To take the chess analogy, it would be like allowing one side to poke the queen off the board, and not allow the other. This is very much like what happens at HN today.

You're dang right, trying to play a particular [rigged] game here.


What side do you feel is not allowed to play here?


What is the "very defined meaning" that you're thinking of?

The one that I think makes the most clear sense is "censorship" by a state power. But you must be thinking of something different, because HN is not a state power.


This forum never said they’re a free speech haven.


That's true, but it's a bit of an interesting question because "free speech" has different meanings. The thing to understand about HN is that we're trying to optimize for one thing: intellectual curiosity (https://hn.algolia.com/?dateRange=all&page=0&prefix=true&sor...). Given that, we're not "free speech" in the sense of "post anything about anything" - we have to moderate spam, flamewar, lame comments like "ok boomer", etc., because those detract from curious discussion.

On the other hand, no single political or ideological position has a monopoly on intellectual curiosity either—so by the same principle, HN can't be moderated for political or ideological position.

It's tricky because working this way conflicts with how everyone's mind works. When people see a politically charged post X that they don't like, or when they see a politically charged post Y that they do like, but which we've moderated, it's basically irresistible to jump to the conclusion "the mods are biased". This is because what we see in the first place is conditioned by our preferences - we're more likely to notice and to put weight on things we dislike (https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...). People with opposite preferences notice opposite data points and therefore "see" opposite biases. It's the same mechanism either way.

In reality, we're just trying to solve an optimization problem: how can you operate a public internet forum to maximize intellectual curiosity? That's basically it. It's not so easy to solve though.


I personally think you guys have it mostly figured out. Kudos.


The difference is that HN is explicitly heavily moderated while Twitter pretends to be an equitable free speech platform.


Unless you disagree with Elon



Good.

Hacker News isn't an open-ended political site for people to post weird propaganda.


How's archive.is "weird propaganda"?


It isn't banned in comments - https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que..., https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que..., etc.

We probably banned it for submissions because we want original sources at the top level.


> We probably banned it for submissions because we want original sources at the top level.

Then why web.archive.org isn't also banned? [1] And what about things which aren't available from the original source anymore?

[1]: https://news.ycombinator.com/item?id=37130420


That's a good question. See https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que... and dredmorbius's comment at https://news.ycombinator.com/item?id=37138346 re archive.org.

As for "why archive.org and not archive.is" - that's a bit of a borderline call, but gouggoug pointed out some of it at https://news.ycombinator.com/item?id=37130890. The set of articles which (a) are no longer on the web, (b) are not on archive.org, but (c) are on archive.is, isn't that big. Paywall workarounds are a different thing, because the original URLs are still on the web (albeit paywalled). For those, we want the original URL at the top level, because it's important for the domain to appear beside the title.


You're apparently in the middle of editing this as I'm replying, but I suspect I'm close to the mark here: <https://news.ycombinator.com/item?id=37138346>


Yup!


> Then why web.archive.org isn't also banned?

Because web.archive.org is generally used for...

... things which aren't available from the original source anymore.

While archive.is is generally used to bypass paywalls. These 2 websites have 2 very distinct missions and use-cases.


Whilst I agree with your characterisation as regards usage on HN, I will note that Archive Today actually is a quite useful archival tool, and often works on sites which the Internet Archive behaves poorly on.

I'd run across an instance of this when the Diaspora* pod I was on (the original public node, as it happens) ceased operations. I found myself wanting to archive my own posts, and was caught in something of a dilemma:

- The Internet Archive's Wayback Machine has a highly-scriptable method for submitting sites, in the form of a URL (see below). Once you have a list of pages you want to archive, you can chunk through those using your scripting tool of choice (for me, bash, and curl or wget typically). But it doesn't capture the comments on Diaspora* discussions.... E.g., <https://web.archive.org/web/20220111031247/https://joindiasp...>

- Archive.Today does not have a mass-submission tool, and somewhat aggressively imposes CAPTCHAs at times. So the remaining option is manual submissions, though those can be run off a pre-generated list of URLs which somewhat streamlines the process. And it does capture the comments. E.g., <https://archive.is/9t61g>

So, if you are looking to archive material, Archive Today is useful, if somewhat tedious at bulk.

(Which is probably why the Internet Archive is the far more comprehensive Web archive.)


The Internet Archive is permitted when the original site or content is unavailable.

Otherwise, HN's rule is to "submit the original source": <https://news.ycombinator.com/newsguidelines.html>

I suppose that might be clarified as "most original or canonical", but Because Reasons HN's guidelines are written loosely and interpreted according to HN's Prime Directive: "anything that gratifies one's intellectual curiosity" <https://news.ycombinator.com/item?id=508153>.


And how was the decision made to ban Federalist, but not say Guardian or The Daily Beast? Do you have any process in place to ensure that your political biases don't influence the list, or you don't care about that?


We don't have "processes" at HN. The idea makes my skin crawl.

Plenty of both left- and right-wing sites are banned and/or downweighted on HN. When a site is primarily about political battle, we either ban it or downweight it. Which of the two we choose depends on how likely the site is to produce the occasional interesting article (in HN's sense of the word "interesting"). That's why The Federalist and World Workers Daily (or whatever it's called) are banned, while National Review and Jacobin are merely downweighted. Both the Guardian and Daily Beast are downweighted, btw, as are most major media sites.

If you or anyone thinks that HN moderation is unfairly ideologically biased, I'm open to the critique, but you guys need to first look at the site as it actually is, and not just look at your own pre-existing perceptions. Every data point becomes a Convincing Proof when you do the latter.

People think that when their team gets moderated, the mods are OMG obviously on the other side. The Other Side feels exactly the same way. This "they're against me" perception is the most reliable phenomenon I've observed on HN. Leftists feel it, rightists feel it, Go programmers feel it, even Rust programmers feel it. Literally the very-most-popular topic on HN at any moment is perceived by someone as Viciously Suppressed because of this perception. Stop and think about that—it's kind of amazing. Someone should write a PhD thesis.


>If you or anyone thinks that HN moderation is unfairly ideologically biased, I'm open to the critique, but you guys need to first look at the site as it actually is, and not just look at your own pre-existing perceptions.

Since when has moderation actions and relevant data been made available to the lay public here? We cannot look at the site as it actually is. We either have to trust you or pound sand.

>Stop and think about that—it's kind of amazing. Someone should write a PhD thesis about it.

Just because (you think) everyone feels persecuted doesn't mean you're doing a good job keeping things level. It's a common joke to make but it's just a joke. Similarly, if both a rampant nazi, and a fierce tankie hate you, that doesn't make you a bastion of democracy. "Fairness" doesn't mean pissing off everyone equally, and that is neither a necessary or sufficient condition.

These are just minor notes, don't take them too seriously


Re your first paragraph: there's more than enough information in the public data. Usually it only takes a little time with HN Search to find information that clarifies such questions.

We don't publish a moderation log for reasons I've explained over the years- if you or anyone wants to know more, see the past explanations at https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que... and let me know if you still have questions.

Not publishing a mod log doesn't mean that we don't want to be transparent, it means that there's a tradeoff between transparency and other concerns. Our resolution of the tradeoff is to answer questions when we get asked. That's not absolute transparency but it's not nothing. Sometimes people say "well but why should we trust that", but they would say that about a moderation log as well.

Re your second paragraph: I agree! and I don't think I've claimed otherwise. In fact, the lazy centrist argument is a pet peeve (https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...).

It's true that the way I post about these things ("both sides hate us") gets mistaken for the obvious bad argument ("therefore we must be in the sweet spot", or as Scott Thompson put it years ago, "we're the porridge that Goldilocks ate!"), but that's because the actual argument is harder to lay out and I'm not sure that anybody cares.


> [/]even[/] Rust programmers feel it

:D

> Someone should write a PhD thesis about it

In a perspective it could be related to Multi-Agent Systems (maybe with reference also to Minsky and H. Simon), as a consequence of the narrow view of the single agent, and/or an intrinsic fault of resource optimization.


> you guys need to first look at the site as it actually is, and not just look at your own pre-existing perceptions.

How can one see the site "as it actually is" when the decisions are kept secret from submitters?

> People think that when their team gets moderated, the mods are OMG obviously on the other side. The Other Side feels exactly the same way.

This will always be a thing. But it's also true that society is more divided now than it was 20 years ago. We find ourselves unable to communicate across ideological divides and we resort to shouting or in some cases violence. Some effort must be made to improve communication, and transparency for content authors is a minimal step towards that.


What do you mean, "even" Rust programmers feel it? Rust programmers feel it the most. This site is hopelessly biased against Go programmers and towards Rust programmers!


The important thing is to read it to the tune/voice of Peter Tosh doing this bit in Legalize it -

Doctors smoke it

Nurses smoke it

Judges smoke it

Even lawyer too


Would you be open to publishing the down-weighting values if they're based on politics not spam?

This is the murkiest part to me since it's not just a binary flag.


It's really not that interesting and I don't remember exactly. It's just a penalty that means the post has to get moderately more upvotes (maybe 30%? I'm not sure because that code is a bit rube-goldbergery) to rise to the same level vis-a-vis an unpenalized site.

For sites in this category (i.e. not banned, but downweighted) we don't distinguish between political sites, major media sites, sensational bloggy sites and so on. They're all in the same bucket.


Why do you assume the Federalist is blocked because of its political leanings?


I'm guessing it's reactive, and Federalist links tended to be garbage often enough to convince someone they should hit the ban button, whereas the others didn't rise up with trash often enough to matter?


[flagged]


> It's under no obligation to be politically balanced.

And obviously, I'm under no obligation to not voice my concern about that.


> And obviously, I'm under no obligation to not voice my concern about that.

Except... You are...

The guidelines [1] of the site outline that most discussion of politics is off-topic as it most often doesn't gratify one's intellectual curiosity. It's rather explicit, too: "Please don't use Hacker News for political or ideological battle.".

1- https://news.ycombinator.com/newsguidelines.html


I came to HN because it seemed to be a place of free dialogue.

> It's certainly under no obligation to promote right-wing propaganda and hate.

It's certainly under no obligation to promote any propaganda or hate. That's what makes the site great. The HN crowd is not your average layman - it's largely well read and educated people. Why not let them decide what is propaganda and what's not (as opposed to some arbitrary authority deciding this)?


> The HN crowd is not your average layman - it's largely well read and educated people. Why not let them decide what is propaganda and what's not (as opposed to some arbitrary authority deciding this)?

That is at best a very naïve take. This is a public forum, not some secret society of like-minded intelligentsia.


> …some arbitrary authority deciding…

something tells me the team here is anything but arbitrary. i can assure you, keeping a community this healthy at this scale is pretty incredible.

what you’re suggesting is to raise the noise and for the users to wade through noise hunting for signal.

what rises to the top here is so much better than any of the noisy sites. on those sites, by the time you find a signal, the information is usually amateur-hour tier. to use an idiom i heard yesterday, sites with little moderation are like searching for a hymen in a whorehouse.


Because it's a service to us that makes the site more valuable. I have exactly zero interest in spending my time here deciding what is or isn't propaganda. I deeply appreciate "some arbitrary authority" providing that hard and mostly thankless service to me. Even if I miss a few things that I wouldn't consider to be spam / abuse / propaganda but which they did, that is a tiny price to pay for having a place to come read things that is mostly free of all that crap.


>The HN crowd is not your average layman - it's largely well read and educated people.

People need to stop thinking this. There's no filter in front of the HN sign up. The community here is the same as the community anywhere else on the internet. We aren't special


That's a very valid point! I simply assumed due to the subject matter (coding discussion, in depth tech industry discussions), the average HN reader would be more well read as opposed to your average X user.


Being deeply informed on a handful of subjects is not the same as "more well read"

It's "more well read about those subjects" :)


Why? Why do you believe that someone familiar with coding discussion and tech industry discussions to be more likely to be well read than an average person? I am explicitly asking you to reconsider these assumptions, think about where they come from, and think if they have any real data to back them up at all


because someone familiar with coding discussion will likely have a basic grasp on logic and maths.

this alone makes them more ‘well read’ than the average X user, especially considering all the conspiracy theories (and groupthink) floating around on the site.


   I came to HN because it seemed to be a place of free dialogue.
It certainly does feel like a series of monologues at times.


> The HN crowd is not your average layman - it's largely well read and educated people.

It most certainly is the exact mixture of people as you’d find elsewhere. The only difference is they’ll (we’ll) foam at the mouth over something vaguely tangential to tech vs. some other topic.

Here's a recent example of the type of take you might find here: https://news.ycombinator.com/item?id=37113312


> no obligation to promote right-wing propaganda and hate

Your poor unintentional choice of words (hopefully) makes it sound like left-wing propaganda and hate is something worthy of being promoted.

Do hope you don't mean it this way.


Did you read the rest of your own post?


Try submitting a URL from the following domains, and it will be automatically flagged (but you can't see its flagged unless you log out): - archive.is

I can assure you that is Not the case with HN: on posting archive.is URL's, proof?

Look at my comment postings : https://news.ycombinator.com/threads?id=archo

Is it possible you have been shadow-banned for poor compliance to the [1]Guidelines & [2]FAQ's?

[1] : https://news.ycombinator.com/newsguidelines.html

[2] : https://news.ycombinator.com/newsfaq.html


> I can assure you that is Not the case with HN: on posting archive.is URL's, proof?

It's not banned in comments, but it is banned in submissions. @dang (HN's moderator) confirms that here: https://news.ycombinator.com/item?id=37130177


Covered in the Guidelines;

>Please submit the original source. If a post reports on something found on another site, submit the latter.

And explained on numerous occasions by dang.


I must admit that I've never really delved into the rules on what/how to post on HN.

For example, I've linked to my work, but it never occurred to me to use "Show HN".

Maybe this is no big deal? Or perhaps for new signups, it would be good to “soft force” them to read the FAQ?


The assertion is about submissions, not comments.


Isn't blocking Stacker.news a petty move?

It's basically HN, but you can earn small tips for submissions and comments.


> It's basically HN, but you can earn small tips for submissions and comments.

Guesses it's crypto bullshit

goes to website

Yep, exactly as expected. Karma alone can mess with incentives, I cannot imagine that adding monetary incentive does anything but make it worse. Also crypto has the reverse-midas-touch from everything I've experienced first-hand or read so adding that into the mix is just another black mark.


It could be because they saw they were getting low quality links from there. In any case, since HN prefers original sources, it’s less likely that a news aggregator would be a good source (the occasional Reddit comment notwithstanding)


Unless the link is to a "show SN" article, it is indeed simply an aggregator, and not a link to the source.

What I mean to say is: I do see the logic of downgrading links to SN, because it is not usually an original source.


Yeah. HN bans your favorite white supremacist blogs. I don’t see a problem with that.


the wise man bowed his head solemnly and spoke: "theres actually zero difference between good & bad things." -- @dril


There are other domains that, while not algorighmically banned, have an army of obsessive people who will flag any story from them if they see it.


Additional details I wrangled for this rabbit hole. I don't think it's t.co doing this intentionally, but rather poor handling of 'do you have our cookies or not'. Everyone in this thread _proving things_ without taking into account the complexity of the modern web.

   man curl
       -b, --cookie <data|filename>
              (HTTP) Pass the data to the HTTP server in the Cookie header. It is supposedly the data previously received from the server in a "Set-Cookie:" line.
----

Add that option to your curl tests.

    ---
    $ time curl -s -b -A "curl/8.2.1" -e ";auto" -L https://t.co/4fs609qwWt -o /dev/null | sha256sum 
    eb9996199e81c3b966fa3d2e98e126516dfdd31f214410317f5bdcc3b241b6a2  -

    real    0m1.245s
    user    0m0.087s
    sys     0m0.034s
    ---

    $ time curl -s -b -e ";auto" -L https://t.co/4fs609qwWt -o /dev/null | sha256sum 
    eb9996199e81c3b966fa3d2e98e126516dfdd31f214410317f5bdcc3b241b6a2  -

    real    0m1.265s
    user    0m0.103s
    sys     0m0.023s
    ---

    $ time curl -s -b -A "Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Firefox/102.0" -e ";auto" -L https://t.co/4fs609qwWt -o /dev/null | sha256sum 
    eb9996199e81c3b966fa3d2e98e126516dfdd31f214410317f5bdcc3b241b6a2  -

    real    0m1.254s
    user    0m0.100s
    sys     0m0.018
    ---


Amazing that this poor handling of 'do you have our cookies or not' only affects news papers and social media sites that Elon doesn't like! What a coincidence.


If it's not intentional, why are people observing different behavior (no delay) for other domains, but a delay for NYT, bsky etc then?


oh boy... -b takes an option which in your examples is -A and -e, then what follows is interpreted as a URL and you throw away the warnings:

  % curl -vgsSIw'> %{time_total}\n' -b -A "curl/8.2.1" https://t.co/DzIiCFp7Ti 2>&1 | grep '^\(* WARNING: \)\|\(Could not resolve host: \)\|>' 
  * WARNING: failed to open cookie file "-A"
  * Could not resolve host: curl
  curl: (6) Could not resolve host: curl
  * WARNING: failed to open cookie file "-A"
  > HEAD /DzIiCFp7Ti HTTP/2
  > Host: t.co
  > User-Agent: curl/8.1.2
  > Accept: */*
  > 
  > 0.013309
  > 0.112494


Alright thanks for explaining that . Here's what I see explicitly setting the cookiejar

    $ time curl -s -b cookies.txt -c cookies.txt -A "Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Firefox/102.0" -e ";auto" -L https://t.co/DzIiCFp7Ti

    [t.co meta refresh page src]

    real     0m4.635s
    user   0m0.004s
    sys     0m0.008s

    $ time curl -b cookies.txt -c cookies.txt -A "wget/1.23" -e ";auto" -L https://t.co/DzIiCFp7Ti                        curl: (7)
    Failed to connect to www.threads.net port 443:  Connection refused
    real     0m4.635s
    user   0m0.011s
    sys     0m0.005s

    $ time curl -b cookies.txt -c cookies.txt -e ";auto" -L https://t.co/DzIiCFp7Ti                                       curl: (7)
    Failed to connect to www.threads.net port 443 Connection refused
    real     0m0.129s
    user   0m0.000s
    sys     0m0.013s
The failed to connects are threads.net likely blocking those user agents but the timing is there which is different than the first UA attempt.


I can replicate this behavior fairly easily in a browser.

  1. Open incognito window in Chrome
  2. Visit https://t.co/4fs609qwWt -> 5s delay
  3. Open a second tab in the same window -> no delay
  4. Close window, start a new incognito session
  5. Visit https://t.co/4fs609qwWt -> 5s delay returns


The reason there isn't a delay the second click is because the redirect is cached locally in your browser.

Your humble anonymous tipster would appreciate if you do a little legwork.


> The reason there isn't a delay the second click is because the redirect is cached locally in your browser.

No, because it’s not an HTTP redirect. It’s an HTML page that redirects you using a meta tag, something that the browser doesn’t cache.


Your humble anonymous tipster notes to their skeptical audience that browsers are capable of caching all sorts of things, even something as peculiar as an HTML page.


> browsers are capable of caching all sorts of things, even something as peculiar as an HTML page.

Yes, and this is irrelevant to your previous comment: caching the HTML doesn’t cache the redirect itself.


You can lead a horse to water, but you can't make him drink. The delay was not on the HTML page.


> The delay was not on the HTML page.

Nobody is saying that.


What is that attempting to prove or replicate?

Here's a simpler test I think replicates what I am indicating in GP comment, with regards to cookie handling:

Not passing a cookie to the next stage; pure GET request:

    $ time curl -s -A "Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Firefox/102.0" -e ";auto" -L https://t.co/4fs609qwWt > nocookie.html

    real    0m4.916s
    user    0m0.016s
    sys     0m0.018s

Using `-b` to pass the cookies _(same command as above, just adding `-b`)_

    $ time curl -s -b -A "Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Firefox/102.0" -e ";auto" -L https://t.co/4fs609qwWt > withcookie.html

    real    0m1.995s
    user    0m0.083s
    sys     0m0.026s
Look at the differences in the resulting files for 'with' and 'no' cookie. One redirect works in a timely manner. The other takes the ~4-5 seconds to redirect.


You're completely missing the point, which is that the 5 second delay doesn't exist at all for most t.co links, even without cookies. The delay only exists for a few Musk-hated domains.


In your second example you are passing the cookie file named ./-A then trying to GET the URL "Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Firefox/102.0" followed by https://t.co/4fs609qwWt


Amen


Good work Penguin. I believe in you


Remember when people were excoriating Google AMP for encouraging walled gardens? If true, this seems in so much worse faith than that.


Not worse. They are both as evil as it gets. Typical: take public resource and use it for an exclusive profit.

What happened to net neutrality? Could it applied for this case?


Net neutrality has been dead since 2017.


The speed at which enshitification is being unleashed surprises me each and every day.


Enshitification is different. It’s when companies destroy a product with hundreds of changes that prioritise internal politics above what end users want.

This is something else - just the ego of one rich guy petulantly satisfying his inner demons.


"Porque no los dos?"

A five-second delay may be enough to cause a measurably increase in the "stickiness" of Twitter if some people wait <5 seconds before clicking or scrolling onwards to something else.

Then they spend more time generating ad-revenue for Twitter than if they had gone off to the New York Times or something and started browsing over there.


It’s not doing it on foxnews.com. Take everything else Musk is saying and doing, and it’s hard to believe it’s not some grudgelist.


> just the ego of one rich guy petulantly satisfying his inner demons.

As that rich guy happens to be the CEO, how is this not the prime example of "prioritising internal politics above what end users want"?


> that prioritise internal politics

I thought it was about increasing short-term revenue.


I just took a benchmark using hyperfine.

Not threads.net, cURL User-Agent: 224.3 ms

Not threads.net, Firefox User-Agent: 227.4 ms

threads.net, cURL User-Agent: 223.9 ms

threads.net, Firefox User-Agent: 2743 ms

Twitter is trying to hide this fact? (As they don't make delay w/o browser User-Agent)

(Full log: https://gist.github.com/sevenc-nanashi/c77d18df6a5f326b0d292...)


thanks for sharing this!


Why not try a Nitter instance. It offers RSS and none of the URLs are shortened with "t.co".


They just broke nitter. Hopefully it will be fixed soon but in the long term Musk can cut nitter off if he wants.


It's working for me.


Which instance? None of the ones I'm trying are working. Maybe it's regional


You can refer to the full list on the github page[0] (which I find to be pretty accurate). I personally use La Contre-Voie[1] for no particular reason.

nitter.net was historically a little less reliable for me due to rate limiting, which is why I initially switched. They worked around the rate limiting issue now, so that may no longer be the case.

[0] https://github.com/zedeus/nitter/wiki/Instances

[1] https://nitter.lacontrevoie.fr/


Thanks. Unfortunately, not having any luck with any of them, including lacontrevoie. Nitter.net gives a nim error, the others don't find any tweets


Ahah! I see now you are correct, it was just broken after a good run of stability. Prior to this it had been working well for a month or so.

If the past is any indication, Nitter will be back again eventually, but every time Nitter breaks I drift further and further from caring about Twitter/X at all.


If give us a Twitter username as an example I'll find an instance thats working. The tweets and the non-t.co URLs are in the RSS.


Thanks - try holly, for example.


Got a message saying Twitter is blocking all access without login. You were right ajb. I was not getting this yesterday though. This also happened back in June but it was only temporary.


Great - at least it's not me personally being blocked somehow. I hope they do unblock it, but it does show that nitter is still at the mercy of Twitter.


And Twitter is at the mercy of Musk. If he blocks access to the public that chooses not to join then the problem is not Nitter. If X is going to be members-only like the Meta clone then I hope folks will stop submitting tweets to HN.


Nitter is working again.

Try https://nitter.cz/holly/rss

No t.co URLs.


I'd like to invoke Hanlon's razor here to at least cut down on the potential for over-the-top reactions that can be somewhat embarassing in retrospect:

"Never attribute to malice that which is adequately explained by stupidity."


Absolutely not applicable to corporations. Bad faith actions are rampant, and I am unwilling to give the benefit of the doubt to billion dollar companies.


How about incompetence instead of stupidity, would that apply to many billion dollar companies?


I believe the two are interchangeable within the Hanlon's Razor definition anyway.


Oh yes, there is definitely a lot of that around!


This is a case where Occam's Razor trumps Hanlon's Razor.


Not applicable when dealing with people who have narcissistic personality disorder like musk. Would be naive to assume stupidity.


Hanlon's razor needs to be inverted when the person you're talking about has NPD. The assumption should be that there are sinister intentions until proven otherwise. You'll make better predictions about reality if you do that inversion.


The best trick the devil ever played...


if I had to guess: ugly synchronous code for statistics that hits a dead shard for the target domain and times out.


Or both


Funny they haven't copied the full Chinese model yet. In China, the forwarding system can ban any website they don't like, and you have to copy the URL manually to visit it. 5 second delay is a mercy. :-)


Another great reason to get off Xitter.


Is that pronounced like "Shitter"? Huh.


The X is pronounced as “Shi”


Yup. It's super-easy to test and, yes, any nytimes.com link takes exactly a five-second count to open.


-- Related --

X has started reversing the throttling on some of the sites, including NYTimes

Discussions on HN: (61-comments - 2023-08-16) : https://news.ycombinator.com/item?id=37141478

Twitter post archive: https://archive.is/PW3eG


I'm outing myself as an ignoramus* here, but could the title be edited to indicate that this belongs to twitter?

I glossed past this on first read because "some url shortener has shitty behavior" wasn't interesting to me. Hearing heard about twitter's throttling someplace else made me come back here because I was surprised not to have heard it about it here first.

*: stalwart radical who doesn't use twitter


Many words about site moderation have been posted in the thread.. Who here has actually moderate(s|d) a forum/chat site?


I have, for about 20 years. I empathise with @dang's methods and explanations here. e.g., don't disclose a blocklist, use shadowbanning, don't get dragged down by pedantic responses to explanations. You're often one person against many trolls finding grey areas for sport, or spammers (automated or otherwise) probing defences.


I am in total agreement with you, keeping the noise/damage of nefarious scammers, spammers, hacker/pranksters,

attempting to penetrating the site requires ALL the tools in the toolbox.


Me also! Though I have tried to avoid it, somehow I get to be admin for many many years!

And I'm feeling HN's position, even though I occasionally trip some of the mechanisms here.


Some of us have, you will find. (-:

But, that said, I'm more interested in the discussions about verification, neutrality, and the reasons that people have for still clinging on for grim death that, in a few hours, will likely be pushed down onto a second page by that huge comment thread currently in the middle of this page and above them.


Shouldn't it be x.co now? Also doesn't t.co embed some tracking information as well?


The redirect has always been a bit annoying and I found on mobile if you click a link, see it load the t.co URL, then quickly go back and click again, it instead loads the actual linked URL. So I just instinctively do this now.


The link doesn't magically change. It might be partially cached when you go back and so it seems immediately to be loading the final url but it's not. It's always going to t.co first. C'mon.


can confirm from europe. shouldn’t this be illegal in america?


It is probably not illegal in America. Would it be illegal in Europe? Because (at least w/r/t Threads) it is an anti-competitive practice?


Net neutrality was repealed in 2018.


And then reinstated again at the state level. It never went away, this just doesn't apply because Twitter isn't an ISP (in that sense).


TIL! Looks like its only some of the states, and that is a good point about Twitter not being an ISP. Twitter is incorporated in Delaware which has not passed state-level net neutrality, so I don't know if it would have mattered anyway.


It basically just matters what California does. There's enough customers there that it's easier to apply those policies nationwide.

Incorporated in Delaware hardly ever affects anything except corporate law.


Network Neutrality has historically been more about malfeasance by ISP networks which have higher/more-concrete barriers to entry, as opposed to the websites.


I also noticed for certain threads, if you open in the browser, it does not let you read the other part of the thread.

You have to open in the app.


That sounds like a play on the "slow ban":

> Selective downtime, where the troll finds that the website is down (or really slow) quite often. Not all of the time, because that would tip them off. Trolls are impatient by nature, so they eventually find a more reliable forum to troll.

https://ask.metafilter.com/117775/What-was-the-first-website...


Usually, slow bans aren't done for people who pay you money and add value to your platform.


NYT and others are still adding value to the platform even if users become discouraged from clicking those links and navigating away (choosing instead to just stay on Twitter and read the summary and reactions there). Nudges like this help Twitter have their cake and eat it, too.


Once referral pageviews take a hit, I suspect the NYT will be reevaluating their options.


Great point! I just wanted to name the technique. I am actually an active speaker and developer against all secretive content moderation actions.


A variation on this is returning an error message of sorts.


Is it possible to build a browser extension to bypass this redirection?


It is called the Tail Call Optimization for a reason. /s


People are still using Twitter?


The justifications I've seen go along the lines of "I can't go elsewhere because my followers are here." .... But if you left, they'd likely follow.

It's like a microcosm of capitalism. The users don't realize they hold all of the power, I guess.


Reddit too, it's crazy to me.


[flagged]


>I bet rightwing sites had this issue under former management.

If Twitter's previous administration had been doing intentionally bad things as your baseless assertion implies, Musk would be posting the code or database tables and showing it everywhere to prove to the world how bad they were.

The fact that Musk is quiet here demonstrates your assertion is not simply baseless, it's provably false.


Unless you can prove it it's a baseless assertion.


[flagged]


"Journalist bad" is apparently your prior.


I'm not sure how you reach this conclusion considering the other posts on this thread show this is related to cookies, which is why clicking the same link twice doesn't have the delay the second time.


[flagged]


If you cannot abide by the very simple rules of HN, please don't bother making an account and commenting.


Xitler


i mean, you can stop visiting the site, no? just leave, bro. it's not that hard. there are other means to connect to people.


You can both stop using the site as a user and still call out the awful stuff they do.


At some point though, the complaining becomes nonconstructive whining. It was already time to move on by the time it changed to X.


Unfortunately not. All of my local government agencies - Police, Fire, DOT, Weather Service, Emergency Management updates, etc. are exclusively on twitter - they frequently post things there they don't even post to their own websites.


At the very least, I can get at the info if it's on twitter.

Anything that's exclusively on a facebook page may as well not exist to me.


I'm in the same boat :/.


You're right! Which is why making Twitter's product worse when there are active competitors taking big bites out of their business seems... dumb?


yep, it also seems to me that the helmsman of that site is dumb. good thing i left that site many years ago.


you could maybe show some testing? if that's true, then it should be measurable


Test yourself, they link to their site from their profile and it's definitely delayed.

https://twitter.com/nytimes


Just tried this - loaded almost instantly. But that's only the link in the profile. Not shared links in tweets.

Leaning towards there's something else going on deep in the DNS/ad servers/cdn/who knows. Not the first I've seen/heard of resolving delays with t.co... maybe it's even just something with legacy non-SSL links being redirected etc


You’re really twisting yourself into a buzzword cloud there aren’t you?

The link you clicked in the NYT bio is not a t.co link - I assume you noticed that but still are using it as counter-proof?


On mobile, a quick test didn't even look at where the link pointed..... Because all links on twitter are t.co aren't they? Weird you think any links aren't t.co. It's how they track analytics on everything

No buzzwords there, just suspicion there's something else underlying with various technologies that are in play even on a 'simple' link click


Both the nyti.ms link and nytimes.com are actually t.co links for me. First one was slow, second one was fast.


A good test might include a bunch of domains. And checking the timing on each. Could we demonstrate the delay is on t.co and not on NYT?


I went through my own Twitter feed, and found 10 non-NYT links. All redirected almost immediately through t.co via wget, only lagging on the destination sites.

I also tried 5 NYT links. All had a very consistent 5 second delay through wget.

I could do more, but I don't care to. Everyone knows Elon has gone redpill, so it wouldn't surprise me if he's "owning the libs", but there also could be a dozen other reasons Twitter might do something like this (including plenty that are not nefarious). I just don't care to dig more...

Edit: I suppose I could have given the specific URLs, but I don't know if/how much t.co links leak info, so I'm not keen to do that. But the delay is absolutely on t.co and not the destination sites, at least as far as external users are concerned. It's possible that t.co queries the sites first before redirecting, and if e.g. the NYT is throttling their traffic that's what's delaying things. I don't know how to disambiguate that, but it's definitely a theory worth considering...


Actually, re: my edit - I think it really is worth considering whether there might be an accidental delay here that's on the NYT side. It's totally possible that Twitter is hitting the sites it redirects people to before it actually sends them there, for either analytics purposes or otherwise, and I'd trust the NYT devops less than Twitter's w.r.t. making sure things were fast.

Incompetence before malice, etc...


It's not.

If you use wget, you see that the delay happens during the first hop with t.co

It also happens with threads.net, instagram, facebook, blueskyweb.xyz


I mean, a good test is to go to go to their website in another browser tab and click around. You're over-complicating this.


That's extremely slow..


https://news.ycombinator.com/item?id=37130129 has some wget commands that absolutely confirm it, at least with a couple examples. You can run the commands yourself in the terminal.


they already told you they did tested it and you don't believe them.

what else could they say that would make you believe them?

you might as well just test it yourself like i did with time wget. it's not like you're going to believe anything anyone writes.


Tried a few links myself; the pattern seems to hold


It seems “it’s a private platform and they can do what they want” only applies when the mob wants [insert social network]’s undesirable behavior.


when does "it's a public square" apply?


Only when the mob says so, apparently


I wouldn't expect an SLA on a service without one.

This is like cogent/he drama


There are a number of reasons this could be that are not necessarily nefarious. It's odd to jump straight to "something evil is going on"

Tell me this, does Twitter have some kind of "play nice" code that slows down inbound clicks through to a site so it doesn't DDOS other sites? I can easily imagine a scenario where anti DDOS caode would allow small sites to pass through quickly, yet sites under heavy "click through" load are being slightly throttled.


Then click delays would appear random to any single client.


This wouldn’t reduce total requests made so it would be a weird anti-DDOS measure.


Indeed. A five seconds delay only means the DDOS starts five seconds later.


If true and intentional, then this is a strong move by Musk against his ideological opponents. Hard to believe he has the cognizance to recognize them as such but maybe he purged more of the 3-letter agency folks from X than it seemed.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: