They want to at least try to reduce data scraping. At least attempt. Do you have...

CaptainFever · 2024-02-16T03:36:12 1708054572

Ignore it. What's the reason to restrict it? Scraping is important to interoperability.

nonrandomstring · 2024-02-16T04:46:52 1708058812

> data scraping.

What we old people call "reading"? But if that's what you crazy kids call it these days.

> Do you have a better idea because we’d love to hear it

Oh well, now you mention it, it'd be rather nice if Elon Musk went and stuck his head up the back of a cow.

Anyway gotta go scrape some more HN posts...

TiredOfLife · 2024-02-16T08:50:29 1708073429

Have cows not suffered enough?

rurp · 2024-02-16T03:18:34 1708053514

I mean, yes? There are many techniques for blocking or throttling high volume scraping. You don't even need to understand the techniques, plenty of companies sell this as a service.

That's beside the point though. There's no actual need to force logins, it's just something Elon wants. Given what a dumpster fire Twitter has turned into the rational move for most people is probably to just forget about the site at this point.

somenameforme · 2024-02-16T04:43:18 1708058598

While this is tangential, you've stoked my curiosity. What could prevent scraping? In this case, they're combating against scraping from other major businesses, so another company doing something like setting up thousands of distinct IPs to scrape from at rates which are specifically intended to mimic organic usage would not be difficult.

Only thing I can think of is starting to get into gaming-site type territory where you end up trying to do things like analyze mouse position, click patterns (every 30 seconds at exactly the 0,0 pixel or whatever), and so on. But this sort of stuff is a cat and mouse game, where I think the cat is generally going to be at a pretty big disadvantage.

aftbit · 2024-02-19T00:04:58 1708301098

IMO literally nothing, as long as the analog hole exists. The best they can do is make it more expensive. Requiring an account is one way to do that. Another option is to take the RIAA route and sue everyone involved. Of course they will likely fail, but they can weaponize the legal system against less well resourced companies.

A better idea would be to sell access to the firehose and API at a reasonable cost such that it makes more sense to pay them for this rather than set up a whole scraping farm, aka the Netflix model. Unlike Netflix, they won't be crippled by 3rd parties unlicensing content.