Hacker News new | past | comments | ask | show | jobs | submit login
Twitter rewrites developer policy to better support research and ‘good’ bots (techcrunch.com)
173 points by ajaviaad on March 10, 2020 | hide | past | favorite | 100 comments



While I hope this is a step in the right direction, I'm not too wowed by this new policy. The term "good bot" is incredibly vague, and I wasn't able to find a much better definition within the policy change itself. The limitations they want to set on bots aren't too limiting either, basically if you pinky swear to behave, you're a good bot. They mention things like no bulk following or spam posting content, but surely these things could be limited in some way by their API, with whitelisted exemptions for the academic groups they want to target?

I also find fault with this statement from the article:

> Going forward, developers must specify if they’re operating a bot account, what the account is, and who is behind it. This way, explains Twitter, “it’s easier for everyone on Twitter to know what’s a bot – and what’s not.”

It may be known to Twitter who is and isn't a bot, but unless something has changed, there is no public facing way to know if an account is a bot. I have never understood this. Sites like Mastodon and Discord clearly label bot accounts, but Twitter has never done so. They are fine labeling accounts arbitrarily as 'verified', but not clearly identifying bots. This would be a great step forward if they want to redefine bot behavior on the site.


I personally object to these sorts of editorial labels. Bots are operated by humans; these are human accounts. They’re just alts. They should not be treated differently. There’s no such thing as a “bot post”: just human beings using different types of software to author posts.

Literally every single “bot post” was posted by a real person. That person’s posts should not be segregated as a result of their client software.


As a person who runs a bot (@sfships) as well as a real Twitter account (@williampietri), I strongly disagree. It's like saying that there's no difference between talking to the store manager and standing next to an in-store TV running a commercial on a loop.

Twitter is mainly about human-to-human connection, and that should be the presumption for any account one comes across. Other uses should be obvious, or at least declared.


I think instead of disambiguating between good and bad bots, they should just allow 'bot accounts' @NAME - bot. If bots want to do their thing, that is fine, and there's certainly a place for bot accounts, it's just when they mascaraed as people.

I'm not sure how you separate them out entirely, I would assume that twitter knows though.


(slightly offtopic, hope you don't mind)

I believe the term is: masquerade.

Though mascaraed is technically correct, it just denotes wearing mascara and not to "mask" oneself as someone else.


I'm fairly sure "mascara" isn't valid as a intransitive verb (transitive would be applying mascara to something), so "when they mascaraed as people" would still be incorrect. Wearing mascara would be "when they were mascaraed as people". (Although it's probably a typo rather than a word choice error, so <shrug>.)


> I strongly disagree. It's like saying that there's no difference between talking to the store manager and standing next to an in-store TV running a commercial on a loop.

No. The commercial in a loop is clearly recognized as such. Twitter is more like the difference between talking to the store manager on the phone and talking to a voice script on the phone which claims to be the store manager.

Good luck getting your problem resolved with the voice script. Some voice scripts are good enough that some people can't tell the difference between it. And that's arguably fine! If you can't tell the difference, then is there a problem?

But enter in the poor soul who can't get that voice script to go off script to solve that customer's problem. Or, nefariously, the voice script that can cause problems en masse because there isn't a human in the loop, or even if there is then the human simply doesn't understand that they're causing problems.


Bots warp a Twitter user's ability to estimate how many human actors are involved in a given interaction. Twitter would like to cut down on astroturfing. One way to do that is to make it significantly harder to automate that process.

These rules make it easier for bot authors who aren't interested in astroturfing to declare themselves as such.


> Bots warp a Twitter user's ability to estimate how many human actors are involved in a given interaction

I think that's a feature, not a bug. Alts and pseudonymous speech are some of the most important speech.


I will go further. If the thought was orginal and came from the bot we should not treat it's message as second class.


Discord is more of a malware rather than a website.


Ok, but could you please stop posting unsubstantive comments to HN? You've done it repeatedly already, and we're trying for a bit better than that here. https://news.ycombinator.com/newsguidelines.html


Lots of those that build on the Twitter platform gets burned at some point. They have shown not themselves to be a reliable partner. Any company doing social media management at some point gets hurt by changing APIs and increasing fees, sometimes causing fatal damage to their business.


>They have shown themselves to be a reliable partner

I hope you mean the opposite? Twitter has long been an unreliable partner in the developer ecosystem.


Not sure if edited, but your quote is slightly off. Should be

> They have shown not themselves to be a reliable partner

The "not" there feels like it's in an ambiguous spot, but at least gives a faint connotation of the author's intent, which reads as you'd like it to.


Even GP flailed it and put the 'not' in the wrong place. Should be "They have shown themselves not to be a reliable partner."


Also acceptable would be "they have shown themselves to be a reliable partner... not"


That could be interpreted as "they have not shown themselves to be a reliable partner" (ie we don't know) instead of the correct "they have shown themselves to not be a reliable partner" (ie we know they aren't), though.


yeah it should be not (themselves to be (a reliable partner))


Perhaps they mean Twitter is reliably unreliable?


Now that scraping has been deemed legal by the highest curts, how can anybody still get burned by API fees?


Scraping, while legal now, is still not free. A scrape does not give you the delta between two points in time without a completely new scrape. You cannot economically sustainably scrape the entirety of a large site like Twitter.

If I’m wrong I’d love to learn from someone smarter than me.

Even google doesn’t scrape the entire web every day.


You're not wrong. Library of Congress started archiving every public tweet in 2010 but abandoned the effort in 2017 mainly because of the sheer volume.

https://www.npr.org/sections/thetwo-way/2017/12/26/573609499...

I think it might be feasible to scrape a subset: a single account or a few accounts at most. But even then, you're still fighting problems ranging from wonky HTML generated through JavaScript towards sudden, unannounced changes to the DOM structure. It's not a managed API after all.


"Legal" and "won't be interfered with" are not the same thing. That it's legal to scrape doesn't mean they can't put captchas, IP blocks, etc. in your way.

Scraping doesn't really help when publishing tweets, either.


I thought the point of the LinkedIn lawsuit was that they were trying to stop them from scraping?


LinkedIn was trying to do so via legal measures (specifically, alleging violations of the Computer Fraud and Abuse Act in a C&D letter).


I had a bot once, which would tweet a single word from the 300k most used words in German every 15 minutes (there is a corpus for that). The bot even had "Bot" in it's name.

It got striked several times by Twitter because of "hate-speech". Yes, the words were unfiltered - of course. That's language. The longest strike lasted for 5 days. It goes without saying that I disputed every single strike to no avail.

I finally took the bot down.


I mean, in German you can cram a pretty large amount of hateful information into a single word, since many words are a combination of words. I'd be curious to see what specifically was flagged.


Well, mostly related to 1933 - 1945 references, if I recall it correct. Basically the "N-Word"-category, but related to jewish people. I think one tweet also said "Judenvernichtung" ("extermination of the Jews").

So, yeah, in hindsight, I should have modified the corpus. But at that time, around 2017, I was pretty angry at how Twitter handled these strikes.

P.S.: The account is still online and has 775 followers. Maybe I should revive it in a more "harmless" way. It was fun.


If your intent was to show the most commonly used German words, then modifying the corpus for political or rule-breaking reasons kind of takes away its value, doesn't it? Twitter just isn't a place that tolerates all cultures the way an impartial academic would.


Literally the definition of censorship.


Yes censorship is a major part of fighting Nazis in the largest war ever held on Earth.


Yes, indeed, that was the intention. Also having a long-running kind of „art installation“. End was projected for 2024.

I thought about rebuilding it as a website. It could also work if I‘d use the ”fediverse“.

Mhm, gets me thinking...


> Twitter says since it introduced a new developer review process in July 2018, it has reviewed over a million developer applications and approved 75%.

There is still no re-review process however. I have an original apps.twitter.com account and used the new development experience form with the first few weeks.

I essentially requested API access for personal projects and was denied with no substantive explanation.

There is a form to submit a platform request hidden away in the support area (literally, it's not even on the page that lists all of the contact forms) but having used it about five times over a number of years has led to no response.

I would like to be a developer, contributing to the ecosystem I've been a member of for just over 10 years but Twitter seems to make this exceedingly hard with no recourse available :(


"Twitter rewrites developer policy" is a frequent enough occurrence that I'd be highly reluctant to build anything material on top of the ever-shifting sands that comprise their APIs.


> I'd be reluctant to use their API's.

I'm not sure what you mean - what choice do you have? You're either building a twitter bot, or you aren't and it doesn't matter?

Also this isn't much (if any?) of an API change, it's a policy change. And even then it doesn't look like it should negatively affect you unless you were doing something nefarious already.


The choice is of what to do with one's time. If Twitter had been a reliable partner to developers, more people would make things for it.


Fair enough, that's a good point.

Now that you mention it, a bot-oriented twitter clone might be interesting...


> I'm not sure what you mean - what choice do you have?

I can choose not to interact with their platform at all. I can also find other ways of interacting with them in other ways (scraping or headless browser automation).

> Also this isn't much (if any?) of an API change, it's a policy change. And even then it doesn't look like it should negatively affect you unless you were doing something nefarious already.

If you are investing time and effort into an endeavour (whatever it is) it helps to know whether ground is likely to disappear from under you.

Microsoft while many people don't like them have kept many ancient APIs working e.g. We have code from 2005 that still works with the latest .NET Runtime with rather minimal changes. This provides you with confidence that going forward you won't face a lot of churn.

Policies while they aren't APIs are important in that they are both clear, applied fairly and don't change on a whim. A lot of people are sceptical that Twitter will do any of those based on previous behaviour.


I've been supporting their API for several years, with a 3rd party library. It's always been mixed feelings, remembering the early days when the ecosystem was exciting and growing, yet the intermediate years when it felt like developers and businesses that built on their platform were getting screwed. Overall, it would be great if they could figure out how to clean up the bad actors, yet build a thriving platform for innovation - a balance they seem tortured to achieve. While a lot of developers have come and gone, I'm still hoping for better.


Years ago I wrote a bot to correct people's poor grammar (specifically "I could care less"), stayed within the API limits, and clearly labeled the account as a bot. Banned by Twitter within a week. Their policies have always been arbitrary and selective, and this kind of vague language ensures that will continue.


I’m glad it got banned, frankly. Those sorts of bots on Reddit are extremely spammy and annoying.

Being a “grammar nazi” used to be a thing on Reddit and I’m glad it died out.


For what it's worth, I always appreciate having my grammar corrected, if it's wrong. Ive actually never understood the opposite view: Having someone tell you when you're wrong is pretty much the best way to get better at it.

(No, I don't go around correcting strangers' grammar, but only because I know that most people are a lot more insecure about learning that they're wrong about something.)


Some would argue that as long as you are able to communicate effectively, your grammar is not wrong.

Another critique might simply be that while it is certainly useful to understand rules of grammar, it isn't worth derailing a conversation over.


> Some would argue that as long as you are able to communicate effectively, your grammar is not wrong.

Sure, but this is sort of begging the question. Bad grammar can subtly affect the effectiveness of communication in ways that sometimes aren't even explicitly clear to the participants. Grammatical rules don't exist (solely) because someone decided to be stuck up for no reason.

> Another critique might simply be that while it is certainly useful to understand rules of grammar, it isn't worth derailing a conversation over.

Sure, this roughly describes why I don't correct strangers in practice. But note that the "derailment" is just another side effect of the insecurity I describe, not an inherent quality of correcting grammar: when someone corrects my grammar, spelling, or diction, my reaction is "Thanks! <rest of comment back on topic>". I don't see how this is derailing at all.


> Bad grammar can subtly affect the effectiveness of communication

The type of grammar that gets corrected is usually not bad to the point of causing misunderstandings. Simple things like "could care less", "would of", etc. Every once in a while there's a mistake that's so bad that it makes things ambiguous, but that's not common. And it's not what these annoying bots are replying to in any case.

> I don't see how this is derailing at all

These bots can devolve into linguistics discussion, etc, that's not related to the linked article or self text at all. And in reddit's case, there are "meta" bots that respond if people respond "bad bot" or "good bot" to a bot's response.

> Grammatical rules don't exist (solely) because someone decided to be stuck up for no reason

Grammatical guides existing does not mean advice is always welcome. I say guides because a lot of supposed rules are prescriptive opinions, and there's no arbitrator that can deem a rule to be "correct". English does not have a language council, unlike French or Korean.


"Derailing" maybe implies more intensity of effect than I intended.

I meant simply that changing the topic is a mild "derailment" so to speak simply by redirecting the attention of the readers away from the topic at hand.


> Those sorts of bots on Reddit are extremely spammy and annoying.

You could block them.

> Being a “grammar nazi” used to be a thing on Reddit and I’m glad it died out.

It's a shame it died out. Just like lots of things on reddit. The good/fun/interesting things are gone, only stale propaganda and silly nonsense is left.

It's funny how some people think that something shouldn't exist because they personally don't like it. The entitlement and privilege.


Reddit is a proper noun and should be capitalized.

> The good/fun/interesting things are gone, only stale propaganda and silly nonsense is left.

Did you mean, “are left”?


> Reddit is a proper noun and should be capitalized.

True. But I'm a rebel.

> Did you mean, “are left”?

Yes. Thank you for pointing that out. We need more of you helpful grammar nazis in this world.


> True. But I'm a rebel.

As are the people you’re trying to correct.


True. It's why I was defending and supporting grammar nazis.


My point is there’s no reason to defend them, people have a right to use grammar in any way that lets the communicate effectively.


> The good/fun/interesting things are gone

Being a grammar nazi isn't good, fun, or interesting. It derails conversations, encourages know-it-alls, and the corrections are often not correct. /r/badlinguistics exists for a reason.

Giving native speakers unsolicited advice is annoying, even if one is correct.

> only stale propaganda and silly nonsense is left.

There are a lot of niche subreddits with great content. Filtering by keywords and a few power users makes for a better experience.

> You could block them.

I hate this argument. The fact that reddit has an API does not mean we should tolerate junk replies from bots. Blocking a bot only blocks it for me; that bot is still polluting discussions.


It violated no rules, and was all in good fun. All it did was reply with "Are you sure you couldn't care less?"


This is actually very explicitly against the rules:

> The reply and mention functions are intended to make communication between Twitter users easier. Automating these actions to reach many users on an unsolicited basis is an abuse of the feature, and is not permitted. For example, sending automated replies to Tweets based on keyword searches alone is not permitted.

https://help.twitter.com/en/rules-and-policies/twitter-autom...


I'm glad it got banned too - bot replies are the worst.

And since it's nit-picking, it's worth noting that the "could/couldn't care less" thing isn't at all as clear-cut as you are making out.

In the early 1990s, the well-known Harvard professor and language writer Stephen Pinker argued that the way most people say “could care less”—the way they emphasize the words—implies they are being ironic or sarcastic along the lines of the Yiddish phrases like “I should be so lucky!” which typically means the speaker doesn’t really expect to be so lucky. Michael Quinion of the wonderful World Wide Words website makes the same argument.

and

Both Merriam-Webster and dictionary.com have weighed in and say “could care less” and “couldn’t care less” mean the same thing. Their reasoning is that both phrases are informal, English is often illogical, and people use the two phrases in the same way. “Could care less” has come to mean the same thing as “couldn’t care less.”

https://www.quickanddirtytips.com/education/grammar/could-ca...



Replies are considered a spammy way of getting attention.

This wasn't Twitter arbitrarily banning you, but people repeatedly reporting your tweets. The option is right there, and aptly called "Uses the reply function to spam".


It probably did that even if you wrote a comment saying something like:

"I could not care less" is a malapropism we should strive to avoid.


A reason why it is bad: I tweet only little and when I do I'm interested in the 2 or 3 replies I get. Those I can handle with notifications being turned on. A seemingly funny bot then innocently interrupts me. If that would happen to me I'd report it for having my peace.


That is pretty spammy and, frankly, obnoxious. Not only is language constantly evolving, chances are that people didn't actually ask to be corrected (using commonly used and understood language), much less by a bot?


> chances are that people didn't actually ask to be corrected

If you remove the tweets that are something the parent tweet didn't ask for comment on, there would be nothing left.


There’s a difference between not asking for a reply, and not asking for corrections for your grammar.

Correcting native speakers when they didn’t ask is obnoxious. Particularly from an automated bot.


I freely admit I did it mostly for the lulz, but the whole point of twitter is that people are shouting into the ether. Hardly anyone "asks" for responses from other accounts. My account has as much right to post and reply as theirs did.


and your account has as much right to be banned as any other. It's clear that twitter does not want bots like that on their platform.


Haha, right? I remember writing my first bot on reddit; it was named after Obama, and it made rate-limited "You're welcome!" comments to all of those "Thanks, Obama" eyerolls during his first term. Someone else quickly made a bot to thank it whenever it commented, and they got along very well. I thought it was hilarious at the time.

But it only lasted about a day. Come on, who makes these rules? :P

Good times, when you come to remember them...and now that I do, that was also the first time I spun up an AWS instance. These sorts of fun and games are good for learning, but I do understand why platforms dislike them, especially once they go mainstream.


The issue of correctness in this case is not so simple: http://itre.cis.upenn.edu/~myl/languagelog/archives/001201.h...


Yeah, that sounds horribly obnoxious. Glad it was banned.


Meanwhile, as a correlation of their caprice, my @fiveobot is still going strong after tweeting hourly for going-on-three years.


That's a semantic error, not a grammar error!


I stopped being annoyed by it when I decided it was short for "I could care less, I guess, but only if I tried realllly hard."


Oh hi there!


o/


I wrote a similar one to correct people's misspellings of the word 'definite'. It got banned quickly, but it's clear why. Unsolicited @ mentions are against the API terms. Was fun while it lasted but definitely deserved the ban.


The issue I see is that it limits the usage to those associated with Academic Institutions. What if you're a data scientist, or an interested person, not working in Academia? Any research you'd want to do and publish isn't good enough, unrelated to the research itself? Seems unfortunate.


That's life, my man. If you think you're going to get the same access as Raj Chetty is, you're in for a rude surprise. The society of academia is built on trust and since people abhor providing information, academics who have other means of demonstrating trustworthiness will be the ones with access.

This is true of everything from IRS data to a startup's data. I worked at a company which shared data extensively with researchers. But we weren't idiots. We gave it to people we trusted. Not every clown claiming to be an independent researcher. That's just sensible.


> The issue I see is that it limits the usage to those associated with Academic Institutions. What if you're a data scientist, or an interested person, not working in Academia? Any research you'd want to do and publish isn't good enough, unrelated to the research itself? Seems unfortunate.

Today, data is gold. Companies may give it to academic institutions for free, but that's mostly because they hope the academics will publish their work and cite to the original source data, effectively free marketing.


From their perspective, it's a way of offloading the otherwise large amount of work needed to evaluate a given researcher. Universities have reputations to protect, and so have things like deans and provosts and IRBs. Whereas any random interested person could be the next Cambridge Analytica. So like so much regulation, it's unfortunate but made necessary by jerks.


I run a couple of 'good' bots:

https://twitter.com/thehugfairy - Sends Twitter hugs. You sign up at hugfairy.com to tell it who to hug. This one get's a lot of use. The website has about 1,500 DAU.

https://twitter.com/smithsonianbot - A more recent one, which sends random photos from Smithsonian's Open Access collection.

Overall I'm glad to see Twitter clarifying their rules. I've had API keys revoked without warning, my accounts deactivated, etc. Twitter dev support is always nice and I've gotten the bots running again, but it would be great to know in advance how I can avoid getting shut down.


Has anyone ever been refused a developer account? I wanted to write some scraper and did the form, but they rejected, told me in the email there was no appeal, and I guess that's it?


Yes. I got rejected because their email asking for more info went to spam, and that rejection was final, with no appeal permitted.

I get a final rejection after several rounds, but the very first time? Because of non-response? Ooof.

Luckily, this was just on my personal account, not for work.


Actually, yes I recall now this is exactly what happened to me! I applied and forgot about it, assuming it didn't matter to them when I responded. What a bummer


> Has anyone ever been refused a developer account?

https://twitter.com/search?q=twitter%20denied%20access%20API...


As someone who has made GPT-2-based Twitter bots, I think the reasonable compromise to the arbitrary app approval process is to allow limited bot access (tweeting only, no replies) through the API without requiring app approval. That would cover the majority of genuine fun bot use cases.


As long as they stop leaving apps dead on the water whenever they decide to maybe develop something remotely related, as has happened several times in the past, then I guess this is good? My previous exp with them isn't super encouraging unfortunately.


This could very well be a "they let Jack remain CEO" dangle, so I'm going to be waiting for another shoe to drop. People will jump on it right away and one of those people will be the first to get burned by whatever develops (and I'm pretty sure there will be burning in the future), and we'll hear about it then. That's the point at which I'll decide to participate or not.


Never trust Twitter API or waste time developing on it. Many burnt bridges.


Hard to believe they won't artificially stifle or prohibit successful use cases, like every other time people built on Twitter. There used to be a whole ecosystem of successful alternate clients, link shorteners, image/video hosting, tweet analytics and other software and it all got deliberately destroyed as Twitter wanted to appropriate the engagement those developers had.

https://www.accuracast.com/news/social-media-7471/twitter-ba...

> In effect ad networks like ‘Sponsored Tweets’ and ‘Ad.ly’ will be discontinued.

https://thenextweb.com/twitter/2012/08/17/twitter-4/

> These changes effectively kill off the growth of the third-party client ecosystem as we know it.


We need real human customer support and not some automated system based algorithms


I don't like the term "good bot" and I really don't like the term "'good' bot", as if bots are always evil. We should just think of them "useful bots", "useless bots", or "malicious bots".


Seems like we're going full circle with Twitter now.


I know I will be downvoted, but I don't care: why are people using Twitter? To me it's the most vile and disgusting platform: no signal, just noise. On top of that, the people that run it are effectively coming up with arbitrary progressive left-wing policies on restricting free speech: for proof, watch this: https://youtu.be/DZCBRHOg3PQ after seeing I realised Twitter/Facebook/Google/Apple are evil corporations taking away your freedom with you cheering to it. Why are you using Twitter's platform? For business? Do you want to make business from Twitter's users which are mind-numbed angry, twitching zombies? Are there no business opportunities elsewhere? Anyone who is doing any sort of business related to social media is wasting their time: it's a broken system that deserves to be boycotted. To anyone who would call me angry: I am saying this so you open your mind about the pointlessness of social media: I dont use it, and I seem to live a nice life, only time I get angry is when I think about social media, like in this post. Oh well, this is HN after all, born and bred in silicon valley where such blasphemy will not be tolerated. But lets see.


Had to look up how many engineers they have, around 4000. That's a lot.


Bots just need banned or API limited... Twitter needs to be re-humazized


They're just picking winners that align with their own ideology at this point. The dangers of authoritarianism are latent in any power structure - whether government, corporate, or other.

This is bad for society.


>They're just picking winners that align with their own ideology at this point

Which is why Twitter and Facebook and LinkedIn and all of them are actually publishers, akin to the Washington Times or Fox News, and don't (currently) deserve §230 protection.


I mean at a certain point, does a difference in volume not become a difference in kind? Washington Times and Fox News are effectively "whitelisting" the content that goes onto their sites, whereas the stance that Twitter/Facebook/LinkedIn are taking here is a blacklist. Those are different things.


>does a difference in volume not become a difference in kind?

Why should it? They built the firehose, they can build the valves. It's not reasonable to argue, "whoopsy, too hard to do now!" when moderation has been around for DECADES and which therefore they consciously decided not to implement anywhere along the line. Even if Zuckerberg was already in over his head by that point, Thiel knew about this stuff. So did Parker. FB pays smart people who know about this stuff, too, they're just presumably prevented from implementing it.

This really is "a steering wheel that doesn't fly off while you're driving" situation, but FB refuses to secure steering wheels until they can figure out a way to do it without humans.

It's their problem if blacklists don't work. Why can't (can't) they use whitelists if that's the sure-fire solution? They can, they just don't want to. Look how many teeth had to be pulled just to get attribution on ads that were proven to be problematic.


Are they different from the consumer's perspective? It seems like adding a small amount of information to a blank page and removing a large amount of information from a full page can have the same result. In programming terms:

( 0xFF & 0x01 ) == ( 0x00 | 0x01 )

People still see the same thing, regardless of how the decision to show it is reached.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: