Is there some obvious reason not to measure requests per minute rather than second? Or is it an offhand joke?
Some systems I've worked on had APIs that averaged less than one per second, but I don't think we want to be measuring in millibecquerels. Some have measured on millions of requests per hour, because the hourly usage was a key quantity, as rate limits were hourly as well.
>Is there some obvious reason not to measure requests per minute rather than second?
It's much less obtuse to say something like "average req/min" or whatever, but then again you can't write a cool blog post about misusing an SI unit for radioactivity and shoving it into a nonsensical context.
In my experience, rate limits are more often per second. It's easy to talk about kilo or mega-units, so this isn't as big an issue as the awkwardness of talking about very very low volume services. Maybe those (generally) inherently don't care about rates as much?
In my perception there is a difference between 1req/s as a rate limit, and 60/min. The difference has to do with bucketing. If we agree that the rate limit is 1/s, I expect to be able to exactly that and sometimes 2 within the same second. However, if we agree on 60/min, then it should be fine to spend all 60 in the first second of a minute, or averaged out, or some other distribution.
This also helps with the question I always get when discussing rate limits “but what about bursts?”. 60/min already conveyed you are okay to receive bursts of 60 at once, in contrast to with 1/s.
In my experience it is exactly the low rate service that care about rate limits as they are the most likely to break under higher load. Services that already handle 100k req/s typically don’t sweat it with a couple extra once in a while.
An effective rate limiting system has multiple bases in my experience, depending on what the goal is. But I usually implement the configuration as a list where you can define how much requests are allow maximum per how many units of time.
E.g. to prevent fast bursts you limit it to 1 request per 1 second, but to avoid someone sending out 86400 requests a day you also cap them at 100 per 86400 seconds (24 hours) and 1000 per 3600 seconds (1 hour).
Whichever limit they hit first will stop it. That isn't hard to implement if you know how to deal with arrays and it allows long term abuse, while still along fast retries if something went wrong.
I guess there's a difference between talking about how many requests a system is capable of handling, and how many they actually get.
At least when i encountered the discussion initially (some thirty years ago) I'd say we usually talked about how many requests the system was capable of handling. Then requests per second was the obvious unit since a request usually took less than a second to process (obviously depending on the system and so on - but mostly), so using that unit often gave a fairly low, comprehensible number.
Was it ten? A hundred (very impressive)? Perhaps even a thousand (very, very impressive!)?
Multiply those numbers by 60, and there's suddenly a lot more mental gymnastics involved. By 3600 and you're well into "all big numbers look the same" land.
Right - it feels like going skin deep on types and then complaining they didn't solve for very deep problems.
Like yes, it would be nice for Map(ICar[] cars, keys).wingspan to throw a type error because cars is typed and we know keys can't include things not in ICar.
But to say that Map(Any[] things, keys) should have ahead of time type checking seems like you're not really using types except when inconvenient. Which might be taken as a no true scotsman or "holding it wrong" argument but... Maybe they are holding it wrong.
(Speaking as a former Windows/CLR PM now working in a Ruby monolith... It's hell and indeed trying to add types via sorbet has been miserable and useless)
To torture the metaphor further - it's also a personal dj, with an audience and customer of 1. Somewhat by definition there can be no outlandish requests, certainly not "play this entire piece".
If I told the DJ at my wedding to play an album front to back, and they transitioned to Aerosmith, I'd be tapping a friend to run the music the rest of the night.
It can't be quite that simple because you have a couple additional problems to solve - (effectively restating bits of the article poorly and partially)
1. You don't want these to be replayable (give your JWT to someone else to use) so they need to be bounded in some ways (eg intended website, time, proof it came from you and not someone else).
2. You don't want the government to know which website you're going to, nor allow the government and the website to collaborate to deanonymize you (or have the government force a website to turn over the list of tokens they got). So the government can't just hand you a uuid that the website could hand back to them to deanonymize.
The SD JWT and related specs solve for these, which is how mDL and other digital IDs can preserve privacy in this situation.
> You don't want these to be replayable (give your JWT to someone else to use) so they need to be bounded in some ways (eg intended website, time, proof it came from you and not someone else).
But these are the things that make it non-anonymous, because then instead of one token that says "is over 18" that you get once and keep forever, everyone constantly has to request zillions of tokens. Which opens up a timing attack, because then the issuer and site can collude to see that every time notbob69 signs into the website, Bob Smith requested a token, and over really quite a small number of logins to the site, that correlation becomes uniquely identifying.
Meanwhile we don't need to solve it this way, because the much better solution is to have the site provide a header that says "this content is only for adults" than to have the user provide the site with anything, and then let the user's device do what it will with that information, i.e. not show the content if the user is a minor.
The cryptography provides nothing to establish that this separation is actually being maintained and there is plenty of evidence (e.g. Snowden) of governments doing exactly the opposite while publicly claiming the contrary.
On top of that, it's a timing attack, so all you need is the logs from both of them. Government gets breached and the logs published, all the sites learn who you are. Government becomes corrupt/authoritarian, seizes logs from sites openly or in secret (and can use the ones from e.g. Cloudflare without the site itself even knowing about it), retroactively identifies people.
I'd review the setup here. You're missing the critical distinction that the cryptography supports - separating entirely (in time and space) the issuance of the cred to the user and the use of that cred with a website.
Unless you're getting the device logs from the users device (in which case... All of this is moot) there is no timing attack. Six months ago you got your mobile drivers license. And then today you used it to validate your age to a website anonymously. What's the timing attack there.
If the driver's license can generate new anonymous tokens itself then anyone can hook up a driver's license to a computer and set up a service to sign for everybody. If it can't, whenever you want to prove your age to a service you need to get a new token from a third party, and then there is a timing correlation because you're asking for the token right before you use the service.
The article proposes a hypothetical solution where you get some finite number of tokens at once, but then the obvious problem is, what happens when you run out? First, it brings back the timing correlation when you ask for more just before you use one, and the number of times you have to correlate in order to be unique is so small it could still be a problem. Second, there are legitimate reasons to use an arbitrarily large number of tokens (e.g. building a search index of the web, content filters that want to scan the contents of links), but "finite number of tokens" was the thing preventing someone from setting up the service to provide tokens to anyone.
Blocking said search indexes is probably a good thing.
I'm thinking perhaps a system where you feed it a credential, a small program runs and maintains a pool of tokens that has some reasonably finite lifespan. The server that issues the tokens restricts the number of uses of the credential. Timing attacks are impossible because your token requests are normally not associated with your uses of the tokens.
And when you use a token the site gives back a session key, further access just replays the session key (so long as it's HTTPS the key is encrypted, hard to do a replay attack) up to whatever time and rate limits the website permits.
> Blocking said search indexes is probably a good thing.
I feel like "we should ban all search engines" is going to be pretty unpopular.
> And when you use a token the site gives back a session key
And then you have a session key, until you don't, because you signed out of that account to sign into another one, or signed into it on a different browser or device etc.
> The server that issues the tokens restricts the number of uses of the credential.
Suppose I have a device on my home or corporate network that scans email links. It's only trying to filter malware and scams, but if a link goes to an adult content barrier then it needs tokens so it can scan the contents of the link to make sure there isn't malware behind the adult content barrier.
If I only have a finite number of tokens then the malware spammer can just send messages with more links than I have tokens until I run out, then start sending links to malware that bypass the scanner because it's out of tokens.
Search engines should not be using website search capabilities. That's putting an undue load on the systems. A board I'm involved with recently had to block search for guests because we were getting bombarded with guest searches that looked like some bot was taking a web query and tossing it around to a bunch of sites. Many of them not even in English.
Imo these are nice to haves. The physical system of ID cards already has these problems but works well enough.
People can loan their ID to someone else (ask college kids with an older sibling...)
When you use your physical ID, the government frequently can deanonymize you either through automated databases (especially when purchasing drugs) or subpoenaing for camera footage, visitor lists, etc
But one overlooked advantage of manually copying JWTs is that the user doesn't have to blindly trust they're not hiding extra information. They can be decoded by the user to see there's only what should be there.
So does Google send a header for each search result when you look up "Ron Jeremy" so that some results get hidden, or does the browser just block the whole page?
Sending all the "bad" data to the client and hoping the client does the right thing outs a lot of complexity on the client. A lot easier to know things are working if the bad data doesn't ever get sent to the client - it can't display what it didn't get.
Google would send a header that it is appropriate for all ages (I'm not sure how the safe search toggle would interact with this, the idea is just a rough sketch after all).
When you click on a search result, you load a new page on a different website. The new page would once again come with a header indicating the content rating. This header would be attached to all pages by law. It would be sent every time you load any page.
Assuming that the actual problem here is the difficulty of implementing reliable content filtering (ala parental controls) then the minimally invasive solution is to institute an open standard that enables any piece of software to easily implement the desired functionality. You can then further pass legislation requiring (for example) that certain classes of website (ex social media) include an indication of this as part of the header.
Concretely, an example header might look like "X-Content-Filter: 13,social-media". If it were legally mandated that all websites send such it would become trivially easy to implement filtering on device since you could simply block any site that failed to send it.
> A lot easier to know things are working if ...
Which is followed by wanting an attested OS (to make sure the value is reliably reported), followed by a process for a third party to verify a government issued ID (since the user might have lied), followed by ...
It's entirely the wrong mentality. It isn't necessary for solving the actual problem, it mandates the leaking of personal data, and it opens an entire can of worms regarding verification of reported fact.
The Twitter layoffs being used as proof of _anything_ is misguided no matter what you're trying to say.
If success is losing half their revenue, reverting to revenue numbers from a decade ago, I gotta know what failure looks like. You might argue that the revenue losses aren't correlated to their headcount changes and probably make a good argument, but I mean... It's not a great one
Everyone predicted twitter would crash and burn within months of the layoffs.
It didn't.
Anyone who has worked at a large company knows that 1/2 the staff there is stuck keeping the lights on because it is easier to hire a warm body than fix tech debt.
I've worked at companies that are literally 10x more effective than other competitors in the market purely due to good engineering practices.
Even within large companies, you can have orgs that are dramatically more effective than others, often due to having to work under just the right set of resource constraints. Too little and no investments in the future, too much and it becomes easiest to build fast and hire people to duct tape the mess that is left behind.
> Everyone predicted twitter would crash and burn within months of the layoffs.
It did, just not obviously. Twitter used to be the store brand social network, vanilla and reliable but not overly obnoxious. It made good money from brand advertisers like Ford, General Mills, and Sony. City governments felt ok with using it to distribute community information. The platform tried its hardest to stay middle of the road and not let things sway too far one way or the other.
Today it is a real time bidding marketplace for changing public discourse. You simply buy blue checkmark accounts in bulk and spread your message free of any content moderation or safeguards. So the Chinese, Russians, and Saudis can get into a bidding war over what rural whites believe to be fact.
With the ad revenue sharing program you don't even need to write the content anymore (one of the biggest things foreign influence campaigns struggle with). Just find someone who is saying the "right" thing already and promote them. Twitter in turn underscores the authenticity of these voices by adding "transparency" features that list where someone is from - because your average person does not know a damn thing about proxies.
You and the poster above disagree about the state of Twitter.
Twitter had been a growth company, it was early/missed the market with Vine, but was showing ad growth.
Now, as a private company, backed by the world's richest man, sovreign wealth funds, and banks that have written down their stakes, it has different economics than a tech / growth company.
It's ad revenue is now, not in the ballpark of the fortune 500 or trendy Instagram ads, but somewhere between reddit and sin site markets.
The purpose of Twitter is IMO no longer to be profitable.
For a man with a trillion dollar fortune it’s just his personal equivalent of Fox News, a way to shape the nations conversation.
Plus a way to get data for xAI.
In that regard it’s a huge sucess. I use grok to find out about stuff on X and it’s very effective. Grok is also nowhere as bad as it should be (it’s still not great).
A way to get data for xAI? Eh, I guess. But it's a source of bad data. Most social media is, even the best case is stuff like Stack Overflow. It wouldn't surprise me if this was at least a strong component of why Grok called itself "Mecha Hitler".
Huge success? Unfortunately I have to agree, given the US government still ended up integrating it despite the Mecha Hitler incident.
> I use grok to find out about stuff on X and it’s very effective.
As with all of these things, I have to ask: How confident are you that it's telling you true things, rather than just true-sounding things? My expectation is Grok will be overtraining on benchmarks (even relative to the others, who will also be doing so at least a bit), and Grok's benchmarks will include twitter reactions, and it will be Goodhart's-law-ing itself in the process to maximally effective rhetoric rather than maximally effective (even by the standards of other LLMs) "truth-seeking".
* plural, not "the", it also works in at least the UK as well as the US
You can ask Grok for “find me this tweet on X, with direct links for sources” and it will do that. It’s basically a super charged fuzzy search engine for X which is great, since a lot of my searches are half remembered tweets that I’d like to find again.
So it’s accurate in the sense that it’s accurate finding things on X. I don’t really use it for anything else.
Thanks, that makes sense. I read too much into your previous comment and thought you were finding out more about things beyond twitter after they were discussed on twitter.
I can't think of a more quintessential crash out of a major brand than Twitter from the past couple years. For a significant percentage (>10% publicly, I'm confident much more than that internally) of users it became unattractive.
If Microsoft did something that resulted in 300 million users leaving it would be considered crashing and burning, but I guess when Elon does the same proportion someone will show up to explain why losing half your revenue is better than losing all of it.
I just want to know who those people are so that I can pitch them on my next investment fund.
> It’s the same for his cars, they haven’t suddenly got worse at building them.
Actually, they demonstrably have. The Cybertruck is a technical and commercial disaster.
You're correct that most people don’t want to buy from someone like Elon Musk. A huge additional problem for Tesla, though, is that instead of focusing on the business that he's paid to run, its CEO has busied himself with far-right demagoguery for the last couple of years. While that was going on, a variety of Far Eastern companies quietly brought a bunch of EVs to market, that are mostly at least as well-made as Tesla's vehicles, while also being cheaper.
On the roads where I live, I now see about ten of these competitors' cars for every Tesla.
I don't know if I've seen "tech debt" do serious damage to any company, and I've been around a long time. I've definitely seen whole teams grind to a halt in pursuit of someone's idealized vision of the "perfect way to organize code" though. They always couch it in the language of tech debt, but really it's just the loudest person's preferred way to shuffle files around - and usually in the direction of more complexity and not less.
Proving a negative and all that. I’ve definitely seen it do crazy damage, features that should take a week takes six months and turn out to need another year of fixing.
But that’s the easy part, the hard part is how it affects culture and how the skilled people leave because they’re severely underutilized.
So when some people talk about tech debt we don’t talk about perfect code or file structure, it’s about painting a wall in a tropical rain, building a house during an earthquake etc. So count yourself happy I guess.
> I don't know if I've seen "tech debt" do serious damage to any company, and I've been around a long time
Just to provide a counter data-point, I've certainly seen companies not being able to move anymore because of tech debt. It's not for nothing that so much has been written about it, and about the ways to fix it.
Your other point stands - the resume-driven development is also a real problem.
> Everyone predicted twitter would crash and burn within months of the layoffs.
I remember people celebrating and praising Musk, predicting new era of free speech twitter that earns tons of money and is massively effective.
Meanwhile, it lost on value, lost on income, became nazi echo chamber and overall much worst version of itself. It did not "crash and burned" simply because Musk was willing to pay huge amount of money for all of that. What it shows is that original engineering was good and reliable, actually.
For that to be true, the revenue loss would have to be related to the loss of headcount, e.g. due to downtime or other issues. Rather the revenue loss was due to an advertiser boycott driven by Elon's rejection of woke politics. That wouldn't apply to other tech companies that made similarly deep layoffs.
Look at it this way. Could Google lay off 70% of employees and keep the lights on, even still launching some new features on core properties, whilst preserving revenue? It'd be surprising if the answer was no.
It doesn’t cost much to keep the lights on. As far as I know, X post-acquisition is not investing in innovation anymore.
Musk might have been right that shifting to KTLO mode was a good idea, but the company would still be better off if someone other than him had bought it and done the same thing.
Valuation is not important. These are private market transactions. Ultimately all the investors made out well, and it seems the company is more profitable than it was before. At least expenses are way down and if those expenses are correct then they are paying off the debt and making a good profit.
I couldn't possibly disagree with this more. Since the acquisition Twitter/X has had far more features at a far faster pace than in the 10 years prior. They've added all sorts of great stuff, and recently have been near the top of the charts in the Apple App Store.
I've never seen the motivation behind buying Twitter to have been revenue, or free speech for that matter. Elon wanted a unique content source to train LLMs on and he got it. Whether that proves out as a good training dataset is still up in the air, but I can't imagine he cared about Twitter revenue.
This is not even intended for the LLM training use case. it is an afterthought use case. He installed a president who he can use. Twitter acquisition helped him achieve this.
Oh sorry, no I meant them shoving Biden into the candidacy despite clear health issues, shutting down any primary challenge, and swapping Harris in late in the game when it was abundantly clear Biden's health was a problem.
As someone on the great, late 8 (https://fixthel8.com/) in Seattle, I'd happily give up my stop to help it be on time more often. I have three other stops I can walk to within ten minutes of me.
SF is another good example of too many stops. It's honestly comical and I stopped riding the bus in SF at times because the stop count was painful.
Imagine the delays are so prominent, someone decides to make a website for CTA (call-to-action) and semi-regularly shares updates on it...
I've been to Seattle once, (ex-Amazon here) where the DevCon was held in the town while my team was located in Bellevue. I took initiative to rent a bike for a day (60$ for drop-bar gravel bike) I must say although I did not beat the time between Day-1 (Office across spheres) and Bingo (Bellevue office), it was not far off. Even comparing the "Shuttles" Amazon operated, shuttle took about 1h while ride takes around 1h15m. (Plus sweat)
> P.S: I would say I am in a "fair" shape as I ride quite a lot throughout the year.
Which of the cities used as examples in the articles are "sparse"? LA? Pittsburgh is one of the smaller ones listed and while the bus network there is very hub and spoke, it's also still semi usable.
But to call NYC, LA, Philly, Chicago, Minneapolis, Houston, etc sparse doesn't seem very accurate. Yes, LA is vast, but I wouldn't call it sparse.
While no one would ever navigate by learning what the mosaics mean, it's a fantastic setup for the expected audience of commuters. Give it a month and your brain would associate a given color with your stop coming up soon, and make navigation easier.
I remember having read a story about some wild dogs in Moscow apparently having learned to use the subway and establishing their own "commute schedule".
I always wondered how the dogs would identify the station to leave the train - counting stations or understanding how the announcements worked felt too "smart". But I imagine the simplest way for them would be to just learn the design of different stations over time and jump off once they see a familiar design through the windows.
If I had to make I guess, I'd go with the dogs recognizing the smell. Dogs apparently don't have terribly good vision, but as I'm sure we all know, a very good sense of smell.
dogs, the smart ones, have fantastic memories, a perfect sense of direction, and a basic faith in thereown abilities.
that feral dogs, though must be very social with humans, have learned the subway is out on the edge, it's not that far, my moms airidales would go walkabout, shoulder to shoulder, and just sit outside doors they wanted to go through, places they had zero
business to be, and be let in, and later out again, got used to bieng driven home by various volenteers and the police, never taken to the pound.
taught two dogs to hitch hike, by raising a paw, people would stop, let me in as an afterthought......in the back, dog up front.
Some systems I've worked on had APIs that averaged less than one per second, but I don't think we want to be measuring in millibecquerels. Some have measured on millions of requests per hour, because the hourly usage was a key quantity, as rate limits were hourly as well.
reply