> However, as data from that external feed was unavailable, the price of the value of the Index instead defaulted to -1
To me, this reeks of a couple of bad experiences I had repeatedly as a back end developer coordinating with the front end:
1. Demanding that my services always returned correct-looking data, even if it wasn’t available, because they couldn’t (or wouldn’t) handle errors. The story always went like this: if a service returned a 4xx, it would trigger an exception in their front end code, and they didn’t want to tackle the complexity of handling potential errors from a bunch of different back end services. I’ve had this one get escalated to management before, who fortunately supported me.
2. Resisting efforts to ensure that potentially dangerous values were displayed in attention-getting ways. Anything that didn’t look aesthetically harmonious was liable to get “fixed” without any consideration for why it was presented in a visually jarring way in the first place. As a back end engineer I sometimes lost this battle, since it was outside of my jurisdiction, but when I could, I looped in users or a product manager to help stop the madness.
Maybe not as bad, but similar ... I've had a frontend dev say they needed to see designs from the designer before they could add any kind of error message or indicator. The status-quo is just silence, the page just stops doing anything. This is not unusual for various SPAs I've seen around the web. It is unfortunately a typical experience to click something ... no response? click again? wait how long? refresh? hello?
Have the designer design it, sure, but until that happens, you have to show something for auth expired (tab suspended for a long while), connection failed (flaky wifi), or 5xx from service - as much as I may try to make that never happen, it is possible (database failover for a few seconds?), and the design can't just be "there shall be no errors", you can't decide that.
That's often a problem with designers -- they only make a happy flow and only the end of it. No in-between states, don't understand the latency and errors exist, don't dig into boundary conditions and edge cases. It's a typical no-my-job-svg response to boot the problem upwards so their manager will make them aware of those pesky things.
A few decades ago, we chose the most awful color combinations for the UI of an internal system to encourage the designer to get on with it (they wore multiple hats). By the time I'd left a year or so later, the people using the system had not only accepted these disturbing choices, but claimed to have grown fond of them.
Google's internal Memegen platform used to have a disturbing amount of #FF00FF in the UI. Since it was internal and not well funded, there was no designer associated with it. The only full-time engineer at the time asked a 20%-er if #FF00FF looked good, partially as a joke, but not knowing the 20%-er was color blind. The 20%-er responded with a strong positive. The full-timer thought that the 20%-er was saying that it would be funny to launch that way as a joke, so all accents in the UI became memegenta.
I Google'd "#FF00FF" to see how bad it was but was instead returned with a rather pleasant blue. Confused, I tried again. I got a completely different color. Same exact search. Different color. I've done the same, simple "#FF00FF" search and I'm up to five different colors with none of them being magenta.
How do you fuck something so simple as a HEX color code up?
(Hilarious story by the way. For a hot second when I saw the blue come up I thought maybe I was colorblind, too.)
You're right, that's really weird...
Initially didn't see that 'cause I don't usually include the '#' in my searches so it just comes up with the colorhexa result (which is clearly more accurate lol)
> if a service returned a 4xx, it would trigger an exception in their front end code, and they didn’t want to tackle the complexity of handling potential errors from a bunch of different back end services
I’ve seen this when the frontend team is building a fat client/smart client/Single Page App but they really want to be building a thin client/dumb client/server-rendered app. Basically it’s a technological mismatch, and client teams feel a lot of pressure to build an SPA but don’t like/want/need the extra compexity that comes with it.
Oh man, I would LOVE for all APIs to always return real HTTP response codes. It's crazy frustrating to get a 200 OK with an empty object or something instead of a payload with the expected/agreed-upon format. If someone can parse/handle a response from an API they can absolutely add a bit of code to handle a handful of response codes.
The ambition to hide errors from users for example is just completely unhelpful if you think about it.
"Something went wrong". Sure, many or most users won't be helped by a specific error message, but they can use this message to ask someone how does. Especially if large tech companies just forget to offer support, a sensible error message is better than nothing. Sometimes error messages lead you down the wrong path if you don't know the intrinsics of an infrastructure, sure. But I still don't know which brain is responsible for the "guideline" to hide some essential from users. It is just objectively wrong.
It would be childs play to hide the "ugly information" in an extra menu, window, popup or anything. But doing UX isn't an excuse for bad engineering.
> Demanding that my services always returned correct-looking data, even if it wasn’t available
Ah yes, the age old "special values will signal an error, so we don't have to deal with error handling" approach. Which is always either a pitch for reinventing error handling in novel and ad hoc ways (usually bad), or for disregarding error handling altogether (always bad, sometimes entertaining).
Spot on. Part of it is wanting errors-as-values without knowing what that is (which is good and contains a kernel of real insight), and part of it is assuming that error handling can be a negligible fraction of the work (which is the opposite of the truth and can only be achieved by doing a poor job of it.)
These sorts of errors are frequent and pervasive in finance.
Not just numbers. I worked on a ref dare system with minimal input validation. Copy/pastes would leave in white spaces, new lines, etc. This would then break trading which tried to execute them.
Aged 17/18 at a summer work program at a fund (~2000) I was given Excel and Bloomberg and had to calculate risk across client portfolios. Messed up the manual input of one by a zero resulting in the whole portfolio getting rebalanced, then un-rebalanced.
The same team passed off a trade via paper slip saying it’s yen. The person entering it didn’t recognize the yen currency symbol, assumed it was a 7 and traded the full number prefixed with 7 in yen instead.
I've been in the room when it happened. In one memorable incident, a junior trader I sat opposite bought 100 bn Japanese Yen instead of 10 bn. His excuse? "I lost track of the zeros". We'd only hired the guy because he had a PhD and "PhD's are smart" - lol. In his defense the UI was awful. He left soon after...
Me too. Fatfingers happen much more often than the ones that are reported, partly because it's really hard to make UX that is functional for high-performing individuals trying to respond quickly in pressure situations while also protecting them from the consequences of errors.
For example, in one old FX trading system I know about you would type in the cross you wanted (eg say GBP if you wanted GBP/USD) and the amount and then if you hit enter it would execute. If you hit F5 it would execute that number of million of that cross. So say you wanted 10 million, you could type 10000000 and hit enter or you could just go 10 F5 and it would immediately execute 10 million. (ie if you hit F5 you didn't also need to press enter).
F6 was the same, but for billion. And yes it originally had an "are you sure?" chicken box but there was only one flag to disable all chicken boxes and one of the chicken boxes on another part of the system was such a constant pain in the ass that everyone had chicken boxes disabled. So if you fatfingered and hit F6 when you meant F5 you could literally execute a thousand times as much as you intended.
Other than massive wrong amounts, the other one I've seen a lot is people buying something when they thought they were selling and vice versa.
A dialog box allowing the user to “chicken out” if they have second thoughts about some important action. Usually they say something like “Are you sure? yes/no”
A "fat finger" of only 10x is not that much of a fat finger and may not be detected for it's probably nothing out of the ordinary compared to the brokerage account's size. The fat finger in TFA, however, is a quite bigger fat finger: $444 bn instead of $58m.
Except that apparently they had such warnings for individual trades, but the bundle was implemented as "for(trade in bundle) transact(trade)". Which design has a certain appeal, standing on the shoulders of the peculiar handling of each trade, but with the terrible flaw that the bundle is never considered in its totality.
How do you buy 10,000,000,000 (let alone 10x that much) of something as a junior trader without someone signing off on it? There's nothing at all to stop catastrophic actions being taken by anyone in the organization?
At the time, no. There wasn't any checking. And while it looks like a lot, it wasn't insanely expensive to unwind. Sometimes these kind of mistakes make money. My own 'mistake profit /loss' was around +1m USD.
Yen was probably around hundredth of euro or dollar. So it is 100 million or billion worth of very liquid currency. Very likely among the most liquid trade pairs. As such you need to trade big to make money there. Get timing right and you make it, get it wrong and you lose some.
My main takeaway from that was that software checks aren't an acceptable replacement for hardware interlocks... A lesson we seem to be collectively unlearning
I like github's solution to deleting a repository.
A popup makes you type the repository name before confirming deletion.
In this case, if the value is unusually high say greater than 100 billion, make the trader type 100 billion before proceeding.
I confess that I also ignore popups and just click ok. This is where i appreciate github asking me to type the name of the repository, just to make sure that I know what I am doing.
As for the second story, "I’m kind of with Revolut here, but on the other hand, if they’re so sure it’s a scam maybe they should just not allow the transfer? Even if the customer sends the selfie? Perhaps you want a hard block here."
Do people want a block? Every story about a bank blocking a transfer draws tons of outrage. It's my money, I can send it where I want, no questions, you can't stop me, how dare you.
In millennium bank Poland every time I need to move a bigger amount of money they ask me for my handwritten signature which always fails to validate. Of course the more I fail the more I get nervous about it and fail to draw my signature appropriately. If I need to move big amounts it is because I need to, like buying a house or something life changing.
Several times I thought what am I going to do now? What is my appeal? The appeal is simple: ask to change the signature beforehand and the proceed with the operation. Damn stupid.
It is like presenting my authentication token, failing and then escalating my privileges to obtain the higher order verification token required.
Humans are bad at processing large sets of similar data occurring repeatedly.
It comes up everywhere:
- On-Call alerts received by my engineering team for our microservices that usually self-resolve result in the first action being taken by the engineers to be "just wait and see if it's a false alarm." We work to reduce the number alerts overall to reduce the noise.
- I'm reminded of "Cigna saves millions by having its doctors reject claims without reading them" https://news.ycombinator.com/item?id=35304017 In order to keep up with the claims, they just auto-rejected them to see if they were appealed in a way to filter out the "noise."
- We hear so much about Tesla's "FSD (Supervised)" and it's request that drivers "don't become complacent" but it happens anyway. After enough time behind the wheel with FSD enabled, we are swayed by the string of successes to become overly trusting of the tech.
> On Tuesday, the National Transportation Safety Board said that a crash last year on the Washington subway system that killed nine people had happened partly because train dispatchers had been ignoring 9,000 alarms per week. Air traffic controllers, nuclear plant operators, nurses in intensive-care units and others do the same.
Every time I've been in a hospital (visiting people), I have a frequent sensation of wanting to get out of there. I hope/imagine I will never need to be in a hospital, because I know it will be hell for me. I have extremely good hearing, and am very affected by sound. Visiting people in hospital requires serious effort from me to maintain focus on the person and those I'm there with, and ignoring such a shocking degree of noise I can barely believe it's the accepted norm. I can't believe how many devices and processes purposely make noise as a part of _normal operation_, where there's not even anything amiss!
My wife was in the hospital because she’d essentially stressed herself to the point of a heart arrhythmia. The irregular rhythm meant the equipment was basically useless at accurately determining her heart rate and the readings fluctuated wildly.
So what did she do for several hours? Laid in a bed anxious about the medical concerns, anxious about life, and listened to the monitoring equipment go off multiple times a minute as she crossed various alert thresholds, went back under them, crossed them again, repeat the entire time she was there.
I asked the staff to turn the alarms off or at least adjust the thresholds.
They just short of flat out said they wouldn’t because if they disabled or adjusted them and anything happened it would be on their head.
So my wife spent half a day being told to try and relax while the machine monitoring her heart and respiration and blood pressure constantly screamed at her that everything was wrong.
I've been involved in some systems that routinely spit out dozens of warnings per day. After 'fixing' some of them - maybe getting a system down to a few per week, bug reports started coming in that the system was 'broken'. Because people noticed the boxes and messages they routinely ignored were gone, and this must be a problem. Lots of re-education may need to happen on a large scale to actually 'fix' alarm fatigue across the board.
I work on a system with notoriously noisy alarming and messaging logging - my first step is to tell the customer to ignore all of it.
I tell them to do two things to tell them how to check system health.
1. Is it working normally? if so, then nothing is wrong.
2. Walk by the hardware once a week and check for idiot lights, check the system dashboard if one of the idiot lights is on. If no error is found, its a hardware problem (failed supply, drive, fan), if an error is found it's a failed server - either way, call us.
There are a handful of cases not fixed by this, but they're limited to time sync failures - which are critical, but generates no user facing alarms.
This triggered a repressed memory. I was once on an ops team responsible for an application that had been turned over to us written in Java. The application ran in Tomcat, and prior to turning it over to us, the external vendor had "helpfully" built some monitoring and alerting for the application. One of the metrics they monitored was how many errors per minute and per hour were happening the application's logs. When they turned this application over to us, they did not properly hand over the monitoring system or even tell us that it existed, and it was sending alerts to another external vendor team which had no association to my ops team.
Several years went by... then because we finally had the time our ops team decided to audit and improve this application, including rewriting significant portions of its code as was pretty typical when we took over an application built by an external vendor. Long story short, we resolved almost all of the errors by the expediency of simply correctly interacting with the database behind the application instead of relying on whatever horrific dumpster fire output to the DB that Hibernate was doing. That's when things got super interesting, because the external vendor team that was receiving these monitoring alerts (unbeknownst to my team) had more than 100% turnover in the intervening period of time, having lost all knowledge about this, but suddenly starting getting a bunch of critical urgency alerts going off because the application was no longer emitting the amount of errors expected for the alarm thresholds. This ended up triggering a massive company-wide incident call, even though there was no external customer impact. I'm leaving a lot of details out, but let's just say it sucked up days of my life and was a massive waste of time for all of that.
Fun example of fixing something and causing an "issue" because behavior changed from expectations/baseline.
I always wondered: Since Alarm Fatigue is well known, why do UX designers and/or product owners keep insisting on adding new alarms for things? Surely we know they increasingly don't work, the more you have.
The better solution is to redesign / engineer systems which automatically solve the alarm condition so no person needs to know about the alarm. That usually requires many people to work together and requires far more complexity to solve, both initially and in ongoing upkeep.
Because modern design philosophy has, as a rule, thrown everything we learned before 2010 out the window because that's not what google does.
"But the A/B tests show consumers prefer it!". Oh? So you have a degree in stats and experience running scientific tests? No? You're just a programmer who only knows how to enable the A/B test feature of your favorite javascript library?
Your A/B tests aren't testing what you think they are testing.
There’s some small fraction where someone believes it would be helpful but was mistaken. What you’re describing is definitely the vast majority.
When something unexpected or out of the ordinary happens, what do you do about it?
- Nothing? Why did you do nothing? This was clearly a problem. This is your fault.
- Figure out appropriate error handling? That’s hard work.
- Just throw up an alert on everything and make it someone else’s job to sort through it all? Perfect, job done.
When entering and exchanging potentially very large money amounts, it behooves risk management of entered, encoded, and transmitted prices carefully like the following: NDC.
> When the trader checked the value of the inputted basket, they were presented with a figure of negative 58 million for the value of the basket (58 million multiplied by -1). The trader saw a ValAtBM of -58,000,000, which was the number they expected to see, and thus they clicked Execute to continue to the next check.
It doesn't seem that this would have helped here: the trader entered exactly the number that they intended to, just that their intent was misconceived.
Perhaps and perhaps another compounding factor was error-prone UX such as not displaying total estimates prominently or the presence of options leading to order-of-magnitude differences.
But the intent of what I suggested is to guard against many classes of errors including data entry, storage, transmission, and interpretation.
Stupid question since I’m not in the fintech space. Why is a trader responsible for manually filling out an order to hedge against a sell order from a customer?
> Why is a trader responsible for manually filling out an order to hedge against a sell order from a customer?
The only manual component is specifying the size. As to why the client isn't doing it, this order might have been placed over the phone. Or it may be part of a more-complicated strategy that required some finesse.