Hacker News new | past | comments | ask | show | jobs | submit | jiggawatts's comments login

Something I noticed a long time ago is that going from 90% correct to 95% correct is not a 5% difference, it’s a 2x difference. As you approach 100%, the last few 0.01% error rates going away make a qualitative difference.

“Computer” used to be a job, and human error rates are on the order of 1-2% no matter what level of training or experience they had. Work had to be done in triplicate and cross-checked if it mattered.

Digital computers are down to error rates roughly 10e-15 to 10e-22 and are hence treated as nearly infallible. We regularly write code routines where a trillion steps have to be executed flawlessly in sequence for things not to explode!

AIs can now output maybe 1K to 2K tokens in a sequence before they make a mistake. That’s 99.9% to 99.95%! Better than human already.

Don’t believe me?

Write me a 500 line program with pen and paper (not pencil!) and have it work the first time!

I’ve seen Gemini Pro 2.5 do this in a useful way.

As the error rates drop, the length of usefully correct sequences will get to 10K, then 100K, and maybe… who knows?

There was just a press release today about Gemini Diffusion that can alter already-generated tokens to correct mistakes.

Error rates will drop.

Useful output length will go up.


I don't think the length you're talking about is that much of an issue. As you say, depending on how you measure it, LLMs are better at remaining accurate over a long span of text.

The issue seems to be more in the intelligence department. You can't really leave them in an agent-like loop with compiler/shell output and expect them to meaningfully progress on their tasks past some small number of steps.

Improving their initial error-free token length is solving the wrong problem. I would take less initial accuracy than a human but equally capable of iterating on their solution over time.


You are having low expectations here. People used to enter machine code on switches and punched paper tape, so yes they made sure it worked the first time. Later, people had code reviews by marking up printouts of code, and software got sent out in boxes that couldn't be changed until the next year.

Programmers who "iterate" buggy shit for 10 rounds until they get it right are a post-Google push-update phenomenon.


Been there, done that. I made mistakes and had to try again or correct the input (when that was an option).

> I love the BoM.

You have no reason to. They're not a well-run, efficient organisation, even by government department standards.

> over http:// urls because of the overhang of farm equipment which can't handle TLS https: connections.

This is the public narrative, and is a brazen lie.

THERE IS NO SUCH FARMING EQUIPMENT!

Certainly not in 2025.

Their site DOES HAVE a certificate, and supports HTTPS, right now.

They just refuse to let you use it, redirecting a successful HTTPS connection back to HTTP.

If there was farming equipment (Which models? Vendors? Affecting how many farmers?) out there they would be broken right now because port 443 is open and listening. If they can connect to port 443, then they don't need to be redirected back to port 80 because they worked. If they can only connect to port 80, then the presence of port 443 makes no difference.

I keep pointing this out, and "true blue aussies that looooove the BoM" keep arguing about this. The BoM spends hundreds of millions on IT, tens of millions on consultancies like Accenture, but they can't manage a $50 certificate and a bog standard HTTP/HTTPS endpoint.

My iPhone, right now, can't connect to most BoM web pages because of this stupid, stupid issue! There is no mystical broken farming equipment, but there definitely are inconvenienced users like me!

After putting up this rant for years and years in various forums, a former BoM employee finally fessed up -- they were the one that tried switching HTTP to HTTPS years ago and "broke" their internal systems. Not non-existent farming equipment, it was BoM's own internal services that fell over.

Why?

Because they hadn't updated any of it in decades, and the software was so old that it couldn't handle "modern" cipher suites and TLS versions. Modern being "newer than SSL 3.0"... in 2015.

That's not the sign of a competently run organisation worthy of admiration.


Farmers have a reason to be sceptical:

The BoM caused hundreds of millions of $ of damage to livestock farming - they warned about a huge drought in 2022 causing massive livestock culling. Instead there were years of above-average rainfall.


The BoM cop this both ways. People don't look to the candlestick error bars on their statements, and their long range weather forecast is outside the window of the ABC article we're discussing. I get how upsetting it is to de-stock but if their best belief is the risk is high, what else are they meant to do? If they hadn't said this, and there'd been the drought, they would have got your scorn too, right?

If you know a climate/weather bureau of comparable scale of footprint doing a better job, we'd all like to know. AFAIK there isn't one: most of the alternates work in different scale, with different climate forcing functions.


This is always the top comment in these kinds of threads, and I see this as an indication that the current state of CI/CD is pathetically propriety.

It’s like the dark times before free and open source compilers.

When are we going to push back and say enough is enough!?

CI/CD desperately needs something akin to Kubernetes to claw back our control and ability to work locally.

Personally, I’m fed up with pipeline development inner loops that involve a Git commit, push, and waiting around for five minutes with no debugger, no variable inspector, and dumping things to console logs like I’m writing C in the 1980s.

You and I shouldn’t be reinventing these wheels while standing inside the tyre shop.


We've had open source CI. We still have it. I remember ye olde days of Hudson, before it was renamed Jenkins. Lotsa orgs still use Jenkins all over the place, it just doesn't get much press. It predates GHA, Circle, and many of the popular cloud offerings.

Turns out CI/CD is not an easy problem. I built a short-lived CI product before containers were really much of a thing ...you can guess how well that went.

Also, I'll take _any_ CI solution, closed or open, that tries to be the _opposite_ of the complexity borg that is k8s.


Having a container makes debugging possible, but it's still generally going to be an unfriendly experience, compared to a script you can just run and debug immediately.

It's inevitable that things will be more difficult to debug once you're using a third party asynchronous tool as part of the flow.


Products don’t get to be informed about the factory in which they are made, or which shop they are to be sold.

> extremely fast IO.

I wonder how big a competitive edge that will remain in an era where ordinary cloud VMs can do 10 GB/s to zone-redundant remote storage.


GB/s is one metric, but IOPS and latency are others that I'm assuming are Very Important for the applications that mainframes are being used for today.

IOPS is the most meaningless metric there is. It's just a crappy way of saying bandwidth with an implied sector size. 99% of software developers do not use any form of async file IO and therefore couldn't care less. The async file IO support in postgres has been released a month ago. It's that niche of a thing that even extremely mature software that could heavily benefit from it hasn't bothered implementing it until last month.

Microsoft SQL Server has been using async scatter/gather IO APIs for decades. Most database engines I've worked with do so.

Postgres is weirdly popular despite being way, way behind on foundational technology adoption.


> Microsoft SQL Server has been using async scatter/gather IO APIs for decades. Most database engines I've worked with do so.

Windows NT has asynchronous IO since its VAX days ;-)

> Postgres is weirdly popular despite being way, way behind on foundational technology adoption.

It's good enough, free, and performs well.


I constantly hear about VACUUM problems and write amplification causing performance issues bad enough that huge users of it were forced to switch to MySQL instead.

I've been involved in a couple of those cases, where a large company ran into an issue, and chose to solve it by migrating to something else. And while the issues certainly exist (and are being addressed), the technical reasons often turned out to be a rather tiny part of the story. And in the end it was really about internal politics and incentives.

In several such cases, the company was repeatedly warned about how they implemented some functionalities, and that it will cause severe issues with bloat/vacuuming, etc. Along with suggestions how to modify the application to not hit those issues. Their 10x engineers chose to completely ignore that advice, because in their minds they constructed an "ideal database" and concluded that anything that behaves differently is "wrong" and it's not their application that should change. Add a dose of politics where a new CTO wants to rebuild everything from scratch, engineers with NIH syndrome, etc. It's about incentives - if you migrate to a new system, you can write flashy blog posts how the new system is great and saved everything.

You can always argue the original system would be worse, because everyone saw it had issues - you just leave out the details about choosing not to address the issues. The engineering team is unlikely to argue against that, because that'd be against their interests too.

I'm absolutely not claiming the problems do not exist. They certainly do. Nor am I claiming Postgres is the ideal database for every possible workload. It certainly is not. But the worst examples that I've seen were due to deliberate choices, driven by politics. But that's not described anywhere. In public everyone pretends it's just about the tech.


Politics is an unavoidable aspect of larger groups, but it gets a lot worse when coupled by wrong incentives that reward heroic disaster mitigation over active disaster avoidance.

When you design a system around a database, it pays off to design your ways of mitigating performance issues you might face in the future. Often, a simple document explaining directions to evolve the system into based on the perceived cause. You might want to add extra read replicas, introduce degraded modes for when writes aren't available, moving some functions to their own databases, sharing big tables, and so on. With a somewhat clear roadmap, your successors don't need to panic when the next crisis appears.

For extra points, leave recordings dressed as Hari Seldon.


Guaranteed sustained write throughput is a distinguished feature of the mainframe storage.

Whilst cloud platforms are the new mainframe (so to speak), and they have all made great strides in improving the SLA guarantees, storage is still accessed over the network (plus extra moving parts – coordination, consistency etc). They will get there, though.


Latency is much more important than thoughput...

On-site.

Speed is not the only reason why some org/business would have Big Iron in their closet.


The worst part is that setting up static web content hosting with something like an Azure blob store, or just a NGINX server somewhere is hilariously trivial.

This is an afternoon's effort for the junior intern, but was "too hard" for these people.


That costs money to maintain, even if it's just a few bucks a month. I've seen plenty of Chinese companies using mega/gdrive/etc just because it's free. I used to think it was just cheapness, but depending on the company it can be a huge hassle to set up recurring small bill items. At my current company for example, it's much easier to pay $5-10k once than pay $5/mo.

With Google Drive specifically, scale becomes an issue though. Once too many people download a given shared file, it gets flagged for possibly being a piracy operation. Not sure about MEGA, but they also have some limits (although for normal drivers this shouldn't be an issue).

Nope azure blob storage hosting can be completely free

The "natural ID" for people design reminds me of a story from a state department of education: They had two students, both named John Smith Jr. They were identical twins and attending the same class.

They had the same birth date, school, parents, phone number, street address, first name, last name, school, teachers, everything...

The story was that their dad was John Smith Sr in a long line of John Smiths going back a dozen generations. It was "a thing" for the family line, and there was no way he was going to break centuries of tradition just because he happened to have twins.

Note: In very junior grades the kids aren't expected to memorise and use a student ID because they haven't (officially) learned to read and write yet! (I didn't use one until University.)


Same first name? For twins?

I find it very difficult to believe this is not prevented by some law.


Me too, but it seems to be an oddly common occurrence: https://www.google.com/search?q=identical+twins+with+the+sam...

> will absolutely swear that there will never be two people with the same national ID...

I suddenly got flash-backs.

There are duplicate ISBN numbers for books, despite the system being carefully designed to avoid this.

There are ISBN numbers that have invalid checksums, but are valid ISBNs with the invalid number in the barcode and everything. Either the calculation was incorrectly done, or it was simply a mis-print.

The same book can have hundreds of ISBNs.

There is no sane way to determine if two such ISBNs are truly the same (page numbers and everything), or a reprint that has renumbered pages or even subtly different content with corrected typos, missing or added illustrations, etc...

Our federal government publishes a master database of "job id" numbers for each profession one could have. This is critical for legislation related to skilled migrants, collective workplace agreements, etc...

The states decided to add one digit to these numbers to further subdivide them. They did it differently, of course, and some didn't subdivide at all. Some of them have typos with "O" in place of "0" in a few places. Some states dropped the leading zeroes, and then added a suffix digit, which is fun.

On and on and on...

The real world is messy.

Any ID you don't generate yourself is fraught with risk. Even then there are issues such as what happens if the database is rolled back to a backup and then IDs are generated again for the missed data!


  The states decided to add one digit to these numbers to further subdivide
  them. They did it differently, of course, and some didn't subdivide at all.
  Some of them have typos with "O" in place of "0" in a few places. Some
  states dropped the leading zeroes, and then added a suffix digit, which is fun.
Any identifier that is comprised of digits but is not a number will have a hilariously large amount of mistakes and alterations like you describe.

In my own work I see this all the time with FIPS codes and parcel identifiers -- mostly because someone has round-tripped their data through Excel which will autocast the identifiers to numeric types.

Federal GEOIDs are particularly tough because the number of digits defines the GEOID type and there are valid types for 10, 11 and 12-digit numbers, so dropping a leading zero wreaks havoc on any automated processing.

There's a lot of ways to create the garbage in GIGO.


The panic about these is way out of proportion with the real risks. Modern systems have all sorts of over-voltage protection, and we no longer use "telegraph wires" directly connected to vulnerable electronics like speakers and amplifiers.

All modern telecommunications are over fibre or radio links.


How much do wire distance, intended voltage matter? All the power electronics are almost certainly protected by caps, but are big office ethernet runs long enough to cause issues? What about coax cables? It seems like with how many more cables we have now, that one of them probably has a design that would cause notable inconvenience.

You need at least hundreds of kilometers for the effects to become significant (as in "tens of volts"). Nothing on the small scale will be affected.

Power lines might be the most vulnerable part, actually. The geomagnetic field can induce current that will bias the core of transformers, causing them to overheat. This can lead to blackouts if the networks are close to capacity, and it's suspected that the 2003 North East Blackout was at least partially caused by them.


Geomagnetically induced currents generally become a problem over hundreds of kilometers. Long range electricity transmission lines are the main worry I believe and solar storm events have knocked off large grids occasionally.

I have no idea what the correlation is between particle flux (the metric reported here) and actual geomagnetic variation which induces the current (varying magnetic field causes voltage). Basically the charged particles zoom past earth, then loop back from the magnetotail towards the poles. The magnetohydrodynamics cause effects large enough to modulate the magnetic field on earths surface.

”We have this long conducting loop” is the issue. The Earth is one component of the loop.


Over voltage protection is not provided by caps (generally), but MOV’s, Zenier Diodes, spark gaps, etc. [https://en.m.wikipedia.org/wiki/Surge_protector].

Essentially, components that kick in when a voltage exceeds a certain limit to allow that excess voltage to shunt to ground instead of continue to build up in the circuit.

Similar devices in pneumatics or hydraulics are pressure relief valves [https://en.m.wikipedia.org/wiki/Relief_valve], and they provide similar functionality - giving a easier/lower resistance path for the high voltage/pressure, so delicate things downstream don’t fail.


There was a blackout in Canada caused by a solar storm in 1989. Has this changed dramatically since then?

What about ground to satellite communication?

That would be radio links. Not many satellites connected to the ground by copper cable.

Yeah, I was asking generally with regard to panic involving "computers and everyday electrical devices", which I thought would be obvious, apologies.

honestly I'd be more worried about the upper atmosphere puffing up and taking out low satellites, like what happened with a starlink launch last year.

As a customer of these types of businesses: yes, we can tell. We do care, and we are ready to drop the shitty products of these bad companies at the first opportunity.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: