Terence Tao on GPT-4

jacknews · on April 12, 2023

I mean, one of the leading mathematicians at the moment, is expected to spent his time collating conference statistics? Does he not have an assistant for this kind of thing?

NhanH · on April 12, 2023

Terence Tao makes about only half a million or so every year (his salary at UCLA is public). Not sure about other income, but it is pretty much just the same ballpark as good doctors, lawyers, accountants at big 4 etc. At that range of income, it’s job-dependent on whether they would have an assistant or not, so no surprise here either way.

Uehreka · on April 12, 2023

Dawg. If $500k isn’t in the ballpark to have an assistant, then nobody outside of actual CEOs or Fortune 500 executives has an assistant. I know people who make less than $200k and have an assistant. I feel certain there are people who make less than $100k who have an assistant.

jackmott42 · on April 12, 2023

How does someone making less than 200k afford an assistant?

explaininjs · on April 12, 2023

The assistant's salary comes from the same corporation that pays the figures salary, not the figure themselves.

To the GP's point, a friend of mine makes 80k but is still provided an assistant. That's in the entertainment industry, where PA's are dirt cheap and union laws prohibit multitasking.

joelthelion · on April 12, 2023

Their company could realize that giving them one would be a wise use of resources.

kgwgk · on April 12, 2023

Just like that someone can afford to make $200k in the first place: someone else is paying those people and their assistants to get things done.

jonahx · on April 12, 2023

It's a surprise even if you restrict your point of view to the narrow economic interests of his employer, UCLA, who benefits from Tao's discoveries. Shelling out another 50k or so for an assistant, even if it boosts his productivity by only 10%, is a no-brainer.

If you expand your point of view to the interests of all mathematicians, or society at large, the waste here is even more extravagant.

lotsofpulp · on April 12, 2023

An assistant who could do what Tao did is not going to be $50k.

valarauko · on April 12, 2023

The assistant's job would be to take low-hanging fruit non-research administrative tasks off his hands - pretty much like the tasks here.

joshspankit · on April 12, 2023

The assistant would have gone through the documents manually and that’s very much a $50k/yr task

microtherion · on April 12, 2023

He made ~650K gross in 2021: https://ucannualwage.ucop.edu/wage/

That is a bit more than 1/9th of what the head football coach at UCLA makes, who has two assistants (offensive and defensive coordinators) who each earned a bit more than Tao. Additionally, UCLA still paid more than $3M in 2021 to another head football coach whom they had fired in 2017. The university seems to have fairly clearly defined priorities.

explaininjs · on April 12, 2023

UCLA football is a profit center. It costs 32MM to operate but brings in 37MM in revenue. I suppose Tao might bring in some government grants (maybe?) but I doubt its anywhere near the same ballpark.

https://www.collegefactual.com/colleges/university-of-califo....

microtherion · on April 12, 2023

It's not entirely clear what year those numbers refer to. Overall, the 2019-2022 period seems to have been extremely unprofitable: https://www.latimes.com/sports/ucla/story/2023-01-26/ucla-at...

But even IF this were a profitable activity, should a university really be in the business of running an enterprise that literally causes irreversible brain damage (Story about USC, but undoubtedly applies to all football programs: https://www.si.com/college/2020/10/07/usc-and-its-dying-line...)?

explaininjs · on April 12, 2023

You’re confusing the football program for the athletic program as a whole. If you scroll down in the link I posted you’ll see every other athletic program at UCLA is indeed running at a significant loss. Football is the only profitable one.

With respect to the brain damage, he spent much longer in the professional setting than college, it’s more likely the damage came from there. I don’t think colleges should be in the business of restricting what students should be allowed to explore based on the possibility they might choose a career where brain damage may occur. And if they did, Engineering should be the first to go. Just think of all the brain damage resulting from professionals engineers of the military industrial complex!

microtherion · on April 12, 2023

Note that the article is not just about Junior Seau. Of the 12 linebackers who played in the USC team he was in, 5 died before age 50, and he was the only one of them with a significant NFL career (another one played a single season, the other 3 never played professional football).

WastingMyTime89 · on April 12, 2023

> Not sure about other income, but it is pretty much just the same ballpark as good doctors, lawyers, accountants at big 4 etc

Either salaries in the US are extremely higher than what I thought, you have a very idiosyncratic definition of "good" or you are significantly overestimating how much people make.

NhanH · on April 12, 2023

That’s a different topic, I think. So you can consider my definition of “good” idiosyncratic, or to substitute it with “highly paid” or something.

The point is that amongst people that I know who are making 500k, the majority of them aren’t having assistants. And I do think it extrapolate to the population of people with 500k income too (no proof here, just a hunch).

nequo · on April 12, 2023

Don't doctors have a lot of the administrative duties, preparation of equipment, etc., done by other staff in the practice? For practical purposes, wouldn't that count as having an assistant?

In the case of this post, one could argue that collating statistics should be done by ICM staff rather than the committee chair.

WastingMyTime89 · on April 12, 2023

Having a personal assistant has become a rarity but I know a lot of people who are paid far less than 500k and can indeed delegate this kind of work. I work in consulting and I would most likely ask a junior member of the team to do data crunching. Similarly we share an assistant in my department whose job is to take care of scheduling complicated meetings, organising business travels, making sure than the documents required for billing have been collected.

returningfory2 · on April 12, 2023

Maybe a bit of everything, though in my experience people outside the US significantly underestimate how much successful professionals in the US earn.

For example, this article [1] says that to be in the top 1% of earners in the US a family needs to make $600k or more. There are 330 million people in the US, meaning that 3.3 million people are in a household earning $600k or more. I think that's a lot!

[1] https://smartasset.com/data-studies/what-it-takes-to-be-in-t...

Bjartr · on April 12, 2023

It may be a lot in absolute terms but does not represent the experience of the other 99% of the population.

jacknews · on April 12, 2023

So you think 'good accountant' and 'world class mathematician' are equivalent, based on their salary?

NhanH · on April 12, 2023

No I don’t, but they have to pay for assistant all the same. I was describing how things are, not how they should be.

University could have paid for Terence Tao’s assistant, but then they could have paid him more too.

jacknews · on April 12, 2023

I think their company or institution should be paying for the assistant, just as they pay for the office, security, executive, HR, etc.

Q6T46nT668w6i3m · on April 12, 2023

Only?

NhanH · on April 12, 2023

I thought about that word, and whether to put air quote around it. But Terence Tao is one of the few that could claim to be one in hundred of millions, if not billion.

So yeah, “only”.

layer8 · on April 12, 2023

So, you’re saying that collating conference statistics in a spreadsheet is worth a $500K yearly salary? I would have thought that you can get that cheaper.

dpcan · on April 12, 2023

The entire world now has a brilliant assistant for these kinds of things - it's ChatGPT.

Every numbskull to every professor now has the ability to generate information on a scholarly level and solve problems that could take a human years of training and experience to produce.

I'm really trying hard not to get all doom and gloom about this stuff, but I'm honestly worried. It keeps me up at night.

3rd3 · on April 12, 2023

I try to maintain an optimistic outlook because the AI race is now unstoppable anyhow.

Mass unemployment is a likely outcome, but modern states already have amazing safety nets in place and they're not far away from a universal basic income.

As technology has improved, so has life on almost all relevant metrics (education, longevity, poverty, crime etc.) and even though technology amplifies greed, criminals do not prevail because they are heavily selected against seemingly regardless of the level of technological sophistication.

As for paperclippers, there is actually no indication that larger neural nets misinterpret human intention. Rather, larger NNs mostly become better at understanding. Current AIs when prompted to make paperclips do not interpret this as us wanting to turn the entire world into paper clips, but in terms of (the more sensible interpretation) of meeting the economic demand for paperclips, hence giving advice on how to optimize the machinery for producing paperclips etc. Small models do tend to find weird, unintended shotcuts, but my hunch is we are seeing this less with very large NNs.

Lastly, we can always employ AIs to control other AIs. AI scammers will be counteracted by scam detection AIs. Same for fake news, mind control, drone warfare, propaganda, etc. AI will allow us to make better political arguments against mass surveillance and for freedom, and it will make journalism more efficient at exposing crime and injustice. Demonopolizing AI is hence extremely important, so that every misuse of AI or misaligned AI can be counteracted by other AIs.

neel8986 · on April 12, 2023

>> Mass unemployment is a likely outcome, but modern states already have amazing safety nets in place and they're not far away from a universal basic income.

very few western and east asian nation have that. 80% of population dont have safety net

3rd3 · on April 12, 2023

True, but AI will accelerate progress and make every product extremely cheap, almost for free. It will be extremely cheap even supporting a population 1000x as large as today (including cleaning up after them, filtering water and soil etc.).

bumby · on April 12, 2023

>Every numbskull to every professor now has the ability to generate information on a scholarly level

I'm hopeful about the future use of AI, but this strikes me as potentially overly optimistic. Or possible misinterprets what academics do.

ChatGPT is very good at collating existing information. I don't know that it is anywhere near the level of creativity necessary to generate novel ideas (especially those that have to be rooted in reality to solve engineering problems).

For example, if you ask "How can we redesign a Wankel engine to achieve 2% better efficiency?" it does give a good summary of the main mechanisms that impact efficiency (thermal losses, friction, combustion efficiency) but it doesn't give any actionable, novel ways of implementing them. It's basically a 101 course summary of combustion engines. "Generate better materials" is not an actionable solution. So unless you think a freshman/sophomore understanding is all that's needed to solve our big problems, we've still got a ways to go before we can turn the reins over to AI.

SantalBlush · on April 13, 2023

>has the ability to generate information on a scholarly level

I think a lot of people misunderstand what a scholarly level is. ChatGPT is not even close.

jostmey · on April 12, 2023

It’s academia. Only admins get secretaries. I’m only joking a little bit, as I once was a professor

robertlagrant · on April 12, 2023

I was going to make the same joke! If he were an Assistant Vice Under-Dean he'd have 10 assistants, but sadly he's just an academic.

dauertewigkeit · on April 12, 2023

His assistants are probably better employed doing research than messing round with spreadsheets. And the secretary might then not have the necessary competence for such a task. And if they employed a person for just such tasks, they would have loads of downtime.

layer8 · on April 12, 2023

Yeah, I was thinking that 50 years ago he’d probably have had a secretary or admin person to take care of that.

chriskanan · on April 12, 2023

As a professor, my understanding is that professors used to have assistants for these sorts of things in the distant past, but this is much less so nowadays besides deans or higher ups. People only in leadership positions seem to be able to get them when in a university setting. This isn't true everywhere, e.g., my academic medical collaborators typically have assistants and so do the lab heads associated with non-profit research organizations (e.g., Salk). Department heads may have them, but even that seems to be happening less and less.

Unfortunately, professors currently have to spend a lot of time doing tedious administrative/organizational work unrelated to research/teaching.

camdenreslink · on April 12, 2023

Isn’t this what graduate students are for?

jltsiren · on April 12, 2023

As a mathematician, he probably doesn't bring that much grant money in. The money to hire an assistant doesn't exist, especially in a public university like UCLA. Assistants tend to be more common in fields funded by the NIH, because the grants and the overheads are much bigger.

FrustratedMonky · on April 12, 2023

GPT is the assistant. Saves on employees, that would be the pro-corporate take.

kevmo314 · on April 12, 2023

Seems like his assistant is GPT-4 :)

commandlinefan · on April 12, 2023

I used GPT-3 yesterday for the first time myself - I wanted some code to list the DynamoDB tables available to me (I had to run it from an EC2 instance and I wanted to see what that particular EC2 instance could see). I was pretty impressed - it put the code together so I asked it for a POM and it put that together. The POM failed with an error because it was trying to load from Maven central using HTTP, so I asked ChatGPT how to fix that error and it told me how to update my settings.xml file (it did make a mistake, but it was an easy one to fix). Like - I knew how to do all of that, but it probably would have taken a half hour vs about 5 minutes with ChatGPT.

elboru · on April 12, 2023

ChatGPT is great with well known tools and libraries. I’ve been able to build HTML/JS apps in minutes. Then I tried building a MAUI app, it started pretty good, but then when I tried to add charts it failed (it’s important to mention this was my first MAUI app), I tried different approaches, in the end I went back to HTML/JS.

That led me to a realization, this could mean that people will likely discard less known libraries or technologies more easily than before. Popular technologies will become even more popular since we’ll be 10x more productive with them compared to less known ones since we won’t have LLMs super powers for those.

burlesona · on April 12, 2023

I agree and have had the same thought. This is gas on the fire to winner-take-all dynamics. It’s already hard for new languages and systems to catch up fast enough to technically weaker existing options bc they have mature tooling. But when the robot is fluent in Java and can’t understand $newlang it’s going to be even harder.

akiselev · on April 12, 2023

It's way too early to tell. I'm betting that projects will be able to fine tune LLMs against their code bases and documentation, making it easier to adopt new open source projects because mini-GPTs will help new users get set up with their exact use case instead of having them dig through a bunch of semi-irrelevant examples.

sharkjacobs · on April 12, 2023

When I ask it about Swift, I'm often surprised by how good it is.

When I ask it about Objective-C it bullshits and fabricates APIs.

When I ask it about python or bash scripts it is astonishingly good

vkazanov · on April 12, 2023

A couple of weeks ago I made an absurdly efficient arithmetic expression language compiler in python, complete with a RD/pratt parser, control flow graph IR , optimizer (dead code elimination, algebraic, etc) and a simple x86 codegen. Chatgpt wrote most of it.

Then I asked chatgpt to port it piece by piece to C.

Took a while to get a hang of gpt4's token length limitations, as well as fixes here and there... 1.5 hours.

But damn... Most programmers are just not capable of doing this at all.

RugnirViking · on April 12, 2023

exactly. It makes mistakes, but so do I. It makes mistakes far faster however, because it takes about 5 seconds to finish each attempt, so they seem far more prevalent. However, I can fix it's mistakes almost as fast as I can fix my own, so this speeds up iteration nicely. It's not a "perfect tool", but nothing is

mihaic · on April 12, 2023

The most shocking thing for me here is that one of the best mathematicians of our times has to do fucking random data to Excel gruntwork.

What AI seems to be replacing is mostly bullshit jobs in this case, that should have been handled by someone else.

acuozzo · on April 12, 2023

This doesn't surprise me.

Our species has proven itself to be exceptionally bad at finding the middle ground between giving individuals the freedom to self-direct and allocating resources efficiently at scale.

These resource allocation problems exist at the national level, but also within even some of the smallest organizations and certainly everywhere between.

Outside of FAANG, how many development teams really have a dedicated DevOps teammate? A DBA? A tester? A sysadmin? Hell, a sysop?!

SWEs outside of FAANG are so accustomed to wearing dozens of hats that many would still consider the wisdom in "The Mythical Man-Month" a pipedream.

Abroszka · on April 12, 2023

Even in some of the FAANG that's not the case. Not sure about all of them though.

thaw13579 · on April 12, 2023

There's quite a bit of drudgery in academic positions, organizing/running meetings/conferences, editing journals, scrounging for reviewers, grant writing, and navigating all of the red tape of universities.

dan_mctree · on April 12, 2023

Someone ought to start a charity to hire personal assistants for people like Terence Tao. Or get some government tech grant working on that. That would be probably one of the most efficient investments in progress you can get

dangerwill · on April 12, 2023

It is so depressing that we are using chatGPT for data collation. You can never ever know if it is going to silently alter your data unless you check it for correctness afterwards which no one is doing. The tech won't seem so awesome when someone is left off the credits for a talk (in this case). We are adding a new point of failure and no one should believe any curated stats from openAI about GPT4 being X% more accurate for a second.

sebzim4500 · on April 12, 2023

Yes, that's why I get people to do these repetitive tasks instead. Humans are well known to have a 0% failure rate, especially when they are bored out of their minds.

kurisufag · on April 12, 2023

>You can never ever know if it is going to silently alter your data unless you check it for correctness afterwards which no one is doing. The tech won't seem so awesome when someone is left off the credits for a talk (in this case).

terry mentions running some checksums after processing on the linked page. I'd imagine most people working on serious tasks do something similar.

dangerwill · on April 12, 2023

https://mathstodon.xyz/@tao/110172819887751038 I'm glad that he spot checked but the checksum thing is nonsense. To be able to fully check that the output was correct (and not just the correct raw count of speakers but apportioned properly) then you need to compare the collation of the data as provided by another means (ie manually going through and building a spreadsheet in this case), which would be doing the work that this is claiming to avoid doing.

pgt · on April 12, 2023

How are folks parsing PDFs with GPT4? Every 3rd party tool that supposedly uses GPT has failed for me on even the most regular of PDF bank statements. Currently I am parsing the text contents manually using pdfbox to reconstruct historical bank statements, but this is error-prone due to page breaks and run-on lines.

Ductapemaster · on April 12, 2023

I had success with having it guess the formatting, which was a pretty cool experiment.

I had a table in a PDF of registers and associated information, and I copied and pasted the text directly into ChatGPT as a big block, and asked it to structure the data as a table again based on its best understanding of the data, knowing it came from a table. To my surprise, it did a really good job. A couple small edits here and there were needed to change some formatting and it missed a couple values, but overall it took me a couple minutes to edit and I was on my way.

passion__desire · on April 12, 2023

Chrome's "Search images with Google" is already good at extracting text. I wish they could recognize table formats within image or a page and allow it to be exported to google sheets. That would be game changer.

ragazzina · on April 12, 2023

I think he just copy-pasted everything from the PDF to GPT4? He says:

> with the only tedious aspect being the cut-and-paste between the raw data, GPT4, and the spreadsheet

zachwill · on April 12, 2023

Note: I haven’t worked with PDF bank statements.

My current solution is pdfplumber → GPT-3 API. I played around with a few different options, and this is personally what’s worked best for my use cases.

MilStdJunkie · on April 12, 2023

Same deal for me, I've tried a few AI PDF pipelines with no good output. If I feed it cleaned up delimited text then everything's good[1], but by the time I get to that point I might as well be using Tabula + Orange / R. I'll keep trying though, because damn, PDF files need to go die under a rock.

[1] Well . . kinda. The model has some weird ideas about what an edge table is.

Jeff_Brown · on April 12, 2023

I'm guessing that manually feeding it a table at a time, with prompts tailored to the shapes you see in that table, is more effective than using a tool that tries to do that for you. Automated tools on top of GPT will be great when they work but they introduce a new communication joint where things can fail.

yewenjie · on April 12, 2023

It is kind of a shame of our academia (society?) that a mathematician of Terence Tao's caliber had to do data entry jobs like this manually before.

throwsalj2rttw · on April 12, 2023

I'd really like to see sessions of people using things like ChatGPT for non-trivial things.

People keep saying it saves them time and look forward to the future.

For the life of me I can't figure out how to use it.

Balgair · on April 12, 2023

We decided at work to run a little experiment with GPT3 to see if/how-much it was 'worth it'.

Since baseball is back and most of us are fans, we decided to write a baseball simulator. We each had a Friday afternoon to write one up. Half of us got to use the free GPT3, and half had just regular googling. After the jam, we'd compare notes at the bar and see what the difference, if any, was.

Holy cow, was there ever a difference.

Those without GPT3 got pretty far. Got the balls and strikes and bases and 9 innings. Most got extra innings down. One even tried the integration with ERA and batting stats in the probabilities of an event occurring but was unable to get it done.

The GPT3 group was estimated to be 2 weeks worth of work ahead of the googling group. Turns out, there is a whole python library for baseball simulations and statistics. The googling group didn't find that, but GPT3 just prompted it outright on the first query for everyone using it. This group got the basics of the game done in ~30 minutes. Managed to get integration with actual MLB statistics. Built somewhat real physics simulators of balls in play and distances, adjusted for temperature and altitude. Not all of them at once, but a lot of really great stuff.

Aside: Did you know that MLB publishes, in real time, all 6 degrees of freedom for a ball, from where it leaves a pitchers hand to where a catcher/batter interacts with it? They put out the spin rates in three axes! Wild stuff.

Our conclusions were that it's totally 'worth it' and is a ~20x multiplier in coding speed. It spits out a lot of really bad code, but it gets the skeletons out very quickly and just rockets you to the crux of the problems. For example: it gave out a lot of jibberish code with the python baseball library; like trying to pass a date into a function that only takes in names. But it gives you the correct functions. Easy enough to go and figure out the documentation on that function.

Like I said, it's a ~20x multiplier for our little experiment.

Action Items for management: Pay whatever you have to and let us use it all the time.

lisasays · on April 13, 2023

So how would GPT fare at writing a simulation for a problem ... that has no source code or even literature for it in the (crawlable) public domain?

Also, as to what the GPT group was able to produce -- sure it was a lot of code, and apparently a quite a bucket of features -- but did it actually produce a usable simulation? Or even a coherent statement of what a "baseball simulation" should do, actually, and how its accuracy is to be measured?

I'm not casting aspersions here - I'd really like to know.

Balgair · on April 13, 2023

It does not produce a usable simulation of baseball right out of the box. It'll give you skeleton code that you kinda have to then fill in yourself. But it's really good skeleton code. Like, the functions are used wrong, but it's the correct function. The explanation of the code that it give you is really spot on though. Like, yes, those are the correct steps a coder should implement.

It's easy enough to try it out for yourself too! Give yourself a challenge and see where it takes you.

lisasays · on April 13, 2023

I'll definitely give it a whirl sometime. And I appreciate the detailed field report.

It's just that, if someone gave me 3 hours, and asked me to come back with constructive, actionable progress toward creating a simulator for X (where X is sufficiently rich and complex, like baseball) -- I wouldn't mess around with skeleton code at all.

Instead I'd try my best to come up with a statement of what the simulator should do, and why.

Balgair · on April 13, 2023

Yeah, in our case it was baseball and most of us are fans. So, we all knew what to do and what 'good' looked like. It was still pretty open ended though, which was fun. It was good to see what my coworkers came up with and the different approaches taken.

Workaccount2 · on April 12, 2023

It has allowed me to write a ~1200 line python program that tests power supplies, sending commands over our LAN to multiple instruments, serial commands the to supplies themselves, and stores all the readings and results in excel, all nicely formatted.

My knowledge of programming doesn't extend far beyond "the basic theory of programming" and it took about 3 days total. Without GPT4 it would probably have taken me 3 weeks. Nor would it have even of been attempted because the old tester "worked" with lots of manual intervention and frequent data losses (it ran on a winXP laptop from 2004, and relied on analog syncing signals between devices)

golol · on April 12, 2023

I used GPT-4 to write a simple chat website so I can talk to it without having to do manual API calls. It created a working version on the first try and me telling it about a few errors. In an hour I had improved it to a really polished look.

I recently used GPT-4 for matplotlib. I wrote a imple PDE solver and wanted it to create a function that saves a simple 3d array as an animatied 2d plot. It did it right away. I could ask it for improvements and it did it too.

Both of these tasks are easy, and if you are working every day in web development or with matplotlib I am sure you can do them in 5 minutes. But in my case, each of them might have taken half a day. And even if I could do it in 1 hour, that would be 1 hour of furrowed-brows staring at stackoverflow. Using GPT-4 is just extremely easy.

From my experience I claim that GPT-4 can also solve more complex problems. I think if I iteratively ask for features on top of what it has given me, I can get up to 3-4 times the amount of features before the whole code becomes too complex to handle. This is just a guess.

laichzeit0 · on April 12, 2023

Depends on what you do. I had some Pandas code that I wanted to be slightly faster. I pasted it in and asked it to optimize the code. It did exactly that, with an explanation at the bottom of why it was doing each particular thing. It was correct, the code was faster. I’ve used it for a bunch of things like that.

Jeff_Brown · on April 12, 2023

How much text was the code you fed it?

webdood90 · on April 12, 2023

complicated tasks are just a series of trivial tasks, wouldn't you agree?

bumby · on April 12, 2023

It depends on the context. A complicated spider web is just a series of connected strands, but pulling on one impacts all the others to various degrees. The point being, a contextual understanding of the systemic effects becomes important when deciding what to do on each "trivial" task. To the OPs point, I'm not sure it's been convincingly shown that ChatGPT has a strong contextual understanding. (in fact, that's also a major shortcoming of humans when we over simply complex models)

kgwgk · on April 12, 2023

Heart surgery is just a series of simple cuts and stitches!

endisneigh · on April 12, 2023

wow, the site is so slow. getting "This resource could not be found" errors on reloads.

anyway, here's the text:

> Today was the first day that I could definitively say that #GPT4 has saved me a significant amount of tedious work. As part of my responsibilities as chair of the ICM Structure Committee, I needed to gather various statistics on the speakers at the previous ICM (for instance, how many speakers there were for each section, taking into account that some speakers were jointly assigned to multiple sections). The raw data (involving about 200 speakers) was not available to me in spreadsheet form, but instead in a number of tables in web pages and PDFs. In the past I would have resigned myself to the tedious taks of first manually entering the data into a spreadsheet and then looking up various spreadsheet functions to work out how to calculate exactly what I needed; but both tasks were easily accomplished in a few minutes by GPT4, and the process was even somewhat enjoyable (with the only tedious aspect being the cut-and-paste between the raw data, GPT4, and the spreadsheet).

> Am now looking forward to native integration of AI into the various software tools that I use, so that even the cut-and-paste step can be omitted. (Just being able to resolve >90% of LaTeX compilation issues automatically would be wonderful...)

Ironically, Tao's post convinces me that AI, though amazing, isn't really the solution. Better UX and data quality is. Why was the data so disjoint to begin with? Why is Latex so hard to work with?

In this case GPT-4 is used to solve a problem that shouldn't have even been one to begin with. The administrators of the ICM could've simply exported the raw data as a Google Sheet (for example) and his problem could've been trivially solved even without GPT-4.

webdood90 · on April 12, 2023

> Ironically, Tao's post convinces me that AI, though amazing, isn't really the solution. Better UX and data quality is.

incredible what mental hoops people will jump through to disqualify AI!

the reality is that a lot of things have a terrible UI with unorganized data. that's why this tool is so amazing - because it doesn't matter anymore.

brian_cloutier · on April 12, 2023

To reinforce this point, we have known for a very long time that better UX and data quality has innumerable benefits. Remember back when everybody was excited about web 2.0 and all the amazing mashups which would soon become possible?

If you are producing data then exposing it in a nice programmable format is an extra cost and generally provides you no benefit. It usually hurts you, if people stop visiting your site and see fewer of your ads!

This is "really" a problem of incentives. It is usually not possible to capture any of the positive externalities of exposing your data. So maybe we could convince everybody in the world to switch to using different browsers with a native micropayment system; that might incentivize everybody to release all data as clean machine-readable tuples.

What I'm saying is, the phrase "Better UX and data quality" ignores just how hard that solution really is. It turns out training an LLM over most of the internet is _easier_ than global coordination.

gpderetta · on April 12, 2023

i.e. semantic web was dead on arrival.

ingenieroariel · on April 12, 2023

But chatgpt can crawl a semantic web and use it without us knowing.

I have asked a langchain bot about wikidata ids for specific places, links to the page, to read it and then to answer facts about places and got very good results instead of made up numbers.

Wikidata links to FIPS codes, OSM ids, GeoNames and that gives us an opening to link against the cool datasets from Flickr, Foursquare and others who have created gazetteers.

To me, Semantic Web was dead on arrival because of its UX, but now a semi-smart agent can help us get past the UX problems and jump from plain text to json output.

bumby · on April 12, 2023

I think it's a great tool but I think a generous interpretation of the OP is that it's solving problems that aren't necessary in the context of a better overall system/process.

A woodworking example is that a planer is great tool that helps you make nice flat surfaces. But, to a certain extent, it's a downstream fix that wouldn't be necessary if a carpenter was using a better overall process. I.e., if their upstream process for cutting/ripping wood made nice flat surfaces to begin with, the awesomeness of the planer becomes moot. (Apologies to the legitimate woodworkers if this analogy is off).

Where tools like GPT becomes invaluable is when you have no control over those upstream processes but still need to get the job done. But leveraging a tool for a downstream fix when upstream fixes are possible is usually a less-good approach to creating good systems.

gpderetta · on April 12, 2023

You mean it "only" solves problems in the real world and not in the ideal world?

bumby · on April 12, 2023

That's not what I mean, unless you assume that you have no control over other elements of the process.

To torture the woodworking analogy, your assumption is that the carpenter has no control over ripping the boards. In some instances that may very well be the case, but there will also the instances where the carpenter does have influence over creating the boards, or even wholesale control over ripping them. In those cases, using a planer to fix poorly ripped boards may not be the best approach.

OkGoDoIt · on April 12, 2023

How often do you actually have control over the entire process of anything? Even website development, unless you’re going to handle raw TCP sockets, you’re going to build on top of someone else’s tools one way or another. And in a business world, you almost always have to deal with other teams, other people, other priorities. Even when you run a company, not all of your employees and partners can always do things exactly the way you want. Having a flexible tool that works in the real world on real world data on top of real world processes is incredibly valuable.

bumby · on April 12, 2023

I think you're engaging in some dichotomous thinking. I'm not making the claim you'd have to have "control over the entire process". What I'm cautioning against is just looking to the tail end of a process and assuming that's where you need to add leverage.

Even in your examples, yes, you have to work with other teams. "Control" doesn't mean you have dictatorial control over those teams. But it does sometimes mean you have build relationships, leverage what you can, and explain the value to those that do have some modicum of control. The idea that we just throw our hands up and jump to workarounds is often an excuse for taking the short-term easy at the expense of a better long-term solution.

xyzzyz · on April 12, 2023

I like your planer analogy, though it is indeed off for woodworkers, but in a way that’s rather nuanced, so it doesn’t inhibit getting the idea across to normal people who don’t know a lot about intricacies or working with wood.

There are a couple of reasons why it would be hard to change the upstream process to not necessitate planers. The main one is that logs are typically ripped into boards when the wood is still green, and in the process of drying, boards change shape and dimensions: they bow, cup, warp, and shrink, and you might still need a planer to bring them back to flatness and to desired final thickness.

weaksauce · on April 12, 2023

I guess the only way to change it would be to rip them when they are green... dry them to a decently low moisture content and then plane them again. it's still a bit off since wood is never static and the differential in moisture between the shop and your shop can also change the wood. you really do need to dimension it after it has stabilized in your shop for a spell.

rdslw · on April 12, 2023

Nope.

Cost of implementing better process for all carpenters is significantly higher, than all carpenters still using bad process + _one_ AI being able to clean it for carpenter, plumber, translator, developer (you name it, you got it).

Not even entering laziness/corposlowness gardens etc.

bumby · on April 12, 2023

This is an misunderstanding of how processes typically work for a couple of reasons. For one, it assumes the "cost of implementing better process for all carpenters is significantly higher". This may be the case for Tao, where he has limited control over the inputs, but probably not the case for the woodworking analogy, for a variety of reasons. You are essentially advocating for "rework" to fix problems which is considered a unnecessary waste in process design.

"Corposlowness" is just another name for "bad processes". It supports the claim rather than negates it. Using AI to overcome bad bureaucracy makes it a workaround, not an idealized process. What often happens when implementing workarounds rather than good processes is that the workaround can create bloat and waste of its own and overtime, not really fix the problem. Like hiring more administrators for a large organization, they can take on a life of their own, eventually becoming divorced from the problem they were intended to solve.

Again, I'm not saying that AI is misapplied in Tao's case. I'm just cautioning that it's not a panacea for bad processes. In many ways, it can be misused as a band-aid for bad processes, just like creating excess inventory is a band-aid for bad quality control.

endisneigh · on April 12, 2023

you have better explained my own point, haha

waynesonfire · on April 12, 2023

i still don't get it. i only understand libraries of congress and cars.

jancsika · on April 12, 2023

> incredible what mental hoops people will jump through to disqualify AI!

There are high peaks and troughs in AI buzz right now.

Yes, on the one hand you've the but-can-it-dance crowd.

On the other hand, Terence Tao on GPT-4. I mean, I'm not weird for really expecting the story here either be about GPT4 helping Terence Tao on some difficult newfangled proof, or Tao talking about the math behind large language models. Instead this boils down to

GPT4 even does the work of some of the smartest mathematicians in the world[1]

1: by parsing some web pages and PDFs for their meetings

knodi123 · on April 12, 2023

> GPT4 even does the work of some of the smartest mathematicians in the world[1]

> 1: by parsing some web pages and PDFs for their meetings

Like that old joke about the guy who impressed people by claiming he had helped a brilliant mathematician solve a problem that had stumped him. And the punchline is something like "yeah, and it only took me a few minutes, all I had to do was replace his timing belt."

jacquesm · on April 12, 2023

The insight I think is that if smart mathematicians can use it to do drudge work they then have more time available to do smart maths.

worrycue · on April 12, 2023

I wonder why he doesn't have a secretary. Isn't he like one of the top mathematicians in the world? Why the heck are they having him do drudge work?

glitchc · on April 12, 2023

Profs don't get secretaries. There's usually one admin assistant for the entire dept. and they're busy with applications for admissions, scholarships, grants, etc.

CamperBob2 · on April 12, 2023

Or grad students, for that matter. The notion that Tao should have to monkey around with Excel in person is just bonkers.

itsacomment · on April 12, 2023

He'd be a seriously bad professor if he wasted the time of the people he was supposed to be teaching with copying badly formatted data for a talk.

paulpauper · on April 12, 2023

mathematicians do not do math all day. there is probably a point of diminishing returns where more doing more math is not useful.

majormajor · on April 12, 2023

The first time you copy and paste your disorganized data into ChatGPT and then into a spreadsheet, it's fun because you know how much longer it took before.

The hundredth time you do it you're going to be like "why is this so f'in annoying still."

Today's interface to language models is subpar for a lot of applications. Lots of room to improve that. A tool can be both amazing and still be just another step on the road to something truly seamless - just like how now it's "tedious" to use the computer for it instead of mailing/faxing paper forms around and filling out tables by pen and pencil.

endisneigh · on April 12, 2023

> the reality is that a lot of things have a terrible UI with unorganized data. that's why this tool is so amazing - because it doesn't matter anymore.

how naive. how do you know it's right? ah, you have to manually do the calculation anyway to confirm, this is what Tao ended up saying in a reply asking as much.

AI is great, but it's not a silver bullet, since its correctness can never be 100% under the current LLM framework.

hn_throwaway_99 · on April 12, 2023

How naive to think regularly structured spreadsheets are the answer to this problem. There is a famous joke about "all your budgeting spreadsheets being useless when you discover you have a formula error in one cell that propagates throughout the whole spreadsheet".

You have to do those same confirmation calculations anyway when you use a spreadsheet. In my experience the utility of something like what ChatGPT can do is still unparalleled.

webdood90 · on April 12, 2023

I'm convinced people like you are in for such a rude awakening.

I don't understand how you can hold this position with AI considering it's only the beginning.

sublinear · on April 12, 2023

Ensuring correctness by a human is existential to the work itself, not a mere annoyance. Yes AI can do it, but the more work you give to the AI, the less control you will have on the quality of the work.

This AI future you're wanting is an Idiocracy-like world where nobody knows how anything works and everything is in decay.

worrycue · on April 12, 2023

> considering it's only the beginning.

Some of us are just burned out on the hype cycles and prefer not to count our chicks before they hatch.

DanHulton · on April 12, 2023

I think we're all in for a rude awakening, regarding what societal changes AI actual makes, but that's neither here nor there.

You're making the same mistake you're accusing them of making - assuming to know the future at the beginning. You're assuming that fixing these issues will prove to be trivial or at least inevitable. Sure, recent progress has been swift, but if you recall, it was damn-near stagnant for decades. Some were even claiming we were in the middle of an "AI Winter" and could not see the spring!

Based on all currently-available evidence, the current techniques that we utilize for generative AI are unreliable, in terms of accuracy of derived facts. It will require either a different or complementary approach to iron that out, or we're going to have to start seeing some _very interesting_ emergent properties from scaling higher. This stuff could show up tomorrow, or it might never show up at all! But the _current_ LLM framework does not look like it can do what we're looking for here, not reliably, certainly not 100% reliably.

endisneigh · on April 12, 2023

ChatGPT is impressive, but I do not think it is a silver bullet, even for data cleansing and processing tasks. I would use your words and say people who think (that ChatGPT solves the problem of data cleansing once and for all) so are the ones who will be in for a rude awakening.

ipaddr · on April 12, 2023

I can't wait for 3d printers to take over manufacturing or self driving cars to self drive. Still waiting for the metaverse. Or cryto to take over banking. We are not even at those points for AI. You might see the future.. let's get a 3d printer in every home first.

jeromegv · on April 12, 2023

What’s the use case of a 3D printer in every home? Why is that even relevant in comparaison of how useful AI can be? The use cases are already there, today.

uoaei · on April 12, 2023

Ask those people who rode the hype wave 10 years ago. They had grand plans to make Star Trek replicators happen a few centuries before they were predicted.

I suspect the satire whooshed over your head.

stocknoob · on April 12, 2023

Can’t wait for the internet, web, cell phones, or social networks to change society… AI is closer to these than the others.

golol · on April 12, 2023

AI only has to be as reliable as a human. The task is ultimately trivial and clearly in the capability of GPT-4. It is easy to statistically verify if GPT-4 has higher or equal correctness than a human. Of course Tao did not do this here, but it will be done.

beepbooptheory · on April 12, 2023

Right but at the very least we can strive for consistency and efficiency, if only to save cost and energy. ChatGPT may save time, but not energy or money, assuming your idea is to just totally rely on LLM's for data processing (which is why it "doesn't matter" for you anymore).

If somehow magically from the beginning of computing we had a natural language interface to a computer's operations, we would still have arrived at particular standards/specs for data formats. There would still be something like xml/json/csv. Indeed, I'd wager there would still at a certain point be some kind of high level programming (or otherwise formal) language adopted, to answer the particular clumsiness of natural language itself [1].

Putting aside any issues of reliability, its simply not sustainable (economically, environmentally) to put all our work into this stuff. Even if it does shine with one off stuff like this.

1. https://www.cs.utexas.edu/~EWD/transcriptions/EWD06xx/EWD667...

contravariant · on April 12, 2023

And to the extent that it does matter it has already been shown that in many areas the best UI is simple human readable text.

Zanni · on April 12, 2023

AI is the better UX! I've been using ChatGPT to do all sorts of things--proofreading, generating "diffs" on unstructured documents, getting text web-ready by substituting HTML-entities--that I already have tools for but use so infrequently that I waste time looking up exactly how I need to specify the request. Using ChatGPT, I can just use English. It's amazing. And frustrating, because now I want this built in to everything, and it's not ... yet.

Teever · on April 12, 2023

And AI can be used to make improvements in existing UI, like it's just so damn stupid.

If you don't like this tool, don't use it! If you like, it, use it!

duringwork12 · on April 12, 2023

GPT = new UI more conversational and easier transformation to commands

I think a lot of people miss that due to being shown in search instances and companion apps.

ragazzina · on April 12, 2023

> a lot of things have a terrible UI with unorganized data. that's why this tool is so amazing - because it doesn't matter anymore.

Do you imagine a future where machines output data that can be barely read by humans, but can only managed through the help of AI? Honest question.

77pt77 · on April 12, 2023

This is an advertisement for something like the semantic web, not AI or language models.

hn_throwaway_99 · on April 12, 2023

And let me know when the semantic web becomes successful after 2+ decades trying. The semantic web was always pretty much doomed to failure because it imposed a large cost on the content creator, who themselves get little benefit out of the structure.

jacquesm · on April 12, 2023

The bigger reason why it was a failure is not because it imposes a cost in terms of work on the content creator, it imposes a lack of control on the part of the party that considers themselves the 'owner' of the data. If every webpage that right now is plastered with ads and various junk to get you to subscribe/form a relationship/amplify the content would just be the structured data and a default way to render that data would be present in every browser then most content monetization strategies would fail. Follow the money, not the effort.

scientism · on April 12, 2023

Nail in the head. But can you imagine what it would have been if hakia would have been a thing instead of the SEO-spam, ad-infested Web that Google and Co. gave us?

jacquesm · on April 12, 2023

That's a hard question. I really have no idea, but I would have loved a browser that takes in structured data and displays it in a way that I control any day over the junk that the web has become.

77pt77 · on April 12, 2023

> becomes successful after 2+ decades trying

Rich pro-ai argument.

hn_throwaway_99 · on April 12, 2023

Fair point. Still, the semantic web is dead because we already solved the problem with a better solution, which is open APIs.

The idea that everything would work great if only all of our data was structured and easily parseable everywhere just leads me to ask "Do you not interact with humans on a regular basis?"

wazoox · on April 12, 2023

Actually I know a guy who've been working on semantic web for like two decades and his project is finally gaining some traction right now precisely because he can now leverage AI to sort it out and turbocharge his application.

77pt77 · on April 12, 2023

I'm very skeptical. It seems to go against the grain.

wazoox · on April 13, 2023

What do you mean? "Semantic web" has been around since XML was touted as the data silver bullet, promising structured and semantically organised data 25 years ago. Countless startups tried : XML databases, digital asset management, countless o-so solutions... Until today.

causi · on April 12, 2023

Not to mention elimination of pointless work.

I needed to gather various statistics on the speakers at the previous ICM

Why did this work need to be done? Are even the people who say they want this data actually going to use it for something productive? Is there something revelational in this data?

justeleblanc · on April 12, 2023

He was the chairman of the ICM structure committee. Having stats on how previous ICMs were organized is obviously useful to him. It's not "people who want this data" who ordered him to gather it (not that anyone could order Tao anything), he wanted the data to use it himself. He had the choice between sifting manually through dozens of webpages or automating the task.

PurpleRamen · on April 12, 2023

> Ironically, Tao's post convinces me that AI, though amazing, isn't really the solution. Better UX and data quality is. Why was the data so disjoint to begin with?

Because the world is not a database. His source were formats meant for Human consumption, not machines. This will never change, people are lazy, greedy, or fear "data leaks", so we will never have a machine-first-format that everyone will use.

I mean, that guy is using latex, others have markdown, or org-mode, or Microsoft Word, those all are not meant for easy scripting. This is why AI will be such a dealbreaker, because it will close this gap, make human-formats to machine-formats when neccessary.

rhn_mk1 · on April 12, 2023

I find the "human consumption" here funny, because it's the "human" part which led the author to turn to machines which can do it better.

pegasus · on April 12, 2023

Well, yes, this is kinda the whole point of the argument. That's what LLMs bring to the table - machines that can better deal with data and documents meant for human consumption.

hn_throwaway_99 · on April 12, 2023

> Ironically, Tao's post convinces me that AI, though amazing, isn't really the solution. Better UX and data quality is.

Umm, and I want a pony? The world doesn't come in perfectly structured data formats. I was pretty amazed when I pasted in some of my medical lab test results and ChatGPT was accurately able to parse everything out and summarize my data. It worked extremely well.

endisneigh · on April 12, 2023

> The world doesn't come in perfectly structured data formats.

In Tao's case, the data was organized already, simply not given to him.

In any case, Tao, per the post, had to manually calculate to confirm that GPT-4 was correct in any case (he implies as much in a comment asking if he checked for correctness).

hn_throwaway_99 · on April 12, 2023

It doesn't matter what the data format was originally in - what mattered is that Tao didn't have access to that format.

I'm sure the company that did my medical labs also has my data in a structured format somewhere. So what, should I call them up and demand they give me my data in a spreadsheet? I can go down that useless path, or I can paste my data on ChatGPT and get results in 15 seconds.

endisneigh · on April 12, 2023

> I'm sure the company that did my medical labs also has my data in a structured format somewhere. So what, should I call them up and demand they give me my data in a spreadsheet? I can go down that useless path, or I can paste my data on ChatGPT and get results in 15 seconds.

Yes, go ahead and send OpenAI all of your HIPAA data.

hn_throwaway_99 · on April 12, 2023

Lol, the good ol' "HIPAA bogeyman" strikes again.

It's my data. The entire point of HIPAA is that I own the data and I can send it to whomever I want if I decide to. I get value out of it, others may not choose to do it, that's their right. But I'm pretty sure sharing my CBC results is not going to be the death of me.

WastingMyTime89 · on April 12, 2023

Checking a table for correctness and filling up a table are not the same thing. I also check for correctness when I manually count entries. It's an extremely error prone task.

endisneigh · on April 12, 2023

How exactly could you check for the correctness without effectively finding out the answer? I mean this discrepancy is fundamentally the problem of P = NP or not.

wetmore · on April 12, 2023

NP is the class of problems for which it's hard to find the answer but easy to verify. The fact such a class even exists should hint to you that finding the answer and checking correctness can be two different things.

endisneigh · on April 12, 2023

the point of my comment is that whether P = NP to begin with is still to be determined. in the case of ChatGPT, it's easy for it to respond, but how difficult is it for you to verify that it was correct, or not? how does that compare with the difficulty in doing the task to begin with?

shagie · on April 12, 2023

Lets take an example from Demo of =GPT3() as a spreadsheet feature - https://twitter.com/shubroski/status/1587136794797244417 /// https://news.ycombinator.com/item?id=33411748

Consider the time it would take to do manual entry for each of those examples compared with the time it takes to verify that the generated content is correct.

"Are these the correct state and zip codes?" is much faster than typing it by hand. You just ask yourself "is MA the correct state code for Massachusetts? Yep, next" rather than "Massachusetts that's... (lookup) MA, type MA; next that one is MI, already a state code ..." and down the list.

I would be willing to content that GPT will do the list faster and with better accuracy than a human doing the same work (that would also need to be checked for correctness).

bumby · on April 12, 2023

>I was pretty amazed when I pasted in some of my medical lab test results and ChatGPT was accurately able to parse everything out and summarize my data

I think an important question is how much faith we give in the answer (especially with medical data!). There are lots of examples of great uses but also a number of examples of hallucinations and just plain bad summaries. When the stakes are high, the conviction needs to be couched in the risk of it being wrong. We need to be cognizant of the automation-trust factor that sometimes makes us place unwarranted trust in these systems.

SkyMarshal · on April 12, 2023

> Ironically, Tao's post convinces me that AI, though amazing, isn't really the solution. Better UX and data quality is. Why was the data so disjoint to begin with? Why is Latex so hard to work with?

> In this case GPT-4 is used to solve a problem that shouldn't have even been one to begin with.

Friction. These problems shouldn’t exist, but they do anyway, and they’re everywhere. Anything human is inevitably going to be imperfect and messy to some greater or lesser degree, introducing friction into dealing with it. Especially as we produce more and more unstructured or semi-structured data, which AI is particularly good at wrangling. If AI can help us cut through that friction significantly faster and more accurately, that’s a win.

moffkalast · on April 12, 2023

Let's just hope that someone doesn't task the AGI with eliminating friction and it realizes that humans are the problem.

layer8 · on April 12, 2023

Ironically, interacting with ChatGPT still comes with a lot of human-like friction and imperfection.

dwringer · on April 12, 2023

IME most of that is of the form "I'm sorry, but as a large language model I..." and seems to have been drilled into it with training to dodge tricky subjects.

helen___keller · on April 12, 2023

> Ironically, Tao's post convinces me that AI, though amazing, isn't really the solution. Better UX and data quality is. Why was the data so disjoint to begin with? Why is Latex so hard to work with?

I’m fairly convinced that for many problems (including this), AI is not the best solution but will become the preferred solution. “Best” doesnt get in the way of “good enough” when convenience is at stake (and making a good UX is often very very inconvenient)

teleforce · on April 12, 2023

I think it's not the case of AI or ChatGPT is the best solution because it can be a local optima. But it's currently the "best" solution available to the average Joe. The fact in this OP case study the average Joe is probably one of the best mathematicians of our time, makes the situation much more interesting and intriguing.

I'm in the middle of reading Steve Jobs' very own words, Make Something Wonderful book. Apparently throughout his whole life he frequently mentioned and discussed a lot about providing and creating easy access to the computer because ultimately people do not want to program but want to use computer instead. He did mention about how Morse code was not popular as much of the later telephone technology because people refused to learn and use the unintuitive Morse code even though it only takes 40 hours for average person to learn the entire Morse code. As a trained communication engineers I can very much relate to this because even though we learnt much of the underlying technology of communication that enable the Morse code or the telephone to work across the Atlantic ocean most of us can't be bothered to learn the Morse code and use it if we can help it because the telephone make it redundant and it's counter intuitive to use.

Granted, now we know that telephone or circuit switching in general is a suboptimal in providing and solving the human communication problems and needs. As of now the Internet, packet switching and the multimedia approaches are probably the best solutions but telephone has served us a potent and still one of the best solutions for communication until very recently.

fauxpause_ · on April 12, 2023

Solving for every last mile problem in UX is hard though. A lot of AI value is being a flexible, generic interface to software automation.

SkyMarshal · on April 12, 2023

We can probably even think of AI as a generalized meta-interface to everything - there’s probably no better UI/UX than natural human language that we’ve all been speaking since childhood.

fauxpause_ · on April 12, 2023

I don’t know that it’s a better UX, but it is a viable UX, and that is very valuable

belugacat · on April 12, 2023

And here’s an exchange about how he validated the data was correct - which to me seems like the most important part:

> Out of curiosity, are you sure that GPT did it correctly? If yes, is it because you were able to "spot check" it in a few places? Or have you used GPT enough that you trust it with a task like this? Or is this for some kind of internal use where a few small errors is unimportant, and you only need the broad strokes to be correct?

> Yes, yes, and yes. (There were some obvious checksums, for instance by computing the total number of speakers several different ways.)

Still cool, but the circumstances above aren’t always true. In fact, in a lot of meaningful work, none of them probably are.

If GPT is like an assistant whose work you always have to double check in order to be sure of the validity of the results, that seems to be a pretty big caveat to me. In that case it’s probably better to just write a script whose output you know to be deterministic, even if it takes slightly longer (and the bigger your dataset to verify, the more likely it is that it’ll take less time to write a script than to validate results one by one).

The scary part is if/when companies/bureaucracies/governments/etc start using GPT for all sorts of Important Tasks and skip the validation part because they assume the machine will always get it right.

humanistbot · on April 12, 2023

> If GPT is like an assistant whose work you always have to double check in order to be sure of the validity of the results, that seems to be a pretty big caveat to me.

Sounds like every junior developer fresh out of college I've had to work with. Interns are even worse. Even senior devs mess up from time to time. That's why we do code review and other QA. Trust no one.

belugacat · on April 12, 2023

True, but one obvious difference is that a senior contributor can typically validate the work of a more junior one in less time than it took them to produce, that’s why the overall equation works. And if you decide to keep the more junior person, it’s probably because you feel like they’re improving and becoming more trustworthy over time.

Another key difference is that a deterministic script that has determined to be valid will be valid for all subsequent runs, regardless of input size. That is not the case for ChatGPT, which might be correct on some runs and not others, or do well on small input sizes but start messing up when thousands (or more) of data items are involved.

Again, I still think ChatGPT is really cool, but thinking about all these nuances about the nature of the work that is actually being done when you use ChatGPT vs writing a script vs handing it off to an intern seems crucial to the debate.

red_trumpet · on April 12, 2023

HackerNews' hug of death hitting a site with normally 4k active users. Not that surprising...

endisneigh · on April 12, 2023

it's plain text. it can be cached. it's embarrassing.

shakow · on April 12, 2023

Feel free to share your know-how https://github.com/mastodon/mastodon

endisneigh · on April 12, 2023

Mastodon's architecture inherently does not prioritize speed of first page load. Any PR would be rejected :D

patates · on April 12, 2023

I thought what and when to cache was one of the hardest problems in computer science.

endisneigh · on April 12, 2023

Something like Mastodon is likely 90% reads, or even more. How often is the OP editing their post? When they post, cache it. Kind of like what I did by including it in my original post. Caching is a hard problem because sometimes invalidation results in serious consequences, therefore knowing when to invalidate becomes potentially intractable. In this case there would be few, if any.

jibe · on April 12, 2023

it was cache invalidation, and naming things. Just throwing stuff in cache is easy.

CyberDildonics · on April 12, 2023

Why would it be difficult to cache static text?

blowski · on April 12, 2023

Perhaps they've optimised for the normal levels of traffic they receive, not unanticpated spikes. That's hardly an embarrassing choice.

endisneigh · on April 12, 2023

I'm not talking about the administrator of this instance, I'm talking about how mastodon is fundamentally designed. Even an iPhone could probably serve the (static) text of Tao's post to a million people on a 5G connection. Computers are fast.

blowski · on April 12, 2023

I run Mastodon on a $5 VM. If I hit HN homepage, my site will probably be slow as well. I don’t see how that reflects on Mastodon though.

endisneigh · on April 12, 2023

what am I saying is that there's a way to design a site on a $5 VM that serves static text that wouldn't be slow even if it was #1 on HN.

if you disagree with that then we'll have to agree to disagree.

moffkalast · on April 12, 2023

Wow I didn't know Mastodon failed that massively in converting people from Twitter. Then again, maybe it's not that surprising.

proto_lambda · on April 12, 2023

That's not how the fediverse works, Mathstodon is just one of many servers in the network (running the Mastodon software).

shawabawa3 · on April 12, 2023

yes, and unfortunately almost all of the servers are unable to cope with even a moderate amount of traffic

Twitter is a far better user experience in that respect, to the point that I actively avoid clicking mastadon links now because they fail so often

drstewart · on April 12, 2023

The joys of federation

poulpy123 · on April 12, 2023

> Why was the data so disjoint to begin with?

Because real life is messy.

> Why is Latex so hard to work with?

Because people that use latex want to be able to brag they use latex

(just joking for the second one.... unless ?)

endisneigh · on April 12, 2023

> Because real life is messy.

This doesn't make sense - the data inherently is centralized. Tao implied as much.

ugh123 · on April 12, 2023

>Better UX and data quality is. Why was the data so disjoint to begin with? Why is Latex so hard to work with?

Do you mean build better, specific, UX and normalize data for only this task? Why spend that time, even as a developer of whatever improved app/system you're thinking of, when he can just turn to AI?

I'd rather lean on GPT4's ability to generalize highly specific technical work rather than ruminate on each individual app's lackings and how it could be better than using AI if we just... put more effort into it?

6gvONxR4sf7o · on April 12, 2023

> AI, though amazing, isn't really the solution. Better UX and data quality is.

In this case, AI was the tool used to have a better UX. People build something with use-case X in mind, and use the best tool for that job. Piping that into use-case Y requires some duct tape and plumbing. It turns out that AI is great at that sort of repurposing.

It's great if someone using the best tool for their job is nearly the best tool for the flexible infinity of other use cases, which is what flexible enough duct-tape gets us.

rvz · on April 12, 2023

> wow, the site is so slow. getting "This resource could not be found" errors on reloads.

Oh dear. Federation at its finest /s

If the instance is too small, it will easily fall over under heavy traffic. If it is too big, it is highly centralized and even then it cannot scale at a maximum of 300,000+ users at the same time.

Eventually, they would all be re-centralizing back to Cloudflare. Once that goes down, all of the biggest Mastodon instances go down at once.

As for GPT-4 in mathematics (LLMs in particular), in goes inline on what I said before. Fundamentally, these LLMs cannot reason transparently, nor can it directly bring up its own mathematical proofs with thorough explanations that allows mathematicians to work with. Only low hanging fruit of summarization of existing text.

If Mr. Tao can see its limitations, surely it puts the hype of unexplainable magical black-box LLMs to rest as being great sophists and bullshit generators.

williamstein · on April 12, 2023

I would not discount the potential of LLM's to support research level mathematics. There's interesting research going into combining the LEAN theorem prover with LLMs, e.g., this recent paper: https://arxiv.org/abs/2202.01344

"... automatically solving multiple challenging problems drawn from high school olympiads."

This combination of GPT and LEAN is the first thing that has finally got me really interested in the potential of LEAN. And LEAN very directly addresses the "bullshit" issue, since it formally verifies whether or not a proof is correct (it's like a compiler like Rust, but for mathematical proofs).

Jeff_Brown · on April 12, 2023

Better source data is indeed a better solution, but the source data is often not under one's control.

HelloMcFly · on April 12, 2023

> In this case GPT-4 is used to solve a problem that shouldn't have even been one to begin with.

But the problem DID exist, and problems like that are likely to continue to exist for the foreseeable future.

77pt77 · on April 12, 2023

https://archive.is/wbwUa

dpcan · on April 12, 2023

They should have done it right the first time? That's not the world we live in.

Coworkers, associates, clients, family - everyone sends us funky data. Sometimes it's a PDF of text instead of a text file, or a picture of their computer screen with a paragraph in it, or copy-pasted text with unwanted formatting, etc, etc.

And the point of this article wasn't really about the tedious job of copy/pasting, it was that once the data was ready, instead of hunkering down and having to work through a problem with an unknown number of variables, GPT-4 can drop an answer in seconds flat. Maybe a few more if you have to tweak your prompt a few times.

Part 2 here is that we have a problem to solve, and that's how data is sent and received, and that it all needs to end up in a structural format that's usable by our AI tools.

I am thinking that writing a script that can dump textual data from literally anything I'm looking at so it's AI-ready makes a lot of sense, then we have step 1 taken care of as well for my own personal GPT-solvable situations.

csomar · on April 12, 2023

> Ironically, Tao's post convinces me that AI, though amazing, isn't really the solution. Better UX and data quality is. Why was the data so disjoint to begin with? Why is Latex so hard to work with?

AI is just that. A UI/UX for any/your data.

aphit · on April 12, 2023

Yeah, I waited a good minute of it trying to load before I gave up.

Anyone that got in can share what it says?

kzrdude · on April 12, 2023

I accessed the message through a mastodon client instead. They are a lot better at federation than the various instance sites IMO, a more consistent interface when there is no option of sending you out to a non home instance page.

SkyMarshal · on April 12, 2023

I didn’t even know there were mastodon clients, though I should have assumed. Know and good ones for Linux and Mac?

kzrdude · on April 12, 2023

I use Tusky/Android and that's the limit of my knowledge

yellow_lead · on April 12, 2023

Check the comment to which you replied (maybe it was added after)

jacquesm · on April 12, 2023

And how does he know the results are correct without verifying each and every entry?

tjr · on April 12, 2023

I find this a real concern when it comes to the proposition of "replace every programmer with GPT".

I think the reality is, there are cases where it matters, and cases where it doesn't matter (at least, not very much).

I work on aerospace software systems. I suspect that most software developers would be surprised at the process and rigor involved. That does not mean that humans are flawless; of course not. But there are going to be a lot more hurdles to replacing all of the humans with non-deterministic machines.

There are numerous software components on a modern airplane. How good is good enough? If the overall system works correctly 99% of the time, is that good enough? There are about 16 million flights annually in the United States alone. If 160,000 of them crash, is that good enough?

I mean, like, if Microsoft Word failed to save your document 99% of the time, you might be irked. If Professor Tao's data was only 99% accurate, he probably still has a reasonably good picture of the information he needs.

Regardless of if a human or a machine is writing the code, there are different levels of software criticality. As humans, we don't apply avionics software development process to iOS games because it's not worth the extra time and expense. Likewise, there will be a spectrum of where it makes good enough sense to automate tasks with AI and where it doesn't.

Which is not to dismiss AI! It's amazingly useful technology. But there really are places where above average accuracy is needed, and AI that works great for handling some tasks might not suffice for other tasks. If you truly need all of the results correct, then a nondeterministic neural network is probably not the best path to be on. There are other methods; even other automated methods.

jacquesm · on April 12, 2023

> I work on aerospace software systems. I suspect that most software developers would be surprised at the process and rigor involved.

I did this once, a fuel estimation program for 747's, the degree of understanding of the problem required to create something that would pass review was off the scale compared to any other program that I've ever written. It is also the only time that the customer wanted to get it right rather than that they were looking at what it cost (because the savings would pay for the development many times over this was probably less of an issue anyway). I'd love to be able to always work like that.

hzay · on April 12, 2023

Imagine Terrence Tao manually entering data into spreadsheets!!

sdfghswe · on April 12, 2023

Wait.... is it possible to tell chat gpt to grab data from a page?

gvhst · on April 12, 2023

And AI is a way to enable better UX and cleaner data...

sebzim4500 · on April 12, 2023

Yeah bold prediction I know but I think basically everyone will move back to twitter by the end of the year