Hacker News new | past | comments | ask | show | jobs | submit login
Creating personalised data stories with GPT-3 (scottlogic.com)
47 points by ColinEberhardt on Dec 9, 2021 | hide | past | favorite | 16 comments



A good article, I appreciate all the discussion on prompts too. However the author is a bit too underestimating of the language model and math:

> GPT-3 is quick to learn that the narratives should include these comparisons, but often gets the maths wrong:

> > During these runs he has climbed 62,599 feet, that’s the equivalent of climbing Mount Everest six times (but a bit less steep)

> Everest is ~29,000 feet, so this runner has climbed Everest roughly twice.

I think it's the author here who gets the math wrong. An Everest climb is closer to 10,000 feet, making the GPT-3 math alarmingly accurate.


Author here! I'll hold me hands up to getting the maths wrong.

With Strava, you can choose metric or imperial units. However, I discovered that this only effects activity distances (which the API reports in km or miles). Elevation is always reported in metres (via the API).

I've fixed that now: https://github.com/ColinEberhardt/running-report-card/commit...


Mount Everest, as far as I know is near 8850 meters high, which in feet is probably around 30000. The article snippets you took seem right to me, the author math checks out.


Right but you can't climb that. The author's math ignores the real context of climbing and uses a google search for the math. But no climber can climb the full altitude due to the physical geography. There's nobody that can relate to climbing that altitude on one mountain because the physical ground isn't like that.

I guess what I'm saying is the AI is showing better understanding of climbing context than the author is giving credit for.


Yes, according to this the elevation gains are ~3500m which is ~11,000 feet: https://www.strava.com/challenges/Strava-Climbing-Challenge-...


It's implausible that GTP-3 wrote that because it knew the actual amount of climbing involve in an Everest ascent and tosses it without comment in a way that would look wrong to most people who read it.

It's far more likely that the program cobbled together a variety of factoids and so got it's wrong-sounding statement in the same way it makes many other clearly wrong or illogical statements.


>It's implausible that GTP-3 wrote that because it knew the actual amount of climbing involve in an Everest ascent and tosses it without comment in a way that would look wrong to most people who read it.

It sounds like you're suggesting that if it could get the number right, it would probably also be able to model people well enough to understand they would consider it a mistake.

I'm not highly confident of its ability to do the former, but it has zero information on what sort of person is interacting with it, correct? I don't see how it can possibly know.

Whereas, it must have the fact embedded in it.


It sounds like you're suggesting that if it could get the number right, it would probably also be able to model people well enough to understand they would consider it a mistake.

We don't have to reason with counter-factuals here. GPT-3 just reproduces language. It isn't trained to be right, it's trained to predict and reproduce language so it has no incentive to be right. And the pattern of saying something correct but wound-sounding is quire rare in my experience. Most people who do that do it by accident as well I think.

It also has no model of people and it knows no numbers. Everything it spits out is response to context, which does allow to seem to model people and seem to know numbers.

No doubt, it has an association with some article describing the actual distance of an Everest ascent as well many other facts and misinformation about Everest. It will spit all this out in context.


All I want is a tool that allows me to input dull text in bulletlist format, and then converts it faithfully into paragraphs of beautiful prose.

So basically like running a summarizer in reverse.


Noted lol

Quick naive implementation idea:

1. Collect text of Wikipedia pages of famous poems

2. Parse the texts into a list in a format like, {prompt: [The Wikipedia summary section text for the poem], poem: [The poem itself]}

3. Map through the list to create strings, eg 'prompt: The poem Sunflowers and Daisies is about the fleeting nature of reality.\n poem: Roses are red, violets are blue, I want to pick sunflowers and daisies with you'

4. Join the list to create a bunch of examples to feed to GPT3

5. ??

6. Profit!

This API design is hereby released into the public domain.


jarvis.ai is really good for this use case


Interesting, but too bad they specialized in marketing copy.


To anyone who knows about the subject, what is the best gpt-like model you can still run in a laptop? (With about 16gb of ram, or 3gb-8gb of vram)


GPT-2 is the most recent available from OpenAI. Otherwise you'd need to look at EleutherAI's GPT-Neo which comes in various sizes.


Do this for peoples stock portfolios and you have a good startup idea right there.

Or you could digest financial, health and content consumption data to create a personalized report with action items.

Very promising.


I agree, that also looks like a great use case. I'm sure there are many more to come.

One thing I love about GPT-3 is that it uses the same common language for the "data people" and the "non-data" people. The same description works for both.

This makes it so easy to explain what you are doing. I think that is a huge advantage over existing text generation methods.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: