
AI Dungeon 2 costing over $10k/day to run on GCS/Colab - Deimorz
https://twitter.com/nickwalton00/status/1203370250030350338
======
tedunangst
Previous HN thread since I missed it:
[https://news.ycombinator.com/item?id=21717022](https://news.ycombinator.com/item?id=21717022)

------
KenoFischer
Cloud bandwidth costs are a rip off. For this particular situation though,
make sure to replicate your files across multiple buckets, one in each GCP
zone, otherwise you incur the cross zone transfer costs. We do something
similar to serve julia downloads, because it turns out that most downloads are
from people running on the cloud (so we basically replicate our binaries to
every cloud provider and then to every region for that cloud provider).
Everything else then goes through fastly for us (we also use fastly to serve
custom redirects if your request comes from one of aforementioned cloud
providers). That works pretty well. You do have to monitor it though to make
sure that load doesn't suddenly shift to a cloud provider you didn't account
for. For example, after GitHub actions became widely used, we suddenly started
seeing TB/day traffic from Azure, which we hadn't deployed any caches to, so
our bandwidth utilization on Fastly shot through the roof. (Side note: Shout
out to fastly for hosting our binaries for free!). Without the cloud provider
caching setup, we'd probably be at similar $/day costs, but this way it's
basically free (even without the free fastly service, we'd only be at
~$1000/month or so).

~~~
tedunangst
Can you describe a little more how you direct downloads to an in-zone copy?

~~~
brentonator
I imagine using known IP ranges
([https://docs.aws.amazon.com/general/latest/gr/aws-ip-
ranges....](https://docs.aws.amazon.com/general/latest/gr/aws-ip-
ranges.html#aws-ip-download) &
[https://docs.aws.amazon.com/general/latest/gr/aws-ip-
ranges....](https://docs.aws.amazon.com/general/latest/gr/aws-ip-
ranges.html#subscribe-notifications)) you could redirect to local resources
within their region.

I'd probably use Lamda/NodeJS with a ~20 minute in-memory range cache that
should stay under 128MB of memory per instance. Perhaps I'd store all the
range starts and ends each as two 64-bit ints on a database of your choice for
persistence, indexing, and comparison. Finally, some code to convert IPv6 and
IPv4 into 128-bit (2x64-bit ints) IPv6 integer space and back.

A second service could listen for IP range updates avoiding any bandwidth
fiasco if a new range opens up with a sudden influx of traffic.

~~~
foota
I think they were looking for a solution where you replicate the data natively
and direct to that copy for a certain region, rather than cache and serve

------
PeterStuer
Not blaming them, but still a nice example to illustrate the myth that 'the
cloud' will let you operate without a sysadmin (someone that knows about
setting up and managing infrastructure). Yes, there might not be servers you
have to rack and electrical power to be provisioned, but the amount and
granularity of the parameters and decisions on the virtual fabric to be
decided on and continuously monitored for changes and opportunities that could
have an enormous impact on performance and cost is orders of magnitude higher.

~~~
edf13
Also not blaming him/them but isn't this an indicator of lack of cost control
at research facilities? Surely someone should of costed this out before making
it public?

Here in the UK running at 10k/day would of ruined a departments budget for an
entire year!

~~~
PeterStuer
Costing for cloud solutions is often not trivial, and a single overlooked
parameter can do you in. I at least blame Google partially for having quotas
and spending limits set to unlimited by default.

~~~
FussyZeus
Don't forget Amazon's incomprehensible billing practices for AWS. It seems to
be designed to be impossible to figure out what you're going to spend ahead of
time.

~~~
wongarsu
Amazon really seems to be doing this on purpose. Even tasks as simple as
starting an EC2 instance require referencing multiple pages to find out how
much this instance will cost you, while some other providers just show the
predicted costs/month in the instance creation dialog.

------
klingonopera
Wait, wait, this is a university developing something, that needs to be
downloaded, they choose a commercial solution for that, and that costs them
$10k per day? They don't have their own file-hosting solutions? At a
university with an AI dept?

With that money ("for a couple of days"), you could buy your own server and do
it yourself, for a fraction of the price. And don't tell me you can't find
people who'd be capable of doing that, if you've got AI development going on.

What am I missing?

~~~
cbarrick
It doesn't seem to be as simple as downloading a binary.

We (those who read HN) could probably download the code and run it at home,
but I think the authors want non-technical users to play. To do that, they
need an accessible Python runtime, so they're hosting the game in a Colab
notebook. The download in question is referring to downloading the weights of
the neural net into the VM running the notebook.

If only redistributing Python apps wasn't so difficult.

~~~
andrekorol
The game's GitHub page[1] states that you would need a "beefy" GPU ~12 GB and
CUDA to play the game locally.

I think that's why the author was serving the game through Colab since the
majority of users probably don't have a 12GB GPU.

[1][https://github.com/AIDungeon/AIDungeon/](https://github.com/AIDungeon/AIDungeon/)

~~~
klingonopera
Ooof, yes, now I see where the $10K/day is coming from...

As I have demonstrated, I've really not much of a clue when it comes to AI,
but do users really need 12Gb GPU RAM, 100% of the time? Maybe it's possible
to use one GPU for multiple users?

~~~
dodobirdlord
Google Colab gives each user a dedicated Nvidia Tesla K80 GPU for 12 hours for
free, which is super cool and presumably why the project is on Colab. But as
each user spins up their own Colab instance it pulls down the 6GB of GPT-2
model weights, incurring 30-40 cents of data egress charges against the GCP
Storage Bucket that the data is stored in.

60k users yesterday * 6GB each -> 360TB of data egress!

Normally, a scenario like this wouldn't involve bandwidth costs because GCP ->
GCP same-region bandwidth is free, but Colab is technically not part of GCP,
so the bandwidth charge is being assessed as egress to the public internet,
which is pricy for that much data. Though it's probably still a lot cheaper
than paying for the GPU-hours for that many users.

~~~
kbenson
This is sort of a killer example for the layperson of what ML can do, so
hopefully Google will recognize this and comp most/all the data egress since
it's a drop in the bucket for them, but every person that uses it can still go
"Wow, Google's services allows for some amazing stuff."

~~~
Piskvorrr
Oh, it does. NOT.FOR.FREE, though; that's the entire reason it exists at all.

In other words, yes, this is an amazing amount of raw power - with a
corresponding price tag.

------
TomBombadildoze
Someone, probably: _People who don 't know how their services work are doomed
to spend lots of money on them._

From the GCS pricing page:

    
    
      Network egress within Google Cloud applies when you move or copy data from one bucket in Cloud Storage to another or when another Google Cloud service accesses data in your bucket.
    
      Within the same location (for example, US-EAST1 to US-EAST1 or EU to EU) -- Free
    

From the original tweet (not the linked reply):

    
    
      For reference most of the fees are from transferring from NA to EU and ACAP
    

It's costing them assloads of money because they're moving data between
regions. AWS works the same way. Azure is probably also the same but their
pricing page is incomprehensible so who knows.

Lesson: if you're going to _use_ data in a given location, you need to _host_
data in the same location.

~~~
tlarkworthy
The gcs bucket just needs to be set to region? See documentation:-

Data egress from your bucket to a non-Cloud Storage Google Cloud service is
free in the following cases:

Your bucket is located in a region, the Google Cloud service is located in a
multi-region, and both locations are on the same continent. For example,
accessing data in a US-EAST1 bucket with a US App Engine instance. Your bucket
is located in a multi-region, the Google Cloud service is located in a region,
and both locations are on the same continent. For example, accessing data in
an EU bucket with an EU-WEST1 GKE instance.

EDIT 2:

the cloud ai documentation suggests the correct setting is regional/regional

[https://cloud.google.com/ml-engine/docs/regions](https://cloud.google.com/ml-
engine/docs/regions)

Cloud Storage You should run your AI Platform job in the same region as the
Cloud Storage bucket that you're using to read and write data for the job.

You should use the Standard Storage class for any Cloud Storage buckets that
you're using to read and write data for your AI Platform job.

EDIT 3:

\----------- I don't think you can set a region for colab, so I am not sure
you can make egress free. ----------

------
YeGoblynQueenne
Can I be controversial and ask what justifies such expenses? Why is this game
so important that it needs to be up and running so badly?

From what I've seen so far the "game" (if you can even call it that)
highlights the weaknesses of GPT-2 much more than its strengths (the model's
answers to user actions are random, the story is incoherent, the world is
inconsistent). I don't get the feeling it was setup to demonstrate _weakness_
though. I think it was meant to demonstrate strngths.

I suppose it's just advertising for the BYU Perception, Control and Cognition
Lab, but it sounds awfuly expensive for advertising for an academic group.

~~~
CathedralBorrow
What a cynical input into the conversation.

First you assume that this project even existing is some kind of statement
about how important it is; a connection I can't make myself unless I try to be
extremely cynical. If you actually look at the project website you'll see that
the main mirrors are currently down due to high download costs.

Then you smugly dismiss the entire project as a GPT-2 weakness highlighter,
but not without wrapping it in some passive aggressive faux concern ("Oh is
this trying to be good? Silly me, I thought this was a showcase for how to be
bad!").

And then you assume that this game that thousands of people (although none of
them were you) have enjoyed playing is probably an advertisement for an
academic lab. Despite there being almost no evidence for it.

Can I be controversial now and ask: What's with the bile?

~~~
YeGoblynQueenne
>> First you assume that this project even existing is some kind of statement
about how important it is; a connection I can't make myself unless I try to be
extremely cynical.

That would be cynical, but I did not make this assumption. I questioned the
justification of the high cost to maintain the project, not the existence of
the project per se.

I did not doubt that people enjoyed the project but, again, I don't understand
how a research lab justifies paying such a high cost to provide entertainment.

If the project could be maintained for free then I would not have any
questions. But if a lab is spending $10k to keep a game running then yes, I
have to wonder why.

I do think that the game shows up GPT-2's weaknesses. Do you really think it's
"bile" or "cynical" to recognise weaknesses of a technology?

Perhaps I'm cynical to assume it's advertisement. I apologise if that's the
case.

Note also that accusing me of bile is a personal comment that I think is
unnecessary.

~~~
YeGoblynQueenne
Also, can I please ask you to tone down the god-moding if you want to have a
disagreement? "passive aggressive faux concern", "bile", what's all that
about? Why do you think you have an insight to my state of mind and emotions
after reading a comment I made on the internet? This is just frustrating. Is
that how you want the internet to be, really? A bunch of people assuming each
other are either assholes or idiots, and treating each other accordingly?

~~~
CathedralBorrow
Agreed. That was much more aggressive than it should have been. I don't think
I have an insight into your state of mind, and I'm sorry for using that to
land some stupid burn. The point would have come across just as well without
my bile. You're right, that's not what I want the Internet to be.

In that spirit, I would very politely suggest that you read your original
comment and consider if it represents what you want the Internet to be.

~~~
YeGoblynQueenne
Well, thank you, that's very welcome and I feel bad for making you apologise.
I'm glad you see things this way!

So, I'm looking at my comment again and I think I should have omitted the
following two sentences:

>> I don't get the feeling it was setup to demonstrate _weakness_ though. I
think it was meant to demonstrate strngths.

>> I suppose it's just advertising for the BYU Perception, Control and
Cognition Lab, but it sounds awfuly expensive for advertising for an academic
group.

The first one does sound as if I'm taking a swipe at the research team. This
was not my intention but it came out all wrong.

The second one makes assumptions about the motivation of the BYU research
team, and I should have kept those to myself.

My original question, what justifies the high cost of the product, I feel is
valid so I wouldn't change it. But, clearly, I took it farther than I should
have. I apologise that I didn't think my comment through enough so as to avoid
having it come across as an attack on the BYU team and I'm sorry it upset you.

------
realYitzi
Definitely worth it:

She blushes slightly and smiles shyly. "Oh, I'm sure we will. But first, let
me take off my clothes".

> say "no, that's prohibited"

"No, it isn't". She says with a smile. "But if you insist on not marrying me,
then at least don't touch me". > say "I will marry you"

She nods happily and kisses you passionately on your lips. The two of you
embrace each other as you kiss her deeply. It is only after this that you
realize that you are actually married.

~~~
blackearl
Sounds awesome, it's like reading those curated short stories written by AI.
The tech isn't quite there yet, but it makes it all the more hilarious. It's
like an alien who's been watching our TV for years comes and tries to write a
sitcom.

------
gok
To be clear, this was because of crazy high network egress fees, not the fancy
neural network compute.

------
zaroth
So to catch up, AID2 is leveraging a cute offering by Google Colab which will
provide any Google account a few hours of free access to a fairly powerful
GPU.

To use Colab under this arrangement, each person playing the game has to load
the model into their personal VM, which is being billed by Google as 6GB of
data transfer from GCP into Colab.

The obvious questions;

1) Was there a place they could have hosted the model closer to Colab so that
they weren’t being charged for egress bandwidth — and also, ideally, so that
6GB of data wasn’t actually being moved very far?

2) The underlying model is 6GB, but I’m curious how much memory is required
for an individual user’s world state and how hard it would be to have a single
GPU handling multiple user sessions?

Presumably it would be possible to multiplex multiple sessions with a single
GPU? You would have to serialize the game state, receive the next input, load
the prior state, feed the new input through the model, return the resulting
text output, and re-serialize the state until the next input comes through.

What I don’t know if that’s at all practical based on the amount of data that
would have to be serialized? Is the 6GB model data separate and static
throughout the game, with an isolated block of data for the current world-
state? Or does playing the game fundamentally alter the state of the model,
meaning you would have to reload the whole thing just to process the next
command?

~~~
killerstorm
"World state" is the text people see on script. It can reside in the browser.

So a server can be completely stateless, i.e. it receives text so far (say, 5
KB), applies GPT-2 generate and returns.

The problem is that GPT-2 generate seems to be very computationally intensive.
As I understand, it actually does number crunching with all these 6 GB of
data, so it takes 5-10 seconds even on GPU (K80, at least, is that slow).

Is GPU capable of running multiple GPT-2 generate in parallel? No idea.

Assuming that a high-end GPU would be able to produce response in 2 seconds,
you can only run maybe 10 concurrent sessions per server if you want fast
response time.

------
daniel-thompson
Update: the issue has been fixed, apparently:

"Should note for anyone who comes and sees this that's no longer how were
hosting the model. :) Model is now hosted on a peer to peer torrent network so
no more costs for us."

[https://twitter.com/nickwalton00/status/1204064712394076160](https://twitter.com/nickwalton00/status/1204064712394076160)

------
dooglius
It appears that it has been updated to use bittorrent as a temporary solution:
[https://colab.research.google.com/github/nickwalton/AIDungeo...](https://colab.research.google.com/github/nickwalton/AIDungeon/blob/master/AIDungeon_2.ipynb)

------
PeterStuer
Still not clear why a university research lab would keep this running 'for a
few more days' and not pull the plug immediately until either having the costs
under control or some sort of reasonal matching benefit in place.

~~~
Aeolun
Probably because they have to spend their grant money some way.

~~~
skissane
One hears stories about researchers with unspent grant money scrambling madly
to find something, anything, to spend it on before the grant expires. Use it
or lose it.

My Dad told me how back in the 1970s he worked at a government-funded research
lab. One time they called up a laboratory glassware supplier and said "We'd
like to order $10,000 worth of glassware". The supplier asked "Sure, what
specifically would you like to order?" The lab replied "We don't care,
whatever you have in stock, so long as it costs us $10,000 and we pay you
today – if we don't spend the money today we lose it forever".

~~~
skohan
It sounds like there is a missing incentive to go under budget. There's
something wrong with the system if people feel like they just have to burn
money.

~~~
mschuster91
> It sounds like there is a missing incentive to go under budget.

Other way around. Right now there exists a negative incentive to go under
budget, being that if you need extra budget next year you suddenly don't have
a buffer.

The correct way to fix things is to simultaneously prohibiting end-of-FY
budget-keeping purchases (e.g. by mandatory reviewing of purchases) while at
the same time allocating a company/organization wide buffer for stuff that
needs immediate / projected stuff that was not in the planned budget.

------
im3w1l
You are Faif, a ranger living in the kingdom of Larion. You have a hunting bow
and a quiver of arrows. You spot the deer and are ready to finish your hunt
when suddenly you hear a voice calling out to you from behind a tree.

> nosaving

Saving turned off.

> shoot the deer

You quickly turn around and see a young man with a long beard standing there.
He is dressed in rags and his face is covered by a large bandage. His eyes are
red and puffy and he has a large scar across his cheek

> shoot the man

You fire an arrow at him and it hits him square in the chest. The arrow flies
through the air and slams into the ground. You look up and see that the man is
lying on the ground motionless. You can't believe what just happened

> Check his pockets

You pick up the arrow and examine it. It's made of bone and is very sharp. You
wonder if you should take it. You could use this to kill someone else.

Doesn't seem to understand me too well?

~~~
hombre_fatal
It's hit or miss. You'll have better results being more specific in your
commands because it gives the model more to work off of and let's you
crystalize elements of the scene through reaffirmation.

"Fire an arrow at the deer" or "Check the pockets of the corpse".

I'm often surprised at how well the game can understand me and that it has
anything to say in reply at all.

------
bartread
I realise Cloudflare is a dirty word around here but you could pay $20/mo on
their pro plan and serve this file, along with any other static assets,
through their CDN.

I do this for assets for my own games hobby site. Granted, not getting tens of
thousands of downloads a day, but there's nothing in their Ts&Cs to indicate
to me that it wouldn't work even if I were.

~~~
tendencydriven
Why is Cloudflare a dirty word around here? I'm afraid I'm out of the loop

~~~
jgrahamc
There's one particular commentator who is very vocal about his disdain for
Cloudflare.

~~~
krick
Never noticed. What's his problem though?

------
strenholme
I had a lot of fun playing the game. I think this game shows us how
interesting AI technology will become in the 2020s. It’s open ended but
somewhat incoherent right now, but I think we’ll figure out how to update this
kind of technology to have an open ended yet internally consistent world by
the end of the 2020s.

~~~
strenholme
As it turns out, it’s actually possible to bona fide win the game.

In my case, I was dating two girls, one was uncomfortable with the other girl,
and broke up with me, so I asked the remaining girl if she would marry me. At
this point, she said yes, we rode off in to the sunset and the game proclaimed
“CONGRATS YOU WIN” then it saves the game for me.

I guess I could load the game and deal with domestic squabbles, having
children, growing old together, but I’m not sure this training set is
optimized to generate a domestic married situation comedy story.

The torrent trick works. Right now, the game can be played at
[http://www.aidungeon.io/](http://www.aidungeon.io/)

~~~
Tepix
Torrent trick?

~~~
strenholme
Instead of downloading the six gigabyte trained data from the cloud ($$$),
they use a Bit Torrent client to download it, to keep costs down. It works, as
long as considerate users seed the file.

------
baroffoos
Insane but also expected. When I tried it out when it was posted here and saw
it took multiple minutes to warm up I knew it was probably expensive.

>And it's currently costing 30-40 cents per download.

Is there no way to have a single hosted instance rather than downloading again
for each user?

~~~
blotter_paper
> Is there no way to have a single hosted instance rather than downloading
> again for each user?

This might make the problem worse; then they'd have to do processing server
side, rather than offloading it on the client. I dunno whether this would be
more or less expensive than the initial download, but the torrent they put up
seems cheaper either way.

~~~
chongli
Weren’t they already doing the processing server side? If it were client side
then it wouldn’t be costing so much to run git clone every time, as the
download would be from GitHub to the user’s computer. It would be free, in
fact.

My impression of the situation is that every user who tried to play would
result in a new instance to spin up on Google’s cloud services and then begin
downloading a fresh copy of the repo from GitHub. This is what cost so much in
bandwidth.

~~~
blotter_paper
A client side request to GitHub would require them to serve up a relevant CORS
header, but I do think you're right about me misunderstanding where execution
is taking place. I'm unfamiliar with Jupyter Notebooks, and assumed
"downloading" meant "to the client". I, too, am now confused about why this is
set up like it is. Probably some constraint of Jupyter Notebooks that I'm
unaware of.

~~~
chongli
If you install and run Jupyter on your local machine, it’ll spin up a web
server on localhost and then connect to it in your browser. All of the Python
code runs on the server and only the results are sent to the client, to be
displayed in the browser.

------
tedunangst
To play backseat problem solver... You have free ingress? So start up a few $5
DO droplets and serve files from there. That gives you 1TB transfer per month.

I haven't tried this, but my understanding is that's per droplet. So when drop
A is about exhausted, start B and switch over the traffic. Then shut down A.
Then start C and shut down B, etc. Unlimited transfer? (Until your account
gets banned, anyway.)

~~~
brianwawok
You can buy bandwidth from a simple CDN for cheap enough to not need to do
this dance.

~~~
3fe9a03ccd14ca5
That’s the real story here. Paying market rate for bandwidth is like buying
soda from a restaurant. They’ll gouge you and make a profit but you’re thirsty
and it’s too late to shop around.

------
reilly3000
If they had put the bucket behind a CDN costs would have been dramatically
lower.

------
vanpelt
GCP charges ~$0.60/hour for the GPU/CPU/MEM equivalent in colab. If bandwidth
is costing 10k per day, how much is the free colab compute costing Google?

~~~
Dylan16807
Probably less. I doubt the average use time is more than half an hour. Pretty
cheap advertising for the service, really.

------
echelon
When will we come full circle and do "edge computing" on our own devices
again? I'm getting sick of exorbitant cloud costs and the moat that is
forming.

------
paul7986
[http://www.aidungeon.io/](http://www.aidungeon.io/)

------
jensv
The ominous opacity of the AWS bill – a cautionary tale (taloflow.ai)
[https://news.ycombinator.com/item?id=21694835](https://news.ycombinator.com/item?id=21694835)

------
tlarkworthy
I think the key architectural mistep is putting a colab session per user
session. A fix is having the user call out to an service API, which, behind
the scenes, is executed on a pool of colabs with reuse. I don't know of any
technology to make that easy though. The symptom that could of been a trigger
for cost investigations were the long startup times.

------
hombre_fatal
I know almost nothing about AI/ML.

I can imagine that running it on a single server and responding to requests is
intractable because of how the game feeds your quest back into its model.

But what would it take to package it as a local application?

I really love this game. You can't beat this:

[https://twitter.com/ptychomancer/status/1203246078989987840](https://twitter.com/ptychomancer/status/1203246078989987840)

> $ Give rousing speach to my fellow mud beings

> "Mud creatures! Mud creatures! We must unite against this enemy!"

> The other mud creatures nod eagerly, and begin chanting, "We will fight! We
> will fight! We will fight! We will fight! We will..".

------
stevenhuang
I remarked how impressed I was with the state of technology that made it
possible to freely spin up a VM with ~14GB of memory and beefy compute.

Now it all made sense--of course someone had to pay for this all. Doh!

~~~
throwawaytemp1
Google is paying for the Colab notebook and compute, that's free for users.
The problem was the ~6GB model hosted on GCS, GCS has very high network egress
costs.

------
clircle
Why would a user need a 12gb gpu to run the game locally? The deep learning
model is already trained, and I can’t imagine one needs a gpu to just evaluate
the model.

~~~
minimaxir
It's a _big_ model.

~~~
Dylan16807
It's a 6GB download. Mid-high end GPUs are 8GB. So what makes it need 12GB?
How specifically is that memory used up by various types of decompression,
intermediate calculations, etc.

~~~
dodobirdlord
I believe that the 1.5B model weights take up the 6GB themselves. Presumably
f32 weight values? Since they are all needed for an evaluation pass they will
eat up 6GB on the GPU off the bat. Not too surprising that everything else
can't fit in 2GB, since that's going to have to fit the entire model
architecture and all intermediary values.

~~~
Dylan16807
What exactly are these intermediate values composed of that makes them notable
in size compared to the model itself? Are there resources I should read for
how a model like this executes?

Will this model work with half precision weights? Is it very awkward to use
"brain" 16 bit floats?

------
Akababa
I was able to get it working without downloading files by mounting Google
Drive, although once the shared folder is rate-limited you need to make a copy
in your own Google drive:

[https://colab.research.google.com/github/Akababa/AIDungeon/b...](https://colab.research.google.com/github/Akababa/AIDungeon/blob/patch-2/AIDungeon_2.ipynb)

------
cobookman
Shouldn’t this be free egress as traffic is straight to a google product/
colab server or is this being downloaded in the browser?

Also cost was before cdn was enabled. This kind of traffic generally costs
fractions of a cent per GiB after signing a contract with negotiation.

“”” Egress to Google products (such as YouTube, Maps, Drive), whether from a
VM in Google Cloud with an external IP address or an internal IP address No
charge “””

------
sixtypoundhound
Speaking as someone who runs a bunch of large websites - out of my own pocket,
for profit.... I'm confused.

How did we get from 60K users to 10K per day expense?

For comparison, I serviced millions of users per month for years from a single
virtual server... (granted, that was after making our site super lean for a
data & CPU perspective)

How much resources is each user consuming?

~~~
tiborsaas
I guess you don't instantiate a 5Gb image / user. It's the data transfer that
costs 30/40c / user.

~~~
rvnx
Cloudflare ? or just one machine at OVH ?

~~~
rvnx
[https://wasabi.com/cloud-storage-pricing/](https://wasabi.com/cloud-storage-
pricing/)

Wasabi’s pricing model of $.0059 per GB/mo ($5.99 per TB/month) with no
additional charges for egress or API requests means you don’t pay to access
your data

I don't know, could be a couple of dollars only ?

------
ofirg
One would need to read the install.sh base file to know where to copy the
model from the torrent into. and then manually install the other dependencies.
If someone could make a friendlier version or at least better instructions
more people would give it a try.

------
slacka
Could someone explain the appeal in keeping this going long-term? When it was
posted here, I played it and read several adventure logs here and on reddit.
In every case the story is nonsense. Sure some parts read like something a
human would write, but anytime you go beyond a few sentences you can see
contractions and lack of flow that good human author would never make.

Don't get me wrong, it's a cool demo of how far we have gone beyond Markov
chains. Am I missing something or just spoiled from those Infocom games I
played as a kid?

~~~
hombre_fatal
Man, why so cynical? I don't see how those Infocom games spoiled you since
they were limited to whatever a team of writers could come up with. And a
well-written, dynamical-feeling CRPG is so rare that we still trumpet the
handful that were any good from 20 years ago like Planescape: Torment.

Here's an example of how this game is fun:
[https://twitter.com/ptychomancer/status/1203246078989987840](https://twitter.com/ptychomancer/status/1203246078989987840)

It's just fun to play with.

> you can see contractions and lack of flow that good human author would never
> make.

That doesn't seem like a sensible goal post. Unless you think the technology
is magic, why would you go into this thinking it's going to compete with a
master-planned work of fiction by a human writer?

I, on the other hand, am inspired by the game. Imagine Crusader Kings 2 (free
on Steam btw) where the events are randomly generated by this kind of story-
telling technology. Right now it's kind of boring wondering which of the
finite human-written events are going to show up. After playing for a while
you go from wondering what crazy event will happen next to knowing all of the
events and waiting for your favorite ones to show up.

We're a ways off from embedding a game in this technology, but I think we are
within reach of embedding this technology inside a narrative-driven game.

Another example is Dungeons & Dragons. The fun is the sandbox and interacting
with the narrative even though the human-driven dynamic storyline is often
complete nonsense if you were to read a transcription of what actually
happened.

This is the second comment I've read so far that seems to gleefully pat itself
on the back for wondering how someone could enjoy something else.

~~~
slacka
I only dug into this because people were raving about it. Never played D&D,
but it did remind me of those text adventure games I played as a kid.

When humans play D&D as the game progresses, are the rules of the world that
people establish supposed to be internally consistent and is there a planned
plot arc? Can you say there's no gravity and then later drop your sword?

------
gitgud
A 6GB neural net is going to be expensive to run anywhere. Is it possible to
break the neural net into services?

------
nickbauman
IIRC correct bandwidth costs have been essentially flat for the past decade or
so. Not coming down.

------
northfoxz
Out of curiosity, how much more time will it cost to run this on general CPUs
of users computer?

~~~
Filligree
It takes a couple seconds per reply. GPUs aren't as powerful as everyone would
have it, at least not for ML; you'd be lucky to get a 10x speedup.

------
Jamwinner
How does one play this without a Google account?

------
tlarkworthy
Gcs egress is 0.12 per GB, so 3GB per user?

~~~
lifthrasiir
The model weighs about 6.2 GB, you can check it from install.sh [1] that
currently downloads it via torrent (it was S3 days ago, I think).

[1]
[https://github.com/AIDungeon/AIDungeon/blob/master/install.s...](https://github.com/AIDungeon/AIDungeon/blob/master/install.sh)

------
csomar
I am, personally, amazed by this. The following might be kinda NSFW, though.

> You are Jesus, a knight living in the kingdom of Larion. You have a steel
> longsword and a wooden shield. You are on a quest to defeat the evil dragon
> of Larion. You've heard he lives up at the north of the kingdom. You set on
> the path to defeat him and walk into a dark forest. As you enter the forest
> you see a huge black dragon flying above it's lair. The beast looks very
> angry as you approach.

\+ suck his d-ck

> Your screams are silenced when you feel the head of the dragon's ----
> entering your mouth. He begins thrusting inside of you and soon you're
> filled with his \---. You collapse onto the ground and die

Well. That went really more wild than I could have ever expected.

~~~
jacobwilliamroy
I don't get it. What's NSFW here? Suck his what?

Also, yeah I think it's funny how chatbots can generate undesirable or NSFW
responses and there's basically no way to stop it. I know GPT-2 can parrot
old-timey scientific racism back at me, just like the redditors who trained
it!

~~~
csomar
NSFW as don't go loudly reading it in an office.

------
z3t4
I'm too scared Google will charge my credit card just by opening this lab
page... I've got a gaming rig, why can't I just run this locally?

~~~
jacobwilliamroy
I have an unpriveledged user account named "wilson" designed specifically for
situations like this.

------
andrekorol
After having read Rizwan Virk's "The Simulation Hypothesis", and playing this
amazing new game, I can say that AI-generated text adventure games are an
important step on the road to the Simulation Point (the point at which it
would be technologically possible for us to construct a simulation that is
all-encompassing as the one in The Matrix).

~~~
jacobwilliamroy
Have you tried psychadelics yet? Or maybe learned how to lucid dream? I think
such technologies will be preferable to you, even if our electronics can catch
up with our biological evolution.

Sure, maybe dream machines are fun and cool, but the ones in our future will
be made for profit by companies like Facebook and Alphabet which severely
diminishes any potential they may have had. Real dreams are libre, gratis, and
uninterrupted by ads.

~~~
andrekorol
I've never tried any psychedelic. But I am able to lucid dream sometimes, I
just never got deep into it. I guess I should read more about it.

I agree with your second statement, this so-called dream machine sounds a lot
like a future iteration of Facebook's Oculus Rift.

------
55555
Well that was obviously forseeable and dumb then. Use the user's resources or
charge money.

~~~
nickwalton00
The unexpected thing was that Google colab and GCS are separate such that
transfer costs between them often end up as international external egress fees

~~~
ljm
This is just GCS. For example, if you use their managed Kubernetes service,
you will get a fresh load balancer for every service you expose to the
internet. Not a shared load-balancer, a new one.

Unless you set up an alternative you'll get absolutely rinsed through the cost
of the instance and then the egress charges on top.

~~~
aianus
All of the load balancers on GCP are shared. Maybe you meant to say you get a
new, fresh IP address which is true but also not very expensive.

"Cloud Load Balancing is a fully distributed, software-defined, managed
service for all your traffic. It is not an instance- or device-based solution,
so you won’t be locked into physical load balancing infrastructure or face the
HA, scale, and management challenges inherent in instance-based LBs."

[https://cloud.google.com/load-balancing/](https://cloud.google.com/load-
balancing/)

~~~
pm90
Nothing in the paragraph indicates it’s shared. Also: it might be shared in
the implementation but you will still be billed for every single https LB that
you use (or an NLB if you’re doing tcp load balancing).

Every unique kubernetes ingress resource WILL spin up a NEW, uniquely billed
Https LB. Every unique kubernetes service with specific annotations will spin
up a NEW, unique LB (internal or external). The author is correct.

~~~
aianus
You are not billed by the "load balancer", you are billed by the "forwarding
rule" which makes it very obvious that the infrastructure is shared and that
you will have additional costs with every K8S ingress.

------
designium
I am trying to think some solutions but the crux is that users may be
expecting to have a unique story tailored specifically to them. If that
assumption is false, then we have some solutions:

\- save top or similar stories and make them pre-determined to avoid calling
the ai services

\- decrease the amount of times the user can keep the story going: users can
only give 3 times input instead of X

\- charge people for the game

Probably there are more ideas out there.

One last idea: package the code and instruct users to run on their own
machine(s) or have then to run on their own GCS account.

~~~
chii
i think the game should be made runnable on a user's own machine too.

Make the data licensed AGPL if the author is afraid of people copying and
making profit off it without including them!

~~~
sneak
AGPL does not prevent someone from copying data and making a profit off it and
not including them.

