
How We Built r/Place - dkasper
https://redditblog.com/2017/04/13/how-we-built-rplace/
======
ChicagoBoy11
I love write-ups like this because they are such a nice contrast to the too-
common comments on Reddit and HN where people claim that they could rebuild FB
or Uber as a side project. Something as superficially trivial as the r/Place
requires a tremendous amount of effort to run smoothly and there are countless
gotchas and issues that you'd never even begin to consider unless you really
tried to implement it yourself. Big thanks to Reddit for the fun experiment
and for sharing with us some of the engineering lessons behind it.

~~~
josephg
> the too-common comments on Reddit and HN where people claim that they could
> rebuild FB or Uber as a side project.

I do and don't agree with you. Whats really going on here is that development
time scales linearly with the number of decisions you need to make. Decisions
can take the form of a product questions - "what are we making?" and
development questions - "how should we implement that?".

There are three reasons why people feel confident saying "I could copy that in
a weekend":

\- When looking at an existing thing (like r/place), most of the product
decisions have already been made. You don't need to re-make them.

\- If you have a lot of development experience in a given domain, you don't
need to make as many decisions to come up with a good structure - your
intuition (tuned by experience) will do the hard work for you.

\- For most products, most of the actual work is in little features and polish
thats 'below the water', and not at all obvious to an outsider. Check out this
'should we send a notification in slack' flowchart:
[https://twitter.com/mathowie/status/837807513982525440?lang=...](https://twitter.com/mathowie/status/837807513982525440?lang=en)
. I'm sure Uber and Facebook have hundreds of little features like that. When
people say "I could copy that in a weekend", they're usually only thinking
about the core functionality. Not all the polish and testing you'd actually
need to launch a real product.

With all that said, I bet I _could_ implement r/place in less than a weekend
with the requirements stated at the top of that blog post, so long as I don't
need to think about mobile support and notifications. Thats possible not
because I'm special, and not because I'm full of it but because most of the
hard decisions have already been made. The product decisions are specified in
the post (image size, performance requirements) and for the technical
decisions I can rely on experience. (I'd do it on top of kafka. Use CQRS, then
have caching nodes I could scale out and strong versions using the event
stream numbers. Tie the versions into a rendered image URL and use nginx to
cache ... Etc.)

But I couldn't implement r/place that quickly if reddit didn't already do all
the work _deciding_ on the scope of the problem, and what the end result
should look like.

In a sense, I think programming is slow and unpredictable because of the
learning we need to constantly do. Programming without needing to learn is
just typing; and typing is fast. I rule out doing the mobile notifications not
because I think it would be hard, but because I haven't done it before. So I'd
need to learn the right way to implement it. - And hence, thats where the
scope might blowout for me. Thats where I would have to make decisions, and
deciding correctly takes time.

~~~
ChicagoBoy11
First off, it's a bit unfair to me that you were the one to make this comment,
since your very extensive experience and deep knowledge of real-time systems
makes you uniquely qualified to disprove my point!! hehe

But in all seriousness, I agree with most of what you said - I think I'm just
more bearish on people's ability to infer those many decision points without
being given the blueprint like we were in this article.

If you are a senior engineer at FB and you decide to make Twitter in your
spare time, I buy that lots of experience and knowledge gained at your job can
probably get you going fairly quickly. But I have never seen an example of
engineers discussing sophisticated systems like these where crucial aspects of
its success in terms of implementation didn't rely on some very specific
knowledge/study of the particular problem being solved that could only be
gleaned after trying things out or studying it very carefully. The
representation of the pixels is a great example in this case -- they go into
wonderful detail about why they decided to represent it the way they did,
which in turn informs and impacts how the rest of the stack looks like.

I think at one point Firebase had it as one of their example apps something
which very closely mirrored what they did with r/Place, so I agree that one
could probably build some "roughly" like it somewhat quickly. I agree that in
general knowledgeable individuals could probably grok things which would in
some form resemble popular services we know today. The devil is in the
"roughly," though. I think that often what makes them be THE giant services we
know are things that let them have the scale which very few of us have ever
needed or know how to deal with, or because they have combined tons of "below
the water" features and polish like you mentioned. When really basically most
web apps that we use are CRUD apps which we all know how to build, I think
maybe we need to give more weight to these "below the water" features in terms
of how much they actually contribute to the success of the applications we
use.

~~~
tripzilch
> The representation of the pixels is a great example in this case

That's funny because I thought (assuming you're referring to the bit-packing
of the pixels) that seemed to me one of the more obvious choices (to do
something _else_ would have been more remarkable to me).

Beyond that though, I have zero experience with real-time networked systems.
Especially with a gazillion users and that everybody gets to see the same
image, that seems _hard_.

The _cleverest_ solution that I read in the article, that I really liked but
probably would never have thought of myself (kinda wish I would, though) was
the part where the app starts receiving and buffering tile-changes right away,
and only _then_ requests the full image state, which will always be somewhat
stale because of its size, but it has a timestamp so they can use the buffered
tile-changes to quickly bring its state up to time=now. Maybe this is a common
technique in real-time networked systems, I don't know, but it was cool to
learn about it.

------
seankimdesign
What a fantastic writeup. I had some vague ideas regarding the challenges
involved to build an application of such scale, but the article really makes
it clear for everyone the amount of decision points encountered as well as why
certain solutions were selected.

I also like the way the article is broken down into the backend, API, frontend
and mobile. This isolated approach really highlights the different struggles
each aspects of the product has, while dealing with what is essentially a
shared concern: performance.

What I also found interesting is the fact that they were able to come up with
a pretty accurate guess in terms of the expected traffic.

> "We experienced a maximum tile placement rate of almost 200/s. This was
> below our calculated maximum rate of 333/s (average of 100,000 users placing
> a tile every 5 minutes)."

Their guess ended up being a good amount above the actual maximum usage, but
it was probably padded against the worst case scenario. The company that I
work for consistently fails to come up with accurate guesses even with our
very rigid user base, so it's pretty impressive that Reddit could accommodate
the unpredictable user base that is the entire Reddit community.

~~~
eblanshey
Agreed, great write-up. Anyone have other recommended links to write-ups about
specific challenges and how to make them scale?

~~~
petercooper
Here are some recent ones that have been popular:

[http://highscalability.com/blog/2016/4/20/how-twitter-
handle...](http://highscalability.com/blog/2016/4/20/how-twitter-
handles-3000-images-per-second.html)

[https://redditblog.com/2017/1/17/caching-at-
reddit/](https://redditblog.com/2017/1/17/caching-at-reddit/)

[https://blog.twitter.com/2017/the-infrastructure-behind-
twit...](https://blog.twitter.com/2017/the-infrastructure-behind-twitter-
scale)

[https://www.netlify.com/blog/2017/03/16/smashing-magazine-
ju...](https://www.netlify.com/blog/2017/03/16/smashing-magazine-just-
got-10x-faster/)

[https://nickcraver.com/blog/2016/02/03/stack-overflow-a-
tech...](https://nickcraver.com/blog/2016/02/03/stack-overflow-a-technical-
deconstruction/)

[https://engineering.linkedin.com/blog/2016/10/instant-
messag...](https://engineering.linkedin.com/blog/2016/10/instant-messaging-at-
linkedin--scaling-to-hundreds-of-thousands-)

[http://techblog.netflix.com/2016/08/building-
fastcom.html](http://techblog.netflix.com/2016/08/building-fastcom.html)

[http://highscalability.com/blog/2016/9/28/how-uber-
manages-a...](http://highscalability.com/blog/2016/9/28/how-uber-manages-a-
million-writes-per-second-using-mesos-and.html)

Or a recent writeup on a challenge building IMDB's forums in 2001(!)
[https://www.beatworm.co.uk/blog/internet/imdb-boards-no-
more](https://www.beatworm.co.uk/blog/internet/imdb-boards-no-more) .. fun and
scary in equal measures.

(I'm plucking these from things were popular recently in our
[http://webopsweekly.com/](http://webopsweekly.com/) :-))

------
eatitraw
> We used our websocket service to publish updates to all the clients.

I used /r/place from a few different browsers with a few different accounts,
and they all seemed to have slightly different view of the same pixels. Was I
the only one who experienced this problem?

When /r/place experiment was still going, I assumed that they grouped updates
in some sort of batches, but now it seems like they intended all users to
receive all updates more or less immediately.

~~~
d23
Yeah, we went into it a bit in the "What We Learned" section, but that was
most likely during the time we were having issues with RabbitMQ. I believe it
was mostly fixed later on, but either way, we found a new pain point in our
system we can now work on.

~~~
atombender
Surprised you're using RabbitMQ. It's one of those things which work great
until they don't (clustering is particularly bad), and then you have almost
zero insight into the issue, and have to resort to the Pivotal mailing list.

Have you looked at NATS at all? We're using it as a message bus for one app
and it's been fantastic. It is, however, an in-memory queue, and the current
version cannot replace Rabbit for queues that require durability.

~~~
fernandotakai
i've been using rabbitmq heavily (as in, the whole infrastructure is based on
two rabbitmq servers) for a long time and i've never seen it fail.

tbh, i never used clustering (because it's one of the shittiest clustering
implementations i've ever seen) but we do use two servers (publishers connect
to one randomly and consumers connect to both) and it seems to handle millions
of messages without any issues.

of all servers i've ever used, rabbitmq is by far the most stable (together
with ejabberd).

~~~
lobster_johnson
RabbitMQ is decent if you don't use clustering (which, I agree, is shitty). I
have some quibbles with the non-clustered parts, but nothing big.

Right now, the main annoyance is that it's impossible, as far as I understand,
to limit its memory usage. You can set a "VM high watermark" and some other
things, but beyond that, it will — much like, say, Elasticsearch — use a large
amount of mysterious memory that you have no control over. You can't just say
"use 1GB and nothing more", which is problematic on Kubernetes where you want
to pack things a bit tightly. This happens even if all the queues are marked
as durable.

~~~
fernandotakai
yeah we have dedicated machines to rabbitmq because it's basically memory
hungry. but i like it that way because it's only going to crash if the machine
crashes.

------
writeslowly
I thought it was interesting that one of their requirements was to provide an
API that was easy to use for both bots and visualization tools. I remember
reading some speculation when this was running that r/place was intentionally
easy to interface with bots, while there were also complaints that the whole
thing had been taken over by bots near the end.

~~~
gramstrong
Without bots, I doubt that /r/place would have been very interesting. It's a
nice thought that a million random strangers can be cohesive without
automation, but for some reason I don't find that to be particularly
realistic..

~~~
saulrh
As a concrete example, as far as I can tell the entire Puella Magi Madoka
Magica section, starting from Homura Did Nothing Wrong next to Darth Plagueis
The Wise, was hand-crafted and hand-maintained. On their discord they were
actively discouraging community members that wanted to use bots.

------
ag_47
Now I'm curious, Are there any websites that do something similar to /r/place?
(hackathon idea?)

Also, reminds be of the million dollar front page [1].

[1]
[https://en.wikipedia.org/wiki/The_Million_Dollar_Homepage](https://en.wikipedia.org/wiki/The_Million_Dollar_Homepage)

~~~
DanHulton
A long time ago, I built
[http://www.ipaidthemost.com/](http://www.ipaidthemost.com/), which is kinda
related, at least to TMDH anyhow. Far, far less collaborative than /r/place,
but similar in terms of staking out ownership.

~~~
jacquesm
Hilarious. That's got to be the most direct monetization strategy since tmdhp.

------
wyager
Why use Redis and multiple machines instead of keeping it in RAM on a single
machine? I'm not claiming the Reddit people did anything wrong; they have a
lot more experience than me here obviously. I'm just trying to figure out why
they couldn't do something simpler. 333 updates/sec to a 500kB packed array,
coupled with cooldown logic, should have a negligible performance cost and can
easily be done on a single thread. That thread could interact via atomic
channels with the CDN (make a copy of the array 33 times a second and send it
away, no problem) and websockets (send updates out via a broadcast channel,
other cores re-broadcast to the 100K websockets). Again, I'm not saying this
is actually a better idea, this is just what I would do naively and I'm
curious where it would fall apart.

~~~
bsimpson63
> keeping it in RAM on a single machine

> 500kB packed array, coupled with cooldown logic, should have a negligible
> performance cost and can easily be done on a single thread. That thread
> could interact via atomic channels with the CDN

That's not simpler.

We used tools that we're already using heavily in production and are
comfortable with.

~~~
wyager
> That's not simpler.

With respect to your experience in the matter, I strongly disagree. What I
described is complicated to say, easy to implement. What the OP describes
("use redis") is easy to say, complicated to implement. Not just in terms of
human work time (setting up the redis machine and instance, connecting
everything together), but also in terms of number of moving parts (more
machines, more programs, etc.).

> We used tools that we're already using heavily in production and are
> comfortable with.

That's entirely fair, and what I figured was the most likely explanation.

------
vmasto
> Users can place one tile every 5 minutes, so we must support an average
> update rate of 100,000 tiles per 5 minutes (333 updates/s).

It only takes a couple of outliers to bring everything down. I'm not exactly
well-versed in defining specs for large scale backend apps (not a back-end
engineer) but it seems to me that preparing for the average would not be a
wise decision?

For example, designing with an average of a million requests per day in mind
would probably fail, since you get most of that traffic during daytime and far
more less at the nightly hours.

Could anyone more experienced shed some light?

~~~
andoon
The entire reddit website goes down every night, especially during weekends,
sport matches, etc, so there you have your answer.

~~~
celticninja
Is that hyperbole or are you really experiencing that much downtime of Reddit?
I see it occasionally but it's never down for long, the odd "servers are busy"
message usually disappears after a single refresh.

~~~
ClassyJacket
The search is consistently broken, but the rest of the site seems to have good
uptime now, much better than it once was.

~~~
celticninja
Search has never worked. I always use Google site: modifier and have never had
an issue.

~~~
Macha
Search worked pretty reliably up until 7-8 months ago for me

------
amyjess
So I got hit by an unfortunate bug on the first day of /r/place.

I was trying to draw something, one pixel at a time, and all of a sudden,
after a bunch of pixels, it stopped rate-limiting me! I could place as many as
I wanted! So I just figured that they periodically gave people short bursts
where they can do anything. This was backed up by my boss, who was also
playing with /r/place, saying that the same thing happened to him not long
before that (yes, my whole team at work was preoccupied with /r/place that
Friday). So I quickly rushed to finish my drawing.

And then I reloaded my browser... and it wasn't there. Turns out that what I
thought was a short burst of no rate limiting was just my client totally
desyncing from Reddit's servers. Nothing was submitted at all.

Not too long after that, another guy on my team got hit by the same bug. But I
told him what happened with me, so he didn't get his hopes up.

~~~
johansch
It happened to me as well. I did verify that my changes actually made change
(from the same IP, but in incognito mode). Didn't bother to check if the
changes stayed.

------
nathan_f77
This is fantastic. I learned a lot, and it seems like they nailed everything.

I really enjoyed the part about TypedArray and ArrayBuffer. And this might be
a common thing to do, but I've never thought about using a CDN with an expiry
time of 1 second, just to buffer lots of requests while still being close to
real-time. That's brilliant.

------
ziikutv
Startup tech writers, take note. This write up has been more helpful than many
in the past. Thank you very much Reddit developers

------
nitwit005
Given the scale described, it sounds like they could have had a single machine
that held the data in memory and periodically flushed to disk/DB to support
failing over to a standby.

~~~
bsimpson63
You're basically describing how we used redis for this project.

~~~
nitwit005
I suppose so, but then what did you gain from the extra hop to redis?

~~~
bsimpson63
Not having to implement redis ourselves.

------
tupshin
_Our initial approach was to store the full board in a single row in Cassandra
and each request for the full board would read that entire row._

This is the epitome of an anti-pattern .I sincerely hope that this approach
was floated by somebody who had never used Cassandra before.

Even if individual requests were reasonably fast, you are sticking all of your
data in a single partition, creating the hottest of hot spots and failing to
leverage the scale out nature of your database.

~~~
spyspy
This entire project is just an elaborate hack day project. There's no reason
to fault them for trying new and interesting hacks to get it off the ground.
They realized it wasn't the right method and moved on. End of story.

------
maaaats
Interesting how big the Norwegian and Swedish flags got, given our small
populations.

~~~
stevekemp
I thought the exact same thing, about the Finnish flag - complete with
Moomins.

~~~
pimeys
And the almost a dictator of a president Urho Kekkonen. I kind of understand
this, I was born in that country...

------
pitaj
r/Place is really awesome. This is how you grow the community. The 2D and 3D
timelapses are super cool to watch, as well. Glad Reddit decided to make this
a full-time thing.

~~~
kzrdude
Full time? I haven't heard that anywhere.

------
ThomPete
This is why experimentation is so important and why I always love when people
do things just to do them.

It's literally like exploring the digital universe and reporting on some of
your findings.

Great writeup!

------
blurrywh
FULL 72h (90fps) TIMELAPSE:
[https://www.youtube.com/watch?v=XnRCZK3KjUY](https://www.youtube.com/watch?v=XnRCZK3KjUY)

------
archagon
Thank you for the fascinating writeup! How long did the whole thing take to
put together?

~~~
tyrust
First commit was on Jan 20 [0], so that provides a lower-bound for how long
they spent on it.

[0] - [https://github.com/reddit/reddit-plugin-place-
opensource/com...](https://github.com/reddit/reddit-plugin-place-
opensource/commit/68498bab9300f43ae4273dd4719dcecb081126f7)

------
hopfog
This is amazing and I got so many ideas on how to tackle the scaling issues I
have with my own multiplayer drawing website. In the aftermath of r/Place I
went into some of the factions' Discord servers and posted my site, getting
50-100 concurrent users which caused a meltdown on my server. It was a good
stress test but also a wake-up call.

Again, amazing write-up. Thank you!

------
antoniuschan99
Looks impressive!

How big was the team? How long did it take to complete this project? Is the
code going to be open sourced?

~~~
aw3c2
[https://github.com/reddit/reddit-plugin-place-
opensource](https://github.com/reddit/reddit-plugin-place-opensource)

------
biot

      > At the peak of r/place the websocket service was
      > transmitting over 4 gbps (150 Mbps per instance
      > and 24 instances).
    

What does Reddit use for serving up this much websocket traffic? Something
open source, or is it custom built?

~~~
dkasper
It's open source and custom built. [https://github.com/reddit/reddit-service-
websockets](https://github.com/reddit/reddit-service-websockets)

------
77pt77
Now where can we get a dump of all the data.

Like

timestamp, x, y, color, username

~~~
calosa
One of the reddit data scientists dumped it here...
[https://data.world/justintbassett/place-
events](https://data.world/justintbassett/place-events)

~~~
aw3c2
> Oops! We can't find that page.

~~~
justintbassett
my fault :). I have to get a few things ready for a public release of more
data

~~~
77pt77
Is this data:

[https://www.reddit.com/r/place/comments/6396u5/rplace_archiv...](https://www.reddit.com/r/place/comments/6396u5/rplace_archive_update/)

Complete?

I've been working with that.

~~~
Ajedi32
That data was recorded by the community. It took everyone a while to figure
out how to grab a snapshot of the canvas though, so a few hours of data near
the start are missing.

------
a_bonobo
Weird question - why does that bash script use absolute paths for standard
tools like awk and grep (/usr/bin/awk instead of just awk)? Is this some best
practice I know nothing about?

------
Cofike
As someone looking to expand their knowledge of big systems and building at
scale this kind of resource is invaluable!

------
calosa
If any data science-y folks want to work with the raw data, you can find it
here... [https://data.world/justintbassett/place-
events](https://data.world/justintbassett/place-events)

~~~
Ajedi32
"Oops! We can't find that page."

~~~
calosa
Looks like the owner changed it to be private :/ Hopefully they'll open it up
again.

~~~
ReverseCold
Did anyone download it already? Reshare?

------
svarrall
Anyone with any insight into how much something like this 'cost' Reddit,
resource wise. Is the main outlay in time and the server costs already covered
by their infrastructure or does the high traffic add enough to make a
difference?

------
_hamilton
/r/place is probably the coolest project that happened this year so far.

~~~
taftster
This year? I'm thinking more like this decade. It's gotta be up in the top 10
of ever. On so many levels, /r/place was fascinating; and I didn't even come
across it until after it had finished!

------
replface
Similar to this?
[https://www.youtube.com/watch?v=9_uX5yXSOwU](https://www.youtube.com/watch?v=9_uX5yXSOwU)

------
rohankshir
anyone know what framework they used to do visualizations?

~~~
bsimpson63
[https://grafana.com/](https://grafana.com/) for the graphs and
[https://www.draw.io/](https://www.draw.io/) and
[http://www.fiftythree.com/](http://www.fiftythree.com/) for the diagrams.

------
the_arun
Good article. One security issue I see is - showing Nginx version in error
page - nginx/1.8.0. No need to show details of webserver or its version!

------
huangc10
Can someone link to the full resolution final image? Been trying to find it.
Thanks!

~~~
Ajedi32
Here you go:
[https://i.imgur.com/ajWiAYi.png](https://i.imgur.com/ajWiAYi.png).

And for those interested, here's some additional stats:

\- The original announcement about /r/place:
[https://www.reddit.com/r/announcements/comments/62mesr/place...](https://www.reddit.com/r/announcements/comments/62mesr/place/)

\- Full timelapse of the canvas over the course all 72 hours
__:[https://www.youtube.com/watch?v=XnRCZK3KjUY](https://www.youtube.com/watch?v=XnRCZK3KjUY)

\- Heatmap of all activity on the canvas over the full 72 hours
__:[https://i.redd.it/20mghgkfwppy.png](https://i.redd.it/20mghgkfwppy.png) by
/u/mustafaihssan

\- Timelapse heatmap of activity on the canvas
__:[https://i.imgur.com/a95XXDz.gifv](https://i.imgur.com/a95XXDz.gifv) by
/u/jampekka

\- Entropy map of the canvas over the full 72 hours
__:[https://i.imgur.com/NnjFoHt.jpg](https://i.imgur.com/NnjFoHt.jpg) by
/u/howaboot (explanation:
[https://www.reddit.com/r/dataisbeautiful/comments/63kuy6/oc_...](https://www.reddit.com/r/dataisbeautiful/comments/63kuy6/oc_heatmap_of_the_most_pixels_changes_happend_on/dfv3n83/))

\- Map of all white pixels that were never touched throughout the event:
[https://i.imgur.com/SEHaUSJ.png](https://i.imgur.com/SEHaUSJ.png) by
/u/alternateme

\- Most common color of each pixel over the last...

    
    
      - 72 hours**: https://i.imgur.com/C5jOtl1.png by /u/howaboot
    
      - 24 Hours: http://aperiodic.net/phil/tmp/place-mode-24h.png by /u/phil_g
    
      - 12 Hours: http://aperiodic.net/phil/tmp/place-mode-12h.png by /u/phil_g
    
      - 6 Hours: http://aperiodic.net/phil/tmp/place-mode-6h.png by /u/phil_g
    
      - 2 Hours: http://aperiodic.net/phil/tmp/place-mode-2h.png by /u/phil_g
    

\- Average color of each pixel over the course of the experiment
__:[https://i.imgur.com/IkPOwIh.png](https://i.imgur.com/IkPOwIh.png)

\- Atlas of the Final Image: [https://draemm.li/various/place-
atlas/](https://draemm.li/various/place-atlas/) by /r/placeAtlas/ (source
code: [https://github.com/RolandR/place-
atlas](https://github.com/RolandR/place-atlas))

\- Torrents of various canvas snapshots and image data:
[https://www.reddit.com/r/place/comments/6396u5/rplace_archiv...](https://www.reddit.com/r/place/comments/6396u5/rplace_archive_update/**)

\- The post announcing the end of /r/place:
[https://www.reddit.com/r/place/comments/6382bb/place_has_end...](https://www.reddit.com/r/place/comments/6382bb/place_has_ended/)

 __It took a while for members of the community to realize what was happening
and start recording snapshots of the canvas, so there are a few time periods
early on that got skipped

~~~
19eightyfour
This was such an amazing demonstration of human collective collaboration. It
sort of makes me fee like humans could do anything, even tho the result is
sort of in some sense trivial. As well as simply being enjoyed, this could be
studied in so many ways. Competition of memes and cultural representations.
'Evolutionary' convergence upon some optimum. Mainstream vs Fringe. Accepted
vs Taboo concepts. Implicit spontaneous emergence of behaviour norms for
participants: self regulating systems. I also like this timelapse which
contains an overview, and then a close up of each of 12 sections ( 333 x 250 )
-
[https://www.youtube.com/watch?v=RCAsY8kjE3w](https://www.youtube.com/watch?v=RCAsY8kjE3w)

------
webdwarf
It's awesome project!

------
mozumder
> We actually had a race condition here that allowed users to place multiple
> tiles at once. There was no locking around the steps 1-3 so simultaneous
> tile draw attempts could all pass the check at step 1 and then draw multiple
> tiles at step 2.

This is why you use a proper database.

I'd probably add a Postgres table to record all user activity, and use that to
lock out users for 5 minutes as an initial filter. Have triggers on updates to
then feed the rest of the application.

~~~
d23
So in that case, each pixel would be stored as a separate row in a relational
database? And to query the whole canvas you'd query a million rows on every
read?

I lean towards just using the ratelimiting stuff we already have in place (via
memcached, which we talked about in a previous post). We just overlooked it.

~~~
mozumder
I'd most likely have two tables - one for user activity and one for each pixel
(1 million rows only in that table). Selecting a million rows from that pixel
table might be 200ms or whatever. I'd still have Redis cache, though, since
you're getting 100ms.

~~~
developer2
Consider exactly what you are proposing. One table to store the entire history
(one billion or more rows). A second denormalized table, whether updated at
the application layer or via triggers, to store the most recent update to each
of the one million cells (1000x1000 pixel grid = one million data points).

The simple fact of introducing a one-million-row read for the latest data of
each "pixel cell" is fairly insane. You _must_ have a cache for such data.
"I'd still have have Redis cache, though" is not even debatable. It doesn't
_have_ to be Redis, but is definitely has to be a cache of one kind or
another.

~~~
mozumder
So, I just did a SELECT * from a table with 1 million single-byte character
rows, and it ran in 90.51ms:

    
    
      place=> explain analyze select * from board_bitmap ;
                                                            QUERY PLAN                                                       
      -----------------------------------------------------------------------------------------------------------------------
       Seq Scan on board_bitmap  (cost=0.00..14425.00 rows=1000000 width=6) (actual time=0.009..57.295 rows=1000000 loops=1)
       Planning time: 0.160 ms
       Execution time: 90.510 ms
      (3 rows)
    

And, with triggers from an activity table, the entire write operation can be
atomized so there aren't any race conditions.

I don't think you understand how fast Postgres is on modern hardware. What
took a large cluster 5 years ago can be done on a single system with a fast
NVMe drive today. We really might not even need Redis in this situation.

And, yes, I have to deal with viral content, so this is right up my alley.

------
brilliantcode
This reminds me of
[http://www.milliondollarhomepage.com/](http://www.milliondollarhomepage.com/)

------
johansch
The front-end UX for scrolling that bitmap was quite frankly horribly badly
designed.

~~~
bsimpson63
What was wrong with it?

~~~
lima
It gets weird when the cursor leaves the box while dragging. Now, when you go
back inside, you're still in drag mode since the box did not get the "mouse
up" event and you end up selecting and dragging random text.

~~~
johansch
In the end this does not seem to have mattered. Reddit's hardcore
"contributors" are the kind of people who enjoy a challenge, even when it's
stupid. I think it even turned into some kind of pride for some of them, being
able to "master" an idiotically programmed system.

Myself, I just get so frustrated about the idiocy.

~~~
zodiakzz
Reddit has a place for everyone, you might find yourself at home on
[https://reddit.com/r/iamverysmart](https://reddit.com/r/iamverysmart)

~~~
johansch
Oh please just go back to reddit.

------
GrumpyNl
Same was done in Holland several years ago. The one million pixels site. Each
pixel was sold for a dollar. All was sold.

~~~
frandroid
In real time though?

------
look_lookatme
This is all great, but your search never works, still. It has been like that
forever.

------
subkamran
This is awesome but man, reading the canvas portion was a bit distressing. I
wonder why they didn't use a game engine to do this? All the work they did has
been implemented already in several JS game engines, such as the one I help
maintain (it's free and OSS),
[https://excaliburjs.com](https://excaliburjs.com). We support all the
features they needed including mobile & touch support. They could have also
used Phaser ([http://phaser.io](http://phaser.io)) too I bet... that has WebGL
support for even faster rendering on supported devices.

~~~
madlee
Hi, I wrote the majority of that part of the project (canvas stuff) & that
section of the article – the simple answer is that I have a lot of experience
working with the canvas API directly, but little to no experience using any of
the popular JS game engines out there (I played around with Phaser years ago,
but not very much). I don't think it would've saved me any time to be honest.

~~~
subkamran
That's totally fair, I get the sentiment. Great job nonetheless!

