

Creating multi-game highscore lists in ArangoDB - solvr
https://www.arangodb.com/2015/04/creating-multi-game-highscore-lists/

======
onion2k
_So let’s pick the option that stores highscores and screen names in separate
places, and brings them together only when needed in a leaderboard query._

Is that actually a good idea though? Leaderboard queries are likely to be the
most common things you do with the data. You'd save a join by storing the
username with the score data (as you'd never want one without the other), and
it wouldn't save that much space by separating them given that usernames are
only a few tens of bytes each.

~~~
ifcologne
I think a more useful example would be adding further information from the
users profile to a leaderboard.

That could be the name in a particular game or other data from the users
profile. Or - limit the leaderboard by filtering on age, region or whatever.

~~~
onion2k
Assuming they're unique you could do that just by using the username as the
key to join the score table to the user table on. Everything would be the same
as the example in the article, but without the join needed when you're just
getting the names with their scores.

Optimising a database for storage is really only a good idea when the
additional redundant data is going to take up _huge_ amounts of space. Storage
is cheap. Making a query run faster isn't. Generally speaking you're better
off optimising to reduce the number of joins and subqueries because they're
the things that usually slow you down.

~~~
lobster_johnson
That would cause problems when users are renamed. You want foreign keys (which
Arango doesn't have?) and transactions, and you also want to make sure that
every name change updates all foreign tables. Changing it one place eliminates
all possible mistakes, which is why surrogate keys were invented.

The way I like to approach performance issues like these is to cache or
precompute de normalized data as needed, and make sure the master data is as
clean and normalized as possible. You could have a materialized view, or a
trigger that produced the denormalized version, or just store the data is
Memcached.

------
ritonlajoie
A little bit off topic : we are trying to use Redis in my office, to use as a
high speed memory cache k/v store, with on-disk saving and reloading
capabilities.

Problem is, we must stick with the opentech versions since we are running in a
windows environment.

It appears the 'fork' hacking those guys made is not very stable. Every once
in a while, the fork process fails for some unknown reason, and we are
actually trying to fix the code ourself to understand why this fails.

Since we have spent a lot of time on it, maybe it's time to go away as this
'stable' release from ms-opentech appears to not be very stable at all.

What would people recommend ? Is there any production ready solution that
exists, with windows stability in mind ?

~~~
virmundi
If you're able to use a 64 bit Windows, ArangoDB can do this. You can shard
and replicate it. Documents can be simple k-v stores.

------
Fiahil
If, like me, you got lost in the overwhelming universe of available storage
technologies; I found this a bit useful (but incomplete): [http://nosql-
database.org/](http://nosql-database.org/)

~~~
bjerun
You can also checkout [http://db-engines.com/en/ranking/graph+dbms](http://db-
engines.com/en/ranking/graph+dbms) which gives a ranking of various
technologies (graph, document, search).

------
anon4
I evaluated Arango for a project a couple of years ago. In the end we had to
drop it, due to

1) Keeping all data in memory -- we didn't have enough RAM to keep everything
in memory

2) Extremely slow startup time -- it was basically just reading the log of
statements and loading things in memory

If those two are fixed now, Arango might be worth a serious look.

~~~
irickt
From the FAQ: [https://www.arangodb.com/faq/](https://www.arangodb.com/faq/)

"A database in ArangoDB can be larger than the available main memory. Although
ArangoDB stores all data on disk for durability reasons, it is a “mostly main
memory database”. This means that the working set (the set of pages that are
frequently accessed) should fit into main memory. It’s left to the operating
system to determine the working set and to transfer pages between main memory
and secondary storage. The data that are currently not needed are kept only on
secondary storage.

"This is in principle also true for indexes: a part of the index (the working
set) could be in main memory and the remainder could be on secondary storage
if not used frequently."

You may also be interested in the roadmap:
[https://www.arangodb.com/roadmap/](https://www.arangodb.com/roadmap/)

~~~
assface
> It’s left to the operating system to determine the working set and to
> transfer pages between main memory and secondary storage

Sigh. Good ol' mmap. They should ask MongoDB how they went for them. Then they
should read Stonebraker's 1981 paper:

[http://db.cs.berkeley.edu/cs286/papers/os-
cacm1981.pdf](http://db.cs.berkeley.edu/cs286/papers/os-cacm1981.pdf)

People never learn.

------
fsk
or you could use a relational database and do

SELECT * FROM scores

WHERE game_id = :game_id

ORDER BY score DESC

