
Topics in Advanced Data Structures [pdf] - htiek
http://web.stanford.edu/class/cs166/handouts/100%20Suggested%20Final%20Project%20Topics.pdf
======
jhayward
I was going to roll my eyes about stuff no one will ever hear of, much less
use in their day job, but there are some really relevant structures here.
Finger trees, cache-oblivious structures, R-trees, etc., just to name a couple
from a random page or two. The "why they're worth studying" summaries are
gold.

Thanks!

~~~
bhaavan
A doctor has to mostly always treat common cold, and influenza. But that's no
excuse for her / him to not know the purpose of the left pulmonary vein. The
more basics they know, the better doctors they are. Sorry if this analogy is a
bit extreme, but I want to make a point.

~~~
jhayward
I think a more apt analogy would be if you would disqualify doctors who don't
know the nucleotide sequence of the genes that code for the cytochrome p450
enzyme process from treating your cold or flu.

The left pulmonary veins are gross anatomy, akin to knowing what an 'if'
statement is. Knowing the current state of the art in determining a minimum
complexity for an operation on an obscure tree implementation is deep domain
knowledge used by only a few specialists.

------
Insanity
The material seems to be for this course:
[http://web.stanford.edu/class/cs166/](http://web.stanford.edu/class/cs166/)

There are more slides and info on that site :)

~~~
copperx
It's a pity there are no video lectures. Are there lectures out there for a
similar algorithms class?

~~~
reubenmorais
MIT 6.851 Advanced Data Structures, Spring 2012:
[https://www.youtube.com/playlist?list=PLUl4u3cNGP61hsJNdULdu...](https://www.youtube.com/playlist?list=PLUl4u3cNGP61hsJNdULdudlRL493b-XZf)

~~~
jbn
I took that class that year, one of the best classes I ever took. A ton of
work, too.

------
amelius
> Traditional data structures assume a single-threaded execution model and
> break if multiple operations canbe performed at once. (Just imagine how
> awful it would be if you tried to access a splay tree with multiplethreads.)
> Can you design data structures that work safely in a parallel model – or,
> better yet, take maxi-mum advantage of parallelism? In many cases, the
> answer is yes, but the data structures look nothing liketheir single-
> threaded counterparts.

Makes me wonder which data structures have "parallel" versions besides the two
mentioned.

~~~
dominotw
would all of scala's default( immutable) datastructures fit the bill?

~~~
dominotw
I was just curious. Not sure why this is downvoted.

------
bogdanoff_2
This kind of stuff fascinates me. General software engineering seems
comparatively boring. I was considering going back to university to do a phd
in cs and this is making me realize that research in algorithms/data structure
would actually be viable (still lots of stuff to discover).

Does anyone here know what it is like doing research in these areas? Any
general advice?

~~~
xtracto
I've got a PhD in CS, and these structures fascinate me as well. The Bloomier
Filter reads amazing.

Unfortunately, day to day job has almost nothing to do with these algorithms,
but mainly software architecture and maintainability.

I think the majority of these algorithms are not really something you will
find implementing in day to day work. And at most I can see myself using a
library with one of those. Even if you do research, it will have to be very
focused on data structures and algorithms to really get deep into some of
these.

Still, very enjoyable.

~~~
sterlind
Just today I finished implementing radix heaps in C# to optimize some of my
Dijkstras.. When I googled I saw zero implementations in C#, so I may be the
first to write it. It worked well, giving me 3x speedup in practice over DAry
heap and very low GC pressure, fortunate as my process was exceeding 256GB
ram.

Rolling your own data structures is pretty vital if you're working on
algorithms.

------
lame88
Does anyone in their work find that they are able to employ data structures
like this, and if so, what do you work on? I've almost always had to delegate
all my state to a database using default indexes, etc., which is productive,
yet a little disappointing, because I'm always applying my brain power instead
toward more mundane tasks.

~~~
petschge
I do plasma simulations and recently had the problem of finding the distance
to the nearest neighbor for every of the particles in the simulation. Doing
that naively is O(n^2) and took hours even for small test problems. Building
an R-tree once and using if for nearest-neighbor look-ups brought that down to
5 minutes.

libspatialindex lacks documentation, but worked really nicely. The rtree
interface in python is much friendlier.

~~~
arman_ashrafian
I’m taking Advanced Data Structures at UCSD right now and our first assignment
was making a K-D Tree and an efficient KNN Classifier. It was surprisingly
simple and the efficiency between the KD Tree and brute force implementation
was quite drastic.

If you only build the tree once and do no insertions what is the benefit of an
R-Tree vs KDTree?

~~~
petschge
I actually do plan to update the tree as I insert additional particles in
locations where the distance to the nearest neighbor is large.

------
sidcool
I love learning Algorithms and Data Structures. The issue is that I don't get
to use these frequently, not even basic DS. Most of what I need exists in the
language or some framework, and if I am to implement it from scratch, I am
sure I will do worse. The only time I really use this knowledge is during
interviews.

~~~
collyw
I (did) feel excatly the same. Now after working for 17 years I realize that I
barely ever use this stuff, and when asked this stuff in an interview I do a
lot worse than I did straight out of university. I am however a lot better
software engineer than I was then.

------
collinmanderson
I'm surprised no one mentioned the "Crazy Good Chocolate Pop Tarts" algorithm.
That one took my by surprise :)

[https://www.researchgate.net/publication/51952511_De-
amortiz...](https://www.researchgate.net/publication/51952511_De-
amortizing_Binary_Search_Trees)

Hilarious

------
lichtenberger
Hm, where else are Lowest Common Ancestors in tree-structures useful? Storing
for instance ORDPATH/DeweyIDs allows to simply check for a common prefix (they
are hierarchical node labels). I think maybe for locking in a storage system
with multiple read/write transactions or to determine which of two given nodes
is the firs one in preorder, which is useful for XQuery/XPath processing (to
determine the so caslled document order). Can anyone think of other usages? Or
for having hierarchical node labels in trees?

~~~
htiek
You can use lowest common ancestor queries in conjunction with suffix trees to
solve a lot of interesting string problems. For example, take two indices
within a string, find their corresponding suffixes in the suffix tree, and
then take their LCA. That gives you an internal node corresponding to the
longest string that appears starting at both indices (this is called their
"longest common extension.") You can use this as a subroutine in a bunch of
genomics applications.

~~~
lichtenberger
Thanks, great use case and I have to say I have to read about genomics... :-)

------
bondant
Does anyone know books which explore in depth recent development in advanced
data structure (or just not well known advanced data structure) ?

------
beeforpork
The list is great, but unfortunately has no pointers to documentation. I find
nothing online for some of the topics.

Where can I find the description and discussion of 'ravel trees'?

~~~
nestorD
They also caught my eyes. I finally found a paper here (the correct
denomination seems to be RAVL tree) :
[http://sidsen.azurewebsites.net/papers/ravl-trees-
journal.pd...](http://sidsen.azurewebsites.net/papers/ravl-trees-journal.pdf)

~~~
beeforpork
Super, thanks for being better at searching! (The name makes more sense this
way.) :-)

------
winrid
This is awesome, thank you!

Used nested R-Trees in a personal project recently (sharded in memory spacial
db for a game) which is not something I thought I'd ever have to do.

------
vaibhavsagar
My favourite somewhat obscure data structure that I didn't see on the list:
Hash Array Mapped Tries.

------
criddell
What font are they using? I find it very unpleasant to read on a 4k screen set
to 125% scale. Or maybe it's Firefox.

~~~
criddell
I loaded the page on Chrome and the text is definitely heavier (and more
readable) but also less sharp.

------
hasahmed
Where can I learn more about ravel trees?

~~~
sus_007
From one of the comments on this thread.
[http://sidsen.azurewebsites.net/papers/ravl-trees-
journal.pd...](http://sidsen.azurewebsites.net/papers/ravl-trees-journal.pdf)

------
mountainofdeath
Great! More fodder for interview questions /s.

~~~
twoquestions
That's probably it, any use of these algorithms in business software would
need to be justified for the increased training your fresh-out-of-school
replacement would need to maintain this.

------
pizza
This is quite cool, thanks.

------
panbabybaby
nice sharing !!

------
maimeowmeow
Please provide the solutions in git repo. Thanks.

~~~
inetsee
Paraphrasing (almost every) math book: "Solutions are left as an exercise for
the reader."

