Scanning the topics here, what seems really interesting is how basic data structure are, naturally, containers for data but these advanced data structures are more like "mix-ins" or effects, qualities that can be added to a container. Time travel most obviously but also the various optimizing tree containers.
Not sure how to make this an interesting question. But it's interesting. How many effects can we add to a given structure? Can we make these additions automatically? Can we combine all of these or are some incompatible with each other? What are the cost-benefits between these?
I find it interesting that a common story of people who did big things is that nobody would have predicted they'd end up doing the things they did.
Reminds me of my own reflections on things that I have done that I never would have expected I'd be doing. As well as this recent article by the creator of Entrepreneur magazine .
"literally nobody, myself included, would have once predicted that I’d be running a magazine called Entrepreneur"
So, I suggest you just read everything you can, getting the gist of things so if you encounter something similar you know where to look.
I also suggest looking for little tricks you can apply again and again. For example, some of the data structures in that course use a trick where you break the structure into a top part, using one algorithm, and a bunch of leaf parts, each of size maybe O(log n), that are handled using another algorithm. Often this combination does better than either algorithm by itself.
1. Where can I find that algorithm you are talking about?
2. The algorithm which you stated seems a bit complex from the algos taught in the undergrad level, so what are the pre-requisites for understanding it that I must be aware of?
3. What books or papers you would suggest for reading?
4. How the hell you remember something you read decades ago :)?
It turns out this can be done (to sufficient accuracy) in nearly linear time using something called the fast multipole method.
I wouldn't consider the "O(log n) chunks" trick all that complex, btw, although Tarjan did have to point it out to me when I should have used it.
Wow, you worked with Tarjan. How cool!!!.
I only kept looking because I didn't think a useless site would make it to the top of HN. People who find the course site another way might not have a reason to keep looking.
"Smart data structures and dumb code works a lot better than the other way around." -- Guy Steele
"The Representation Principle: Once a problem is described using an appropriate representation, the problem is almost
solved." -- P. H. Winston
"Show me your [code] and conceal your [data structures], and I shall continue to be mystified. Show me your [data structures], and I won't usually need your [code]; it'll be obvious." -- Fred Brooks
"I will, in fact, claim that the difference between a bad programmer and a good one is whether he considers his code or his data structures more important. Bad programmers worry about the code. Good programmers worry about data structures and
their relationships." -- Linus Torvalds
"Rule 5: Data dominates. If you've chosen the right data structures and organized things well, the algorithms will almost always be self-evident. Data structures, not algorithms, are central to programming." -- Rob Pike
"What are tensors? The facts of the universe." -- Lillian Lieber
I found that the greatest barrier to deal with seemingly hard concepts was getting the right intuition on what "might" be happening under the hood. For me, this is absolutely crucial even if my intuition about a "thing" might not be completely correct.
As someone who didn't go to a classroom to be taught a lot of the basics, I dive in head-first to read papers, I mean a lot of papers, to help build this mental model EVEN though I don't understand everything in it, but it lets you pick up patterns for solving similar problems you might be having.
For things you don't understand, you can always fit in these gaps in your mental model as you dig deeper into the abstraction cake and repeat the whole "intuition" building loop.
I find that the same also applies to Math and other domains which build on a bunch of primitives that has to be really well known to be able to reason in higher abstractions.
I feel software is no different... dig deep enough, there will be a hashmap of some sort.
My favorite so far is Skiena's "The Algorithm Design Manual". I especially love the computational geometry section.
Just genuinely curious because I love reading technical books but I can never finish them from page to page.
If it's a technical book I probably read and enjoy it and forget most of it. If I see a similar problem it might jog my memory and then I can refer back. Only works for good books. Its a different experience than directly interacting with the examples by coding/working out yourself.
Even that is probably overkill for a lot of people, practically speaking.
On the other hand, if you're learning for fun, you should study the thing you're most excited about because you will do the most interesting things with knowledge and skills that you're passionate about.
Ultimately, in my personal experience, the latter course has also led to career advancement.
From a practical perspective... The best sign is if you're spending a lot of time trying to wrangle performance or scalability and most of the bottleneck is inside the standard data structures.
E.g. if you're spending too much time or memory doing dictionary lookups, then it may be worth looking into Bloom filters. But if dictionary lookups are a trivial component of your overall performance, then there are no real gains to be had from replacing with a fancier data structure.
There's a follow-up point I'd like to make too. Skill with data structures is more about knowing the right questions to ask, rather than memorizing an encyclopedic list of a whole bunch of exotic structures. When you understand the major tradeoffs and design considerations underlying data structure design, then it's usually easy to find a solution in the literature even if you were completely unaware of it prior.
Effectively that means something like being able to translate between business requirements and academic language. "Oh, I need something that works on directed graphs, handles cycles, logarithmic in space, linear in average case time, resistant to adversaries, respects cache locality during lookup, and behaves deterministically."
The best way to get learn this stuff is to read a bunch of famous papers in data structure. Again, the point isn't to memorize a zoo. It's to get comfortable with the already well-developed framework that previous researchers have used. It's very unlikely that your business problem is truly unique at a theoretical level, so most likely someone before you has already figured out the tradeoffs and considerations that you are facing.
I'd love to see a catalog of data structures that documented those characteristics; do you know of a good resource like that?
For me personally, the best resource has been simply to work through famous papers in the space. Blogs like the Morning Paper, centered around breaking down papers, are also helpful.
I mostly disagree. You won't need to implement them, but a working knowledge of the basics is the basis of any performance analysis and can make a huge difference in system performance in day to day programming. At some point all the small inefficiencies also add up.
I would go so far as to say that data structures and basic algorithms (e.g. Dijkstra) are more important to become a competent engineer than any of the practical tools he listed. It's like the difference between a mechanic and a mechanical engineer.
There is also a reason they are taught very early in a computer science degree.
To use your mechanic and mechanical engineer analogy: They're not even mechanics, they simply change the oil and light bulbs from time to time, along with cleaning the interior.
Not saying that it's not useful - you'd get a much better understanding of things...but this obsession with learning everything under the sun just to be "ready" for the job is overkill
Yes, the majority of work is on solved problems and we need far more mechanics than engineers. But it's not such a clear cut and a lot of actual software engineers work on solved problems most of their time and only spend a small amount of time on the engineering part (which is probably similar to other engineering disciplines). But when you spend the small amount of time on engineering work these basics are important and I realized multiple times that I could make better decisions because of it. And I'm kind of certain that's the case for most engineers with an academic background or self-taught basics though they might not even realize it.
In my experience knowing how to implement such things isn't very useful in daily practical programming. Knowing that they exist, what problems they solve, the pros and cons that is incredibly useful day to day. Especially if you can recognize when you've accidentally strayed into implementing one of them under the guise of solving a business problem. You can then apply known patterns or have the foresight to say "Yo this is a solved problem, let's find a library"
I have the same question, except I already have a partial answer (i.e. less interested in going after it, I sometimes do these things if I'm interested enough). I'm currently working my way up 6.006, but I wonder why I'd have to do advanced data structures. My guess is: I think I don't, because it's more of a research area than anything else.
I would need to take a quick look to see if there are some interesting data structures solving some interesting problems.
Personally, I'd be interested in time travel and memory hierarchy (caching) as topics. I think they might have some immediate practical utility in those areas.
I studied a lot of Intel's 64 and IA-32 Architectures Software Developer Manuals, particularly the first and third volumes, and have forgotten a lot of details of what I read. I should have waited until I was ready to invest time in writing an OS or a first-stage bootloader or something. The broader things were useful, but I learned a lot of details that I haven't used and have forgotten. I'm sure it's still useful in the sense that I can quickly pick it back up if I ever needed to, but still. Same for all those things I read from the OSDev Wiki.
EDIT: so is the coauthor! http://joebergeron.io/posts/post_four.html
Since this is meant to be a graduate level research course, I wonder if any cutting edge topics have been added since? Were there any major advances or new ideas in fundamental data structures in the past few years?
And then it lists five requirements...do I get the MIT course description equivalent of a Knuth check now?
Nearest I might have got was trying to use a bloom filter in a vehicle tracker, but nope.
What learning about stuff generally has given me is indirect, the ability to dig down and optimise because I understand systems. As I read up on eg. sorting I understand that quicksort is very much not the quickest sort and fixed bad performance bugs from its misuse.
Or understanding Btrees so I can get very good performance out of DBs, even if I've never written a Btree and frankly never want to.
Stuff like that. Maybe one day I'll find the need for dynamic programming and learn it properly, until then it's just a waste of time.
But use them directly, no. Surely sux.
Looks like a great course though!