My background is chemistry/chemical engineering. I had applied for a data scientist position. Phone interview included a problem where I was asked about my solution's complexity. I admitted I didn't know about it.
Still got called back for an interview on site, but the weekend before I powered through this course. Unsurprisingly, it came up in the on-site and they were really pleased I had learnt about it. I got the job.
I also found it useful to implement all the algorithms in Python.
Done well it looks amazing, elegant, and efficient, but in the wrong hands you'll lose your hands.
"Enough Rope to Shoot Yourself in the Foot: Rules for C and C++ Programming" 
It's also perhaps the best mixed metaphor I've ever encountered.
While Rust might not become a replacement for the Python layer, it may replace the C/C++/Fortran layer with all their speed and low-level optimization, yet provide good (and especially safe!) abstractions on top of that.
Currently, people try to use C++ to fill that gap, but I'd love to see Rust's type system, borrow checker and macro system, instead of C++ templates.
 As opposed to Mathematica, MatLab, etc.
Python works well when you have to convey your algorithm in a nice succinct functional manner, but that's optimizing for cuteness, not understanding.
I always say that python is like the English of programming languages; it's an amalgamation of other languages, programming paradigms and a rich set of libraries, and often serves as a 'Common Tongue' to glue disparate processes together. There is an inherit risk to that, though. Because it's so easy to transition between object/procedural/functional modes, because it's so easy to tie together so many different interfaces, because it's so easy to `import everything`... it is often too easy to avoid a proper separation of concerns. The flexibility comes with a disincentive towards discipline.
All of the pedantry of systems level language performance and practices aside, even the context switch alone between 'algorithm language' and 'application language' is beneficial to promoting better discipline in both areas.
where your program doesn't even compile if you don't give it the right invariant.
In my opinion, an understanding of data structures is _much_ more useful for a data scientist than algorithms.
Why should data scientists know about algorithms? Because data scientists are typically interviewed by computer scientists/software engineers, and that's what they tend to ask.
I recently conducted many phone and on site interviews for a data scientist position. I didn't ask about algorithms.
In my experience, building performant web applications is much more about things like reducing bundle size, making sure animations are hardware-accelerated, being smart about _when_ you do complex work... The cost of using an O(n^2) algorithm over an O(n) algorithm will rarely have a tangible impact, since you don't typically deal with item sets that large on the client.
Not debating that CS fundamentals are valuable, just that they are _far_ from the most important skillset to have. Give me a JS hacker who understands page load times over a CS grad who writes his own bucket sort algorithm any day.
Apologies if I took it the wrong way!
I say this as a self taught programmer who studied a non-CS engineering well after learning about big-O.
The DBs implement the binary-search, not the back-end Dev.
For the average programmer, IMHO learning about data structures/algorithms makes you a better programmer, but it's not that essential.
And the front end is used for some pretty crazy things these days, eg 3D rendering. Sure if you're making a social network for cats it probably isn't an issue, but then not everyone is.
When your UI is based on 25 megabytes of lists and maps, you do a lot of iterating and filtering. In our current implementation, it's possible, but difficult, to produce a case where it spends as much as a quarter of a second doing that in one go. But only the first time; if it takes that long (anything over 100ms), we memoize the call so the next time you get it for free. Binary search hasn't yet been required, but it's on our short list of performance improvements to apply once the backing data grows enough to require them.
Another project from last year was a tool to consume very large (250-300M) XML dumps from a very large, very expensive line-of-business system used by our finance department, and produce a wide variety of aggregations to simplify verifying the output of said very large, very expensive system, which I gather may not always be able to perform arithmetic correctly.
This one wasn't a team effort; it was one of those things where you get the spec on Friday afternoon and the deadline is Monday morning. (Not something I'd tolerate on a regular basis, but when the stakes are "it's this, or the SVP Finance sends a helicopter to retrieve the VP IT from a cruise liner"...) But it's also a frontend-only application, because installers take time to get right, too, and in any case something so simple has no need for a backend.
It takes about three minutes to fully parse, process, and report on a 250M XML dump. It leaks no resources, does not hang the UI thread, and presents a cute Bootstrap progress bar along with a table of running totals, so the user knows what it's doing. (Not gonna lie, I was showing off a little with the table. It's fun to watch numbers blur as they spin upwards in value!)
I concede that this tool doesn't do much work with long lists - mainly just the values of interest from the raw dump, which are several lists of maximum cardinality on the order of ten thousand, and the negligibly short lists used to maintain parser state information. But I do gather a certain impression that you're one of those sadly behind the times folks who still thinks of the browser/JS platform as a cute toy and a decent document reader, but not up to anything remotely resembling Real Computational Work, and I thought I'd include this example, as well as the other, in order to help disabuse you of that rather outdated notion.
But yeah, that's admittedly a tenuous thread.
The idea of "Software Engineering" was born with the Software Crisis report in the 1960's.
Later on practicing SE's would study things like Design Patterns so things evolved over time.
EDIT: Software Engineers probably should know some relevant basics, but programmers could never come up with a "body of knowledge" like real engineers have.
It all depends what you work on and what new developments keep coming out...
All the really interesting jobs require knowing data structures, algorithms and discrete mathematics well.
It also helps that Robert Sedgewick has been in compsci forever (got hit PHC in 1975) and is one of the subject matter experts in algorithms.
Robert Sedgewick also has fantastic algorithms courses on Coursera.
Am I to understand that it is "just" recursion with caching?
I do recommend you to read this chapter:
After reading this chapter, you can try to understand why Dijkstra and Floyd-Warshall shortest path in graph algorithms work.
These classical and fundamental algorithms combine graph theory with dynamic programming.
"just caching" is also hard to apply to e.g. the tower of hanoi problem.
Spotting where it can be successfully applied is the hard part.
Recursion not required, iterative DP solutions are things too.
Maybe you could explain better...?
> Then why did recursion work so well with divide-and-conquer? The key point is that in
divide-and-conquer, a problem is expressed in terms of subproblems that are substantially
smaller, say half the size. For instance, mergesort sorts an array of size n by recursively
sorting two subarrays of size n/2. Because of this sharp drop in problem size, the full
recursion tree has only logarithmic depth and a polynomial number of nodes.
> In contrast, in a typical dynamic programming formulation, a problem is reduced to
subproblems that are only slightly smaller—for instance, L(j) relies on L(j − 1).
I think I got it. The distinction isn't that clear cut anyway. For example, I could implement a mergesort with caching and say that I'm technically doing dynamic programming, even though the cache would get hit exactly zero times. It would sort of be "trivially" dynamic programming. On the other hand, the first time I calculated fibonacci numbers recursively, I've added a cache. This would be "memoization". Or in other words, wikipedia article is correct, but some problems don't benefit from the caching.
Thanks for the references.
Roughgarden's class is advance and expects mathematical maturity. You may find his course quite fast and rough if you are a beginner.
Sedgwick's class is much easier. He is a bit boring and tries to use "real life" examples (in some instances) from the physical sciences to make the material relatable. This in my opinion detracts from the material. Also, he doesn't always fully explain where he got some of the big ohs here and there.
My advice? Follow MIT's OCW course (it uses CLRS). Supplement it with Algorithms Unlocked, the Khan Academy link in OP and CLRS. If you use those 4 resources and put in the work you'll understand the material.
All 4 sources have Thomas C's DNA touch to it (he is the C in CLRS). So you'll find it consistent when you read from one source to the other. After reading/hearing the same thing about 4 different times in 4 different ways it'll begin to click.
Order of easiness is probably Khan Academy > Algorithms Unlocked > MIT Algorithms Course > CLRS.
Algorithms Unlocked is like "pre-CLRS" and Khan Academy's version is the TL;DR version of Algorithms Unlocked.
Hope this helps.
Below are the links,
If you have to pick only one go with the MIT OCW, and snag a copy of CLRS. I got mine from my local lib. and it gave me more than enough time to work through the problem sets from the mooc.
I find there are a lot of online resources for those looking to learn algorithms and data structures but I've had trouble finding the same breadth and depth of resources surrounding the math behind CS (discrete math, probability, etc.). Any suggestions?
Where does Roughgarden's course fit in to this?
- 6-00-1x https://www.edx.org/course/introduction-computer-science-mit... (started)
- 6.00.2x https://www.edx.org/course/introduction-computational-thinki... (March)
All are good and are pitched at various levels of complexity. The Princeton course uses Java. Okay if you're into that sort of language/thinking. MIT is using Python. Found one using lisp,
"Systematic Program Design" ~ https://www.edx.org/xseries/how-code-systematic-program-desi...
Algorithms unlocked: https://www.amazon.com/Algorithms-Unlocked-Press-Thomas-Corm...
Both include the same author as the one in this article (Thomas Cormen).
I think a much better and free alternative is:
PS: I'm an experienced programmer (Perl).
It's nearly a third of the length of CLRS, and half of Sedgwick. Much more precise, yet offers more in that it talks about common problem solving uses cases with data structures and algorithms, rather than writing going through the theoretical proofs behind them.
Pair this up with this excellent lecture by the authors Sedgewick and Wayne:
Which is great, except it takes you to the main list of subject matters, and algorithms isn't in there.
So I'm not able to view this link on my iPad unless I uninstall KA?
One of the authors - Professor Balkcom is our advisor as well.
And I'm kind of smirking right now, because again asymptotic got butchered.
I've spent the good part of this semester trying to get my head around a very formal, very dense script of my own algorithms course. And I finally cracked asymptotic.
Maybe I'm just dense. But If that's the case, I'm sharing a classroom with others who are equally dense.
We dealt with all 5 classes, big oh, small oh, theta, big omega and little omega. We're required to always give the "most exact" classification for best/avg/worst. Including "does not get as fast as" or "does not get as slow as"
I'm willing to write a "freshman friendly" write up if someone's willing to post it or use it. I'm shit at self publishing.
The follow up to that is understanding what you're counting and why, i.e. branches v.s. statements v.s. dereferences v.s. logical I/Os v.s. physical I/Os ...
Not just the everyday examples of what constitutes an algorithm, but the voice, presentation, etc.
However, once you understand the iterative version, it's probably easier to understand how the recursion is actually working.
The general idea is that something takes O(f(n)) time if it takes at most C·f(n) time for some constant C and all but finitely many values of n. The 'all but finitely many values' is what makes this definition 'asymptotic'. Basically 'O(f(n))' ignores constant factors and the behaviour at 'small' n (i.e. small inputs), the reasoning behind this is that an algorithm in O(f(n)) is faster than any algorithm not in O(f(n)) provided you make the input big enough.
The little o, big Omega, big Theta are just small variations on this, which won't be too hard to understand if you get the general concept, and really the distinction isn't too important usually, just know that O(f(n)) gives an upper bound, not necessarily the best possible upper bound. To understand the big-O notation better it might help to have some basic knowledge of limits.
As someone with a Computer Science degree I can say that the only times I have ever used any of the algorithms I learned directly was when writing low level C and GoLang. I'm willing to bet 90% of programmers... even those that right "advanced" programs do not use them day to day.
Every algorithm and data structure worth anything has been abstracted out into easy to use libraries years ago.
Ditto mathematics past roughly Algebra 1. Every now and then I try to brush up on calculus or linear algebra or something, but it takes so much time to keep those skills up when you're not using them at work. Like keeping up foreign language skills while living somewhere you rarely encounter native speakers.
I'm not talking about implementing my own sorting algorithms every time I need to sort things; I'm talking about recognizing that a particular problem can be efficiently represented by data structure X and solved with algorithm Y. Hopefully I'm writing none of this code myself, but a programmer who doesn't in theory know how it all works is never going to be able to compose pre-written code into an optimal solution. I know, because I've been that programmer.
I don't know what you mean by "advanced" programs. I guess I fall within your 10% by definition, since I can anecdotally say that having and using this knowledge has made a big difference for me. I would say that I use this stuff 5% or less of my time, but for that 5% it can make the difference between run time of under a second vs. multiple days, and between a brittle, defective solution and a rock solid one.
Not to mention that it's extremely handy for getting hired in the first place, and it's just plain fun.
Very few days do I avoid evaluating what kind of/depth of loop (if any at all!) is needed to generate an output.
Knowing that a problem at hand requires a certain solution is important.
If the reasonable defaults aren't good enough... you're in the 10%.
Sure, in some situations it might be obvious, but maybe not obvious for others.
And things that are immediately, blindingly obvious to a 5 year experience programmer may not be obvious to a newbie.
Ex: imagine if you didn't know what a hashtable or a dictionary was and just used single variables for everything.
For that matter, the choice of sorting algorithm can have security implications. QuickSort on user-facing data can easily become a DoS vulnerability.
Each "tool" you learn inside and out allows you to build something new. The better you learn them, the more you can build.
Can't help but think of it in literal lego terms.
mylist = ['a','b','c','d'].
if 'd' in mylist:
Now, how does the performance compare, when using a set data structure?
myset = ('a','b','c','d')
if 'd' in myset:
So the answer is "you will become good at tech interviews and be able to get a job"
in all seriousness, algorithms and data structures are essential to computer science theory. When trying to get a job as a software engineer, technical interviews will mainly be problems about algos and data structures.
Algorithms are recipes or instructions.
Usually it's learning algorithmic techniques for solving various computational problems and implementing algorithmic coding problems in a programming language.
Hope that helps.
Early in my career, actually before my career, I taught myself the available language on our department's mini-computer (don't remember the language or the computer), so I could automate parts of my and my colleagues' job.
I had to sort something, and I innocently "invented" bubble sort to do it. The system administrator visited me the day I first ran it, and told me to stop doing that.
Soon after, I quit and went to school. :)
Also, to add some signal to my noise, Tom Cormen is apparently one of the designers of the course. He wrote the gigantic encyclopedia of algorithms, as well as a shorter, more-entry-level book on the subject. You'd be in good hands to learn the topic.
And the topic is one that teaches you how to reason about your programs in a way that works across languages and disciplines.
A lot will say learn this as it is the "core" of programming, but in industry practice I've seen using pure functions (functions that return the same result when the same parameter is passed in) is becoming more commonplace than having a trickier function that mutates a variable differently during each iteration, making it harder to reason about program state. Algorithmic problem solutions usually involve non-pure functions that are not intuitive until you've seen them before. Front-end UI is better built using several pure functions.
When working with algorithms, you’re trying to solve problems efficiently. Your programs should be fast, the wait for a solution should be short.
Im making an explicit opinion that python is no better than any other language for implementing algorithms. HN please prove me wrong in an objective way so we may all learn?
it will. and this person chose python.
> HN please prove me wrong in an objective way so we may all learn?
no one cares about the choice of language. This is about learning algorithms.
Beyond that, if your interviewer is fluent in python, list comprehensions can greatly shorten the amount of boilerplate you have to write.
I'll definitely second the list comprehension point, though. Between that and pleasant string support, a lot of standard interview answers are maybe 50% as long in python as Java. Not easier, necessarily, but faster to write and cleaner to read.
Even if you aren't using python, it will give you more room to work (the other part of this is divide the board before you start).
In my case, I'm most familiar with Python.
Not a Java expert, can you do something similar with Java? Maybe with generics?
With python, your algorithm will work with the primitives without specifying either in the algorithm or test code. You only need to specify type, i.e maybe a new class and definitely instantiation, if you want to use something more abstract.
This would be true of Ruby and other dynamically typed languages. For beginner algorithms courses at least, a dynamically typed languages makes more sense since the algorithms taught are only clouded by typing. Complexity theory still applies and what not.
Maybe if learning algorithms that do heavy crunching, a lower level language, probably with static types, may be used but by that point you're not going to mind type info.
https://www.reddit.com/r/programming/comments/9nipa/joel_on_... ( https://www.joelonsoftware.com/items/2009/09/23.html )
http://www.lessonsoffailure.com/tag/paul-graham/ <-- note "flamewar" in the title!
Dictionaries, sets and lists can be nice though.
Go here( https://www.thecodingforums.com/threads/performance-sets-vs-... ) and search-find "Terry Reedy".
Your issue isn't so much being wrong as lacking a coherent and relevant point in the first place.