This complaint comes up time and time again. "Universities should prepare students better for jobs!" "Teach more real life skills!" "Turn CS uni into trade school!"
But a computer science university program is not about scikit-learn or TensorFlow! It's about long-lasting principles, underlying mathematics, mental models and ways of thinking.
None of my computer science lectures were about how to apply that particular part of CS knowledge in some hot new Python library. It's expected that there will be some amount of time required to adjust to a company's software setup. That's not a big hurdle usually.
I'm not saying it should be only theory, though. University courses often have accompanying assignments or projects. Depending on the country in question, they often offer more hands-on, practical courses ("lab courses") as well, where you do actually go through the steps of making the theory work in real life. I had such courses where we played with microcontrollers and FPGAs to understand CPU instructions, assembly and low-level C concepts (but even there the goal wasn't to learn exactly the thing that you will use on the job. Most CS graduates will never need to program FPGAs in their day job.).
But sure, there is a place for even more data engineering training, but I don't think it's computer science university programs. Where do people like network engineers learn how to configure Cisco routers and use whatever config software they use? Where do sysadmins learn Bash, Unix, backup management etc? Not at university courses. Wherever they learn those skills, that's where data cleaning, parallelization engineering etc aspects of machine learning should be taught as well.
There’s a big gap between the whiteboard in a classroom and a blinking terminal cursor. As a teaching assistant a few years ago I spent a good chunk of my time in the lab showing otherwise brilliant computer science students things like basic terminal commands, how to read error messages, and just overall basic problem solving skills. Almost all of the students I worked with were much stronger than me in terms of theory and what I call “whiteboard computer science”, but many of those same students who aced written tests really struggled with basic roadblocks like learning language syntax to do practical assignments.
It sounds really silly, but some of the best instruction I gave was “Tab” to autocomplete a command and “Up Arrow” to re-run the last command. Whenever I would do a class demo on the projector someone would always stop me to ask how I was running commands so quickly and fluidly and how on Earth I could remember them all.
Someone (Knuth?) said there are only 2% students in any CS class who are good. I worked as a CS TA as well, and I found that to be true. The top 2% of students already had practical skills (shell, vim, git, networking, etc), usually because they've been coding since they were 12 and learned by doing cool projects. Sometimes I saw bright students without prior experience picking up those skills quickly (much quicker than I did).
Why does CS seem unique in this? Why does the same not apply to medicine, law, physics, mechanical engineering etc.? Or perhaps it does? Would you say only a low percentage of mechanical engineering students are any good, namely those who have tinkered with mechanical contraptions from age 12?
Does CS require uniquely high intelligence / problem solving ability compared to other fields? I don't think so. Then what do those 2% have that others don't? Some have suggested frustration-tolerance, we could perhaps add detail-orientedness. But these are also needed in engineering. Perhaps the abstract, intangible nature of CS makes it harder?
I think the article is actually saying the exact opposite - that Tensorflow / PyTorch / sklearn code soup from “trade school” sources like bootcamps or quick online programs are not very valuable out in the world.
You might be misunderstanding the focus on data cleaning and feature engineering as being less specialized than say PyTorch coding but it’s exactly the opposite.
The most critical aspects of ML engineering for production are all about advanced statistics. Understanding multicollinearity, overfitting, dimensionality reduction, convergence, and time series issues like assumptions of stationarity or conditional independence effects.
Any engineer can crank out neural network software - that has pretty much zero value.
Value lies in realizing some stratification error in the data and following that lead to use a multi-level model to control for it. Value lies in realizing several key feature inputs are correlated on a seasonal basis - leading to multicollinearity - and then setting up some adaptive feature aggregation to mitigate it and dashboards with things like variance inflation factor to be able to raise alerts on it across time.
Value lies in working on small data problems and using literature review to determine the best prior to use for a Bayesian model, and doing robust posterior predictive checks to validate it.
These things require many years of education and experience dealing with statistical irregularities, understanding confounders and causal inference, understanding missing data treatments, understanding time series forecasting.
You cannot learn that in 101 courses that overly focus on the mechanics of how to type Tensorflow or sklearn code - that part can be picked up by anyone in a month or two. And mere intro to data cleaning and plotting distributions or proportions of missing data is not a substitute for actual statistical knowledge.
Or you can just use 100x more weights in a transformer, and it will learn how to write human level quality texts without much data cleaning or fancy statistics. /s
I've oscillated between these two positions for a few years now, when in truth neither positions are really in conflict.
When we complain about universities not preparing students better for jobs, what we really mean is that universities are not doing the bare minimum that they should be doing - in case of CS, students should at least know how to program well, and be well versed in the practicalities of computing. That does not exclude learning the fundamentals (which is often denigrated as "theory").
It is just that students often have neither the theory nor the practice, and at a minimum, we're asking, they should know the practice so they can at least be useful in their jobs.
Which universities don't teach the bare minimum? I assume the article was about the US. The US has the very best CS universities in the world, do they not teach these basics? Or are we talking about smaller lower-tier American universities? I think there's also a difference in which programs you look at. There are, for example Computer Engineering programs and also Computer Science programs, which are not the same. In Germany, there are universities (Universität) and "universities of applied sciences" (Fachhochschule), which differ in the balance of theory and practice.
My big picture point is that the complaint is really general and isn't specific to machine learning (ML is more of a click magnet here). The same could be said about other parts of CS and about the general computer-handling skills of CS graduates.
> Which universities don't teach the bare minimum?
My university did not teach source control, or the basics of good programming practices.
There were plenty of practical courses, with plenty of programming assignments among them, but the only thing that you were evaluated on was whether or not the resulting code worked.
That's not the job of a university. A reasonably smart student picks up how to work git in a few evenings with some online tutorial and some open source project. Or just when doing their homework.
In fact, a proper theoretical foundation makes this really easy. Graph theory and algebra will have taught them about DAGs and partial order, which is what git branches are. A crypto class will have taught them about hashes and signatures. Distributed systems class will have taught them about issues with synchronisation. With all that background it doesn't matter whether it's git or whatever system will be en vogue in 10 years.
Imagine a student having learnt CVS 20 years ago at university. Completely useless knowledge today. But the same student with the above fundamentals will pick up git in no time. That's what universities are for.
If the university expects to produce graduate students, who use computers to solve research problems, teaching them how to code well helps.
> A reasonably smart student picks up how to work git in a few evenings with some online tutorial and some open source project. Or just when doing their homework.
The same can be said for writing comprehensible English, yet you will lose marks on your essays for poor writing.
Ones offered by legitimate universities, that require bachelors to take ~12-18 credit hours of liberal arts courses.
Those courses typically require you to submit essays, which will be evaluated for both their form, and their function.
Given the generally poor grasp of civics, ethics, and ability to relate to other cultures that I have seen, if the job of a university is the creation of educated people (as opposed to vocational training), more liberal arts education can't go that amiss.
You're also missing the main thrust of my argument. My university offered - and required - CS majors to take a number of practical, vocational courses, with many programming assignments. But it never actually took the time to train, or grade us on the quality of our solutions to those assignments. This is not how real engineers, chemists, or physicists, or folks studying bioscience get trained.
> Ones offered by legitimate universities, that require bachelors to take ~12-18 credit hours of liberal arts courses.
I see. That's rather a peculiarity of the American system, in Europe, higher ed is more specialized and broad, general liberal arts education ends at the high school level.
> But it never actually took the time to train, or grade us on the quality of our solutions to those assignments.
I've worked as a teaching assistant, and it's often due to a lack of staffing and time. TA's are most of the time also PhD students and they have projects to work on and papers to write and giving detailed feedback on assignments is infeasible beyond a certain number of submissions. The same with exams. They are optimized to be easy to unambiguously grade in the shortest time possible. There's a conflict between two roles of the university: research and teaching. Research is more important for one's advancement, so teaching gets neglected.
If you don't actively make the real-world connections to the theory then most students will just memorize the coursework and then forget it later.
The number of times I have to walk through why linear memory access matters, how caches and branch predictors work is staggeringly high. In every single case they all knew the theory but never made the connection to how it applied to the task at hand.
Let’s continue the anecdotal train: my Computer science major requires a class that does teach source control. Better yet, it’s a liberal arts school and I’ll graduate with a B.A., so instead of focusing on anecdotal evidence, why don’t we talk about what ought to be?
Yes, and software engineering would teach proper source control practice, but the question of the parent poster probably is about whether university computer science studies should necessarily be about software engineering; just as physics studies mostly does not teach mechanical engineering nor CAD use.
IMHO they should (or, more accurately, they should have both 'true and proper' computer science and software engineering - but the expectation is that the latter will be the thing most in demand), but there's no consensus about that, people have different opinions.
Most people these days never touch a computer until late teenage years. How do you expect them to learn source control if they hardly have an exposure to that in their private lives?
At the end of the day you will end up with people who have excellent theoretical knowledge but no good practical skills.
There's lots of hard CS content you can learn and take exams in without writing code. Logic design, complexity theory and automata, graph theory theorem proofs, linear algebra, complex analysis, coding theory (compression, encryption, ...), algorithms, proofs about data structures, operating system theory (scheduling algorithms, deadlocks, race conditions, virtual memory), database theory, etc.
I think this is more of an expression that coding interviews are horribly poor for assessing effectiveness at delivering code to solve business problems.
Those people you say “can’t code” actually can code very well - it’s just that the question “can you pass this timed hazing trivia test in coderpad or on a whiteboard?” has no relationship to “can you code?”
I got turned down from codementor's fulltime platform because my "tic tac toe" in react wasn't complete. I didn't handle the diagonal case (which I was explicitly aware of) nor did it have unit tests.
Did I mention I had to do it from scratch live with the interviewer within 1 hour. Thats 1 hour to plan and implement from a blank slate from "hey write me tic tac toe with react"?
Never mind that It was functional, had a hook for me to handle the diagonal case if I had time and aside from that, worked!
Nevermind that I had completed a few projects on their platform already and have a great rating. That I have experience working in a few startups in Silicon valley under my belt. Nevermind that I have open source contributions and was a key speaker in a js conference.
I can't write a complete tic tac toe from scratch WITH unit tests from a cold start and a blank slate in less than an hour.
Sometimes I feel like its becoming a race to the bottom. As CTO of a startup right now, I spent days deliberating and planning out the db schema we use to optimize for our main use cases while allowing some deeper queries for analytics. I write stored procedures and convert business rules into working software that scales. Under our last load test, 90% of the reqs were served in less that 250 ms. I do code reviews in js, elxir and sql on the regular have trained other engineers on obscure sql like nested joing an unnest() ing json arrays for analytics.
Yet somehow my value can be arbitrarily broken down to "he's not good enough because he can't scope, plan and impliment a full app with unit tests within an hour"
Not really. “Basic programming” - like little helper methods for linked lists or array & tree manipulation, algorithms like sorting or hash table implementations - is an activity that professional programmers do slowly and methodically, with the benefit of whatever references they want, and with no short term time pressure or anyone actively surveilling every part of their intermediate problem solving approach, and with no requirement to also simultaneously verbalize all those choices in a way that seems to meet the social expectations of a complete stranger who is evaluating you on murky criteria.
It’s truly, truly nothing at all similar to the actual activity of “basic programming.” Highly skilled people will get flustered, forget basic facts, put their foot in their mouth, none of which has anything to do with whether they are skilled or unskilled at coding.
I was going to make sure they understood loops and maybe classes/methods.
You can ask them data questions without needing to code on the spot.
Which one of us is terrible at coming up with relevant interview questions? The guy who considers tree manipulation basic or the one who considers loops basic?
I’ve never heard of any organization asking software interview questions about the basics of loops or classes. If you are asking that, it is very, very weird of you.
Meanwhile asking about data structures and algorithms is an extremely ubiquitous industry standard for interviewing new grads and entry level programmers up through experienced veterans - and is practiced by nearly every tech company and tech recruiting agency under the sun.
Your expression of surprise about this suggests you are not very aware of the tech industry or software hiring practices.
But regardless, all the same issues would apply to asking about loops or classes that apply to asking about linked lists or sorting. In professional work, none of that is ever addressed in a time crunch with someone actively surveilling you and assessing intermediate explanations - so concluding someone is bad at coding from a poor interview performance is just flat wrong.
When I was at a major US University in thr early 2000's, anything higher than C wasn't taught because the field changes faster than a 4 year degree would make sense- modern languages were seen as the domain of 2 year tech schools.
I believe that has since changed, but I am not sure to what extent.
A computer science graduate degree is not a programming course certificate and should not be treated as a substitute. If you're willing to hire people who studied four years of theory with no practical applications or experience, you need to have a plan to onboard them from theory to software development.
You wouldn't hire a metallurgist as a welder, so you shouldn't be hiring a computer scientist as a programmer.
Hard disagree here. It's like saying an electrical engineer doesn't need to know how to use a multimeter until after college. Or a chem Engineer isn't responsible for knowing how to titrate something.
Programming is a tool and they should not let anyone graduate with a CS degree if they don't know how to use that tool.
Heck it almost seems like Electrical Engineers in your fantasy would make better programmers.
If the degree is called "Software Engineering", then you are definitely right. But we are talking about Computer Science, which is more like an offshoot of mathematics, for studying computation and information processing. Complexity, tradeoffs of time and storage, information representation, search, analysis of algorithm correctness and efficiency, etc. Putting together large production applications for business use cases has little to do with that.
> Programming is a tool and they should not let anyone graduate with a CS degree if they don't know how to use that tool.
CS isn't programming, software engineering, or software development. There are some places that offer degrees in the last two, and some places where you can focus on CS degree on those more practical aspects, but CS by nature is more theoretical and abstract, bearing roughly the relationship to those other things that Physics has to Mechanical Engineering or Automotive Maintenance.
Of course, you don't have places looking at Physics degrees as entry stakes for auto mechanics, but I'd they started doing that the problem wouldn't be with the Physics programs.
I graduated in the class of 2019. Plenty of CS courses had no programming requirement at all and a few only had on-paper coding requirements (i.e. nobody every checked if the code worked).
I once submitted code that did not compile as I ran out of time. I got 100% on that assignment.
Whether you get a good grade on the programming portions is almost random.
I’m an engineer who helps quite a bit with hiring interviews. In my humble opinion, there’s a surprising number of fresh grad candidates who are not very skilled with theory nor practice.
It has been many, many years since I was in school. I think it’s fine that your computer science education focuses on fundamental CS concepts and the mathematics so you can easily pick up areas that require that math (ML, for example). I do think universities can do better. At my school, we had mandatory “block” classes in arts and humanities, which in my opinion, offered no value.
To be clear, I’m not saying these subjects aren’t valuable. I am saying, however, that the quality of these courses was very poor, and they could have been substituted entirely with classes related more to my discipline.
I remember sitting in some political science classes that were part of this required pool. I have no idea what we learned in there. As far as I can remember, we read political science papers that were very poorly written/extremely inaccessible. It was impossible to differentiate the authors personal opinion from objective truth of any kind. It was more or less a checkbox - I had to have so many credit hours from this pool in their curriculum to graduate. Did it force me to think critically in some manner? Not at all. It actually gave me a false sense of how “intelligent people” write.
Yes - all these years later I understand that it wasn’t me who just could understand those papers - it was half rambling, pretentious rambling nonsense. It was the opposite of effective communication, and it provided no value.
> there’s a surprising number of fresh grad candidates who are not very skilled with theory nor practice.
Is there any "entrance point" where people don't complain about this? Companies complain universities don't prepare students. Universities complain that first-year students come out of high school without necessary background. High schools complain that elementary school does not prepare their entrants etc. Elementary schools complain kindergarten doesn't prepare kids to be mature enough. Kindergarten probably complains that parents don't prepare the kids enough.
Broadly speaking, I think there is more demand for highly skilled, highly intelligent people than can be produced by any given cross-section of the population born in a given year. Sure, universities could do better, but beyond a certain point teaching doesn't work. There are people who are intrinsically motivated and soak up knowledge and seek it out in books and online (so much high quality content can be found online, especially for CS!), and there are those who just coast and do the bare minimum. I don't think you can radically improve the outcomes by changing the curriculum.
To be fair, this piece is mostly arguing against just teaching Tensorflow or Pytorch and is in favor of more general skills (with data engineering being fairly general, though as you point out also something that can be taught via concrete assignments). And as to your last point, that's pretty much the conclusions of the piece itself:
"Based on the current state of machine learning courses it is clear that AI courses will get you through the door in your effort to perform cutting edge research or landing a machine learning job, but they won’t teach you everything you need to know. To fill in the knowledge gaps that remain you will have to put in outside effort on your own. "
I guess the question is whether the outside effort needs to be addressed by universities, or by other resources.
This complaint has nothing to do with teaching hot new python libraries.
The thing is, data cleaning is no less fundamental than backpropagation. Maybe more so - learning algorithms come and go, but real-world data is always going to be inherently messy. The difference is in that we have a beautiful mathematical theory for backpropagation but not for data cleaning. So the courses that teach the former but not the latter are akin to the proverbial drunkard that searches for the lost keys under the street light - beautiful mathematical theories are easier to lecture on so they teach them instead of messier (but not less fundamental or useful) topics such as data cleaning.
To comment on "Where do people like network engineers learn how to configure Cisco routers and use whatever config software they use? Where do sysadmins learn Bash, Unix, backup management etc?" - they certainly can do at university courses.
Just as universities offer study programs and degrees in software engineering, there are also programs and degrees for network engineering, which would include not only the theoretical basis of networking but university courses for applied networking where they would learn all that you describe and much more; a university teaches a network engineer to configure routers and manage backups just as they teach a first year electronics engineer to solder stuff. Sure, a generic computer science or software engineering program will not include these courses, that's a usually a separate specialization, but universities definitely do offer engineering programs.
I got my BSc in computer science and PhD in machine learning, and ended up working in a top FAANG AI research lab.
In the hindsight both when doing research for my PhD and also when working as an engineer I felt the most useful courses from undergrad were linear algebra, algorithms, calculus, operating systems, and statistics in that order. I ended up filling the gaps in my math education later by reading textbooks and taking online courses.
IMO an undergrad program should focus on very fundamental theory. If I was in charge of designing CS programs I would quadruple the amount of credits required in math and specifically in linear algebra. You would be surprised how handy and applicable linear algbera is in ML, CV, robotics, computer graphics, finance, etc. etc. Calculus is also important but to a lesser degree.
It's a waste of time to teach TensorFlow or teach the trendiest neural network architecture at school. The knowledge becomes irrelevant in a few years, and it's fairly easy to pick it up by reading docs/papers if you know the fundamentals.
> It's a waste of time to teach TensorFlow or teach the trendiest neural network architecture at school. The knowledge becomes irrelevant in a few years, and it's fairly easy to pick it up by reading docs/papers if you know the fundamentals.
Well, kind of. You teach one or two instances of such things as a case study in how to learn a framework. Usually Software Engineering courses are the best place to do this. The point is, your ML course should probably not be spending any time on things like pytorch. A sophomore level engineering course should have already taught students how to go through the process of learning a new framework.
This answer is probably biased by what I’ve needed to use recently, but I think spectral methods are the most generally useful. E.g. spectra of symmetric matrices, SVD, Courant-Fischer theorem. You may not need this in all of, or even most of, practical ML, but knowing these things are prerequisites in my mind for understanding PCA, CCA, LDA/QDA, multivariate Gaussians (which are foundational in probabilistic interpretations of ML), and covariance. A good understanding of inner products also conditions you to understanding kernels better.
You need something a bit different for SVMs. The linear algebra there is basically the geometry of planes and half spaces. Also for optimization you need some different things, but those are typically taught under the moniker of convex analysis, not linear algebra.
In specific, my approach to many multivariate estimation problems starts with “take the SVD, are there any properties you can use afterwards?”
It's hard to pinpoint a few topics. I'd just suggest to go through an introductory course like Gilbert Strang's intro to linear algebra to get a general understanding and build on your knowledge as needed.
The author makes a point that I relate to. I've been on the receiving end of a couple 'Statistical Learning 101' courses. These courses go roughly as he describes it in his blog post: they first teach you how to multiply two matrices, then launch into linear and logistic regression, then classification via clustering, then decision trees and SVMs, then CNNs and deep learning. Along the way, they do a lecture or two on reinforcement learning and HMMs.
In the end, I ended up with a thin smear of half-baked knowledge in my head, where I stop understanding the math once we are half-way through the material.
So how to you achieve this level of deep/intuitive understanding:
> Without understanding the mathematical underpinnings of key models and techniques in full detail, students aren’t able to quickly choose the right models for certain scenarios.
Does anyone have a good study plan with MOOCs and so on? If you have any practical advice, I would appreciate it!
I mean it really depends how deep you want to go. Like you point out that the classes you took are 101 courses. These are really just "tasting" courses. I'm sure if you decided to take more advanced/grad numbered courses, or unnumbered "topics" courses, you would have a better idea of what's going on.
In general, "having a deep understanding of models & techniques in full detail" is not well-defined. For example, analysis of linear regression is often offered as a full year-long sequence for graduate students in math/stats depts. Is this necessary for doing linear regression in practice? Not really, but who cares - it's interesting stuff in its own right. Most people just need just enough understanding to finish a job.
In general, the precise medium that you use to study something isn't that important as long as it works for you, but there is a good reason that there are longstanding classic textbooks that people swear by in most fields of mathematics. I do strongly feel that in the context of any mathematical subject, that there are few substitutes to the grind - doing proofs and solving problems on your own.
Okay, but just to have at least one link in my post, I want to share this guy MathematicalMonk who used to make really great videos on ml-related stuff:
Part of the problem is that math and statistics are poorly taught most of the time. Textbooks will skip crucial steps in their proofs. Math professors will be too lazy to update their course notes based on frustrated student feedback. Courses will lack tutorials and other ways to discuss problems. I shouldn't have any reason to be watching youtube videos or khan academy to fill the gaps. But me, and thousands of others, are forced to do exactly that.
He proposes changing the syllabus of advanced/graduate-level courses to skip reviewing linear classification and backprop and make that a pre-requisite.
The author highlights not teaching enough mathematical theory behind the various techniques. I have tried Andrew Ng's course on Coursera. It uses Octave from what I remember. After a point the lack of mathematical background started to show up. I have always wondered where can I find course that teaches both the mathematical background as well as the hands-on programming in a balanced way.
But a computer science university program is not about scikit-learn or TensorFlow! It's about long-lasting principles, underlying mathematics, mental models and ways of thinking.
None of my computer science lectures were about how to apply that particular part of CS knowledge in some hot new Python library. It's expected that there will be some amount of time required to adjust to a company's software setup. That's not a big hurdle usually.
I'm not saying it should be only theory, though. University courses often have accompanying assignments or projects. Depending on the country in question, they often offer more hands-on, practical courses ("lab courses") as well, where you do actually go through the steps of making the theory work in real life. I had such courses where we played with microcontrollers and FPGAs to understand CPU instructions, assembly and low-level C concepts (but even there the goal wasn't to learn exactly the thing that you will use on the job. Most CS graduates will never need to program FPGAs in their day job.).
But sure, there is a place for even more data engineering training, but I don't think it's computer science university programs. Where do people like network engineers learn how to configure Cisco routers and use whatever config software they use? Where do sysadmins learn Bash, Unix, backup management etc? Not at university courses. Wherever they learn those skills, that's where data cleaning, parallelization engineering etc aspects of machine learning should be taught as well.