Hacker News new | past | comments | ask | show | jobs | submit login
The revolution of machine learning has been exaggerated (nautil.us)
339 points by benryon 22 days ago | hide | past | web | favorite | 200 comments



"The widespread misimpression that data + neural networks is a universal formula has real consequences: [...] in what students choose to study, and in what universities choose to teach."

I think this is a problem that deserves more attention. A large portion of CS students nowadays choose to focus on Machine Learning, many brilliant CS students decide to get a PhD in ML in hope of contributing to a field that will continue to develop at a pace similar to the progress we have seen in the last 7 years. This view, I think, is not justified at all. The challenges that AI faces such as incorporating a world model into the techniques that work well will be significantly harder than what most researchers openly admit, probably to the detriment of other CS research areas.


This is a pet peeve of mine as well. A lot of undergrads are not really looking around any more. Universities offer their first data science courses in year one. Undergrads mapping out their paths towards becoming research scientists with dollar signs in their eyes.

I think this is not wise career bet at all. There will always be spots for a select few 'pure ML' grads, but at the post phd level the really good jobs are getting pretty thin unless you come from the right adviser + right school + right publication track.

On the other hand, if you add a secondary useful skill, like having a really good understanding of distributed systems, or networking, or databases, or embedded systems, or being 'just' a solid overall software engineer, you (i) have something to fall back on if ML dreams do not work out, and (ii) have an edge by being just the right fit for specific teams, while also still competing in the general pool.

I think there will be a rude awakening on this in 3-4 years for many.


That's true, but how many people in the field started out with the right specialty to begin with?

They might be disappointed graduating post-ML bubble, especially if they have their hearts set on half a million dollar salaries, but they'll do fine with a PhD in CS.

Then again, I'm not part of the SV world, so YMMV.


> That's true, but how many people in the field started out with the right specialty to begin with?

What's hot in technology changes all the time. The smart strategy for an undergraduate is to:

1. cover the fundamentals

2. take the honors track (where the professor and the students want to be there)

3. avoid "hands-on" classes, which you can learn on the job as required

4. take at least 3 years of math. Really. Don't avoid it. If you can't do math, you can't do engineering or science.

As for me, my degree is in Mechanical Engineering, with a side of Aero/Astro. And yet I work on compilers :-)


Yes, always play dual-class, so to say. For every pair of important fields X and Y, there are numerous people who can do X very well but can't do any Y, and vice versa. If you do both X and Y well enough, you are at a huge advantage.

Trivially, being good at frontend coding and at UX design (cognitive aspects, etc) helps. Being good at designing software and being good at talking the language of your business is super helpful, this what's required at the director / VP Eng level.

And yes, do take the math. After just 2 years of university math (as an embedded systems major, I had to later study myself to grab some missing algebra, some category theory, etc, to grasp SICP on one hand, and Haskell on another. This helped me become seriously better at writing and understanding code in industrial languages.


I will say that I noticed sometimes the honors courses, in college, were taught by the absolute worst professors.

I have a theory it was because, "they are honors students, they'll learn the subject regardless, and we need to put this bad teacher SOMEWHERE".

I once got out of a differential equations honors class because it was very obvious the first week, that the teacher was awful. Later, my fellow honors students told me how "lucky" I was to have switched. They were suffering through the course, while I was learning and enjoying my new class.


Never went to college but I can agree with everything you said. I would add to actually go out and build something. A portfolio is immense and has opened doors for me otherwise that never would be.


Could you expound on #4 some more?

Would you recommend a specific branch of math?(discrete, linear algebra, etc.)

Also, which math or general skills/fundamentals would you say has helped you the most when implementing compilers?


I've followed the same path, out of curiosity do you mind if I PM you asking a few questions? I'm a couple months out of undergrad and I'd like to pick your brain a bit.


sure


> A lot of undergrads are not really looking around any more.

My friend is doing a CS degree now and is being told by profs that you won't get a job without an ML background.


Speaking as faculty myself, the only job advice that you should listen to from an academic is how to get a job in academia. Even the ones that had a job in industry prior to their academic job have a reason for preferring their academic job. Their firsthand knowledge of the industry is stale at best, and their current information is all hearsay.

If you want to get a job in industry, you need to get advice from alumni that have (recently) gotten a job in the industry you are interested in.


The newly hired faculty at my school have mostly been ML/AI people. While it seems completely wrong that ML is needed to break into industry, is it true that research roles strongly favor ML right now?


I'd say that with some caveats, a freshly minted PhD in CS/stat/math/etc. with an ML dissertation will have an easy time finding an academic job. The caveats are that you might not get the job you really want. You might have to do a post doc first. Even if you don't have to do a post doc, you might not be at the most prestigious university, might not start out as tenure track, and might not even be in a CS/stat/math/etc. department.

You could end up at a medical school or a business school, a ton of mediocre PhD's take these jobs because Stanford and MIT never called them back. The medical and business schools think they hit the jackpot- they know they got someone mediocre, but at least they found someone. The mediocre PhDs usually end up happy because they found a job and medical and business schools usually pay well.

I'd say that a lot of the jobs that favor ML are from schools/departments that are playing catch up and trying to cash in because they saw a wave of grant money going to ML research. They are either smaller and less prestigious, or interested in the application to their field, rather than any direct interest in ML itself.

An ML focus probably isn't the boost you may have thought if you want a job at Stanford. The big schools have their pick of the litter and are already flush with ML talent, so being "an ML guy" isn't a magic serum that hides all your other flaws. You still have to be smart enough and positioned well (good school, good advisor) to get a job at Stanford.


I am in such a situation now. Soon to graduate with a PhD in ML about Bayesian networks, and do not know what to do in future.

Originally I learned programming, so I can write classical software, working from home. Never wanted to do anything else, but that does not pay. Only for AI research I could get funding


I don't think that's true. The vast majority of software development jobs require no ML at all. I don't see that changing in the next 10 to 20 years.


Even further the majority of work in ml driven products is not related to the ML model.


this 1000%.


I don't think it is true either. But my friend tells me that multiple profs tell him that he needs to learn ML.


I learned ML in school 13 years ago. I've been happily employed since then at a major tech company with increasing responsibility and the only time I've really needed that background was to call BS on a given project needing a custom ML component.


Maybe those profs are just huge fans of ML–you know, the original ML, i.e., Standard ML, OCaml, etc.


“Who needs a neural net? OCaml can already match patterns!”


That's ridiculous. There are tons of jobs in everything from devops to pure frontend to writing shit in COBOL.

ML is sexy and pays a lot, sure, but it's far from the only thing going.


Another tactic for an undergrad looking for a leg up in their career might be to do a double major of CS + some other engineering field or finance. You can really outperform your peers and become a unique asset in a given industry by having an expert understanding of the domain paired with expertise in software development.


Good combinations - CS + Finance, CS + Operations Research, CS + Economics, CS + Physics


>CS + Operations Research

this is me. I can't recommend it enough.


I loved doing OR for my final project at Uni, it was great solving the travelling salesman problem for a taxi network routing it to the correct people and picking people up in realtime around an imaginary map of points. I remember thinking as I took a cab into halls from the station "wouldn't this be amazing if I could have a device that told me exactly where each cab was at any point and maybe hook into the satnav somehow". In 1999 [1].

[1] https://www.google.com/search?rlz=1C5CHFA_enGB720GB720&sxsrf...


talk about having the right idea at the wrong time!


Ah, I would never in a million years been enough of a shitbag to build Uber so it's a moot point. But still.

In my school all CS students took at least one OR class.

It was one of the best classes I've taken and some of the things still feel like magic.


Was it linear programming?


Do you have any suggestions for people who want to self-study in OR? I would like to know more in general but am specifically interested in applications to healthcare


My recent previous employment was in healthcare (on the insurance side) so I can give you some relevant insight

I do want to mention that on top of any self-studying, try to attend a talk or two or start following some feeds online that are close to actual healthcare operations. Operational teams are the ones who have to figure out what to do even when there is no good answer. the easiest way to keep up with which topics are most valuable right now is inside knowledge.

There is no purity in the OR field outside of phd's doing their research - it is entirely about getting shit done efficiently, however possible, to the extent that the operations team can understand. That last part of that sentence is a big catch. For example, if your 'solution' has interns making judgement calls on data entry (because moving the work upstream is efficient!), you are fucked if you assume that data will be accurate.

BUT there's obviously plenty of skillset stuff you can learn to help you in a general way so here are some important areas: 1: Linear & Nonlinear programming (tool: AMPL) 2: Markov Chains (good knowledge) 3: Statistics & probability (necessary knowledge) 4: Simulation (start with monte-carlo [it is easy and you will be surprised it has a name]) 5: databases: SQL / JSON / NoSQL 6: data structures and algorithms (big O notation / complexity)

OR work in general overlaps a lot with business analysis. The core stuff they teach you in school is listed above.

Healthcare right now has a big focus on Natural Language Processing, and applying Standardized Codes to medical charts - and then working with those codes. The most common coding standard in US is ICD-10 I believe.

Other than that it is mostly solved logistical items like inventory control systems that need a qualified operator. You do not want your hospital running out of syringes. You do not want your supplier to be unable to fulfill your order of syringes because you need too many at once. You do not want to go over budget on syringes because there's a hundred other items that need to be managed as well.

Now the important thing to keep in mind is that almost all operations problems at existing companies/facilities have solved their problems to SOME extent since if they didnt then their operations teams would have fallen apart and failed. So in practice it is rare that you are going to implement some system from scratch, or create some big model. Youre probably going to work with a random assortment of tools and mix data together between them on a day to day basis to keep track of when you need to notify people of things. With a lot of moving parts, you will have to task yourself with finding improvements and justifying them. Expenses are very likely to be higher than optimal, and you can earn extra value for yourself by finding incremental improvements.

No one is going to say: "work up a statistical model for me". They are just going to be doing something inefficiently with no idea how to do it better, and you are going to have to prove to some extent why doing it another way will be better - and also be worth the cost of training people to do it a new way. It will be monumentally difficult to convince anyone to make any sort of major change unless the operations team is redlining, so your best skill will be in being resourceful and adapting to the way things are no matter and improving THAT mess - not creating a new way of doing things.

Databases house the data you need to make your case. SQL was the norm, but a lot of stuff is JSON now. You might need to work with an API, add django to the req list.

Simulations let you test multiple scenarios of resource allocation on a complex system with moving parts (for example, resources includes # of hospital rooms as well as employees and their work schedules). Statistical analysis lets you verify the output of your simulations as meaningful or not. There are proprietary simulation programs that do the A-Z of this for you if you know how to configure it (ARENA), and theres pysim + numpy + pandas + ...

Markov chains are related to building a model for your system. It's theory stuff, but helps wire your brain for it. Laplace transforms are "relevant" somewhere in this category

(non)linear programming is the calculator of the fields distribution problems. In practice you create 2 files: a model file and a data file. Model file expects the data file to be a certain format, and is a programming language for processing the dataset.

For example, if you manufacture doors and windows at 3 locations, and sell them at 10 other locations, and you have a budget of $10000: how much wood and glass do you buy for each manufacturing location and how many doors and windows do you make and ship to each selling spot from each factory? The answer depends on data - price each sells at, cost for each plant to produce, capacity of each plant to produce, cost of transporting from factory to selling location, etc. So you make a model file for your problem and then you put all the numbers in a data file. You can change a number, run the model again, see if the result changed. You can script this to test many different possible changes

Data structures and algorithms: There are a lot of different optimization algorithms, all with different use-cases and theres no real upper limit on learning them; so this area can be a good time sink... since someone else will have already coded the implementation but you are providing value in knowing how to use it. Therefor - you dont need to learn how to make the algorithms or what magic they perform outside of whatever helps you understand what its good at. Outside of research, its unlikely this stuff will really get you anything other than maybe being able to impress at an interview - BUT who knows, maybe you find a use-case for some random alg that is career-defining.

I know a ranted a bit, and I didnt proof read, but I hope there was some helpful info in there


Thank you for the really in-depth reply! This is very helpful and gives me a lot to think about.


No problem! I'm glad it was helpful


My double major was CS + Applied Mathematics. Highly recommended.


I did CS + Finance, then got an MBA. It has set me up very nicely.


CS + Something medical, at least in the US


I majored in geophysics and it doesn't really help me as a software engineer in Houston (although I haven't tried to really exploit it for the most money, because I felt guilty about working in oil and left). What's killer though is having a master's in a field and programming experience.


I'm not in Houston or oil/gas, but trying to find a niche with the same background. I'm not seeing a great demand for this combo commercially unfortunately.


It's super easy to get a general programming job, though.


Did you have to leave Houston too?


Nope, happily employed here. What prompted you to leave?


Nope, haven’t arrived yet. Just scoping it out.


Nah, Houston is fine and there are plenty of enterprise jobs, and they are constantly hiring. If you need a reference, send me a PM and I'll give you my email address.


I’m not sure how to PM you on here...


Yeah, sorry, I'm used to reddit. Leave your email? Leave a temporary email? I would list mine but I don't like linking my online identities to my real one.


gmail is my HN handle


This is true. My first job as a dev was in fintech, but I know literally nothing about finance, nor do I care to. I was not a very good developer in that arena, because you really need domain knowledge to be truly effective in some fields.


> large portion of CS students nowadays choose to focus on Machine Learning, many brilliant CS students decide to get a PhD in ML

These claims are massively exaggerated. Students don't "choose", "decide" etc. In the US, lets be generous & say there are 50 top universities with good CS/ML depts. After students beg plead & submit their GRE scores & recos & transcripts etc, each dept chooses on average 20 students for incoming PhD cohort. Of them easily 20% will wash out for sure. So atmost 800 top students graduating with a PhD in ML each year. Let me caveat by saying there are nowhere close to 50 top univs offering ML PhDs or having an intake of 20 per dept. So overall, the number is probably close to 200-300 students for the whole USA, not 800.

So all this handwringing for what some 200 kids will do ? I don't care if they do a PhD in basketweaving, in the grand scheme of things, 200 is not even a drop in bucket.


I agree that it might not be a large portion of all CS students, but it does appear that many of the incoming CS PhDs have chosen to go into ML (of my class, maybe 1/3), which serious distracts from interest in other subfields. It's actually a joke in my department that all of the new students are pursuing ML, while almost no one is pursuing theoretical CS or other less popular subfields.

I'm also not sure about why the exact number matters - 200 kids matters a lot when future professors are drawn from the pool of students who have successfully completed a PhD.


I think the problem applies to non-PhDs students as well. I'm seeing a lot of interest from recent non-PhD grads in subjects other than CS wanting to steer their career toward data mining. My concern is that this re-focusing will probably lead them down paths that may leave them undistinguished relative to peers who stay within the more stable but less shiny domains where they're better prepared to succeed.

I saw the same thing happen in the late 1990s as everybody and his dog got into web design while it was hot. Few of those folks are doing that now, nor did that skill translate well into other roles since the skill set isn't fundamental to other careers.

I'm not sure ML is any different, especially deep learning, since few companies have anywhere near the necessary amount of data to successfully play that game and win.


I don’t know much about industry but it seems to me there are two ways to look at ML. First is you learn some of optimization, stats, math, parallel computing, numerical methods. The second is that you learn a hell of a lot about fiddling with different network architectures and applying things to specific problems. I wonder whether the first (more fundamental) approach doesn’t have different prospects. At least in this case it can lead to career paths like a national lab.


The first path lets you become a quantitative problem solver, which is widely employable. The second path leads to being very good at specific deep learning tasks that I don't think will be in large demand in 5 years IMO, at least outside the largest tech companies with all the data. Other companies will fulfill their business needs with fewer ML-specialists, AutoML and pretrained models.


I've also noticed a trend of college graduates who in interview struggle with general software engineering practices and more fundamental coding skills and CS knowledge, but have knowledge and proficiency of ML. If you're a CS BA or MS, and don't plan to do an ML PhD, I'd suggest asking yourself if you want a data science job or a software engineering one. If the latter, remember to focus more on that. Surface knowledge of ML is a bonus, but the rest is more important.


My partner has ditched crosstraining into data science and ML because of how ridiculously 'ductape and string' the entire sub-indistry is. Tooling that is pre-2000 quality if it even exists, people with math skills yet only the most rudimentary ability in best practice around data storage and maintenance, coding, domain knowledge and understanding of how much bias they're introducing into results. Her view is its just a free for all of people who have little to no need to justify their output because everyone is waiting and hoping for the magic to happen.


Larger organizations (by size and longevity) tend to have staff that predate the data science and ML/AI hype train and have therefore worked out more robust tooling. Unfortunately the popularity of ML is driving a lot of adoption of trendy but not necessarily best practices.


There was a lot of this sort of thing with the web application boom as well. Lots and lots of decrying how JS libraries and frameworks were ruining the software industry.

I've not following that very much in a decade though. Did that whole area end up maturing?


I've noticed this myself but wasn't sure if it was just me. Almost every CV we get has a ML slant but the candidates struggle to write a simple SQL query.

From a pure numbers perspective 95% of jobs in our industry have nothing to do with any reasonable definition of machine learning. Even those applications of ML have a large amount of traditional CS going into acquiring, reformatting and storing large datasets.


That’s definitely something I’ve seen too. Most of the cvs I get through, especially from younger devs, talk about ML. I’ve got my work cut out for me keeping my existing codebase understandably without deliberately introducing opacity!


Most of the ML useful in industry is getting commoditized at a fast pace, and not enough people understand that. The focus on modeling for applying ML in a business environment is as incongruous as people believing programming is graph algo and compiler techniques. It sometimes is, but rarely.

It is much easier to learn the basics of ML if you have good software engineering skills than the opposite in my experience. The one thing that needs time learning is experimental design and quantitative analysis, and this is rarely taught well at university before PhD.


It's also a problem that students are focusing so much on neural-network-based approaches, neglecting other techniques. There's what, like, 10 places in the world where you'll have enough compute to be able to train cutting-edge neural networks? Real world problems are just as well solved with other techniques like XG-Boost, which wins pretty often in Kaggle, for example.


>There's what, like, 10 places in the world where you'll have enough compute to be able to train cutting-edge neural networks

You can just use a smaller amount of compute with transfer learning style stuff.

I can probably fine-tune a transformer model with my pocket money and beat any NLP solution from two years ago.


It may not be ideal for fundamental research but demand for ML practitioners is only likely to increase over time.

ML is pure alchemy for a business operating at scale. It’s as if a coal plant could turn its waste emissions into profit. You have this “data exhaust” from all the activity happening within your system that can be used to optimize your system Atleast a few percentage points beyond what is other wise possible. A team of 5 ML engineers can improve an ad targeting system by 5 % and if the company is google that’s billions.

ML creates feedback loops of improvements in product that improve usage that lead to more data which further strengthens the moat a business has.

It totally makes sense to jump on this train. It won’t solve AI but will make a lot of people wealthy optimizing systems.


I think that there are a pretty large number of companies that put out "data exhaust" as you put it that could increase their efficiency by 5% or so, sure. Not every company, but some.

It's very unclear to me that there are a large number of companies that can increase their efficiency by much more than 5% through ML. And it's not clear to me that there is more than a couple-year's-worth of projects in turning data exhaust into money for each company.

So if I were a new grad looking to get into ML, that might somewhat concern me.


>> A team of 5 ML engineers can improve an ad targeting system by 5 % and if the company is google that’s billions.

Not going to pick up on your arbitrary 5% constant here, but please elaborate how these ML engineers are any different from anyone with a quantitative background?


I think that's the key here. It's not ML magic that's driving success, it data literacy. Steve Levitt (of Freakanomics fame) said his advice to students would be to ensure they have base knowledge in understanding data analytics regardless of their domain. ML is just the sexy subset that gets all the attention


And ML is a tiny part of the whole data science toolchain and process. Getting the data to the point where you can use it for something interesting is probably the hardest part.


My point was that cutting edge ML can add a bit more on top of what data literacy can achieve. At scale that’s worth a lot.


What I meant was for eg applying deep neural networks on large data sets for eg can give you a few percentage points improvements on systems implemented without them.

At scale it makes a big difference.


>> It may not be ideal for fundamental research but demand for ML practitioners is only likely to increase over time.

If research stagnates, application willl also stagnate. For the industry money to keep pouring in there has to be growth and for there to be growth there has to be progress- scientific progress.

So if something is "not ideal for fundamental research" it is also "not ideal" for business.


not everyone wants, or needs, an ad targeting system


> " The challenges that AI faces"

Isn't this a good thing? I would think that If it were easy we wouldn't need graduate degrees


There are certainly challenges, but "incorporating a world model" has been going well recently: "Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model"

https://arxiv.org/abs/1911.08265


Not exactly. Incorporating a world or domain model usually means taking pre-existing declarative knowledge in some form (e.g. from a semantic net) and using it to aid learning.

To quote the article, "Model-based reinforcement learning aims to address this issue by first learning a model of the environment’s dynamics, and then planning with respect to the learned model."

So they've sped up learning from examples by learning a model first, but it's still learning from examples.


That seems a fairly specific meaning of the term that may not be in wider use, see eg: https://arxiv.org/abs/1803.10122


i wouldn't underestimate the power of good tools here. All the software libraries for ML are very easy to get started with, and make it very easy to prototype cool things. It seems like in other applied areas it's a lot more work to get less results.


I think that's exactly the problem with ML. You can get interesting looking results fast with very little effort. But then, getting from an impressive demo to something actually useful in the real world is much harder, and will lead you down endless rabbit-holes as you try to improve your results, but things only get worse, not better...


After 5 years of growth in data mining at my giant pharma, this is exactly what I'm seeing. Most ML projects remain toys while the number of them that advance into something useful can be counted on one hand.

(Of course it's a bit hard to assess the impact of a revolution like ML (esp DL) when your company already has hundreds of statisticians who have been employing similar data/experiment analysis techniques for decades, thereby diluting the signal of how disruptive novel forms of ML are within the enterprise.)


Cpus have hit their limits and not growing. Software has stagnated and recycles concepts from 50 years ago, and not growing. What's left to study? Either world-scale clouds or the growing field of NNs. Out of the two i 'd pick the 2nd cause it's either more fun or more unpredictable. Students are making a rational choice


Even if those fields completely stagnated, there's great progress to be made in high performance computing, security, bioinformatics, and human computer interaction (just to name a few).


Deep learning still at least doubles every two years, it can be estimated by https://scholar.google.com/citations?user=kukA0LcAAAAJ .


One thing is though that lots of deep learning is becoming simplified to the point of where anyone can do it, with tools like Azure ML studio and whatever the AWS offering is called (I forget).

Of course, there will always be a need for people with deep expertise, but unless you work in a very technical or prestigious field, how many compiler experts have you worked with?


"how many compiler experts have you worked with"

you meant CS PHDs? because it is a better analogy. My answer would be with many, because I am one of them.


Honest question: how common is this in tech hubs? Is the talent level that much higher? I work in the enterprise world and few people have advanced degrees.

I mean, I know there are some shops that are doing really advanced stuff, but I'm talking about your normal tech hub employer.


I don't think that people are ranked by talent. This is also true for academia. You don't have to be more talented than the next guy to show better results according to some metric. Right place + right time + avoid confrontations = a huge boost for your success.

See http://data-mining.philippe-fournier-viger.com/too-many-mach...

Roughly speaking, about 50k papers are published every year in arxiv/ML by perhaps 20k different researchers. Since some people still don't publish there, you may say there exists about 50k ML/AI/DL researchers.

Say 5k of them are pushing the field forward, most of them have a PhD. The remaining 45k deliver various local optimizations/adaptations, many of them don't have a PhD. Now it depends on the size of your normal tech hub employer. If it is big, then you have a few people from 5k and many from 45k. Otherwise, a few people from 45k.


I don't work in a tech hub but in a city with a world-leading university. Having worked for a local consultancy and a startup, I'd say over half of my colleagues have had technical PhDs and it's rare to meet people without at least a Masters. I now work for a remote company and it's probably closer to 65% with over 200 staff, but we are CS research oriented.

I'm currently studying a part-time MSc in CS because not having one is notable here (and being around academic people has inspired me to learn more).


PhD doesn't imply talent level is higher, just knowledge.


Pretty rare at your average non-tech company. Increasingly common as you get into major regional tech hubs and software as the product companies.


But how do you estimate the quality of those publications? Is knowledge about DL exploding, or is it saturating?


This person is the most cited scientist in machine learning (when measured by # citations per year) and is about to become the most cited scientist in the whole world. He has a lab of about 100 people and most of his works are at least good. I just provided an estimated. Once you see his citations count has saturated (probably in 5-10 years), you may guess that the hype is over.


But I suppose that the amount of citations can be large even if the underlying research outcomes approach a horizontal asymptote.


It can be. Still, he made some contribution to the progress in that field. Personally, I don't like the 'Canadian Mafia', I tend to prefer Schmidhuber.


Sounds like someone reeeeeallly good at nudging envelopes and writing grant proposals. Will be mostly forgotten a couple hundred years from now. Academic playwrights thought Shakespeare was a joke in his day.


"a couple hundred years from now" by whom? He contributed to AI, will not be forgotten by AI.

Edit: if you don't get what I mean by "not be forgotten", he will be the most cited scientist so why AI will exclude all his papers from some of its training sets (assuming that at least some stage of AI's development it will learn how humans do research)? Sounds unlikely.


By human beings. And he did not contribute to AI capable of contemplating human research like a human does, because no such AI exists yet. When it does exist (if ever), it will surely not consider this person or his hundred lab workers to have contributed to AI. It will consider them to have contributed to applied statistics.

Edit: I realized these comments might be very mean to you (assuming you're the lab director with the hundred workers). I should disclaim that what I say is something I assume is true about most people who gloat about their citation counts, but it's certainly not true about all such people, and might not be true about you. Myself, I'm a bit resentful because I'm so bad at the whole academic game, so that probably makes me quite biased.


>> assuming you're the lab director with the hundred workers

You make wrong assumptions and proceed. I am not that person and thus I don't get why you attack me personally.

While AI does not exist yet, that guy contributed to its development independently of your opinion about it.

Don't attack people on HN.


>Don't attack people on HN.

Don't attack people on HN.


True. I edited it to be nicer.


What do you mean by "Deep learning still at least doubles?"



This narriative has also taken over the VC world.


That's because the media hype is completely different from the industry hype. This article is a form of media hype.

The pragmatic, already realized effects of machine learning are mostly in improving existing systems, especially classification systems, and have increased metrics by big percentages at big data companies, and more applications are being found every day. That is why real demand and interest is so high for ML.

These generative, speculative models are interesting experiments by researchers but inconsistent and far overstudied by the media, because "AI invents new physics" is more understandable and sexier than "AI improves lead generation by 6.8%"


I would say the industry definitely has its own hype. And honestly, I think the news media is a trailing indicator on that hype.

For example, there's already an ongoing replication crisis that's been ongoing for several years with ML papers in computer science. I've spoken with people in the industry who have even speculated that there's straight-up academic fraud taking place. If you get a splashy, sexy finding you can get scooped up by a FAANG-like company with a fat salary before anyone gives your work a second look.


From working with many ML researchers, it's not speculation. Some aren't very proud, but if you want to get ahead and stay competitive, you have to publish your un-replicable results. The results aren't usually completely fake, but they depend on enough dishonest sleight of hand that they might as well be.


You can improve lead generation by 6.8% without ML too. A/B testing a few changes can easily accomplish the same thing.

The AI could be optimizing to attract people who are willing to give out their email address than actual potential customers so it doesn't help the business. We tried using Google ads to optimize for leads before and we saw ads were on sketchy websites promising people free stuff.

Lead generation was a bad KPI to use but it was chosen can't optimize for sales because the volume is to low to train the ML models. This is the reality of many businesses that experiment in AI and they produce junk.

ML does work well for certain domains where you have the scale of data to pull it off, but now the hype has spread where it is not appropriate.


These articles (and a lot of what g. marcus writes) are attacking strawmen. I ve never heard no one claiming that NNs will invent new theories and i dont think that 2008 article is widely read. But, for things that are hard both computationally and theoretically, like protein folding, NNs may really be revolutionizing the field even if we don't know how they do it. Scientists do not buy-in foolishly into every AI hype. The problems lie with VCs and funding bodies, which are indeed swayed by adding "AI" to every kind of proposal. That's a separate problem, though


That's the whole point... at the very beginning the post cites articles that suggest or even outright say that AI is coming up with new theories (or in the CS world, see all the articles and bloviators who predict AI will replace programming as a job). Regardless of what the scientists in the field would say one-on-one, the outright-wrong portrayal of ML in the press, egged on by VCs out to flip their investments, is causing problems for anyone who wants to actually make progress in their field, or recruit students honestly.


> articles that suggest or even outright say that AI is coming up with new theories

I would contest that there are widely read articles saying those things. Yes, surely someone publishes those , but they are generally dismissed. the field is conscious of its hype (e.g. check r/machinelearning), and i don't think students are stupid either


This is a perfect example of the 'AI effect' in action:

>"It's part of the history of the field of artificial intelligence that every time somebody figured out how to make a computer do something—play good checkers, solve simple but relatively informal problems—there was a chorus of critics to say, 'that's not thinking'." AIS researcher Rodney Brooks complains: "Every time we figure out a piece of it, it stops being magical; we say, 'Oh, that's just a computation.'"

https://en.wikipedia.org/wiki/AI_effect


We really need another name, which refers to people uninformedly quoting 'the AI effect'.

How about "the AI effect effect"?

This happens all the time:

- X suggests that success at a particular task is a good proxy to general intelligence.

- Years later Y solves that task with a clever hack and lots of computation, but no general intelligence.

- The media breathlessly suggests we've made progress on general intelligence.

- Z points out we haven't really.

- Someone says "Oh that's the AI effect!", by which they mean the goalposts have been unfairly moved on AI.

No, they have not. There's something called general intelligence, humans can do it, and Deep Blue cannot, and it's fine to say that Deep Blue beating humans at Chess was not thinking.

It just turned out that Chess wasn't as good a proxy to thinking as we initially thought, and mentioning "The AI effect" just clouds the discussion.


> Years later Y solves that task with a clever hack and lots of computation, but no general intelligence.

While that is true the set that contains instances of Y is only growing and the members are each becoming less narrow.

Deepblue was only chess with many handcrafted parts. AlphaZero does chess, shogi and go with no prior knowledge. Then they went from perfect information, turn-based games to RTS with fog of war.

It's not general intelligence, but at least further down the road in behavioral complexity than jellyfish, despite some crippling limitations (e.g. separation of training and inference)


> There's something called general intelligence, humans can do it, and Deep Blue cannot

There's no such thing. General is a strong word. Humans can't for example handle more than 7 objects in working memory. We have an inbuilt limitation to how much complexity we can grasp. Programmers know what I mean.

General intelligence requires a general (infinitely complex and challenging) environment. Without it it will always be a specialised intelligence. Human intelligence is specialised in human survival, as individuals and part of society.


As you appeal to Programmers: do you believe quicksort is a general sorting algorithm?

Would it still be a general sorting algorithm if someone had just developed it and used it solely to sort numbers between 1 and 100?

It would be.

Why do you think you need to a general environment to develop a general algorithm?

No one is arguing human rationality is unbounded. But our intelligence generalizes, in a practical sense, to a great many more tasks than any computer algorithm I'm aware of.

Mainstream psychologists believe in general intelligence. They even attempt to measure components of it, with good prediction of performance on unseen tasks.

I think there's a big burden if you want to argue it doesn't exist.


> Why do you think you need to a general environment to develop a general algorithm?

Because intelligence is not intrinsic to the agent but the result of the agent trying to maximise rewards in an environment. In the end the driving force is survival - the agent needs only to survive, any method would do. So as soon as it overcomes a challenge it stops evolving and turns to exploiting. And the list of challenges that threaten survival does not scale to infinity. Anytime you think about intelligence you need to also think about the environment and the task otherwise it is meaningless.


You state this as if it's fact but it's a point of view, not necessarily wrong but not proven either. More to the point, it's circular: "intelligence is not intrinsic to the agent but the result of the agent trying to maximise rewards in an environment" is a statement that the human mind operates on the same principles as ML algorithms. If you assume this, then not surprisingly it follows that ML algorithms can in principle do anything the human mind can do. Not everybody agrees and the question has not been empirically settled.


Well, humans can understand any utterance certainly in their native language and there are an infinite number of those, given that natural language, as far as we can tell, is infinite (you can genereate utterances for ever without ever generating the same utterance twice). That is as general as anything gets.

Also, we may not be able to keep some number N of objects in memory simultaneously, but we can definitely reason about an infinite number of objects, like I did above. And if you want to handle more than N objects, you just do it N objects at a time. You can always write things down etc.

You can always extend your memory, I mean.


Some systems are impossible to parcel out in sets of N objects we could grasp at once, for example, the DNA code.

Think about trying to explain the stock market to an ant. Then scale up, and imagine a similar situation for humans.


I'll need an example of a process we can't explain by cutting it up in chunks though. DNA and the stock market- well, maybe we don't "understand" them as such, but that's not to say we can't, ever.

I think it would be very difficult to find a process that _cannot_ be understood by humans, let alon explain why. It'd be a bit of a paradox really. "Let me explain to you this thing that is impossible to understand".


>> Humans can't for example handle more than 7 objects in working memory

citation needed


See https://en.wikipedia.org/wiki/The_Magical_Number_Seven,_Plus...

However stating "can't handle more than 7 objects in working memory" as fact seems a little too assertive to me :) Humans can do it, but are usually not very good at it as the number of items goes up.


Well, I think you do have a point, but on the other hand, there have been people saying "Machines will never play chess well because it will require general intelligence!"

And when machines start to play chess (or go, or Starcraft) they just say "Oh well it's just computation."

So I think it's justified to bring up "AI effect" when people say "machines will never do $(insert your favorite activity here) because it requires general intelligence!"


Since the beginning of time, people have anthropomorphized any and every thing that is complex enough to be beyond a person's understanding. Being equivalent to a human mind is the default assumption for anything. If it was a wrong assumption when applied to storms, volcanoes, Babbage's machine, Clever Hans, ELIZA, and a billion other things, then why would you assign any weight to most claims that something requires intelligence?

It makes me think of the saying about atheism, that an atheism disbelieves just one more god out of thousands than a (mono)theist. You disbelieve in intelligence in very nearly all of the things it's been used to explain.


It's also possible that an autistic chess-winning machine is not actually reproducing the full range of behaviors humans mean by the phrase "playing chess".


That is a good point about something I hadn't considered. I was going to disagree with you until the last sentence, which then clicked.

As AI is developed, it is also championed as "closer to real thought" and might someday solve the more general intelligence problem, only for those same experts to later realize that would never be the case and then retroactively proclaim "this was never going to solve general intelligence, but it is still AI".

The history is full of this kind of revisionism, so I agree with you.


I agree with you, but just want to clarify for others who don't understand why the parent misused the term AI effect. AI effect is more related to the fact that once you understand the trick behind the magic you no longer believe the task to require intelligence, even though it is still able to do an amazing task that wasn't possible before and had for centuries been thought off as requiring intelligence and something that only humans could do.

That the current AI techniques couldn't work for every imaginable task, and that people believed they might isn't a case of the AI effect. That's just hype, and possibly another coming case of the AI freeze.

The AI effect will be how in 5 or 10 years, CS students will learn neural nets as part of their first or second year curriculum and the whole machinery will be behind a single API to some Apache library. And someone will then use that to some cool effect and say they're doing AI, and others will laugh and say, you're not doing AI, you just used a neural net to learn weights from a big data set silly, that's not AI.

This effect happens for every task, even when AI is not involved. Until you understand how, someone doing something you can't comprehend will have you think they are a genius and maybe of a higher IQ and probably very intelligent. But learn the "how" for yourself and it'll stop being so impressive. Rubik's cube is a good example of this. Once you realize there's a trick to it, it stops being as impressive.

Some relevant quotes from the wikipedia article:

> AI effect is: As soon as AI successfully solves a problem, the problem is no longer a part of AI.

> Software and algorithms developed by AI researchers are now integrated into many applications throughout the world, without really being called AI.

> AI advances are not trumpeted as artificial intelligence so much these days, but are often seen as advances in some other field

> practical AI successes, computational programs that actually achieved intelligent behavior, were soon assimilated into whatever application domain they were found to be useful in, and became silent partners alongside other problem-solving approaches, which left AI researchers to deal only with the "failures", the tough nuts that couldn't yet be cracked

> The great practical benefits of AI applications and even the existence of AI in many software products go largely unnoticed by many despite the already widespread use of AI techniques in software. This is the AI effect. Many marketing people don't use the term 'artificial intelligence' even when their company's products rely on some AI techniques. Why not?

https://en.m.wikipedia.org/wiki/AI_effect


No, it talks about the other AI effect, where AI researchers thinks that generalizing the model will be relatively easy. Like, lets say someone makes an AI to play tic-tac-toe at the same level as the best human and then says that with just a little bit more work it will be able to beat a grandmaster at Go.


Right. Someone linked to "Artificial Intelligence Meets Natural Stupidity" last week, which makes that argument in detail. People in the field have been claiming Strong AI Real Soon Now since the General Problem Solver of the 1960s, which is a simple tree search algorithm. I heard a lot of that at Stanford in the 1980s, when the expert systems boom was about to crash and people thought Symbolics LISP machines were magical.

There is progress, though. Machine learning does do a lot. And it makes money, so effort will continue at a high level. AI used to be a dinky field - maybe 20-50 people at MIT, CMU, and Stanford. Now it's huge.


I've only seen the headlines whizzing by, but isn't this exactly what Deepmind(/AlphaGo/AlphaZero/MuZero) has been doing lately?


My point was that machines mastered tic-tac-toe 50 years ago and mastered Go just last year. Making a statistical model to predict a small subset of three body problems isn't very impressive at all, kinda like tic-tac-toe. It is a bit interesting that it works, but not much more than that. Adding another body, or adding starting velocities to these three bodies, would make the problem a lot less tractable, so it is very unlikely the same techniques will work there.


I think the root the issue is that "AI" is a vague term where almost any kind of program, no none at all could go under. The fact that a machine can automatically calculate the sum of two numbers could be considered artificially intelligent by someone who's never seen a computer before. At the same time, no matter how far you go, any program (based on turing-like machines) will always be a deterministic chain of cause-and-effect, which is "just a computation".

But AI in its meaning use contains a certain set of fields in a way that makes sense. It's just the puplic misinterpration nonsense that makes it look so weird.


Are computers meaningfully different to brains in that “deterministic” respect? Both are apparently based on deterministic interactions at the nanometre scale and exhibit simple behaviour that can be explained from first principles (eg. simple arithmetic, or a stroke). Both also exhibit complex emergent behaviour that can’t be explained from first principles (eg. the output of complex ML systems, or most human behaviour), which we sometimes call “non-deterministic.”

I’m happy to concede that software is artificially intelligent just as I’m prepared to concede that most animals are naturally intelligent. It’s just a question of drawing an arbitrary line when things are intelligent enough to warrant the description. The AI effect is the process of that line getting moved closer and closer to “capable of doing everything an average human can do.”


The definition of a modern human is someone who does what machines do not do, so as machines become more capable it changes what humans do.

I guess it seems very likely to me that we will have Skynet/paperclip maximizers before AGI. I mean, I think we kind of do right now in what the internet is growing into.

We're like single cells imagining what a brain would be like but we'll never know because probably we'll be part of something brainless that dominates the ecosystem, or if there is an organism with a brain, we won't be able to perceive it because we're components.


We are already components of something brainless that dominates the ecosystem. :-)


Of course, but that thing is getting more and more computerized and the software is getting more sophisticated. My point is just that if there is a phase change, we probably won't notice because we're part of it and don't perceive it.


Exactly. That's why there's the AI vs. AGI distinction.


The concept of AI to the average person outside of the computer bubble is "human intelligence made artificial"

So if it's not actually thinking, then technically, from the perspective of an average layperson, it is not real AI. However, the "AI effect" is still real, but it's related to a problem local to academics and AI laypeople.

Most non-AI-laypeople/AI academics believe AI is a machine that can think like how a person thinks.


That link honestly reads like more kool aid and deflection of criticism written by AI professionals. Why aren’t mathematicians complaining about their work being largely unknown?


I assume from the title that this is either written by Gary Marcus or quotes him extensively. (I just checked and was correct.) It would be nice to hear this sort of critique from more than one person.


hah, I made a bet with myself too that this was a Gary Marcus article. He is really riding that brand..


He has a book out with the co-author of this piece, Ernest Davis. This is probably part of the promotion of the book. The book title and subject is listed at the end of the article and the two authors of the article are listed as the book's authors.


Is he a good source of critique? Sincere question, I don’t know anything about the field.


He is rather controversial in machine learning circles. I would say he is essentially right about the fact that a lot of ml work is crap. Like most of anything is crap, so that is not a groundbreaking insight and it comes off as somewhat grandstanding to keep saying that.

He has however really turned this into a personal brand which many suspect he is using to advertise his startups. Many people feel he is sort of disingenuous about the actual progress being made in computer vision and NLP, even if there is many ways to go.

To answer your question: You can read a single piece of his and maybe get some value from that, but he keeps rehashing the same points to sh*t over different new work, and there is not much to be learned from that after a while.

Disclaimer: I did not read the above article in full, I am just familiar with Gary Marcus + work in ML industry.

Edit: Keep expanding this but it got me thinking. Earlier in my career, I was also eager for the 'gotchas', for being able to point out why something is crap, exaggerated, cannot do what it promises, and so forth. As I outgrew this impulse to focus more on what is good about some work, I began spotting it on new students a lot. They would come into seminars and smugly crap all over recent papers.

I have since come to the conclusion that it is useful to have realistic internal assessment of progress in your field, but that sharing your harshest criticisms is not necessarily a good way to look smart or make progress.

It often turns out that while feeling clever about realizing limitations of some work, the field just moves forward. The people you just criticized leapfrogged while you were busy thinking about how to one-up them.


It's easy to critique a paper but hard to pinpoint its value in the broader context of a field.


He is a smart and thoughtful person, and any given article by him is worth reading (as long as you balance it with other people's takes as well), but he has quite strong opinions on the subject, which is one reason he's always quoted. The main danger in my opinion is that it's not hard to read five separate articles about how AI is overhyped and think you are getting five independent data points from domain experts, when in fact you are really getting the same data point five times.


I kind of agree with his worldview but the tone of his articles is so polemical that it's not an effective critique of the field. It just puts peoples' backs up.

Someone definitely needs to stick machine learning's nose into its own dirt, for its own good first of all, but it must be done in a very serious, stick-up-its-arse manner, focusing only on the practical aspects and leaving any emotion aside, and all in a calm and erudite voice, otherwise it won't have any effect.

Of course, now that I say this, there has been one such critique and it was ignored roundly, probably because Noam Chomsky Doesn't Tweet.

This is what I mean:

http://languagelog.ldc.upenn.edu/myl/PinkerChomskyMIT.html

Sorry, I can't find the edited interview (and the transcript doesn't quite get the "calm and erudite" voice across, what with all the uhs it transcribes). And I'm half-wrong about the reaction because there was a reaction, from a certain Peter Norvig, oh yes sirree there was:

http://norvig.com/chomsky.html

But I'm still with Chomsky on this.


A recipe for the typical media hype "AI learns physics" discovery

1. Find a well-known classical problem with a battle-tested deterministic model representation

2. Generate synthetic data using the model

3. Train a 1-layer neural network to score 99% in predicting the synthetic data (even if the deterministic model can score 99.999% but let's ignore the minor difference)

4. Claim AI "learns" the underlying physics and does a better job than classical models

5. World dominance


This is exactly it. Earlier this year there was a hyped up paper [0] going around claiming that they could derive physics with AI. On inspection I found that what they actually did was this:

1. Take one simple physics equation, such as F = ma

2. Generate 10^6 tuples exactly satisfying this equation (F = 6, m = 2, a = 3), with absolutely no redundant, erroneous, or extraneous information

3. Feed these to a primitive "AI" system which brute-force tries the simplest relations that could relate the three provided variables (m = Fa, a = Fm, F = ma, ...)

4. Recover F = ma, declare physicists obsolete

It's such a poor imitation of the real process of discovery in science that it's frankly insulting. When people hear AI is involved with something they turn their brains off.

0: https://arxiv.org/abs/1905.11481


In this particular case I think the headline of the title is probably better than the one on HN.

Yes, I’m many fields ML has been exaggerated, but it’s just a matter of time. Most companies aren’t using ML effectively yet.

But I know for a fact when you improve NLP, you _can_ improve almost all white collar jobs. NLP the past two years has been crazy, and we haven’t seen anything yet.


I agree that there has been some impressive progress on NLP lately; am curious what you think the corresponding improvements in white collar jobs have been, though? Genuinely asking! Or do you think they are mostly yet to come?


From my own work:

* NER (name-entity recognition) is becoming insanely good (99+%), this can help with everything from data security to automated form filling, to automated medical billing, etc. Multiple billion dollar industries will be shook up by this.

* My own work on data generation is pointing to reduced surface area of security purposes and improved access to quasi-data for data science and application testing[1]

* Translation from images to text descriptions are also a major thing, which when combined with other systems can do things like make suggested diagnosis.

* Text generation systems are impacting how consumers interact. The chat bots these days are getting to the point where it's very difficult to identify if it's a human or not. I can't go into too many details, but see[2]

* There's also my side business which is a search engine for people who would know an answer to a questions: https://insideropinion.com/

In general, NLP is probably going to have a larger impact that driverless cars on our day-to-day lives. We're seeing insanely good story generation (e.g. GPT-2), chat bots, billing, etc. I also think WAY WAY more is yet to come. The transformer (which is spuring most of this) is only a couple years old. We'll see lots more applications as time goes on.

[1] https://medium.com/capital-one-tech/why-you-dont-necessarily...

[2] https://arxiv.org/pdf/1908.01841.pdf


I can at least give you some examples from my recent experience. A combination of text classification and NER can solve many business problems if the accuracy is high enough or you can mitigate errors.

For e-commerce search, a common problem is how to give relevant results for misspelled queries or unique queries. Text classification and/or NER can often identify the product type, brand, etc and hopefully give a relevant result which leads to a sale.

Another example is data entry or business analysis roles. Often they will manually extract data from sources and put it into Excel or some other BI tool. NLP ML has gotten good enough in just the last couple years that much of the data extraction can be automated, but you still need a human to verify accuracy.


I’d like to know more about the developments in the last two years as well. I briefly explored some Python NLP tooling more than two years ago, but it seemed like there would be too much work involved to get useful results in my white collar job by writing code in my free time. Sounds like I need to give this another go.


Cross-linguistic understanding has grown by leaps and bounds over the last two years. Classification tasks that topped out at 60% accuracy in 2017 are pushing 80s today.

That in part is why I don’t buy any of the “AI winter” chatter. We’ve previously never had machines powerful enough to run most useful ML nor enough data to train effectively. Today, as in right now, we have both. Things that were neat theories in the 90s are profitable businesses today.

Nobody in the ML field thinks strong AI is around the corner, or even 10-20 years from now. But that misses the point entirely. ML is useable and useful right now, and that’s not going to revert, no matter how many programmers grumble about “AI hype.”


Any resources/links you're aware of? Would be nice to learn more about the recent developments. Particularly with regards to "understanding" user-generated free text


From a professional perspective, if you think ML is a fad you are completely missing the train.


From a professional perspective, does it make sense to get on a train that is so crowded already? Step 0 is probably to take Andrew Ng's on Coursera, but as of right now, you'd be among "2,647,287 already enrolled!" [0]

[0] https://www.coursera.org/learn/machine-learning


I'd guess they have a very high drop-out rate.

Regardless, "machine learning" is a very broad field and honestly I have no idea what an "ML engineer" is doing if they are one. It can cover any of the following:

1. Cutting-edge academic research (do better on this test set)

2. Doing data analysis to identify prediction ability

3. Creatively thinking of useful features to evaluate.

4. Implementing data pipelines/logging to obtain the features needed for #3.

5. Production systems to evaluate/train ML systems. (multiple places in the stack).

Because the spectrum is so wide, if you are already an engineer, you can readily get into "ML" categories 3-5 and even 2. Andrew Ng's course is a valuable introduction and not that heavy of an investment -- I found that just with it (alongside my product and infra background), I could readily contribute to ML groups at my company.


>> 1. Cutting-edge academic research (do better on this test set)

It's interesting you put it this way. I think most machine learning researchers who aspire to do "cutting-edge" research would prefer to be the first one to do well on a new dataset, rather than push the needle forward by 0.005 on an old dataset that everyone else has already had a ball with. Or at the very least, they'd prefer to do significantly better than everyone else on that old dataset.

I bet you remember the names of the guys who pushed the accuracy on ImageNet up by ~11%, but not the names of the few thousand people who have since improved results by tiny little amounts.


> but as of right now, you'd be among "2,647,287 already enrolled!"

* Enrolling yourself in a free online course, does not, at all make you a ML expert. A very sizeable portion of those enrolled may not have gone further than the first chapter.

* The ML train (as in people actually knowing ML) is not, at all, crowded

* Even if the train was crowded, learning ML does not make you forget what you already know. Saying "I don't want to learn this skill because so many people already have it", just means there are that many people that now have one more skill than you (unless you decide to allocate this time to learning something else)


> learning ML does not make you forget what you already know

I often feel that learning something new takes such energy and mental rearrangement that it does crowd out what I already "know". The brain is a neural network whose weights are constantly being adjusted, just because you learned something at one point does not mean it is permanently there.

For example, I spent many years working in networking, but now that I've been out of that field for several years and working in embedded firmware and data, I would have to relearn much of what I "knew" in my old field in order to be professionally productive. And that's apart from the field itself advancing in ways I haven't kept up with.

It's like riding a reverse-steering bicycle. By learning something that's in direct competition with your existing neural structures, your neural structures change and you no longer "know" what you used to. Jump to 5:10 to see this person, who took 8 months to learn to ride a reverse bicycle, try to ride a straight bicycle: https://ed.ted.com/featured/bf2mRAfC


> but now that I've been out of that field for several years and working in embedded firmware and data, I would have to relearn much of what I "knew" in my old

This seems to me this is mostly caused by being out of the field for so long, not because you did something else while being out.


That seems like a distinction without a difference. The reason time erases skills is because you're always doing stuff, and over a longer period of time, you've done more stuff, which has reprogrammed your brain.


You have to be joking. Demand for ML skills, especially scaling in production, is off the charts.


Scaling ML in production doesn't take much ML skills though.


The bulk of industry ML work isn't actually suited to ML phds without engineering skills. I work in FAANG and this is a huge problem where the ML phds have poor communication with skilled engineers and a lack of engineering experience. They often even look down upon people who don't have fancy credentials. Unfortunately they just end up creating a money wasting disaster of a system.


Cope


A problem is that the current mode of input for neural networks (one floating point value per input node) makes neural networks problematic for tackling problems with variable input size (read: most problems). My current research looks into how we can maybe fix that in certain cases, but in general it's a big limitation. Recurrent networks fix this to a certain extent, but they have not had the revolution that CNN's have enjoyed just yet. We can also do things like represent things in binary, but this adds complexity, and often greatly inhibits learning.


Once you realize there's "artificial" in AI, it quickly dispel the interest.

Science cannot even agree on a practical definition of intelligence. Unless you let biologists, psychologists and neurologists work with computer scientists to advance the discussion, I don't think it's worth it to explore machine learning techniques to pursue intelligence.

Don't call it ML or AI, call it "improved statistical prediction". The buzz will quickly fade.


In my experience companies just aren't willing to invest, and get turned off when they realize just how long the timelines on bringing DL to production are.

It certainly works, but magic it ain't. You need an experienced practitioner and lots of patience.


This year, I’ve been working on representing electronic wave functions with neural networks: https://arxiv.org/abs/1909.08423 I come from a physics background, and my impression is that NNs are a cool computational tool that every computational physicist probably should have in their toolbox (at least having a sense of their capability). But so far I haven’t seen any truly revolutionary advance with NNs acrose the whole of materials science.


When I started with computers back in 1992, Eastern Europe, noone around me had the slightest idea what computers can do. I sort of imagined (and hoped) they will (magically) solve math problems for me so I won't need to actually make the hard effort to understand.

27 years later, not just 99.99% of the humans in general but of my "fellow" programmers, if I can call them so, are still stuck at this petty ignorance level.


Is that really so? I think this field has been hyped pretty hard and lots of people want a piece of that deep neural cake.


Question for the collective wisdom:

What, in your estimation, are some current or near-future trends that represent an objective improvement to the state of the art?


"You can see the computer age everywhere but in the productivity statistics." This was true for a few years, until it wasn't.


I would like to humbly submit, from my nose-bleed seat, that ML would serve us best by adding it to our arsenal of fuzzing techniques.

Train a model and then set it loose to do exploratory testing, looking for valid-looking inputs that cause the wrong answer to be returned from our entirely pedestrian Plain Old Business Logic. The AI proposes a scenario, but the human is the final arbiter.


Most ML will eventually be commoditized so it is unlikely that specializing only in ML will make things better for people who dont come from permier schools. If you have a so called MiML from any premier school you may not know jackshit about production Ml and still earn hundreds of thousands.


Deep fakes are pretty good. It's kindof interesting....ML has allowed computers to generate realistic images that humans cannot discern....while image recognition remains woefully hopeless compared to human perception.

So rather than competing with the human brain...AI will try and trick the human brain instead.


>ML has allowed computers to generate realistic images that humans cannot discern

I've yet to see a deep fake that's even remotely indiscernible from the real thing.


You probably haven't been noticing them in movies though. Such as when they stuck Margot Robbie's entire head on an actual skater for I, Tonya.

https://www.youtube.com/watch?v=bqgD6lHrQ_8


Was that a DF though?


Gemini man is a better example of where this is heading.

https://www.youtube.com/watch?v=k5y4kxhZIBA

A DF has a pretty specific meaning if you are diving into the technical aspects. It's an approach that has a lot of specific limitations that put a practical ceiling on how good it can be. For example, it's a 2D system that won't capture lighting differences with good accuracy. The limitations become more apparent as resolution increases.

Newer systems use source material to generate 3D facial models. The face in the target video is also 'motion captured.' The target video's lighting is estimated and a resulting face image is rendered and composited into the target. That's a lot closer to the pipeline of movies such as Gemini man.

So while DF isn't a technique used in movies, machine learning is heavily involved. It's built into model generation, motion capture, animation, etc. Artists will start with captured/generated assets and hand-tweak them to get the desired result.


That video shows a CG head and says CG face replacement. Typically sequences like this are done on a shot by shot basis and end up being a very healthy mix of compositing, cg face replacements, CG head replacements, etc. These types of effects have been used for literally decades across a wide range of movies and have nothing to do with the recent trend and techniques of 'deep fakes'.


Right, so not a DF, which is what we're talking about here.


I was answering your question.


All it takes is a little filtering, some camera shake, and you can hide the tells of a fake.


How do you know?


Well, technically I don't, for the reason I don't know if the abominable snowman exists. I have yet to see one I knew to be a df which was even close to being convincing.


Have you seen CTRL Shift Face, the YouTube channel?


>tendency among science journalists ... to overstate the significance of new advances in AI and machine learning

It's very hard to not overstate when overstating brings you clicks, likes, retweets, and dopamine spikes.


Machine learning = stats + computing power.

The key thing here is that the core ingenuity of ML is statistics - the limits of which are well known.

ML hype is false promises based on uninformed wild extrapolation.


Not if we measure the revolution in games and kitten pictures! There has been tremendous progress in those domains!


What revolution in games do you have in mind? Is there any particular video game that is making really interesting use of ML?


All the production tools for making games/video are cramming ml tools in. It's one of the first places to be swimming in it.

The two minute papers guy made a material predictor to cut down on preview render time. Quixel uses ml for search. Substance and Houdini are gaining tools soon. Blender has used it behind the scenes for its new neblua tool. Mocap & animation are bathed in it. There's probably a lot more I can't recall right now.


I am referring to the playing of said games like with alphago and alphazero.


Still waiting for a revolution which has the optimal amount of hype. I wonder if this is an application for GANs.


The ratio of ML interested or majored job candidates I saw nowadays is staggering, appears >70%.


People doing research in ML have been saying something along these lines for years.


I'm not expert but I have a feeling this article will age poorly.


DeepFake issues has definitely been exaggerated




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: