(Lots of feature engineering based on domain expertise. This is not end-to-end DL)
Do a smaller set of new experiments to explore a small subset of the solution space.
Retrain the model with these new experiments.
Perform another smaller set of experiments, this time over a more varied sample of the solution space.
Overall, a x10 improvement in predicting the glass property of an un-tested sample (although the entire process is biased toward positive samples)
Conclusion: classical ML still rocks.
I really dont see any reason why this could not have been done 10 or even 20 years ago.
The advancements in tooling, infrastructure and accessibility of ML in the last 3 years alone have made the difference. That's seems obvious.
Maybe your point is that the underlying techniques haven't changed, and thus it would have been possible to have made this discovery decades ago. But isn't that true of even the greatest inventions? Much of what's created or discovered is a function of the environment and conditions surrounding it.
In other words, it's not surprising to see a halo effect in other sectors as a result of tech investment in ML.
One of the biggest roadblocks to this happening more today is that people don't know how to perform feature engineering to prepare raw data for existing machine learning algorithms. If we could automate this step, it would be a lot easier for subject matter experts to use ML.
For example, I work on an open source python library called featuretools (https://github.com/featuretools/featuretools/) that aims automated feature engineering for relational datasets. We've seen a lot of non-ml people use it make their first machine learning models. We also have demos for people interested in trying it themselves: https://www.featuretools.com/demos.
I expect to see a lot more work in the automated feature engineering space going forward.
And how close in performance this could get vs code based solutions ?
Performance is tricky thing to answer. If you care about machine learning performance such as AUC, RSME, F1, then I think the answer would be 80%-90% of coding. If you care about building a first solution, then I think the automation would be 5-10x better.
By the same logic, nothing in modern CMOS logic or its production process requires physics or chemistry of a vintage later than the 1940's to explain, so why did it take us three quarters of a century to get where we are? Because it's hard. Knowing how it works and figuring out how to do it are two different things.
Your paper was referenced after all. 
Exactly. This is what is required to make machine learning work well.
For most people, this issue with machine learning isn’t that it doesn’t work but that it’s hard to use.
I suspect that if we gave domain experts who often don’t know how to code more power to do feature engineering than we’d see a lot more applied machine learning research like this.
With a lot of talk about high paying AI whiz kids recently I wonder whether it is not much more promising to try to bring basic ML techniques into a really wide field of day-to-day business, given how many small businesses are still completely left out.
I liked this example very much. A small family business of a handful of people used standard ML to automate their process of classifying cucumbers for their business.
Just imagine how many people we could free from manual labour to seek higher education if even only a fraction of family businesses had a use case like this and every one of those farmers or small shop owners who is bogged down by repetitive classification tasks could free up the time of a family member or two. That must be tens of millions if not more people on the whole planet.
Because nobody knew enough about both subjects to build an experiment.
That's the good thing about ML getting popular. The easiest it is to use, more people can try to solve multidisciplinary issues with it.
Example: tue release candidate of the newest version of GIMP added a "new" type of smart blurring: symmetric nearest neighbor, which is surprisingly effective. I looked it up: it is a super simple algorithm, original paper was from 1987, yet for some reason the only mention of it that I found outside of the GIMP page describing it was a wiki for "subsurface science", so a specialisation within geology I guess.
So my point is even humanities and social sciences are becoming more empirical (at least in subfields, and the retort that a lot of statistics got founded in humanities is well taken) and they are using the tools that are popular and widely known.
"They started with a trove of materials data dating back more than 50 years, including the results of 6,000 experiments that searched for metallic glass. The team combed through the data with advanced machine learning algorithms developed by Wolverton and Logan Ward, a graduate student in Wolverton’s laboratory who served as co-first author of the paper."
And the actual publication: http://advances.sciencemag.org/content/4/4/eaaq1566
The paper is the first scientific result associated with a DOE-funded pilot project where SLAC is working with a Silicon Valley AI company, Citrine Informatics, to transform the way new materials are discovered and make the tools for doing that available to scientists everywhere.
Generally you'll get absorption (opacity) when the photon energy exceeds an energy gap, allowing valence band electrons to be bumped into conduction bands, creating a corresponding hole in the valence. In the case of metals, there is effectively no gap at points in k-space, so there is absorption throughout the spectrum.
But anything this orderly doesn't look like a glass anymore.
A hybrid alumina-silica glass can be created. The hybrid glass has a higher elastic modulus than other silica glass systems, and sheets of it can bend further before fractures form. This glass can be further toughened by creating it with high sodium content, then later exchanging the sodium ions with potassium ions. This type of glass is probably in your pocket right now.
Doped alumina-yttria hybrids are also useful in lasers.
There's a decent discussion of the 5th place entry. Judging by a very quick read it looks like performance for methods like this could improve dramatically with larger amounts of data.
Why would this arrangement of atoms be stronger than a crystal lattice of steel?
Yield strength means "how much strain a material can handle before it fractures or otherwise breaks"
In crystalline metals, a crack that forms anywhere can propagate through the lattice quickly and lead to bulk fracture (see the southwest engine failure recently). In an amorphous material, the deformation caused by a local crack can be "absorbed" by the surrounding atoms because they're able to reposition more easily.
Strength, hardness, Flexibility all mean different things when engineers talk about them.
Glass is a brittle material which means it will not deform (change dimensions i.e. stretch) when you apply an external force. The opposite of brittle is a ductile material, like steel, which yields and begins to deform once you apply a load above the material's yield threshold. Depending on the material's application ductility/brittleness can be a desired property.
I work kind of adjacent to two of the supervisors on this project at NIST, and the workflow for these types of projects usually goes:
(1) build predictive model of chemical composition -> superconductivity from past experiments or databases of simulations
(2) Build and program automated sample prep (this is actually the hardest part, not the machine learning)
(3) Build and program automated structural characterization and superconductivity measurement
The difficulty is finding a system whose design space can be explored with and measured with automated tools, otherwise the machine learning isn't used effectively. As others have noted the models are decades old in some cases, what's driving this is researchers who know how to automate traditional mat sci workflows and know enough about machine learning to pick the automated runs optimally