Hacker News new | comments | ask | show | jobs | submit login
Artificial intelligence accelerates discovery of metallic glass (northwestern.edu)
237 points by rbanffy 10 months ago | hide | past | web | favorite | 65 comments

This is random-forest supervised learning from a set of 4000 historical experiments

(Lots of feature engineering based on domain expertise. This is not end-to-end DL)

Do a smaller set of new experiments to explore a small subset of the solution space.

Retrain the model with these new experiments.

Perform another smaller set of experiments, this time over a more varied sample of the solution space.

Overall, a x10 improvement in predicting the glass property of an un-tested sample (although the entire process is biased toward positive samples)

Conclusion: classical ML still rocks.

I really dont see any reason why this could not have been done 10 or even 20 years ago.

> I really dont see any reason why this could not have been done 10 or even 20 years ago.

The advancements in tooling, infrastructure and accessibility of ML in the last 3 years alone have made the difference. That's seems obvious.

Maybe your point is that the underlying techniques haven't changed, and thus it would have been possible to have made this discovery decades ago. But isn't that true of even the greatest inventions? Much of what's created or discovered is a function of the environment and conditions surrounding it.

In other words, it's not surprising to see a halo effect in other sectors as a result of tech investment in ML.

I agree that it is exactly this. New tooling has made machine learning easier to use. As a result, people with deep domain knowledge but less machine learning expertise are starting to apply ML to the problems they understand that best.

One of the biggest roadblocks to this happening more today is that people don't know how to perform feature engineering to prepare raw data for existing machine learning algorithms. If we could automate this step, it would be a lot easier for subject matter experts to use ML.

For example, I work on an open source python library called featuretools (https://github.com/featuretools/featuretools/) that aims automated feature engineering for relational datasets. We've seen a lot of non-ml people use it make their first machine learning models. We also have demos for people interested in trying it themselves: https://www.featuretools.com/demos.

I expect to see a lot more work in the automated feature engineering space going forward.

That's interesting! So could we see a gui based machine learning for non-programmers becoming a reality soon ?

And how close in performance this could get vs code based solutions ?

Yes, I think so. Featuretools is actually the core of my company's commercial product.

Performance is tricky thing to answer. If you care about machine learning performance such as AUC, RSME, F1, then I think the answer would be 80%-90% of coding. If you care about building a first solution, then I think the automation would be 5-10x better.

+1 for this tool.

Yeah, the grandparent is hung up on the theory vs. application delay.

By the same logic, nothing in modern CMOS logic or its production process requires physics or chemistry of a vintage later than the 1940's to explain, so why did it take us three quarters of a century to get where we are? Because it's hard. Knowing how it works and figuring out how to do it are two different things.

Traditional engineering has been using machine learning for years for condition monitoring... http://infolab.stanford.edu/~maluf/papers/flairs97.pdf

Shameless plug: I published almost the same thing in a very nearby field 2 years ago:


Not shameless at all.

Your paper was referenced after all. [17]

That’s a terrific idea. Would this be applicable to drug discovery?

Widely used in drug discovery

> Lots of feature engineering based on domain expertise

Exactly. This is what is required to make machine learning work well.

For most people, this issue with machine learning isn’t that it doesn’t work but that it’s hard to use.

I suspect that if we gave domain experts who often don’t know how to code more power to do feature engineering than we’d see a lot more applied machine learning research like this.

Ultimately, yes, more power means time aka money to pursue a target freely while messing with feature engineering. Brute force a la full DL stack is not there yet for two reasons: on one side, the space domain to search for novel materials is immense; on the other side, novel materials found through ML methods must be stable somewhere in their physics state diagram, synthesizable to be manufactured properly and cheap enough to be worth engineering deployment. The x10 process acceleration (from 20-30 years to 2-3 years) is actually in the space domain search thanks to ML methods working through several thousand experiments like in the linked article, not in the engineering readiness protocol for a candidate novel material from the lab confirmation to the real application. Outsiders can help as well by implementing their own pipeline after collecting their niche-specific datasets through journal papers, conference contributions and meeting minutes. I for example am interested in novel alloys or steels for Gen IV nuclear and now creating my own dataset for a first shot, having got a benchmark already from a known, valdated and successfully deployed material.

>I suspect that if we gave domain experts who often don’t know how to code more power to do feature engineering than we’d see a lot more applied machine learning research like this.

With a lot of talk about high paying AI whiz kids recently I wonder whether it is not much more promising to try to bring basic ML techniques into a really wide field of day-to-day business, given how many small businesses are still completely left out.

Do small businesses have enough data to do something useful with ML/DL ? what ?


I liked this example very much. A small family business of a handful of people used standard ML to automate their process of classifying cucumbers for their business.

Just imagine how many people we could free from manual labour to seek higher education if even only a fraction of family businesses had a use case like this and every one of those farmers or small shop owners who is bogged down by repetitive classification tasks could free up the time of a family member or two. That must be tens of millions if not more people on the whole planet.

> I really dont see any reason why this could not have been done 10 or even 20 years ago.

Because nobody knew enough about both subjects to build an experiment.

That's the good thing about ML getting popular. The easiest it is to use, more people can try to solve multidisciplinary issues with it.

I'm sure that there is a treasure trove of ready to be applied knowledge spread out over many sciences.

Example: tue release candidate of the newest version of GIMP added a "new" type of smart blurring: symmetric nearest neighbor, which is surprisingly effective. I looked it up: it is a super simple algorithm, original paper was from 1987, yet for some reason the only mention of it that I found outside of the GIMP page describing it was a wiki for "subsurface science", so a specialisation within geology I guess.

It also has a German wiki article, and has had one for 6 years: https://de.wikipedia.org/wiki/Symmetric_Nearest_Neighbour

Which is still odd:why German but not English?

That's not odd. German Wikipedia is one of the largest, it's about even in quality with the English one, and so you'll just s frequently find an article that's only in english as you'll find one that's only in German.

I meant that the paper is originally by English-speaking authors, meaning one would expect it to be more well known in English speaking scientific circles

German science is mostly done in english - so german scientists are typically very fluent.

I agree with the sentiment (nothing new methodically) but have a thought: these methods were in the field of computer science and operations research (maybe). The popularity of ML and data science is taking place in the same 20 yrs that every non-beta science is becoming more quantified. It takes a novel generation of researchers to combine the old with the new. ML's popularity, and ease of entry (in a broad sense, with tools and information easily available) is only helping the spreading.

What is non-beta science?

Sorry, that might be too local: Beta = natural science, alpha = humanities, gamma = social science.

So my point is even humanities and social sciences are becoming more empirical (at least in subfields, and the retort that a lot of statistics got founded in humanities is well taken) and they are using the tools that are popular and widely known.

My guess was “established” science, like something taught in schools.

Thank you so much for mentioning random-forest supervised learning, I did some duckduckgo'ing and came across this [1] and am excited to try it out.

[1] https://labs.genetics.ucla.edu/horvath/RFclustering/RFcluste...

> I really dont see any reason why this could not have been done 10 or even 20 years ago.

"They started with a trove of materials data dating back more than 50 years, including the results of 6,000 experiments that searched for metallic glass. The team combed through the data with advanced machine learning algorithms developed by Wolverton and Logan Ward, a graduate student in Wolverton’s laboratory who served as co-first author of the paper."

As a researcher in this field I'll just add that in many cases, automating the mat sci workflows (the sample prep and the characterization) is a massive leap in and of itself, even without adding machine learning. The benefit of machine learning in many of these projects is to pick the automated runs optimally (choose he right neighborhood of composition space), which probably adds a 10-100x speedup on top of the already 100-1000x speedup gained from just not making and characterizing samples manually. It's truly a synergistic combination of advancements in both fields and has great potential for accelerated discovery. /shill

Long live the Singularity! (For which these types of exponentials bode well...)

Found the original article more informative: https://www6.slac.stanford.edu/news/2018-04-13-scientists-us...

And the actual publication: http://advances.sciencemag.org/content/4/4/eaaq1566

The paper is the first scientific result associated with a DOE-funded pilot project where SLAC is working with a Silicon Valley AI company, Citrine Informatics, to transform the way new materials are discovered and make the tools for doing that available to scientists everywhere.


This field is ebullient! Bad joke, I know, but glorious times for materials scientists, expecially when not grant-constrained and free to go deep into domain applications. One recent state-of-the-art lit review is here https://www.nature.com/articles/s41524-017-0056-5 , outdated here and there already.

Transparent aluminum?

Will be very useful for transportation of whales

More like a "Valyrian steel"

Dragon glass

No, alloys with some of the properties of glass in terms of atomic arrangement but not transparent.

Breakable, and not transparent? Sounds like the worst of two worlds ... ;)

Harder, hard to corrode. Those are important properties and you could always make a composite (sandwich) of more than one material.

"Glass" means non-crystalline, basically. It doesn't mean that it is like glass.

better thermal and energy transfer properties too.

It's very hard to make a metal (where electrons can easily jump from atom to atom) transparent.

Transparency is a function of wavelength/photon energy as well as the underlying material. Glass (the silica version) is opaque to portions of the UV spectrum.

Generally you'll get absorption (opacity) when the photon energy exceeds an energy gap, allowing valence band electrons to be bumped into conduction bands, creating a corresponding hole in the valence. In the case of metals, there is effectively no gap at points in k-space, so there is absorption throughout the spectrum.

I imagine a clever arrangement of atoms where non-metallic regions alternate with metallic ones could do the trick, as long as the transparent regions line up enough.

But anything this orderly doesn't look like a glass anymore.

For aluminum, you purify it, oxidize it, and then fuse it into a single crystal of corundum.

A hybrid alumina-silica glass can be created. The hybrid glass has a higher elastic modulus than other silica glass systems, and sheets of it can bend further before fractures form. This glass can be further toughened by creating it with high sodium content, then later exchanging the sodium ions with potassium ions. This type of glass is probably in your pocket right now.

Doped alumina-yttria hybrids are also useful in lasers.

That's the ticket, laddie!

This was the first thing I thought of when i read the title. Some ML nerd went "Hmm, I bet I could use ML to figure out how to make the mystical Star Trek alloy."

I always though sapphire fulfilled the requirements of "transparent aluminium". It's used as a lens material for a lot of phone cameras.

Hello, computer.

How quaint!

I read about efforts to discover compounds using random methods back in the 90s, and have been trying to research it lately, to see if the "shake and bake" method is still a thing. Can anyone point me to relevant research? I was surprised by estimates given about the number of possible compounds, so many that there would not be enough time in the universe to make and test them all, even limiting the primary elements to a dozen or so. I guess there are a multitude of ways that the same atoms can fit together. I've tried to find research on computer simulations. Apparently, only rough predictions can be made. My searches have been pretty fruitless, though, and I'd welcome help.

Bulk metallic glass based mainly on Fe is very interesting. It could likely be injection molded into very complex shapes.

There's an old Kaggle competition[1] "Predicting Transparent Conductors" which had a similar objective.

There's a decent discussion of the 5th place entry[2]. Judging by a very quick read it looks like performance for methods like this could improve dramatically with larger amounts of data.

[1] https://www.kaggle.com/c/nomad2018-predict-transparent-condu...

[2] https://www.kaggle.com/c/nomad2018-predict-transparent-condu...

The amorphous material’s atoms are arranged every which way, much like the atoms of the glass in a window. Its glassy nature makes it stronger and lighter than today’s best steel

Why would this arrangement of atoms be stronger than a crystal lattice of steel?

From the actual paper: """ For example, the absence of deformation pathways based on gliding dislocations leads to exceptional yield strength and wear resistance """

Yield strength means "how much strain a material can handle before it fractures or otherwise breaks"

In crystalline metals, a crack that forms anywhere can propagate through the lattice quickly and lead to bulk fracture (see the southwest engine failure recently). In an amorphous material, the deformation caused by a local crack can be "absorbed" by the surrounding atoms because they're able to reposition more easily.

How come this does not make regular glass strong?

It does, check out its properties compared to other materials. The process of turning silica into glass makes it much stronger than other materials made of silica.




Material properties can be confusing.

Strength, hardness, Flexibility all mean different things when engineers talk about them.

Glass is a brittle material which means it will not deform (change dimensions i.e. stretch) when you apply an external force. The opposite of brittle is a ductile material, like steel, which yields and begins to deform once you apply a load above the material's yield threshold. Depending on the material's application ductility/brittleness can be a desired property.

My only credentials are a lifetime of wikipedia, but I think it's because breaks in metals propagate along crystal boundaries. You can harden metal by cooling it quickly which results in small crystals, or compressing or stretching it (i.e. with a hammer), which breaks larger crystals into smaller crystals. Maybe with a completely amorphous structure it's like the smallest crystals ever.

Maybe a dumb question, but speedier in reference to what? Design of Experiments?

Wonder if the same thing could be done with high temperature superconductors.

definitely: https://arxiv.org/abs/1709.02727

I work kind of adjacent to two of the supervisors on this project at NIST, and the workflow for these types of projects usually goes:

(1) build predictive model of chemical composition -> superconductivity from past experiments or databases of simulations (2) Build and program automated sample prep (this is actually the hardest part, not the machine learning) (3) Build and program automated structural characterization and superconductivity measurement

The difficulty is finding a system whose design space can be explored with and measured with automated tools, otherwise the machine learning isn't used effectively. As others have noted the models are decades old in some cases, what's driving this is researchers who know how to automate traditional mat sci workflows and know enough about machine learning to pick the automated runs optimally

Every time I see "artificial intelligence" in a headline I mentally replace it with "a sleep deprived computer science grad student". The result is usually much more accurate.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact