This is actually a very cool demo! I did run some simulations similar to this one, but at some point I stopped trying so many different models for the same problem and started facing more difficult tasks (like multi-class, for example).
If it is to anyone's interest, the problem with escalation of this model lies on what would be the analogous to "activation functions" in classic neural networks. The cool thing about the sigmoid function is that it's monotonous, whereas every-day quantum gates are deeply rooted in periodic functions, so the chances of getting stuck inside a bad local minimum rise absurdly fast when deeper or wider circuits are considered.
Actually the problem shown on the demo was pretty much identical to the initial project for my bachelor's thesis, and some phd students took it and raised it to a whole new level (https://arxiv.org/abs/1907.02085). In order to understand the purpose of the cost functions and such, some notions of quantum mechanics come in handy, but if you believe that what we did just so happens to make some underlying sense, then the ultimate result is stunning: there is an implementation which circumvents the problem of local minima, and renders meaningful results at attainably small scales.
Oh yes, this is an amazing paper. I actually decided to include single qubit circuits after reading it.
Moving to higher dimensions and more complex problems is indeed challenging. I think that one key to the problem is how you design circuits. There are some elemental points explored here: https://qml.entropicalabs.io/native_gates.pdf
But also things that are hard to grasp on paper, with only theoretical reasoning. This "demo" is actually a derivative work of our "quantum machine learning lab" that we use to explore circuits and visualize landscapes. As you can see in the current demo: when you change circuits and use data re-uploading --as in the article-- the convexity breaks apart and local minima appears everywhere. As a consequence the learning does not converge well if at all.
We are currently working on the MNIST dataset (handwritten digits) with 5-qubit circuits and have first encouraging results. And you are right: well chosen cost functions is an important part of the equation :-)
We are still in the infancy of the domain but I think we will be able to scale these techniques to really hard, big problems. Classical perceptrons dates back to the sixties and some more decades were needed to evolve the idea in something practical. The good thing is that quantum machine learning can follow the steps of classical machine learning and progress a lot faster.
btw: we are hiring at entropica labs (both interns and long term positions). If you want to continue working on these problems you can definitely apply.
I'm truly conviced that exploring landscapes is the way to get higher abstraction skills in the design domain. Yet, discussing this point with someone in the field, they told me we're still lacking a major breakthrough, since right now we only have several (more or less interesting/successful and still) arbitrary approaches to the problem, each of them more fit for one type of problems.
It is expected that someone in academia develops the whole mathematical apparatus needed for the rest of us to keep wrong steps to a minimum. I believe until that point we will not get closer to a general solution, although of course the more knowledge we hoard about particular cases, the better!
I'll contact you regarding those positions :D
Cheers!
PS: out of curiosity, with what version of MNIST do you work? Binary (B&W, 0 or 1) or grayscale? And what image resolution?
What sort of mechanism is being used for the learning step? To my eyes, it isn't gradient descent, or, if it is, it wasn't immediately visually obvious to me what gradient is being descended.
I also ask this because, again at least visually, it is definitely the case that quantum perceptrons have a different classification shape than conventional perceptrons, but the first question I'd ask is if that different requires the "full" quantum formulation, or if you could get that different shape from something less than exponentially complicated, and perhaps amenable to some other mathematical analysis possible in the classical regime. (See also the previous work done in challenging D-Wave on some of their work a couple of years ago where they "beat" classical algorithms, and within a week, the classical algorithms had been improved to parity, simply because nobody had ever really looked at them with an eye to optimization before.)
Yes, actually this is a gradient descent, a modified version of adagrad.
We do not know yet if this could perform faster or better than the classical machine learning. There are some fundamental difficulties on comparing two ML approaches, what's the criteria ? If it takes exponentially less time but with a lower accuracy, is that ok ? What if we need less data points to learn ?
If the shape is exponentially more complex, can we find a way of learning, we need at least some local convexity near the minima.
Sorry, more questions than answers, this is where we stand now in Quantum Machine Learning. We are able to do things but we do not have a clear view.
OTOH, you are saying it's a "Quantum Perceptron" and I totally parsed "QML" as "Quantum Machine Learning", given that perceptron is a classic ML technique...
That is a brilliant demo. I get the impression we could scale that out to many dimensions and the quantum computing would be really fast. Could you give some indication of the time to run and cost to run on a quantum computer as of now and the rate of change on those? I would like to have a feel of when this would be both possible and economic. I.E. How big is my problem to give Entropica Labs a call?
:-) we would love to get a call to solve problems.
But quantum computers are still in their infancy, the most powerful ones have ~50 qubits, 50 unstable and noisy qubits.
Meaning that we can only perform 10 to 20 operations per qubit before the noise starts blurring everything.
With the method described here, on an actual 50-qubit computer we can process a feature vector of at most maybe 500 dimensions. This is a small problem for classical machine learning...
However, the industry strongly believes that in fews years we will have quantum computers with thousands of qubits. And things have been improving steadily.
Actually the problem shown on the demo was pretty much identical to the initial project for my bachelor's thesis, and some phd students took it and raised it to a whole new level (https://arxiv.org/abs/1907.02085). In order to understand the purpose of the cost functions and such, some notions of quantum mechanics come in handy, but if you believe that what we did just so happens to make some underlying sense, then the ultimate result is stunning: there is an implementation which circumvents the problem of local minima, and renders meaningful results at attainably small scales.