(in the current bluesky link, only people who have a bluesky link would be able to watch it but skyview.social allows anybody to view a bluesky post even without an account, running directly within in the browser iirc)
Discovered it from the rob pike AI related recent HN post, what do you think?
For more context, the idea that any biological protein responds to magnetic fields is (at least slightly) controversial. Our observation (a commonly used, well-known fluorescent protein gets slightly but unmistakably dimmer in response to a handheld magnet) is surprising, but it's straightforward to reproduce.
It's not an immediately useful result, but there's a long history of successfully engineering small effects into big effects in fluorescent proteins. If that's possible here (unproven), we could imagine noninvasive control of arbitrary biological machinery ("magnetogenetics").
Even if it's technologically useless, this observation also raises interesting questions about the effects of magnetic fields on biochemistry. Is this a unique edge case, or an instance of a more general phenomenon? What are the necessary/sufficient conditions for magnetic fields to affect life? Given that MRI is nonlethal, there are bounds on how strong these effects can be... but are they really zero?
"...multiplying a vector by the matrix returned by dft is mathematically equivalent to (but much less efficient than) the calculation performed by scipy.fft.fft."
fft-as-a-matrix-multiply is much slower to compute than a standard fft, especially on large input.
You can do the linear parts of neural nets in the frequency domain, but AFAIK you can't do the nonlinearity, so you have to inverse transform back to the spatial domain for that. The nonlinearity is an absolutely essential part of pretty much every neural net layer, so there is no big win to be had unfortunately. For convolutional nets in particular there are other ways of going faster than a naive matrix multiply, e.g. winograd convolution.
But if your network contains a layer whose linear part performs an (approximate) DFT, you will get an efficiency gain by replacing it with an exact FFT.
You wouldn't want to use an FFT for most CNNs anyway because the kernels have very small support. Convolution with them is O(n) in the spatial domain as long as you recognize the sparsity.
Every since I got my own lab, I've been skipping "traditional" publication, for these reasons and more. I've had great success and satisfaction sharing my research via "DIY" publishing:
Advancing my field is my life's mission, and disseminating my research is too important to outsource.
Believe it or not, Twitter has been crucial to the process. It's not great for nuanced discussion, but it's AMAZING for advertising the existence of technical information. For example:
I hate publishers as much as the next guy, but playing the twitter high-school popularity game is the last thing I want to do with my time, and IMO it's leading to the click-baitification of research in AI. This year there was even an instance of literal ASTs being hailed by deep learning hoards as some amazing new idea.
If science gets attention according to its level of twitter amplification, then scientific publishing is going to start looking a lot like journalism. That's already happening. Ask journalists how their search for truth is going.
As opposed to the traditional publishing high-school popularity game? I'm partially joking, but traditional publishing is very much a popularity contest. You're free to ignore this, but I don't recommend it.
My personal experience (twelve years of traditional publishing followed by five years of DIY publishing) is that I spend substantially less of my time on publishing/dissemination, have higher impact, and produce higher quality work. You should give it a try!
That's a good example! I never bothered soliciting formal peer review for that article, but many of the principles we simulated there have since been demonstrated:
FPBase and Talley Lambert ( twitter.com/talleyjlambert ) are both awesome. I'm a physicist working with fluorescence microscopy, and I use tools that Talley developed or contributed to all the time.
Which reminds me, also check out napari.org for a nice viewing/annotation tool for N-dimensional numpy arrays.
The short answer: Nonlinearity isn't just important for deep neural networks - nonlinearity is deep neural networks. Without a nonlinear element in between linear layers, the "deep" is meaningless - a "deep linear network" is precisely equivalent in power to a simple, one-layer linear classifier[1]. (Because if all you're doing is a bunch of linear transformations, you can't do anything you couldn't do with a single linear transformation.)
As far as I can tell, your understanding that this is just a linear function is precisely correct - which means it can't do anything that a simple linear classifier can't.
[1] I suspect that the reason this has multiple layers is because of the physical constraints of the system that prevent a single layer from being an arbitrary linear function of the inputs. The light from a specific pixel can only get effectively diffracted so far, so they need to cascade multiple layers to make sure that all the inputs can contribute to all the outputs. It still ends up being equivalent to a single linear transformation.
True, but all the nonlinear optical effects I'm aware of only really start to matter at very high intensities - so wouldn't really be applicable to the kinds of scenarios they envision, like directly feeding it images seen from ambient light.
Uhm, speed of light differences in a modified crystal lattice are constant nonlinearities reasonable to produce. They do not need high intensity light, but they would need additional circuitry for scaling. Plus the network would have to work on phase angle and not magnitude. Mostly Kerr effect (high voltage) and cross wave polarization (e.g. given Pockel's cell) are useful there.
I asked this on Twitter, but maybe folks here can answer better: how important is nonlinearity for deep neural networks? This method's output seems to be a linear function of its (complex) input. Does that put important bounds on performance?
https://mobile.twitter.com/AndrewGYork/status/10228414045888...
Echoing the other respondents–if you don't have a nonlinearity, your whole network is just a sequence of linear transforms, which (multiplied out) is the same as a single linear transform. Meaning that removing the nonlinearities gives you (effectively) a one-layer network.
Mind boggled that this article comes up now. Been working on similar tech recently, and the question of non-linearality arose right away. The discussed conclusion was "impossible". Yet I was able to design a crude NAND gate. So there has to be non-linearality it the quantum nature of diffraction and interference.
Though the other comments are correct, I want to point out that you can get some nontrivial behavior with only linear functions. For example, low-rank matrix factorization is kind of like a neural network
f(x) = U * V * x,
where U is an n by k matrix and V is a k by m matrix, where k is much smaller than n and m. Basically, we are constraining the set of allowed linear transformations, which is a form of regularization. Convolutional layers in neural networks similarly restrict the allowed linear transformations.
Nevertheless, the power of linear neural networks is far less than that off nonlinear networks.
Nonlinearity is very important and is the only reason why neural nets can approximate arbitrary functions. You can’t do that with linear transformations alone. Though from briefly skimming the paper they do seem to achieve similar effects through phase modulation. Otherwise even MNIST would be out of the question.
No paywall, no delay, straight to the web. Open data, open code, interactive/animated figures. Transparent, rolling peer review, version control, CC-BY license, citable DOIs. I was worried no one would read it, but it turns out science twitter is awesome. Very positive experience, so far.
Sounds like we're pulling in the same direction re:improving scientific publishing. At the risk of debating an ally, do you mean to imply that preprint servers are "more archival"? I'm guessing you're familiar with CERN/Zenodo; you trust bioArxiv's continued existence more than CERN's? I rate them as similar, arguably with an edge to CERN.
My experience with publishing is that discoverability is a stronger function of advertising and (especially) getting cited than the publication venue. I do agree that bioarxiv and arxiv are nice advertising venues, but there are lots of ways to skin that cat.
I'm actually only familiar with the arXiv, for physics and computer science research. I don't know much about biorxiv or Zenodo, but they both seem much better than a personal website or github page.
The arXiv has overlay sites, such as http://www.arxiv-sanity.com/ or https://scirate.com/ , that improve discoverability over the basic arXiv interface; for example, you can browse all the papers in a given area posted in the last two months, sorted by some proxy for "interest." Of course there is also Google Scholar, and perhaps there are better ways.
https://www.science.org/content/article/magnetically-sensiti...
reply