Hi - thanks for pointing this out. I've added a note clearing up that as you correctly write the quantum interpretation makes sense at the single photon level. Obviously it's hard to generate and manipulate single photons without the right equipment (especially with a phone like in my case) but I do believe this still provides a nice intuition for what's going on. Thanks for your suggestion!
Hey, you're right you could use a stick on a piece of paper etc. Totally fair. That being said this is in fact a real application where a qubit can model things a standard bit can't. Professor Aaronson describes it in this paper: https://www.scottaaronson.com/papers/qcoin13.pdf. Additionally, it's described in his lecture notes here: https://www.scottaaronson.com/qclec/5.pdf
Thanks for the links. It doesn't seem like your experiment captures the interesting part, which is that you don't need more qubits to measure a more subtle bias.
As I understand the experiment now, it seems like the more subtle the bias in the coin, the more times you would need to rotate the polarizer to detect the bias.
If there is something about using the polarizing filters to keep track of tries that is more efficient than using something like a stick, then I would emphasize that in your write-up.
Yup. As greek to me as the paper is at least it makes very clear what it sets out to achieve and why (and when) it differs. I suppose it's implicit but I feel article really ought to explain that in the demonstrated case of heavy bias, few attempts and fixed, coarse steps there is of course no advantage - apart from the stick in ground one could also best its resolution off 0b1000000 and ++/--.
It's a nice explainer on polarization but tries to be more than that and doesn't achieve it - but with further work (not in form of added caveats but rather a new approach to tying the two concepts together) I'm sure it could.
Hey - you totally could to mock this exact setup like you're suggesting. However, in a real quantum computer setup you would take one qubit and apply either the positive or negative rotation gate to it again and again and rotate the state of that qubit. The "stick" in the quantum computer would be the qubit (or photon in this case). So the post is meant to show what would happen in a real setup. Hope that makes sense.
Hi! Author here - If you have any feedback on what can improve please let me know! Thanks for reading and feel free to shoot me a note at dhruv.parthasarathy@gmail.com if you'd like to see something edited.
It is a very nice article, and very well articulated.
However, this article falls into a pet peeve of mine which is that the behavior exhibited here can also be completely explained classically -- this is also a standard demo when explaining how polarization works classically. I feel that it is worth it to at least include a footnote to that effect. The reason that I bring it up is that I (as someone who first learned classical optics, but is now learning quantum optics) personally suffered from some deep rooted misunderstandings about quantum mechanics due to having seen so many of these simplified demos which do not actually capture the quantum nature of light.
The way this article is presented it implies that one can also model quantum phenomenon using maxwells equations -- which is obviously not true. In this specific case you get the same answer, but as soon as you start looking at the individual photon statistics your answers will start to diverge. This is where the actually quantum things like Bells inequality and the Hong–Ou–Mandel effect come into play. If people had just been up front with their descriptions 'oh by the way, when you look at the aggregate behavior of photons they look perfectly classical, it is only when you look at the statistics do they behave any different' it would have saved me a lot of soul searching and misguided contempt for the quantum community.
Hey - this is perfectly reasonable and constructive feedback - thank you! I see your point that the polarization example can be explained using classical approaches. I wanted to explain it in terms of individual photons as I wanted to use this to help provide some visual intuition for qubits. Photon polarization is a nice, visual way of interpreting qubits and as such lent itself well to the task.
EDIT: I've gone ahead and added the footnote. Thanks for the suggestion!
This article does point out a lot of things that have failed this decade - all fair points. But in focusing on failures it fails to capture that casual education (not formal, school education) has fundamentally shifted this decade.
Today, I can learn almost anything I want for free on the internet.
- I can get a world class mathematical education from 3Blue1Brown (seriously this guy's videos are just out of this world good).
- I can get incredible guitar lessons on any song I want from youtube (I've taught myself guitar this year through it!).
- I can find scores of language education podcasts to help me learn Spanish (I use coffee break spanish).
- I can get world class SAT prep from Khan Academy (for free!).
- I can get awesome yoga classes online for free (did this also this year).
Seriously the wealth and quality of information available to us today is insane - it's the Library of Alexandria a search bar away.
As someone who enjoys learning, there's no way I would ever go back 10 years in time.
For those wondering, here’s how GoodRx works and makes money:
1. GoodRx obtains prices of drugs at all pharmacies through contracts it has with PBMs (middleman that determines insurance payouts to pharmacies).
2. Pharmacies have wildly different prices from one another - going to one pharmacy over another can save you and your insurance a lot of money.
3. When you use a GoodRx coupon, the insurance company pays GoodRx a small amount as they had to spend less on their patient thanks to the coupon (i.e they didn’t use a super expensive pharmacy).
4. The pharmacy pays GoodRx a referral fee (kind of like a restaurant pays yelp a referral fee when you use a yelp coupon).
Over the last 5 years, insurance companies are increasingly pushing high deductible plans on to employers. As a result, patients are way more price sensitive than they used to be. If you had no deductible, you probably wouldn’t care too much about drug price differences because the copays wouldn’t be that different. However, due to these new plans you care a lot more and hence services like GoodRx and Blink become a lot more important.
Hey! I wrote this article - “we” is referring to people who had a similar educational experience to me. I was introduced to matrices as a tool for solutions to systems of equations. I always wish I was taught the functional perspective from the beginning.
Thanks for the feedback. I go into this in the next post on eigenvectors here: https://www.dhruvonmath.com/2019/02/25/eigenvectors/. I start by discussing basis vectors which I believe is what you’re looking for in your comment.
When I first was introduced to matrices (high school) it was in the context of systems of equations. Matrices were a shorthand for writing out the equations and happened to have interesting rules for addition etc. It took me a while to think about them as functions on their own right and not just tables. This post is my attempt to relearn them as functions which has helped me develop a much stronger intuition for linear algebra. That’s my motivation for this post and why I decided to work on it. Feedback is more than welcome.
What got me for a while was the concept of a tensor:
For example: What is a tensor?
Wrong way to answer it: Well, the number 5 is a tensor. So's a row vector. So's a column vector. So's the dot product and the cross product. So's a two-dimensional matrix. So's a four-dimensional matrix, just... don't ask me to write one on the board, eh? So's this Greek letter with smaller Greek letters arranged on its top right and bottom right. Literally anything you can think of is a tensor, now... try to find some conceptual unity.
Then coordinate-free fanaticism kicked in, robbing the purported explanations of any explanatory power in terms of practical applications of tensors. The only thing they could do was shift indices around.
What finally made it stick is decomposing every mathematical concept into three parts:
1. Intuition, or why we have the concept to begin with.
2. Definitions, or the axioms which "are" the concept in some formal sense.
3. Implementations, or how we write specific instances of the concept down, including things like the source code of software which implements the concept.
If you ask a mathematician a tensor is an element of a tensor product, just like a vector is an element of a vector space. This moves the question to "what is a tensor product", which you can think about as a way to turn bilinear maps into linear maps (this is an informal statement of the universal property of the tensor product, you also need a proof of existence of such an object, but it's easy for vector spaces and alright for modules after seeing enough algebra)
Crikey, I hope I never have to talk to that mathematician! That's a terse, unintuitive definition that isn't very helpful unless you're already familiar with the concepts. (Also maybe you meant linear maps into bilinear?)
Reminds me of the time an algebraist mentioned to me that he was working on profinite group theory. I asked what a profinite group was, and he immediately replied 'an inverse limit of an inverse system', with no follow up. Well thanks buddy, that really opened my eyes.
Math is just a much deeper topic than most others. The things people do in research level math can take a really long time to explain to a lay person because of the many layers of abstraction involved.
It is a very deep and specialised topic. However, there are ways to convey intuition to a 'mathematically mature' audience, and there are quick definitions that are correct but unenlightening. I much prefer the former :)
No, it turns bilinear maps into linear one! If you have three R-modules (one can read K-vector spaces if unfamiliar with modules) N,M,P and a bilinear map N×M→P then there is a unique linear map N⊗M→P compatible with the map N×M→N⊗M which is part of the structure of a tensor product. (What's really going on here in fancy terms is the so called Hom-Tensor adjunction because the _⊗M functor is adjoint to the Hom(M,_) functor, but just thinking about bilinear and linear maps is much clearer)
I think the ideas behind the coordiate-free formulation of tensor calculus make it relatively easy though.
A tensor is a function that takes an ordered set of N covariant vectors (i.e. row vectors) and M contravariant vectors (i.e. column vectors) and spits out a real number. It has to be linear in each of its arguments.
I'm pretty sure all the complicated transforms follow from that definition (though you may have to assume the Leibniz rule - I can't remember), and from ordinary calculus.
As a layman, the word "tensor" always intimidated me. As a programmer, I was surprised then when I found out that a tensor is just a multi-dimensional array (where the number of dimensions can be as small as 0). That was a concept I was already quite comfortable with.
That's a bit like saying a vector is 'a row of numbers'. Not incorrect, but missing the point. What matters is what vectors do. It's the properties like pointwise addition, scalar multiplication, and existence of an inverse that make vectors vectors.
You're confusing a tensor with its representation. Tensors are objects which must obey a certain set of rules. (Which rules depends on whether you're talking to a mathematician or a physicist.)
It's a nice article - you focus on matrices as a kind of operator that takes a vector as input and produces another vector. This is one side of the coin.
The other interpretation is that matrices are functions that take two arguments (a row vector and a column vector) and produce a real number. IMO this interpretation opens the door to deeper mathematics. It links in to the idea that a column vector is a functional on a row vector (and vice versa), giving you the notion of dual space, and ultimately leading on to differential forms. It also makes tensor analysis much more natural in general.
If you're going to attempt a definition like this you need some more conditions (approaching concepts like linearity, for instance). Otherwise you can have a black box like x^3y^3-u^3v^3 that takes in [u,v] and [x,y]^T and spits out a real number; that's not a matrix-y operation.
I'm describing a somewhat unusual way of thinking about vectors, matrices etc. At least, it's unusual from the perspective of someone with an engineering / CS background.
First think about row and column vectors. A row vector and a column vector can be combined via standard matrix multiplication to produce a real number. From that perspective, a row vector is a function that takes a column vector and returns a real number. Similarly, column vectors take row vectors as arguments and produce real numbers.
It turns out that row (column) vectors are the only linear functions on column (row) vectors. This result is known as the Reisz representation theorem. If I give you a linear function on a row vector, you can find a column vector so that computing my function is equivalent to calculating a matrix multiply with your column vector.
Now on to matrices. Matrices take one row vector and one column vector and produce a real number. I can feed a matrix a single argument - the row vector, say - so that it becomes a function that takes one more argument (the column vector) before it returns a real number. Sort of like currying in functional programming. But as we said, the only linear functions that map column vectors into real numbers are row vectors. So by feeding our matrix one row vector, we've produced another row vector. This is the "matrices transform vectors" perspective in the OP's article. But I think the "Matrices are linear functions" perspective is more general and more powerful.
This perspective of vectors, matrices, etc... as functions might seem needlessly convoluted. But I think it's the right way to think about these objects. Tricky concepts like the tensor product and vector space duality become relatively trivial once you come to see all these objects as functions.
I appreciate you trying to explain it, however I believe it would have really helped if you started from the examples where this way of thinking is useful: you mentioned tensor product and vector space duality, however unless someone is already familiar with these concepts (I'm not) then it does seem needlessly convoluted. Are there any practical applications of these concepts that you can describe?
I haven't been involved in abstract math in close to a decade, but I think it's a description of the (general) inner product. So, a generalization of the dot product. The classic dot product is that operation with the identity matrix. My understanding is that using matrices that way is very common in physics.
Yes they can. This follows from singular value decomposition. Let S be the matrix representation of a shear transformation. There exist rotation matrices R, B and a diagonal matrix D such that S = RDC, where C is the transpose of B. D is the matrix representation of a scaling transformation and R, B are the matrix representations of rotation transformations. Since S is a product of rotation and scaling matrices, its corresponding linear transformation is a composition of rotations and scalings.
It would ordinarily be weird to represent shear transformations using rotations and scalings because shear matrices are elementary. But it checks out.
OK, point taken. I considered "scaling" in a less general sense (scalar multiple of the unit matrix), while you want to allow arbitrary diagonal entries. My definition is to my knowledge the common one in linear algebra textbooks because in yours, the feasible maps depend on the chosen basis.
EDIT: To state my point more clearly: in textbooks, "scaling" is the linear map that is induced by the "scalar multiplication" in the definition of the vector space (that is why both terms start with "scal").
Same for the the orthogonal matrices, or the diagonal matrices, or the symmetric matrices, or the unit determinant matrices, or the singular matrices ... They are all sets of Lebesgue measure zero.
Orthogonal, diagonal, symmetric, and unit-determinant matrices are all sub-groups though, which makes them 'more special' then all shearing matrices.
Singular matrices are special in the sense that they keep the matrix monoid from being a group. My category theory isn't strong enough to characterize it, but this probably also has a name.
Edit: I think the singular matrices are the 'kernel' of the right adjoint of the forgetful functor from the category of groups to the category of monoids.
Though I must admit a lot of that sentence is my stringing together words I only vaguely know.
No, I wasn't, but I did confuse the terms. Shear can be done without the extra dimension. Skew transforms require the extra dimension, as does translation.