Coding the Matrix - Philip Klein 
It used to have a Coursera course, but I think it's been taken down. The website has videos of the course taught at Brown I think.
The associated website is: http://codingthematrix.com
It is legal, not a pirated version.
My question is: What motivated you to create this freely available book?
For motivation: briefly, I was at Union College and was asked to teach Linear. I saw a second edition of Strang's book, thought it was wonderful (still do), and adopted it. But the students, who were perfectly good students, did not get it. They just couldn't solve the problems that a person should solve in that class. I decided after some reflection that the folks I had need to be brought along to where they are ready for a higher level understanding. So I put together a presentation that uses lots of examples, motivation, and naturalness to try and develop maturity in people who do not yet have it.
(Again, I'm not talking Strang's book down, it is very fine indeed. But the students that I have in front of me at this point are just not yet ready.)
Also: some people just want "math for deep learning" and that's mostly baby calculus: the chain rule and gradient descent.
They do this in several ways.
1) They recommend textbooks to beginners that are too advanced for their skill level.
2) These textbooks do not include solutions and the advice given is that solutions would somehow rob them of the experience (don't let those that lack self-discipline ruin learning for non-traditional students that aren't in a formal classroom with access to professors and TAs). The self-learner needs some sort of feedback system.
3) They tell the self-learner that solutions aren't provided because everyone has the ability to know if their solutions are correct without having their work checked. Would you write a complicated program and write no test cases, or could you instantly know your program is bug free the first time you write it? Why is math suddenly any different?
Despite popular elitist opinions, I'd recommend this book over Axler for the beginner that doesn't know linear algebra. Everyone says Axler is the perfect first book in linear algebra. It really isn't. Even Axler admits this himself in the preface. He assumes that it is a second approach to the field.
But people like to be elitist and recommend books to beginners that aren't always best from a pedagogical perspective.
Those are the same class of people that recommend Rudin or Spivak to someone that wants to study elementary calculus 1 material.
Here's a simple example. The distributive property says:
a (b + c) = ab + ac
I can teach students to expand 3(x + 2). With some practice almost all of them will get this. They'll say
3(x+2) = 3x + 3(2) = 3x + 6
Do you know how hard it is to convince people that due to the nature of equality you can reverse the steps? Some never understand that
3x + 6 = 3(x + 2)
Some will never understand that 3x + 2x = 5x because of the distributive property and that combining like terms for the expression 3x + ax is the same process.
I don't think there is any desire to be elitist in terms of having good books that are appropriate. It's just hard to do. We've gone through various cycles of reforming calculus, introducing old concepts in a new way. Through all of the changes one thing has remained constant in my experience. The percentage of people who can get it remains the same.
I do agree with your point on providing solutions.
While you are right, a related issue is that most maths textbooks are atrocious at most aspects that aren't writing pages of equations. There is almost nobody sitting at the 3-way intersection of great mathematician, great writer and great educator who can then write great textbooks.
I've found books with terms like "History" and "Philosophy" in the title are much better places to learn about mathmatics; combined with wikipedia for formulas and details. Any book with excercises but no solutions or historical context has turned out to be basically useless to me, even as a reference (wikipedia is usually better for simple stuff).
I've got something like 4 books on statistics on my shelf at the moment. The only one that I've actually manged to read and learn something from has been Chatterjee's Philosophy of Statistics, because it talks about what techniques were developed in context of which problem, failed alternative approaches, explains what was confusing to some of the greatest minds in the history of statistics, etc. This has been vastly more useful in setting up a framework for what the world of statistics looks like that I can attach a whole bunch of proofs and suchlike too. It has been enlightening in a way that textbooks can't really manage.
A person that doesn't understand the distributive property, going into Calculus 1 is not the person that is going to understand Rudin or Spivak on their own without guidance. Yet, people in mathematics communities would recommend Spivak or Rudin anyway and look down on them if they used Stewart or Thomas.
To them if you aren't learning a subject in the fullest rigor possible, then it isn't worth learning. It doesn't matter if the person doesn't even know the basics, they blindly recommend these type of books regardless.
These are excellent books once you already seen the material before, not so much for the average person that doesn't know what the distributive property is.
There's a difference between a person starting off in mathematics, versus someone who has done several undergrad/grad level courses and knows the "system". If you've gone through math formally earlier in your life, you should be able to pick up a math textbook and learn the subject. Unfortunately, many math textbook do not make this easy.
That's why Khan Academy is a so fantastical resource. They really build a very clear explanation of the concepts. Every teacher should watch it before classes.
It's not free but it's an amazing comic book textboom and fun instersting challenging problems for kids.
Buy the books and share them with your neighbors, and share an online account if you can't afford your own.
So far, I like it a lot, thanks to ALEKS. Give it a shot; it's free.
Like Diestel's Graph Theory is great, it's the book you want to read if you're serious about the subject (D.B. West's book has exercises with solutions, I suppose), but there's absolutely no solutions to be found anywhere; the hints to solutions are now relegated to the "Professional Edition" as well, you can only find hints to old editions online. Makes me question whether or not I understand the material fully, since I can't check work.
Then pick up any calculus textbook and chug through it (Thomas is good from what I've heard). Even if the questions don't have answers in the back of the book you can check your computational steps for free using: https://www.symbolab.com.
I also recommend lots of practice, so khan academy is good for drilling + any problem set book with lots of calculus problems, such as "Schaum's 3,000 Solved Problems in Calculus" or "Essential Calculus Skills Practice Workbook". They don't have to be huge calculus text, doing problems is more important than reading through 1,000s pages of colorful examples. You can find shorter calculus books that focus primarily in drilling calculus techniques. Focus on those to nail the techniques.
Math is an art of making a specific type of rigorous, formal arguments that are meant to convince, beyond all doubt, another human being that one statement follows from another. If an argument like that fails to convince even the proof author themselves, then you know the argument isn't good even without comparing it to another solution.
In practice, after you've written down your proof, you examine it for weak spots. If any particular step raises doubts or objections then you recursively spell it out in greater detail (or replace with an alternative approach) until eventually all doubt dissipates. At that point you don't need to compare it to another solution.
In fact, as you recognized yourself, your solution is likely to be different from the book author's, so comparison isn't helpful. Similarly, a problem like "write a small application to..." in a coding book or an assignment like "write an essay on..." in a writing book don't include solutions. Code, proofs and essays are all too free-form to subject them to correctness verification by comparison with a standard.
Perhaps what you have in mind are examples. All good math textbooks include worked-out proofs that illustrate the techniques. In fact these are generally the bulk of the book! Perhaps you're asking for more examples. That's fair. Proof problems OTOH should be given without solutions to encourage independent creative thought of the reader and to avoid incorrect implication that there is the single correct solution.
(Disclaimer: Not a mathematician.)
Finding a teacher isn't at all mandatory, you can do math yourself by completing exercises, checking your work, and looking at the solution afterwards. There are plenty of places to ask questions if you get stuck.
I firmly disagree.
What a teacher can give you is perspective that you, not knowing the subject, cannot have -- and augmenting any particular viewpoint expressed in a book.
If people could learn everything from texts, we wouldn't have universities (for students) and conferences (for working mathematicians).
In theory, one can write a Great Text that explains an Idea. In practice, it's damn hard to do that, and it's far easier to impart understanding in a conversation, filling in any blank spots the audience might have on the spot, and guiding the way in the jungle.
That's why all texts are kind of bad. Either they are too narrow to give a wide perspective, or too huge to be absorbed!
As one of my advisors said: mathematics, like food, is best shared. Don't go into it alone; and whether you have or don't have a mentor, try to find someone else to join you on your journey (a friend who wants to learn the same subject).
Axler's approach for determinants.
Axler defines determinant as (up to a sign) the constant term of the characteristic polynomial, and he needs two different definitions for characteristic polynomial, one over R and one over C. Now what if the ground field is something else? Do we need yet another definition of characteristic polynomial in order to define the determinant? What if you are doing linear algebra over a commutative ring?
LADR actually presents a very narrow view about linear algebra : it treats linear algebra merely as finite-dimensional functional analysis. The readers can be hit hard when they need to do other (computational or theoretical) stuffs. Similar concerns had been voiced on the internet before. In particular, I think Darij Grinberg's comments (below the answer https://mathoverflow.net/a/16996) on LADR are rather spot on.
It's fine if you find LADR helpful. The book does have its merits (I like its clear and fluent writing and its neat proofs), but it has also its own shares of problems and there are other nice choices of books in the wild.
(I'm more familiar with this phenomenon in philosophy, where the greater the philosopher, the more they have entirely their own way of looking at things, untranslatable into another tongue, which you just have to come to understand on its own terms. A summary of their views leaves out the personal aspect, the style, the way of thinking, and will seem dead.)
You don't start kids with complex numbers until they can handle reals. You don't even start negative numbers until they can handle positive numbers.
This is probably the correct generalization of the volume definition.
The only beneficial traditional treatment I can think of is saying "determinant is volume", and that's what Axler does.
From there, one can look into alternating forms, convince oneself that an alternating n-linear form does the same thing, and obtain a formula for it (e.g. summing up signed products of numbers on the diagonals over all permutations of columns).
If by "traditional" you mean: "Here's an insanely complicated formula that does something magical. Learn to compute it. On page 5, we'll prove that it tells something about independence. Oh, and we'll mention volume on page 10 briefly" -- then I not only think this is not useful, I think it's outright harmful.
Also I recall reading Axler, and I think it covered some more advanced content towards the end. But this book looks equally solid.
You could solve the problem with something like a wiki but I've seen that usually results in quality problems. It isn't like a wiki article where all the results can basically see what has been written. With a wikibook though...you need all writers to basically know where stuff has already been explained...otherwise you just get a bunch of people explaining x here and then other people explaining the same thing in a later chapter.
Linear Algebra Abridged, a free compactified version of Linear Algebra Done Right, 2016. http://linear.axler.net/LinearAbridged.html
Linear Algebra Done Right videos, free videos to accompany the book, 2017.
For those who want a free alternative, behold: Treil's Linear Algebra Done Wrong
The books is downloadable as a free PDF. The name is an answer to Axler's book (dry mathematician's humor), and offers an opposite approach (getting to determinants first).
While I agree with Axler and diagree with Treil, LADW offers way more examples and applications, and together LADR and LADW offer a complete, excellent course material.
As for the linked text: not a bad text, but I wouldn't pick it over LADR + LADW.
1. size: it's larger than LADR+LADW taken together. It's hard to see the forest behind the trees.
2. exposition: it follows the structure of many other texts that I don't like because they terribly confuse the students (that I'd have to re-teach afterwards): starting with solving systems of linear equations, then jumping into vector spaces, for example.
3. I don't like how key concepts (matrix product, determinant are introduced). If you already know the material, it will be hard to see what's wrong with the approach of throwing a definition at the reader, and then talking about why that definition was made. But the opposite should be the case.
After teaching Linear Algebra, here's my litmus test for a good book. At a glance, it should make the following clear first and foremost:
1. A matrix of a linear map F is simply writing down the image of the standard basis F(e_1), F(e_2), ... F(e_n). These vectors are the columns of the matrix. If you know them, you can compute F(v) for any v by linearity. That's called "multiplying a vector by matrix"; we write Mv = F(v).
2. The product of matrices is simply the matrix of composition of linear maps that they represent. The student can figure out what that matrix should be (or should be able to do so); here's how. If M is the matrix of F, and N is the matrix of G (where F and G are linear maps), then the first column of MN is F(G(e_1)) = M x (first column of N). Same for other columns. Ta-dah.
3. The determinant of v_1, .. v_n is simply the volume of the lopsided box formed by these vectors (mathematicians call the box "parallelepiped"). In particular, in a plane, the area of the triangle formed by vectors A and B is half the determinant. This are can have a minus sign; switching any pair of vectors flips the sign.
4. Eigenvectors and eigenvalues are fancy words that allow us to describe linear maps like this: "Stretch this picture along these directions by this much". Directions are eigenvectors, by how much - eigenvalues.
5. Rotation and scaling are linear maps. That's all any linear map does: rotates and stretches. Writing a map down in this way is called singular value decomposition.
6. Shears are linear maps that don't change the volume. Any box can be made rectangular by applying a bunch of shears to it. That's called Gaussian elimination or row reduction when you look at what happens to matrices (and apply scaling as the last step). This is also an explanation of why the determinant gives volume (if you define it as an alternating n-linear form).
That's the beginning of a solid understanding of the subject.
From my experience, LADR+LADW leave the student with an understanding of 1-4, and other texts, due to being organized badly, don't (even when they contain all the information in some order).
Books I recommend:
Just to nitpick, but this sentence might be read ambiguously by some people, who would understand it as the definition of shear. You may want to say "Shears are an example of ...".