Don't know about matrix algebra books in general, but Strang is mostly elementary and incomplete, there are few proofs and the focus is on simple examples. It builds intuition for basic things like bases, rows/columns in matrix multiplication, subspace, kernel/image of a matrix and its transpose (not even thought of as adjoint, and the complex case is reduced to a syntactic analogy with real one), action of elementary transformations and change of basis. Axler then provides a detailed explanation of what's really going on (which builds on the intuition formed by Strang) and extends the picture (enough to train intuition about things like the complex case, invariant subspaces, polar decomposition, generalized eigenvectors).
This seems like a natural order to me. How does the other way around work (taboo "personality")?
I prefer to be told "what's really going on" before practicing computations; this is both more intrinsically pleasant (I find) and aids in memory. See my post on Bayes' theorem, where I contrast my abstract approach with the usual one, which starts with concrete examples.
Perhaps an even more extreme example would be multivariable calculus: I was never able to properly remember, let alone understand or apply, the theorems of Gauss, Green, and Stokes until I had learned the formalism of differential forms and the generalized theorem (itself arbitra...
This will not be a long post; I have a simple question to ask: if you wanted to educate yourself to graduate level in mathematics, but didn't actually want to go to university, what would you do? I would ask for text-book recommendations, but I don't want to limit your responses (however, bear in mind that the wikipedia articles on, say, cardinality or well-ordering go over my head – they may skim my hairline, but over they go). Also bear in mind that while I personally have A-levels (British qualifications) in both Maths and Further Maths (which is to say, I know some calculus at least), there are probably plenty of people on lesswrong who don't and who desire the same information – so assume as much ignorance as you feel necessary (it's a shame, actually, that there isn't a sequence here on lesswrong for maths). What do you advise (if you think the query ill-defined, I would like to know that as well)?