Now, citing axioms and theorems to justify a step in a proof is not a mere social convention to make mathematicians happy. It is a useful constraint on your cognition, allowing you to make only inferences that are actually valid.
When you are trying to build up a new argument, temporarily accepting steps of uncertain correctness can be helpful (if mentally tagged as such). This strategy can move you out of local optima by prompting you to think about what further assumptions would be required to make the steps correct.
Techniques based on this kind of reasoning are used in the simulation of physical systems and in machine inference more generally (tempering). Instead of exploring the state space of a system using the temperature you are actually interested in, which permits only very particular moves between states ("provably correct reasoning steps"), you explore using a higher temperature that makes it easier to move between different states ("arguments"). Afterwards, you check how probable the state is that you moved to when evaluated using the original temperature.
When you are trying to build up a new argument, temporarily accepting steps of uncertain correctness can be helpful (if mentally tagged as such).
Agreed. You just have to remember once you've figured out all those steps leading you your conclusion, you have an outline, not a completed proof. Being able to produce such outlines that most of the time can be successfully turned into proofs, or at least interesting reasons why the proof failed, is an important skill.
The above is an incorrect "proof" that 2=1. Even for those who know where the flaw is, it might seem reasonable to react to the existence of this "proof" by distrusting mathematical reasoning, which might contain such flaws that lead to erroneous results. But done properly, mathematical reasoning does not look like this "proof". It is more explicit, making each step obviously correct that an incorrect step cannot meet the standard. Let's take a look at what would happen when attempting to present this "proof" following this virtue:
The set of real numbers is known to be non empty, so let y be a real number. Let x = y.
By an axiom of the real numbers, a real number multiplied by a real number is a real number, so x*x = x*y.α
By the same axiom y*y is a real number. By an axiom y*y has an additive inverse -y*y.
By an axiom, adding a real number to a real number is a real number, so x*x + -y*y = x*y + -y*y.
By axioms defining addition, multiplication, and additive inverse, (x+y)*(x+-y) and y*(x+-y) are real numbers.
By the distributive law of multiplication over addition, (x+y)*(x+-y) = (x+y)*x + (x+y)*(-y) = x*x + y*x + x*(-y) + y*(-y)
By commutativity and associativity of addition and multiplication, and distribution of multiplication over additive inverses,
x*x + y*x + x*(-y) + y*(-y) = x*x + (x*y + -x*y) + - y*y = x*x + -y*y
Similarly y*(x+-y) = y*x + y*(-y) = x*y + -y*y
So, by transitivity of equality, (x+y)*(x+-y) = y*(x+-y)
By an axiom a real number not equal to zero has a multiplicative inverse. I want x+-y to have a multiplicative inverse, so I want to prove that x+-y is not zero, but I can't, since it is not true. By construction x = y, so x+-y= y+-y, and by the definition of an additive inverse, y+-y = 0. The proof fails.
Now, citing axioms and theorems to justify a step in a proof is not a mere social convention to make mathematicians happy. It is a useful constraint on your cognition, allowing you to make only inferences that are actually valid. Similarly, there is a virtue in the art of programming that constrains you from making mistakes in code. In Making Wrong Code Look Wrong, Joel Spolsky describes a solution to the problem of keeping track of which strings are already HTML encoded and ready to be written to a web page, and which strings are unformatted and may have come from a malicious user who wants to inject an evil script into your website (or a normal user who wants to display a message with characters that happen to be HTML control characters):
Just like the single step in the attempted mathematical proof that can't be explicitly justified, a single line of code violating Spolsky's convention is obviously wrong. In both these cases, it is possible to be sufficiently explicit and detailed that an automated process can verify correctness, a proof verifier for mathematics and compilers and static analysis for computer programs.
What about in domains where everything is not so clearly defined, where you have to deal with uncertainty? Well, first of all, probability theory is a mathematical description of uncertainty. You can be rigorous in keeping track of what the ideal amount of uncertainty is for a particular claim. Though that is a lot of work, and may be worthwhile only for beliefs that inform important decisions. But even in your everyday reasoning, beware local mistakes. Just as Spolsky would advise you to scream "Error!" wherever an unsafe string is stored in a variable meant for safe strings, even if that unsafe string never ends up written to a web page (until someone makes some locally correct modifications), don't wait for a global failure to hit you over the head with the wrongness of your local mistake. Nip that problem in the bud, back up, and do it right. And try to have a simply verified concept of locally correct that still adds up to globally correct.
α. I am being extra explicit in that I am avoiding using common notations that have been carefully defined with rules of manipulation derived from more basic rules of inference. This is for the benefit of non mathematicians who may have learned these notations (like subtraction, exponents) but not their justifications.