This is, in fact, close to being the worst system ever devised. The fact that something is widely used does not mean that it is any good. Examining the results of this kind of system shows that, when applied to unfamilliar material, they consistently give the best marks to the worst students. If the best students can't do every problem with extreme ease, they tend to venture answers where poor students do not. This results in the best students dropping towards the median score and the highest scores going to poor students who were lucky. Applying the system to familliar material should produce a similar, though less pronounced, effect. Adding penalties lowers the dispersion about the mean, which always makes an exam less useful.
Exam systems that have no penalty for wrong answers are better than ones that do, but are still imperfect. The only reliable way to guage students ability is to have far more questions (preferably taken as several papers), to reduce the effect of mistakes relative to ignorance and to increase the number of areas examined. This is generally cost-prohibitive. It also tests students' ability to answer exam questions, rather than testing their understanding. There is, fortunately, a way to test understanding - a student understands material when they can rediscover the ideas that draw on it.
This is, in fact, close to being the worst system ever devised.
Not really- it teaches calibration as well as correctness. Are you more than 50% sure? No? Then don't guess.
In fact, it shares several properties with the best system ever devised (for multiple choice questions, at least): the test-taker assigns a probability to each of the answers (and the total probability doled out must sum to one), and is graded based on the logarithm of the probability they assigned to the correct answer. (Typically, there's an offset so that assigning equal probabilit...
Once upon a time, there was an instructor who taught physics students. One day the instructor called them into the classroom and showed them a wide, square plate of metal, next to a hot radiator. The students each put their hand on the plate and found the side next to the radiator cool, and the distant side warm. And the instructor said, Why do you think this happens? Some students guessed convection of air currents, and others guessed strange metals in the plate. They devised many creative explanations, none stooping so low as to say “I don’t know” or “This seems impossible.”
And the answer was that before the students entered the room, the instructor turned the plate around.1
Consider the student who frantically stammers, “Eh, maybe because of the heat conduction and so?” I ask: Is this answer a proper belief? The words are easily enough professed—said in a loud, emphatic voice. But do the words actually control anticipation?
Ponder that innocent little phrase, “because of,” which comes before “heat conduction.” Ponder some of the other things we could put after it. We could say, for example, “Because of phlogiston,” or “Because of magic.”
“Magic!” you cry. “That’s not a scientific explanation!” Indeed, the phrases “because of heat conduction” and “because of magic” are readily recognized as belonging to different literary genres. “Heat conduction” is something that Spock might say on Star Trek, whereas “magic” would be said by Giles in Buffy the Vampire Slayer.
However, as Bayesians, we take no notice of literary genres. For us, the substance of a model is the control it exerts on anticipation. If you say “heat conduction,” what experience does that lead you to anticipate? Under normal circumstances, it leads you to anticipate that, if you put your hand on the side of the plate near the radiator, that side will feel warmer than the opposite side. If “because of heat conduction” can also explain the radiator-adjacent side feeling cooler, then it can explain pretty much anything.
And as we all know by this point (I do hope), if you are equally good at explaining any outcome, you have zero knowledge. “Because of heat conduction,” used in such fashion, is a disguised hypothesis of maximum entropy. It is anticipation-isomorphic to saying “magic.” It feels like an explanation, but it’s not.
Suppose that instead of guessing, we measured the heat of the metal plate at various points and various times. Seeing a metal plate next to the radiator, we would ordinarily expect the point temperatures to satisfy an equilibrium of the diffusion equation with respect to the boundary conditions imposed by the environment. You might not know the exact temperature of the first point measured, but after measuring the first points—I’m not physicist enough to know how many would be required—you could take an excellent guess at the rest.
A true master of the art of using numbers to constrain the anticipation of material phenomena—a “physicist”—would take some measurements and say, “This plate was in equilibrium with the environment two and a half minutes ago, turned around, and is now approaching equilibrium again.”
The deeper error of the students is not simply that they failed to constrain anticipation. Their deeper error is that they thought they were doing physics. They said the phrase “because of,” followed by the sort of words Spock might say on Star Trek, and thought they thereby entered the magisterium of science.
Not so. They simply moved their magic from one literary genre to another.
1 Joachim Verhagen, Science Jokes, 2001, http://web.archive.org/web/20060424082937/http://www.nvon.nl/scheik/best/diversen/scijokes/scijokes.txt