Typical analysis of the basic design you described is often something like a mixed 2×2 factorial design: which test (pre- / post-test, within subjects) × intervention (yes/no, between subjects) - the interaction term being evidence for effects of intervention (greater increase between pre- and post- test in intervention condition). Often analysed using ANOVA (participants as random effect), nonparametric equivalents may be more appropriate.
More complex models are also very appropriate, e.g., adding question type as a factor/predictor rather than treating the different questions as separate dependent variables: this would provide indications of whether improvement after intervention differs for the question types, as you've predicted. This doesn't give you clues about bimodality but at least allows you to more directly test your predictions about relative degree of improvement (if the intervention works).
Correlations between your different dependent measures: feel free by all means - but make sure you examine the characteristics of the distributions rather than just zooming ahead with a matrix of correlation coefficients. And be aware of the multiple comparisons problem, Type I error is very likely.
Excluding participants on the basis of overly high performance in pretest is appropriate. If possible I suggest setting this criterion before formal testing (even an educated guess is appropriate as this doesn't harm the conclusions you can draw: it can be justified as leaving room for improvement if the intervention works) - or at the very least do this before analysing anything else of the participant's performance to avoid biasing your decision about setting the threshold.
... don't want to elaborate too much on what tasks we'll give to the subjects, in case I'll recruit someone reading this to be one of my test subjects.
I'm afraid you've said too much already - and if you're looking for people who are naive about the principles involved, LW is probably not a great place for recruiting anyway.
please feel free to private message me if you'd like clarification of what I've posted - this sort of thing is very much part of my day job.
Thanks a lot!
I'm afraid you've said too much already
Could you elaborate on that? Something like "so we're going to test the impact of traditional instruction versus this prototype educational game on your ability to do these tasks" is what I'd have expected to say to the test subjects anyway, and that's mostly the content of what I said here. (Though I do admit that the bit about expecting a bimodal distribution depending on whether or not the subjects pay attention to something was a bit of an unnecessary tipoff here.)
In particular, I expect...
If it's worth saying, but not worth its own post (even in Discussion), then it goes here.
Notes for future OT posters:
1. Please add the 'open_thread' tag.
2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)
3. Open Threads should be posted in Discussion, and not Main.
4. Open Threads should start on Monday, and end on Sunday.