I've heard about the medical datasets.
Noise is a pretty interesting thing, and the possibility of "denoising" depends a lot on the kind of noise. White noise is the easiest to get rid of; malicious noise, which isn't random but targeted to be "worst-case," can thwart denoising methods that were designed for white noise.
This thread is for the discussion of Less Wrong topics that have not appeared in recent posts. If a discussion gets unwieldy, celebrate by turning it into a top-level post.